?

Log in

No account? Create an account
Today's discovery was a full /var/log/wtmp file. Those of you who… - 'Twas brillig, and the slithy toves did gyre and gimble in the wabe [entries|archive|friends|userinfo]
Thomas

[ website | Beware the Jabberwock... ]
[ deviantArt | the-boggyb ]
[ FanFiction | Torkell ]
[ Tumblr | torkellr ]

Links
[Random links| BBC news | Vulture Central | Slashdot | Dangerous Prototypes | LWN | Raspberry Pi]
[Fellow blogs| a Half Empty Glass | the Broken Cube | The Music Jungle | Please remove your feet | A letter from home]
[Other haunts| Un4seen Developments | Jazz 2 Online | EmuTalk.net | Feng's shui]

[Tuesday 27th January 2009 at 9:35 pm]
Thomas

boggyb
[Tags|, ]

Today's discovery was a full /var/log/wtmp file.

Those of you who know what that is are probably staring at this going "WTF?". For those that don't know (i.e. non-die-hard-linux-geeks), this file tracks all logins and logouts. Every time someone (or something) logs in or out, an entry gets added to this. And, following the Unix philosophy, no program ever expects that this file might become full. Because, of course, such a thing could never possibly happen. Ever.

Ha. Ha. Ha.

It turned out that the ftpd variant we were using wrote to this file on login/out (oh yes - on Linux it's the responsibility of each individual program to log account usage, not the operating system), and this particular system had a 2GB file size limit. Why, I don't know - even FAT could handle files larger than that. Anyway, given that this is a load box it was quite easy to hit the 2GB limit, and when this happened rather than return an error code Linux's default behaviour is apparently to send a SIGXFSZ signal. And the default behaviour for *that* is to terminate the process.
Link | Previous Entry | Share | Next Entry[ 6 pennies | Penny for your thoughts? ]

Comments:
[User Picture]From: jecook
Tuesday 27th January 2009 at 11:58 pm (UTC)
*hands you a very small violin*

We've had to deal with that issue for the better part of two years on the AIX clusters- some doofus made the var filesystem waaaaaay too small, and we outgrew it.
(Reply) (Thread)
[User Picture]From: boggyb
Wednesday 28th January 2009 at 6:06 pm (UTC)
Been there, done that - had one box where the out-of-memory killer ran rampant and ate crond. Of course, we only found out a couple of months later when /var filled...
(Reply) (Parent) (Thread)
[User Picture]From: pakennedy
Wednesday 28th January 2009 at 8:42 am (UTC)
So ftpd terminated every time someone tried to use it. What about local login? Was the shell deep sixing every time you tried to login?
(Reply) (Thread)
[User Picture]From: boggyb
Wednesday 28th January 2009 at 6:08 pm (UTC)
I was able to ssh in fine, so presumably opensshd is smart enough to not self-destruct on a full wtmp.
(Reply) (Parent) (Thread)
[User Picture]From: tau_iota_mu_c
Wednesday 28th January 2009 at 10:26 am (UTC)
sigxfsz, does that imply that it was trying to write a 2GB core file (from the manpage), implying that it had tried to allocate 2GB, but hit a 32 bit problem, failed, segfaulted, couldn't write the core file, because it had mmapped >2GB?
(Reply) (Thread)
[User Picture]From: boggyb
Wednesday 28th January 2009 at 6:11 pm (UTC)
I don't think so - I think the execution path went something like open(), seek(EOF), write(), and the sigxfsz was on the write(). Core dumps are supposed to be slightly intelligent, in that the core dump will be truncated if it exceeds the core limit.

The part of the manpages that confused me is supposedly SIGXFSZ is only sent if you exceed the per-process file size limit, and ulimit happily told me that there was no limit set. There is supposed to be a ETOOBIG or similar error code from write(), but I never saw that in the strace.
(Reply) (Parent) (Thread)