The Two Nasty Bugs
The nasty bugs
You know what it is like: you never saw them, but some people did, and these people are reliable, so there must be something wrong. But nobody on the engineering team managed to reproduce the issues. And the issues are random, and only happen rarely. Very rarely, such as once in a month for a given customer, or even once in a year for another one. And you know what: these bugs are horrible, because it seems you’ll never catch them. They hide, and silently trigger when you are asleep.
First bug report: Can not run an external crash reporter on Linux
This was the nicest one, actually, and it took only few days to investigate it. Basically, in case of crash, a crash reporter was supposed to be ran, and send useful information (stack traces et al.). Unfortunately the crash reporter appeared not to work correctly. It did not start. For an unknown reason (no error messages)
Fortunately, this bug was reproducible internally.
Second bug report: I see log files in my database on Windows
Yes, log files in my database. Or more precisely log files in my database binary files. I was a bit surprised, too. It seems that, randomly, some log files that were supposed to be sent to the logs were redirected to another random file, including database ones.
Obviously, the database was not happy after that.
That was the report after months of investigation: the first ones were just the usual there was a random crash and I have no clue why – I scratched everything and now it works.
These two issues were actually related, and the root cause was the way file descriptors are handled.
Oh, what are file descriptors, but he way ?
The Three Little Descriptors
All C programmers know the three standard library input/output files:
stderr. These are the default standard input, output, and error streams. The first one typically receives the console input (keyboard), the later ones allows to display output.
If you look at the stdin manual page, you will see that
A file with associated buffering is called a stream and is declared to be a pointer to a defined type FILE. (…) At program start-up, three streams shall be predefined and need not be opened explicitly: standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output).
FILE streams are encapsulating a file descriptor, that is, a lower-level input or output resource (which is generally identified by an integer, ie. an
int), and provide additional features, such as buffering, and offer a set of advanced functions, such as
fprintf to print formatted output.
By default, the input/output streams are connected to the console (Linux, or DOS), but you may redirect them easily – this feature is especially known in the Linux/Un*x world:
In this very basic shell script example, the
ls -l /tmp/foo command standard output is redirected to the file
/tmp/list, and the standard error output is redirected to
/tmp/errorlog. Note that the
>/tmp/list can also be written as
1>/tmp/list: the ‘1’ and ‘2’ refer to the file descriptors 1 and 2, which are the standard output and error file descriptors. The standard input is the file descriptor 0:
Why are the standard input/output streams identified by 0, 1 and 2 ? Well, this is just a convention – and, by default, file descriptors are allocated increasingly, and the first three files opened (this is done by the system) are these streams.
We can redirect the standard streams easily using shell commands – but can we do that programmatically, especially for the output streams ?
Redirection to a File
We have two output file descriptors (1 and 2), and we want to capture them in a file – say, a log file. This is quite convenient, especially for server applications: log files can then be rotated, inspected…
The standard way to reassign a file descriptor is through the
dup2 standard library function. This function takes an opened file descriptor, and clone it to the second file descriptor provided. You can also extract the file descriptor beneath a
FILE stream through the
fileno standard library function: here’s a simple code to redirect the standard output to a file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
And that’s it! Simple, isn’t it ?
Here Comes The Problems
Redirection is what we typically do in my company – I even modified some of the involved code. But sending output to a log file has its drawbacks. Especially regarding log file size: the file is growing, and we need to clear it time to time, or the disk will eventually fill. This is generally done through a “rotate” operation, that is:
- closing the current log file
- renaming it with some timestamp
- opening a fresh one
- cleanup old logs if necessary
File descriptors, more generally, are special resources in most operating systems, as they are inherited when spawning a process. By default, all file descriptors are transmitted to spawned processes (on POSIX systems,
fork clone them, and
exec functions keep them opened by default), and this is a pain: pipes and sockets are also inherited. This may lead to the following case:
- main program opens a socket to a remote location
- spawning a process, which inherits this socket (file descriptor) too
- main program sens commands to remote location and closes the socket
- spawned program still lives, and still has the socket file descriptor copy opened
- remote location does not “sees” the client close() and is stuck
The thing is that spawned processed do not give a damn about already opened file descriptors (except for stdio). But the standard is broken for historical reasons, very unfortunately.
The solution is painful: you need to flag all possible opened file descriptors with the
FD_CLOEXEC flag set. This flag explicitly tells the system that the file descriptor shall not be inherited when spawning a process:
open(..., ... | O_CLOEXEC)
open(..., ... | O_CLOEXEC)+
fopen(..., ... + "e")
fcntl(..., F_DUPFD_CLOEXEC, (long) 0)
dup3(..., ..., O_CLOEXEC)
open(..., O_RDONLY | O_DIRECTORY | O_CLOEXEC)+
acept(..., ..., ...)=>
accept4(..., ..., ..., O_CLOEXEC)
socket(..., ..., ...)=>
socket(..., ... | SOCK_CLOEXEC, ...)
- and so on…
Some of these functions are non-standard (such as
accept4) unfortunately. Some other will not be available (Windows). Some old kernels (Linux) will sometimes not support these features – and some old glibc will not work either (especially for the “e” flag). Yes, this is a real pain.
Oh, one last thing: do not dare to use
fcntl(fd, F_SETFD, ...) to clear the
FD_CLOEXEC flag: this is insanely hopeless in a multithreaded environment. Can you guess why ?
So What Was That All About ?
Okay, back yo our nasty bugs!
- The first one was quite easy to investigate.
Here are the investigation steps, ordered temporally:
- “can not run an external crash reporter on Linux”; ie. can not
execv()the external crash reporter in a signal handler, with a status code returned by
waitpid()reporting a signal 6 caught during launch (ie. the launched program called
- we can not run the crash reporter outside the signal handler either
- we can not run the crash reporter using
- we can not run “/bin/echo foo” through
system()either; the command returns the status “return code 1”!
- … BUT we CAN run “/bin/echo -n” through
At this stage, it is clear that the issue is related to the output file descriptors.
Further investigation showed that:
- we redirected output to logs, opening the log file with the FD_CLOEXEC flag set, AND using the Linux-specific
- as a result or
dup3(), standard streams had their FD_CLOEXEC flag set
- spawning an external program led to have the standard streams closed
- attempting to write on the closed standard streams led to either a broken pipe signal, or an assertion failure depending on the program involved
The solution was simply to use the default
dup2() function for standard streams. These are the only streams we want to be inherited, after all!
- The second bug took a bit more time
The biggest issue is that the bug could not be reproduced at first internally. Until our Q&A team managed to reproduce it (these guys are really good at finding bugs, I must admit) three days in a row.
The logs reports were strange, and totally unrelated. But one log was kind of interesting:
1 2 3
The error was a corrupted produced XML document, with another unrelated log inside it, in the middle of the structure. The final error did not have any timestamp, but the previous one was just before midnight (GMT). Humm, a midnight bug ? What are the chances for this being a coincidence ?
I then realized that midnight was the trigger time for a number of maintenance operations, including… log rotation.
But after thoughtfully inspecting the code, everything looked fine. And the bug never popped up on Linux, it was Windows-specific (or at least, much frequent on this operating system)
But there was another clue: all weird logs were actually logs coming from the Java virtual machine (either through Log4j, or through
System.err) in Java, we are actually using a
PrintStream, which is an object wrapped in various
FilterOutputStream objects for buffering, the final instance being a
FileOutputStream, which is, according to the documentation an:
output stream for writing data to a File or to a FileDescriptor
And, in our case, the
FileDescriptor involved was
FileDescriptor.err, that is, Java handles to the standard output and error streams.
Fortunately, the sources of the Open JDK are public, and it is possible to dig into the real code beneath, that is, the native code:
Hummm… there is a Windows-specific implementation for FileOutputStream native calls. strange, isn’t it ?
1 2 3 4
writeSingle call is using a file descriptor returned by
1 2 3 4 5 6
And this macro just calls
GetObjectField(..., IO_handle_fdID) to retrieve … the “handle” member of the corresponding FileDescriptor class.
What ? A handle ? Not a file descriptor ?
HANDLE – and this was the root cause of all out problems.
Let me give you the reason: on Windows, a standard library handle is actually a kernel
HANDLE wrapped by the C library, nothing more.
It means that if you replace the standard library input/output streams, you’ll just replace the C library streams.
Here is the complete bug scenario on Windows after collecting all the clues:
- log rotation function is called at low level, and replaces the standard output and error streams 1 and 2 using
- the Visual C++ C library replaces the streams, by closing the underlying existing
HANDLE, and replacing by new ones
- the old
HANDLEare now totally meaningless (“dangling handles”)
- Java has no knowledge of the replacement, and still has the old
HANDLEstored in its
System.outcalls are now sending data to nowhere (this is bad)
- someone opens a new file, and Windows recycle the old
System.outcalls are now sending data to a valid — yet unrelated — file (this is very very bad)
- data corruption ensues (and not profit)
The root cause is that Java is unaware that the low-level
HANDLE has changed.
The solution is simple: let’s tell him!:
OutputStream instances, we need to:
- get the “
out” member field of the
FilterOutputStreamclass and mark it public (see
- unwrap all
FilterOutputStreamwrapping by calling fetching the “
out” member field recursively
- cast the resulting instance to a
- fetch the
SharedSecrets.getJavaIOFileDescriptorAccess()to set the
long) value to -1 temporarily
- rotate the logs
- fetch the new
long) through a native call to
- set the new
HANDLEusing the previous steps
This is a bit cumbersome – but mixing native code and Java is always cumbersome, so you have to be used to it :)
The two issues were rather different:
- one was a mistake created when attempting to fix the insane default handle inheritance on both POSIX and Windows API, causing external programs not to run anymore
- one was a design bug between the Visual C++ library and the JVM (ie. there is no
dup2()for HANDLEs), causing random corruptions on files or sockets
But in both cases, a simple standard stream mishandling was the root cause. But the consequences were nasty, the resolution painful, and the time spent unreasonable (the second bug took months to be investigated)
TL;DR: redirecting Stdio is Hard, and prone to nasty bugs if you are not careful enough!