The Two Nasty Bugs
The nasty bugs
You know what it is like: you never saw them, but some people did, and these people are reliable, so there must be something wrong. But nobody on the engineering team managed to reproduce the issues. And the issues are random, and only happen rarely. Very rarely, such as once in a month for a given customer, or even once in a year for another one. And you know what: these bugs are horrible, because it seems you’ll never catch them. They hide, and silently trigger when you are asleep.
First bug report: Can not run an external crash reporter on Linux
This was the nicest one, actually, and it took only few days to investigate it. Basically, in case of crash, a crash reporter was supposed to be ran, and send useful information (stack traces et al.). Unfortunately the crash reporter appeared not to work correctly. It did not start. For an unknown reason (no error messages)
Fortunately, this bug was reproducible internally.
Second bug report: I see log files in my database on Windows
Yes, log files in my database. Or more precisely log files in my database binary files. I was a bit surprised, too. It seems that, randomly, some log files that were supposed to be sent to the logs were redirected to another random file, including database ones.
Obviously, the database was not happy after that.
That was the report after months of investigation: the first ones were just the usual there was a random crash and I have no clue why - I scratched everything and now it works.
These two issues were actually related, and the root cause was the way file descriptors are handled.
Oh, what are file descriptors, but he way ?
The Three Little Descriptors
All C programmers know the three standard library input/output files: stdin
, stdout
, stderr
. These are the default standard input, output, and error streams. The first one typically receives the console input (keyboard), the later ones allows to display output.
If you look at the stdin manual page, you will see that
A file with associated buffering is called a stream and is declared to be a pointer to a defined type FILE. (…) At program start-up, three streams shall be predefined and need not be opened explicitly: standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output).
Basically, these FILE
streams are encapsulating a file descriptor, that is, a lower-level input or output resource (which is generally identified by an integer, ie. an int
), and provide additional features, such as buffering, and offer a set of advanced functions, such as fprintf
to print formatted output.
By default, the input/output streams are connected to the console (Linux, or DOS), but you may redirect them easily - this feature is especially known in the Linux/Un*x world:
ls -l /tmp/foo >/tmp/list 2>/tmp/errorlog
In this very basic shell script example, the ls -l /tmp/foo
command standard output is redirected to the file /tmp/list
, and the standard error output is redirected to /tmp/errorlog
. Note that the >/tmp/list
can also be written as 1>/tmp/list
: the ‘1’ and ‘2’ refer to the file descriptors 1 and 2, which are the standard output and error file descriptors. The standard input is the file descriptor 0:
read A 0</tmp/bar
Why are the standard input/output streams identified by 0, 1 and 2 ? Well, this is just a convention - and, by default, file descriptors are allocated increasingly, and the first three files opened (this is done by the system) are these streams.
We can redirect the standard streams easily using shell commands - but can we do that programmatically, especially for the output streams ?
Redirection to a File
We have two output file descriptors (1 and 2), and we want to capture them in a file - say, a log file. This is quite convenient, especially for server applications: log files can then be rotated, inspected…
The standard way to reassign a file descriptor is through the dup2
standard library function. This function takes an opened file descriptor, and clone it to the second file descriptor provided. You can also extract the file descriptor beneath a FILE
stream through the fileno
standard library function: here’s a simple code to redirect the standard output to a file:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>
/** Redirect stdout and stderr. Returns 0 upon success. **/
static int redirect_std_out_err(const char *log_name) {
int failure = 1;
/* Open log FILE stream. */
FILE *const fp = fopen(log_name, "wb");
if (fp != NULL) {
/* Extract log FILE fd, and standard stdout/stderr ones. */
const int log_fd = fileno(fp);
const int stdout_fd = fileno(stdout);
const int stderr_fd = fileno(stderr);
if (log_fd != -1 && stdout_fd != -1 && stderr_fd != -1) {
/* Clone log fd to standard stdout/stderr ones. */
if (dup2(log_fd, stdout_fd) != -1 && dup2(log_fd, stderr_fd) != -1) {
failure = 0;
}
}
/* Close initial file. We do not need it, as we cloned it. */
fclose(fp);
}
return failure;
}
int main(int argc, char **argv) {
/* Redirect stdout and stderr to a single log file */
if (argc != 2 || redirect_std_out_err(argv[1])) {
return EXIT_FAILURE;
}
/* Print output to both streams */
printf("Hey, stdout is redirected!\n");
fprintf(stderr, "Hey, stderr is also redirected!\n");
return EXIT_SUCCESS;
}
And that’s it! Simple, isn’t it ?
Here Comes The Problems
Redirection is what we typically do in my company - I even modified some of the involved code. But sending output to a log file has its drawbacks. Especially regarding log file size: the file is growing, and we need to clear it time to time, or the disk will eventually fill. This is generally done through a “rotate” operation, that is:
- closing the current log file
- renaming it with some timestamp
- opening a fresh one
- cleanup old logs if necessary
File descriptors, more generally, are special resources in most operating systems, as they are inherited when spawning a process. By default, all file descriptors are transmitted to spawned processes (on POSIX systems, fork
clone them, and exec
functions keep them opened by default), and this is a pain: pipes and sockets are also inherited. This may lead to the following case:
- main program opens a socket to a remote location
- spawning a process, which inherits this socket (file descriptor) too
- main program sens commands to remote location and closes the socket
- spawned program still lives, and still has the socket file descriptor copy opened
- remote location does not “sees” the client close() and is stuck
The thing is that spawned processed do not give a damn about already opened file descriptors (except for stdio). But the standard is broken for historical reasons, very unfortunately.
The solution is painful: you need to flag all possible opened file descriptors with the FD_CLOEXEC
flag set. This flag explicitly tells the system that the file descriptor shall not be inherited when spawning a process:
open(..., ...)
=>open(..., ... | O_CLOEXEC)
fopen(..., ...)
=>open(..., ... | O_CLOEXEC)
+fdopen(...)
ORfopen(..., ... + "e")
dup(...)
=>fcntl(..., F_DUPFD_CLOEXEC, (long) 0)
dup2(..., ...)
=>dup3(..., ..., O_CLOEXEC)
opendir(...)
=>open(..., O_RDONLY | O_DIRECTORY | O_CLOEXEC)
+fdopendir(...)
pipe(...)
=>pipe2(..., O_CLOEXEC)
acept(..., ..., ...)
=>accept4(..., ..., ..., O_CLOEXEC)
socket(..., ..., ...)
=>socket(..., ... | SOCK_CLOEXEC, ...)
- and so on…
Some of these functions are non-standard (such as accept4
) unfortunately. Some other will not be available (Windows). Some old kernels (Linux) will sometimes not support these features - and some old glibc will not work either (especially for the “e” flag). Yes, this is a real pain.
Oh, one last thing: do not dare to use fcntl(fd, F_SETFD, ...)
to clear the FD_CLOEXEC
flag: this is insanely hopeless in a multithreaded environment. Can you guess why ?
So What Was That All About ?
Okay, back yo our nasty bugs!
- The first one was quite easy to investigate.
Here are the investigation steps, ordered temporally:
- “can not run an external crash reporter on Linux”; ie. can not
fork()
/execv()
the external crash reporter in a signal handler, with a status code returned bywaitpid()
reporting a signal 6 caught during launch (ie. the launched program calledabort()
) - we can not run the crash reporter outside the signal handler either
- we can not run the crash reporter using
system()
either - we can not run “/bin/echo foo” through
system()
either; the command returns the status “return code 1”! - … BUT we CAN run “/bin/echo -n” through
system()
!
At this stage, it is clear that the issue is related to the output file descriptors.
Further investigation showed that:
- we redirected output to logs, opening the log file with the FD_CLOEXEC flag set, AND using the Linux-specific
dup3()
call - as a result or
dup3()
, standard streams had their FD_CLOEXEC flag set - spawning an external program led to have the standard streams closed
- attempting to write on the closed standard streams led to either a broken pipe signal, or an assertion failure depending on the program involved
The solution was simply to use the default dup2()
function for standard streams. These are the only streams we want to be inherited, after all!
- The second bug took a bit more time
The biggest issue is that the bug could not be reproduced at first internally. Until our Q&A team managed to reproduce it (these guys are really good at finding bugs, I must admit) three days in a row.
The logs reports were strange, and totally unrelated. But one log was kind of interesting:
[2014/11/28-10:59:08.514] [info] ...
[2014/11/28-10:59:08.659] [info] ...
[Fatal Error] :2:999961: XML document structures must start and end within the same entity.
The error was a corrupted produced XML document, with another unrelated log inside it, in the middle of the structure. The final error did not have any timestamp, but the previous one was just before midnight (GMT). Humm, a midnight bug ? What are the chances for this being a coincidence ?
I then realized that midnight was the trigger time for a number of maintenance operations, including… log rotation.
But after thoughtfully inspecting the code, everything looked fine. And the bug never popped up on Linux, it was Windows-specific (or at least, much frequent on this operating system)
But there was another clue: all weird logs were actually logs coming from the Java virtual machine (either through Log4j, or through System.out.println()
calls)
When using System.out
(or System.err
) in Java, we are actually using a PrintStream
, which is an object wrapped in various FilterOutputStream
objects for buffering, the final instance being a FileOutputStream
, which is, according to the documentation an:
output stream for writing data to a File or to a FileDescriptor
And, in our case, the FileDescriptor
involved was FileDescriptor.out
and FileDescriptor.err
, that is, Java handles to the standard output and error streams.
Fortunately, the sources of the Open JDK are public, and it is possible to dig into the real code beneath, that is, the native code:
- ./jdk/src/share/classes/java/io/FileOutputStream.java
- ./jdk/src/windows/native/java/io/FileOutputStream_md.c
- ./jdk/src/solaris/native/java/io/FileOutputStream_md.c
Hummm… there is a Windows-specific implementation for FileOutputStream native calls. strange, isn’t it ?
JNIEXPORT void JNICALL
Java_java_io_FileOutputStream_write(JNIEnv *env, jobject this, jint byte, jboolean append) {
writeSingle(env, this, byte, append, fos_fd);
}
And the writeSingle
call is using a file descriptor returned by GET_FD()
:
void
writeSingle(JNIEnv *env, jobject this, jint byte, jboolean append, jfieldID fid) {
// Discard the 24 high-order bits of byte. See OutputStream#write(int)
char c = (char) byte;
jint n;
FD fd = GET_FD(this, fid);
And this macro just calls GetObjectField(..., IO_handle_fdID)
to retrieve … the “handle” member of the corresponding FileDescriptor class.
What ? A handle ? Not a file descriptor ?
Yep: a HANDLE
- and this was the root cause of all out problems.
Let me give you the reason: on Windows, a standard library handle is actually a kernel HANDLE
wrapped by the C library, nothing more.
It means that if you replace the standard library input/output streams, you’ll just replace the C library streams.
Here is the complete bug scenario on Windows after collecting all the clues:
- log rotation function is called at low level, and replaces the standard output and error streams 1 and 2 using
dup2()
- the Visual C++ C library replaces the streams, by closing the underlying existing
HANDLE
, and replacing by new ones - the old
HANDLE
are now totally meaningless (“dangling handles”) - Java has no knowledge of the replacement, and still has the old
HANDLE
stored in itsFileDescriptor
internals - Java
System.out
calls are now sending data to nowhere (this is bad) - someone opens a new file, and Windows recycle the old
HANDLE
- Java
System.out
calls are now sending data to a valid – yet unrelated – file (this is very very bad) - data corruption ensues (and not profit)
The root cause is that Java is unaware that the low-level HANDLE
has changed.
The solution is simple: let’s tell him!:
For each System.out
and System.err
OutputStream
instances, we need to:
- get the “
out
” member field of theFilterOutputStream
class and mark it public (seeField.setAccessible()
) - unwrap all
FilterOutputStream
wrapping by calling fetching the “out
” member field recursively - cast the resulting instance to a
FileOutputStream
- fetch the
FileDescriptor
throughFileOutputStream.getFD()
- use
SharedSecrets.getJavaIOFileDescriptorAccess()
to set theHANDLE
(along
) value to -1 temporarily - rotate the logs
- fetch the new
HANDLE
(along
) through a native call to_get_osfhandle()
- set the new
HANDLE
using the previous steps
This is a bit cumbersome - but mixing native code and Java is always cumbersome, so you have to be used to it :)
Epilogue
The two issues were rather different:
- one was a mistake created when attempting to fix the insane default handle inheritance on both POSIX and Windows API, causing external programs not to run anymore
- one was a design bug between the Visual C++ library and the JVM (ie. there is no
dup2()
for HANDLEs), causing random corruptions on files or sockets
But in both cases, a simple standard stream mishandling was the root cause. But the consequences were nasty, the resolution painful, and the time spent unreasonable (the second bug took months to be investigated)
TL;DR: redirecting Stdio is Hard, and prone to nasty bugs if you are not careful enough!