xavier roche's homework

random thoughts and sometimes pieces of code

I Found A Bug In strncat()

Yes, A Bug

This code won’t behave correctly (ie. partial copy) on many glibc versions on Linux:

strncat(dest, source, SIZE_MAX);

This is a corner case, but this is a bug.

(hint: the correct behavior is to do strcat(dest, source) equivalent)

A Secure Version Of Strcat

It began with code cleanup.

In httrack, there is a lot of legacy (old, dirty) code laying around. Rewriting everything from scratch would probably be a sane decision, but I just don’t have the time (and the courage) to do that, so I’m cleaning up bits and bits when possible, attempting to improve the code quality when possible.

One of the terrible design decision was to use C strings everywhere, and associated strcpy, strcat etc. primitives. It was an unfortunate decision, because of the lack of flexibility (immutable capacity), and security issues (buffer overflows that might be introduced).

In this hypothetical case, we’re adding a string to a structure foo:

struct foo {
  char filename[1024];
...
};

void append_filename(struct foo *foo, const char *filename) {
  strcat(foo->filename, filename);
}

This code will overflow if the passed string added to the existing string overflows the filename buffer.

The strlcat version exists on BSD, and is a reasonable solution to mitigate this problem. Unfortunately, some people do not like BSD, and decided that these functions will never be part of the glibc. (insert flame here)

Besides, switching to strlcat and its friends would involve checking every single occurrence of strcat, strcpy. etc., probably introducing new bugs!

I then decided to use a modified version of the dangerous libc primitives, using compile-time type checking:

/**
 * Check whether 'VAR' is of type char[].
 */
#if (defined(__GNUC__) && !defined(__cplusplus))
/* Note: char[] and const char[] are compatible */
#define HTS_IS_CHAR_BUFFER(VAR) ( __builtin_types_compatible_p ( typeof (VAR), char[] ) )
#else
/* Note: a bit lame as char[8] won't be seen. */
#define HTS_IS_CHAR_BUFFER(VAR) ( sizeof(VAR) != sizeof(char*) )
#endif
#define HTS_IS_NOT_CHAR_BUFFER(VAR) ( ! HTS_IS_CHAR_BUFFER(VAR) )

/**
 * Append at most N characters from "B" to "A".
 * If "A" is a char[] variable then the size is assumed to be the capacity of this array.
 */
#define strncatbuff(A, B, N) \
  ( HTS_IS_NOT_CHAR_BUFFER(A) \
  ? strncat(A, B, N) \
  : strncat_safe_(A, sizeof(A), B, \
  HTS_IS_NOT_CHAR_BUFFER(B) ? (size_t) -1 : sizeof(B), N, \
  "overflow while appending '" #B "' to '"#A"'", __FILE__, __LINE__) )

The idea was to sniff at compile-time the arguments passed to the modified strncat version: when an opaque char* is used, there’s nothing you can do. But when passing a regular char[...] array, a secure version can be used. This is not perfect (especially because with non-GCC compilers, char[8] are seen as char*), but at least it allows you to quickly mitigate many potential buffer overflows without touching existing code.

But at one point I did this quite daring thing to create the secure strcat version:

#define strcatbuff(A, B) strncatbuff(A, B, (size_t) -1)

The (size_t) -1 value was simply passed as the boundary of strncatbuff - at some point, we would be doing strncat(dest, source, (size_t) -1), and this seemed fine to me.

Soon after, bugs started to appear (strings were copied partially). It took me a bit of time (with the kind help of the bug reporter) to figure out that my daring macro was involved in this bug. This took me a while because the bug only appeared on Linux, on some glibc releases, and quite randomly.

I managed to have a reproducible minimalistic case:

/* strncat_size_t-1_testcase. */

/* gcc -Wall strncat_size_t-1_testcase.c -o strncat_size_t-1_testcase */
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <assert.h>

/* strncat_size_t-1_testcase "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor" */
int main(int argc, char **argv) {
  const size_t size = strlen(argv[1]);
  char *const buffer = malloc(size + 1);
  int success;
  buffer[0] = '\0';
  strncat(buffer, argv[1], SIZE_MAX);
  success = strlen(buffer) == size;
  if (!success) {
    fprintf(stderr, "** result: '%s'\n", buffer);
  }
  assert(success);
  return success ? EXIT_SUCCESS : EXIT_FAILURE;
}

If you build this program, and use it:

gcc -Wall strncat_size_t-1_testcase.c -o strncat_size_t-1_testcase
./strncat_size_t-1_testcase "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor"

… then you’ll get an error. Or maybe you’ll get one. Randomly.

This Is Not A glibc Bug, But A Misuse Of Strncat

The only way to be sure was to carefully read the strncat standard.

The strncat() function shall append not more than n bytes (a null byte and bytes that follow it are not appended) from the array pointed to by s2 to the end of the string pointed to by s1. The initial byte of s2 overwrites the null byte at the end of s1. A terminating null byte is always appended to the result. If copying takes place between objects that overlap, the behavior is undefined.

With the help of specialists in comp.unix.programmer, several points must be noticed:

  • strncat is not a secure version of strcat (strlcat is), and the third argument is not the destination capacity
  • using (size_t) -1 as third argument is actually using SIZE_MAX
  • the third argument is an additional limit and can have any value

Therefore, strncat(..., ..., SIZE_MAX) should behave as strcat(..., ...).

Yes, yes, this is a corner case. But this is a legit one.

Where Is The Bug ?

The reason why not all glibc releases were impacted is simple: this is an optimization bug. (insert laughs from experienced developers)

The original strncat source is not used on x86-64 architectures: the optimized SSE2 version is used in place.

The code might have been introduced in 2013 in this commit which was supposed to introduce a “Faster strlen on x64” - but some existing files seems to have been merged at that time.

Unfortunately, the assembly source is uncommented, and I did not have the courage to dig deeper. But this source is buggy.

Funny Remark: By the way, I’m not even convinced that the assembly version is faster at all: code executing speed is probably irrelevant compared to L1/L2/L3/memory bandwidth issues.

What’S Next ?

First I obviously stopped using SIZE_MAX with strncat, and dispatch to strcat accordingly. This was a daring thing to use this corner case value anyway, prone to … corner-case bugs.

The second step was to fill a bug report in the glibc.

TL;DR: Beware when using corner cases, even when perfectly conforming to the standard. And never “optimize” code by rewriting obvious functions into a 300-lines of undocumented assembly!

Comments