The 0xFFFFFFFF problem

2018-09-13

This is a pretty old document that I wrote for myself back when I was first learning C, but the explanation is pretty good so I published it here for future reference. I hope to soon push a reference document that explains some tricky concepts and things from the C standard that are not so easy to remember off the top of the head (such as type conversion, declarations, storage durations, scopes, etc…)

For future reference, always refer to the latest C standard as it has answers to many questions of the same nature such as this one. The C11 draft can be found here.

Recently, I posted this question to StackOverflow about the following program:

1int main(void) {
2    int n = 0xFFFFFFFF;
3    printf("%d\n", n); //was originaly %u
4}

In this explanation, I will disregard the fact that I used the wrong conversion specifier in my printf call (used %u instead of %d), thus I will replace it with the correct one, and will come back to the topic of %u later on in the explanation.

So, what exactly is happening here? Initially, I was confused as to why this program’s output was -1 (at least on a linux/gcc 64bit platform).

Let’s break the “problem” down into pieces and analyze each statement and operand separately:

The hexadecimal constant 0xFFFFFFFF, is in decimal form equal to 2 ^ 32 - 1, or 4294967295, and in binary form, equal to 11111111111111111111111111111111. By the compiler, it is treated as an unsigned integer, and not as a series of bits. Yes, it can be represented as a series of 32 bits (all 1’s), but to the compiler that is apparently irrelevant. In C99, hexadecimal and octal constants are tested in the following order:
- int
- unsigned int
- long int
- unsigned long int
- long long int
- unsigned long long int
The first type that is large enough to accomodate the size of the constant is the one that the compiler will use to identify it. In this case, 0xFFFFFFFF is too large to fit in the first contender (int, or rather signed int), but is just the right size to be treated as an unsigned int, and thus the compiler treats it as that.
Next, let’s look at the left operand of the = operator, int. When used by itself, it is the equivalent of singed int, and thus I will refer to it as that from now on. The largest positive number that a signed int can store is equal to 2 ^ 31 - 1, or 2147483647 in decimal form, due to the need of being able to represent negative integers, and thus having one less bit to work with.
The types of the operands on both sides of the = operator don’t match. One is a signed int, the other an unsigned int. Thus, C has to implicitly convert the right operant to the type of the left operand. (in assignment, the right operand always has to abide by the type of the left one). The result of this conversion isn’t specified in the C standard, and is therefore implemenation defined.
Per my current understanding, the binary representation of 0xFFFFFFFF, or rather 0b11111111111111111111111111111111 is simply copied into the address of our signed int n varible. Since the variable is a signed integer, it utilizes 2’s complement. In 2’s complement, a binary representation of all 1s is always equal to -1 in decimal form, and therefore that is the output of our printf() call.
Why are all 1s always equal to -1 (decimal) in 2’s complement? Let’s look at the following example using an 8bit number (although the same rules apply for any other number of bits, including 32):
- in a regular unsigned system, 1 (decimal) is equal to 0b00000001 in binary. Very obvious.
- Let’s get the negative value of this number (-1) using the 2s complement operation:
  - first we invert the bits: 0b00000001 -> 0b11111110
  - then we add 1 : 0b11111110 -> 0b11111111.
  - Upon completion of the 2s complement operation, we ended up with a binary representation of the negative version of our original number (-1)
- As you can see, we’ve ended up with the number -1 (decimal), that is in binary equal to all 1s (all bits). The same is true when the operation is applied to a number consisting of any number of bits.

The `%u` problem

As I said, in my original question I (for some unknown reason?) substituted the %d conversion specifier with the %u conversion specifier, resulting in the following statement:

1printf("%u\n", n);

This statement prints out 4294967295. Why? Let’s find out:

The %u conversion expects an unsigned integer as the argument. We’ve, however, supplied it a singed integer. However, this time there is no conversion, as the conversion from int to unsigned int is impossible, nonexistent, and therefore the behaviour is undefined.

In the case of GCC, %u (or rather, printf) assumes that the value passed to it is of the correct type (uint in this case), and prints its value accordingly. 0xFFFFFFFF in an unsigned system has the decimal value 4294967295, which is then simply printed.

The %u problem

The `%u` problem