The 0xFFFFFFFF problem
This is a pretty old document that I wrote for myself back when I was first learning C, but the explanation is pretty good so I published it here for future reference. I hope to soon push a reference document that explains some tricky concepts and things from the C standard that are not so easy to remember off the top of the head (such as type conversion, declarations, storage durations, scopes, etc…)
For future reference, always refer to the latest C standard as it has answers to many questions of the same nature such as this one. The C11 draft can be found here.
Recently, I posted this question to StackOverflow about the following program:
1int main(void) {
2 int n = 0xFFFFFFFF;
3 printf("%d\n", n); //was originaly %u
4}
In this explanation, I will disregard the fact that I used the wrong conversion specifier in my printf
call (used %u
instead of %d
), thus I will replace it with the correct one, and will come back to the topic of %u
later on in the explanation.
So, what exactly is happening here? Initially, I was confused as to why this program’s output was -1
(at least on a linux/gcc 64bit platform).
Let’s break the “problem” down into pieces and analyze each statement and operand separately:
-
The hexadecimal constant
0xFFFFFFFF
, is in decimal form equal to2 ^ 32 - 1
, or4294967295
, and in binary form, equal to11111111111111111111111111111111
. By the compiler, it is treated as an unsigned integer, and not as a series of bits. Yes, it can be represented as a series of 32 bits (all 1’s), but to the compiler that is apparently irrelevant. In C99, hexadecimal and octal constants are tested in the following order:- int
- unsigned int
- long int
- unsigned long int
- long long int
- unsigned long long int
The first type that is large enough to accomodate the size of the constant is the one that the compiler will use to identify it. In this case,
0xFFFFFFFF
is too large to fit in the first contender (int
, or rathersigned int
), but is just the right size to be treated as anunsigned int
, and thus the compiler treats it as that. -
Next, let’s look at the left operand of the
=
operator,int
. When used by itself, it is the equivalent ofsinged int
, and thus I will refer to it as that from now on. The largest positive number that asigned int
can store is equal to2 ^ 31 - 1
, or2147483647
in decimal form, due to the need of being able to represent negative integers, and thus having one less bit to work with. -
The types of the operands on both sides of the
=
operator don’t match. One is asigned int
, the other anunsigned int
. Thus, C has to implicitly convert the right operant to the type of the left operand. (in assignment, the right operand always has to abide by the type of the left one). The result of this conversion isn’t specified in the C standard, and is therefore implemenation defined. -
Per my current understanding, the binary representation of
0xFFFFFFFF
, or rather0b11111111111111111111111111111111
is simply copied into the address of oursigned int n
varible. Since the variable is asigned integer
, it utilizes 2’s complement. In 2’s complement, a binary representation of all 1s is always equal to -1 in decimal form, and therefore that is the output of our printf() call. -
Why are all 1s always equal to -1 (decimal) in 2’s complement? Let’s look at the following example using an 8bit number (although the same rules apply for any other number of bits, including 32):
- in a regular unsigned system, 1 (decimal) is equal to
0b00000001
in binary. Very obvious. - Let’s get the negative value of this number (-1) using the 2s complement operation:
- first we invert the bits:
0b00000001
->0b11111110
- then we add 1 :
0b11111110
->0b11111111
. - Upon completion of the 2s complement operation, we ended up with a binary representation of the negative version of our original number (-1)
- first we invert the bits:
- As you can see, we’ve ended up with the number -1 (decimal), that is in binary equal to all 1s (all bits). The same is true when the operation is applied to a number consisting of any number of bits.
- in a regular unsigned system, 1 (decimal) is equal to
The %u
problem
As I said, in my original question I (for some unknown reason?) substituted the %d
conversion specifier with the %u
conversion specifier, resulting in the following statement:
1printf("%u\n", n);
This statement prints out 4294967295
. Why? Let’s find out:
The %u
conversion expects an unsigned integer as the argument. We’ve, however, supplied it a singed integer. However, this time there is no conversion, as the conversion from int
to unsigned int
is impossible, nonexistent, and therefore the behaviour is undefined.
In the case of GCC, %u
(or rather, printf
) assumes that the value passed to it is of the correct type (uint
in this case), and prints its value accordingly. 0xFFFFFFFF
in an unsigned system has the decimal value 4294967295, which is then simply printed.