Bit Manipulation

Table of Contents

While it may seem esoteric, the ability to manipulate the individual bits of a number is often required for low-level hardware programming, certain file formats, network protocols, and data structures.

Operations #

In addition to basic arithmetic operations such as addition, subtraction, multiplication and division, programming languages (and the CPUs they run on) support what are known as bitwise operations. The most important of these are not (or complement), and, or, xor, left shift and right shift.

The not operation changes each 1 to a 0 and each 0 to a one. For example, not 1101 is 0010.

The and operation compares the corresponding bits in two numbers. If both bits are 1, then the corresponding bit in the result is 1. Otherwise the corresponding bit will be 0. For example, 1001 and 1100 is 1000.

The or operation also compares the corresponding bits in two numbers. If either bit is 1, then the corresponding bit in the result is 1. For example, 1001 or 1100 is 1101.

The xor (exclusive or) operation works much like or, except that it wants either bit to be 1, but not both. So 1001 xor 1100 is 0101.

The shift operations shift the bits in a number left or right. For example 0110 shifted right by one is 0011, and 0110 shifted left by one is 1100.

Here’s how to do these operations in some different languages. The “C / Java / Python” column shows the most common syntax and applies to far more than just those three languages. Even the languages that are different, like Go and Rust, are usually pretty close.

	C / Java / Python	Go	Rust
a and b	`a & b`	`a & b`	`a & b`
a or b	`a \| b`	`a \| b`	`a \| b`
a xor b	`a ^ b`	`a ^ b`	`a ^ b`
not b	`~b`	`^b`	`!b`
x shift left n	`x << n`	`x << n`	`x << n`
x shift right n	`x >> n`	`x >> n`	`x >> n`

Basic Manipulations #

There are several basic bit manipulations: Turning bits on, turning them off, querying their current state and toggling them (settings 0’s to 1 and 1’s to 0). In each case this involves anding, oring or xoring the number you want to manipulate with another number called a mask.

Specifically, or turns bits on, xor toggles bits, and and both turns them off and queries their state. For example, the following code does all of these manipulations on the second and fourth bits of a number:

uint8_t mask = 10;  // 00001010
uint8_t x = 0xfd;   // 11111101
x = x ^ mask;       // Toggle. x = 11110111
x = x ^ mask;       // Toggle. x = 11111101
x = x & ~mask;      // Turn off. x = 11110101
x = x | mask;       // Turn on.  x = 11111111

if (x & mask)
  cout << "Second and fourth bits set." << endl;
else
  cout << "Second and fourth bits unset." << endl;

Mask Creation #

You can either create a mask by hand, or build it up by oring and shifting. For example:

uint8_t mask = 0;
mask = mask | 1;  // 00000001
mask = mask << 2; // 00000100
mask = mask | 1;  // 00000101
mask = mask << 3; // 00101000

If you get tired of typing, you can save yourself some time by writing inline functions like this one:

// Returns the nth bit of x.
inline unsigned bit(unsigned x, int n) {
	return (x >> n) & 1;
}

Groups of bits #

We use shifts in combination with and to extract group of bits. First, right shift to move the group to the end, and then use and to zero out the undesired upper bits.

For example, to extract the middle four bits from an 8-bit number:

uint8_t x = 0xac;     // 10101100
x = (x >> 2) & 0x0f;  // 00001011

To create a mask that’s n bits long, shift a 1 left by n and then subtract 1. For example, to create a 4-bit mask:

uint8_t mask = 1;  // 00000001
mask <<= 4;        // 00010000
mask -= 1;         // 00001111

Shifts are also used in combination with and and or to insert a group of bits. Zero out the relevant bits in the target integer, then left shift the group into position, and finally or it with the target.

For example, to put a 1001 in the middle of 10101001:

uint8_t group = 0x09;         // 00001001
uint8_t x = 0xa9;             // 10101001
uint8_t mask = (1 << 4) - 1;  // 00001111
x &= mask << 2;               // 10000001
x |= group << 2;              // 10100101

Note that shifts have at least one gotcha: Right shifting a signed integer will shift in 1s instead of 0s if the leading bit is 1. This preserves the sign when using shifts for fast multiplication and division by powers of 2. For bit manipulation purposes it’s best to stick to unsigned integers.

int8_t x = 0xfe;  // 11111110 = -2
x << 2 == 0xf8;   // 11111000 = -8
x >> 1 == 0xff;   // 11111111 = -1

Rotations #

Rotations are a lot like shifts, except instead of throwing away bits it shifts them in at the other end. While the vast majority of CPUs have an instruction to quickly perform this operation, most languages do not have an operator for it. Instead, you have to write it in terms of shifts and ors. Fortunately modern compilers are usually smart enough to optimize this to a single CPU instruction.

uint8_t x = 0xf2;             // 11110010
(x << 2) | (x >> 6) == 0xcb;  // 11001011, rotate left by 2
(x >> 2) | (x << 6) == 0xbc;  // 10111100, rotate right by 2

Bit Arrays #

Finally, lets put this together into something useful: A simple bit array.

void set(uint8_t *bits, int b) {
    bits[b / 8] |= (1 << (b % 8));
}

void clear(uint8_t *bits, int b) {
    bits[b / 8] &= ~(1 << b);
}

unsigned get(uint8_t *bits, int b) {
	int shift = b % 8;
    return (bits[b / 8] & (1 << shift)) >> shift;
}

Just remember, though, while it’s instructive (and fun) to play around and write your own, for production code you’re probably better off using one that’s already written and tested. In this case, C++ provides a couple of possibilities: std::vector<bool> and std::bitset.