# Bit Manipulation

## Table of Contents

While it may seem esoteric, the ability to manipulate the individual bits of a number is often required for low-level hardware programming, certain file formats, network protocols, and data structures.

## Operations #

In addition to basic arithmetic operations such as addition,
subtraction, multiplication and division, programming languages (and the
CPUs they run on) support what are known as **bitwise** operations.
The most important of these are *not* (or *complement*), *and*, *or*,
*xor*, *left shift* and *right shift*.

The *not* operation changes each 1 to a 0 and each 0 to a one. For
example, *not* 1101 is 0010.

The *and* operation compares the corresponding bits in two numbers. If
both bits are 1, then the corresponding bit in the result is 1.
Otherwise the corresponding bit will be 0. For example, 1001 *and* 1100
is 1000.

The *or* operation also compares the corresponding bits in two numbers.
If either bit is 1, then the corresponding bit in the result is 1. For
example, 1001 *or* 1100 is 1101.

The *xor* (exclusive or) operation works much like *or*, except that it
wants either bit to be 1, but not both. So 1001 *xor* 1100 is 0101.

The *shift* operations shift the bits in a number left or right. For
example 0110 shifted *right* by one is 0011, and 0110 shifted *left* by
one is 1100.

Here’s how to do these operations in some different languages. The “C / C++ / Python” column shows by far the most common syntax and applies to far more than just those three languages. Even the languages that are different, like Go, are usually pretty close.

C / C++ / Python | Go | |
---|---|---|

a and b |
`a & b` |
`a & b` |

a or b |
`a | b` |
`a | b` |

a xor b |
`a ^ b` |
`a ^ b` |

not b |
`~b` |
`^b` |

x shift left n |
`x << n` |
`x << n` |

x shift right n |
`x >> n` |
`x >> n` |

## Basic Manipulations #

There are several basic bit manipulations: Turning bits on, turning them
off, querying their current state and toggling them (settings 0’s to 1
and 1’s to 0). In each case this involves *and*ing, *or*ing or *xor*ing
the number you want to manipulate with another number called a **mask**.

Specifically, *or* turns bits on, *xor* toggles bits, and *and* both
turns them off and queries their state. For example, the following code
does all of these manipulations on the second and fourth bits of a number:

```
uint8_t mask = 10; // 00001010
uint8_t x = 0xfd; // 11111101
x = x ^ mask; // Toggle. x = 11110111
x = x ^ mask; // Toggle. x = 11111101
x = x & ~mask; // Turn off. x = 11110101
x = x | mask; // Turn on. x = 11111111
if (x & mask)
cout << "Second and fourth bits set." << endl;
else
cout << "Second and fourth bits unset." << endl;
```

### Mask Creation #

You can either create a mask by hand, or build it up by *or*ing and
shifting. For example:

```
uint8_t mask = 0;
mask = mask | 1; // 00000001
mask = mask << 2; // 00000100
mask = mask | 1; // 00000101
mask = mask << 3; // 00101000
```

If you get tired of typing, you can save yourself some time by writing inline functions like this one:

```
// Returns the nth bit of x.
inline unsigned bit(unsigned x, int n) {
return (x >> n) & 1;
}
```

### Groups of bits #

We use shifts in combination with *and* to extract group of bits. First, *right
shift* to move the group to the end, and then use *and* to zero out the
undesired upper bits.

For example, to extract the middle four bits from an 8-bit number:

```
uint8_t x = 0xac; // 10101100
x = (x >> 2) & 0x0f; // 00001011
```

To create a mask that’s n bits long, shift a 1 left by n and then subtract 1. For example, to create a 4-bit mask:

```
uint8_t mask = 1; // 00000001
mask <<= 4; // 00010000
mask -= 1; // 00001111
```

Shifts are also used in combination with *and* and *or* to insert a group
of bits. Zero out the relevant bits in the target integer, then *left shift*
the group into position, and finally *or* it with the target.

For example, to put a 1001 in the middle of 10101001:

```
uint8_t group = 0x09 // 00001001
uint8_t x = 0xa9; // 10101001
uint8_t mask = (1 << 4) - 1; // 00001111
x &= mask << 2; // 10000001
x |= group; // 10100101
```

Note that shifts have at least one gotcha: *Right shifting* a **signed** integer
will shift in 1s instead of 0s if the leading bit is 1. This preserves the sign
when using shifts for fast multiplication and division by powers of 2. For bit
manipulation purposes it’s best to stick to unsigned integers.

```
int8_t x = 0xfe; // 11111110 = -2
x << 2 == 0xf8; // 11111000 = -8
x >> 1 == 0xff; // 11111111 = -1
```

### Rotations #

Rotations are a lot like shifts, except instead of throwing away bits it shifts
them in at the other end. While the vast majority of CPUs have an instruction
to quickly perform this operation, most languages do not have an operator for
it. Instead, you have to write it in terms of *shift*s and *or*s. Fortunately
modern compilers are usually smart enough to optimize this to a single CPU
instruction.

```
uint8_t x = 0xf2; // 11110010
(x << 2) | (x >> 6) == 0xcb; // 11001011, rotate left by 2
(x >> 2) | (x << 6) == 0xbc; // 10111100, rotate right by 2
```

## Bit Arrays #

Finally, lets put this together into something useful: A simple bit array.

```
void set(uint8_t *bits, int b) {
bits[b / 8] |= (1 << (b % 8));
}
void clear(uint8_t *bits, int b) {
bits[b / 8] &= ~(1 << b);
}
unsigned get(uint8_t *bits, int b) {
int shift = b % 8;
return (bits[b / 8] & (1 << shift)) >> shift;
}
```

Just remember, though, while it’s instructive (and fun) to play around and write
your own, for production code you’re probably better off using one that’s
already written and tested. In this case, C++ provides a couple of
possibilities: `std::vector<bool>`

and `std::bitset`

.