I'll Never Use memset Again...
The Pitfalls of the memset function
0. Foreword#
This problem originated from my first programming exam during my freshman year… It was a question involving block decomposition (data chunking), and in my program, I had an operation like this:
memset(mul_tag, 1, sizeof(mul_tag));
cppUnsurprisingly, the program resulted in a WA (Wrong Answer). I spent a very, very long time debugging. This line looked completely harmless, didn’t it? But as it turned out, simply changing this line fixed the program! Why??? The answer becomes clear when we look at the memset
function prototype.
1. memset
Function Introduction#
The prototype for the memset
function is as follows:
void *memset(void *s, int c, size_t n);
cs
: A pointer to the block of memory to fill.c
: The value to be set. Note: Althoughc
is of typeint
,memset
actually convertsc
to anunsigned char
before filling.n
: The number of bytes to be set to the value.
The purpose of memset
is to set the first n
bytes of the memory block pointed to by s
to the value specified by c
.
2. The Trap#
memset
performs its filling operation byte by byte. When a
is an int
array (assuming int
occupies 4 bytes), memset(a, 1, sizeof(a))
will set each byte of each int
element to 1
. This results in each int
element having the value 0x01010101
, which is 16843009
in decimal, not the 1
we were hoping for.
3. Exceptions#
Using memset(a, 1, sizeof(a))
is dangerous in most scenarios. However, there are a few exceptional cases where it works as expected or is safe:
- If
a
is achar
array,memset(a, 1, sizeof(a))
is correct because thechar
type occupies only one byte. memset(a, 0, sizeof(a))
can be safely used for arrays of any type to initialize the entire array to 0. (This is what we typically do! And it’s precisely why I initially thoughtmemset(a, 1, sizeof(a))
would be fine!)memset(a, -1, sizeof(a))
is safe forint
arrays and will correctly initialize the elements to -1. Why? Hint: Computers store negative numbers using two’s complement representation. The two’s complement of -1 (for a 32-bit int) is11111111 11111111 11111111 11111111
, which means every byte is0xFF
. Therefore,memset(a, -1, sizeof(a))
fills every byte with0xFF
, effectively setting eachint
element to -1.
4. You Should Use std::fill
#
Instead of memset
for non-zero/non-minus-one initializations (especially in C++), you should use std::fill
.
std::fill
Example (C++):
#include <algorithm>
#include <array> // Or use raw arrays
std::array<int, 10> a; // Or: int a[10];
std::fill(a.begin(), a.end(), 1); // Or: std::fill(a, a + 10, 1);
cstd::fill
operates on elements of the container or array, assigning the specified value (1
in this case) correctly to each element, regardless of its underlying byte representation.