How do realloc and memcpy work?

I have two questions.

  1. Do realloc() and memcpy() copy the entries in an array to another in a way faster than just iterating on each element O(N) ? If the answer is yes then what do you think is its complexity ?

  2. If the size allocated is smaller than the original size, does realloc() copy the entries to somewhere else or just leave them as they are decreasing the size of the array ?


1 - No. They copy a block at a time. See http://www.embedded.com/design/configurable-systems/4024961/Optimizing-Memcpy-improves-speed for a pretty good analysis.

2 - This is implementation dependent. See http://www.gnu.org/software/libtool/manual/libc/Changing-Block-Size.html for glibc details. "In several allocation implementations, making a block smaller sometimes necessitates copying it"


Let's take a little closer look at memcpy and, while we're at it, at "big O" or Landau notation.

First, big-O. As i've talked about elsewhere, it's worth remembering the definition of big O, which is that some function g(n) is said to be O(f(n)) when there exists a constant k for which g(n)kf(n). What the constant does is lets you ignore the little details in favor of the important part. As everyone has noted, memcpy of n bytes will be O(n) in most any normal architecture, because no matter what you have to move those n bytes, one chunk at a time. So, a first, naive implementation of memcpy in C could be written

unsigned char *
memcpy(unsigned char * s1, unsigned char * s2, long size){
    long ix;
    for(ix=0; ix < size; ix++)
        s1[ix] = s2[ix];
    return s1;
}

This is in fact O(n), and might make you wonder why we even bother with a library routine. however, the thing about the libc functions is that they are the place where platform-specific utilities get written; if you want to optimize for the architecture, this is one of the places you can do it. So, depending on the architecture, there may be a more efficient implementation options; for example, in the IBM 360 archiecture, there is a MOVL instruction that moves data is big chunks using very highly optimized microcode. So in place of that loop, a 360 implementation of memcpy might instead look something like

LR 3,S1      LOAD S1 ADDR in Register 3
LR 4,S2      
MOVL 3,4,SIZE

(No guarantees that's exactly right 360 code by the way, but it'll serve for an illustration.) This implementation looks like instead of doing n steps around the loop as the C code did, it just executes 3 instructions.

What really happens, though, is that it's executing O(n) micro instructions under the covers. What's different between the two is the constant k; because the microcode is much faster, and because there's only three decode steps on the instructions, it is dramatically faster than the naive version, but it's still O(n) -- it's just the constant is smaller.

And that's why you can make good use of memcpy -- it's not asymptotically faster, but the implementation is as fast as someone could make it on that particular architecture.


  1. There is absolutely no way to copy N items faster than O(N). However, it might be able to copy multiple items at once, or use special processor instructions - so it still might be faster than you could do it yourself.
  2. I don't know for sure, but I'd assume that the memory is completely reallocated. That's the safest assumption, and it's probably implementation dependent anyway.

  1. The performance of memcpy can't really be better than O(N) but it can be optimized so that it outperforms manual copying; for example, it might be able to copy 4 bytes in the time it takes you to copy 1 byte. Many memcpy implementations are written in assembly using optimized instructions that can copy multiple elements at a time which is usually faster than copying data one byte at a time.

  2. I don't quite understand this question, if you use realloc to decrease the size of memory and it succeeds (returns non-NULL), the new location will contain the same data as the old location up to the size of the new request. If the memory location was changed as a result of calling realloc (not usual when decreasing the size) the contents will be copied, otherwise no copying needs to happen as the memory hasn't moved.