Does using heap memory (malloc/new) create a non-deterministic program?
Solution 1:
In the context of realtime systems, there is more to determinism than a repeatable "execution path". Another required property is that timing of key events is bounded. In hard realtime systems, an event that occurs outside its allowed time interval (either before the start of that interval, or after the end) represents a system failure.
In this context, usage of dynamic memory allocation can cause non-determinism, particularly if the program has a varying pattern of allocating, deallocating, and reallocating. The timing of allocations, deallocation, and reallocation can vary over time - and therefore making timings for the system as a whole unpredictable.
Solution 2:
The comment, as stated, is incorrect.
Using a heap manager with non-deterministic behavior creates a program with non-deterministic behavior. But that is obvious.
Slightly less obvious is the existence of heap managers with deterministic behavior. Perhaps the most well-known example is the pool allocator. It has an array of N*M bytes, and an available[]
mask of N bits. To allocate, it checks for the first available entry (bit test, O(N), deterministic upper bound). To deallocate, it sets the available bit (O(1)). malloc(X)
will round up X to the next biggest value of M to choose the right pool.
This might not be very efficient, especially if your choices of N and M are too high. And if you choose too low, your program can fail. But the limits for N and M can be lower than for an equivalent program without dynamic memory allocation.
Solution 3:
Nothing in the C11 standard or in n1570 says that malloc
is deterministic (or is not); and neither some other documentation like malloc(3) on Linux. BTW, many malloc
implementations are free software.
But malloc
can (and does) fail, and its performance is not known (a typical call to malloc
on my desktop would practically take less than a microsecond, but I could imagine weird situations where it might take much more, perhaps many milliseconds on a very loaded computer; read about thrashing). And my Linux desktop has ASLR (address space layout randomization) so runnning the same program twice gives different malloc
-ed addresses (in the virtual address space of the process). BTW here is a deterministic (under specific assumptions that you need to elaborate) but practically useless malloc
implementation.
determinism means that running a program twice will lead to the exact, same execution path
This is practically wrong in most embedded systems, because the physical environment is changing; for example, the software driving a rocket engine cannot expect that the thrust, or the drag, or the wind speed, etc... is exactly the same from one launch to the next one.
(so I am surprised that you believe or wish that real-time systems are deterministic; they never are! Perhaps you care about WCET, which is increasingly difficult to predict because of caches)
BTW some "real-time" or "embedded" systems are implementing their own malloc
(or some variant of it). C++ programs can have their allocator-s, usable by standard containers. See also this and that, etc, etc.....
And high-level layers of embedded software (think of an autonomous automobile and its planning software) are certainly using heap allocation and perhaps even garbage collection techniques (some of which are "real-time"), but are generally not considered safety critical.
Solution 4:
tl;dr: It's not that dynamic memory allocation is inherently non-deterministic (as you defined it in terms of identical execution paths); it's that it generally makes your program unpredictable. Specifically, you can't predict whether the allocator might fail in the face of an arbitrary sequence of inputs.
You could have a non-deterministic allocator. This is actually common outside of your real-time world, where operating systems use things like address layout randomization. Of course, that would make your program non-deterministic.
But that's not an interesting case, so let's assume a perfectly deterministic allocator: the same sequence of allocations and deallocations will always result in the same blocks in the same locations and those allocations and deallocations will always have a bounded running time.
Now your program can be deterministic: the same set of inputs will lead to exactly the same execution path.
The problem is that if you're allocating and freeing memory in response to inputs, you can't predict whether an allocation will ever fail (and failure is not an option).
First, your program could leak memory. So if it needs to run indefinitely, eventually an allocation will fail.
But even if you can prove there are no leaks, you would need to know that there's never an input sequence that could demand more memory than is available.
But even if you can prove that the program will never need more memory than is available, the allocator might, depending on the sequence of allocations and frees, fragment memory and thus eventually be unable to find a contiguous block to satisfy an allocation, even though there is enough free memory overall.
It's very difficult to prove that there's no sequence of inputs that will lead to pathological fragmentation.
You can design allocators to guarantee there won't be fragmentation (e.g., by allocating blocks of only one size), but that puts a substantial constraint on the caller and possibly increases the amount of memory required due to the waste. And the caller must still prove that there are no leaks and that there's a satiable upper-bound on total memory required regardless of the sequence of inputs. This burden is so high that it's actually simpler to design the system so that it doesn't use dynamic memory allocation.
Solution 5:
The deal with real-time systems is that program must strictly meet certain computation and memory restrictions regardless of the execution path taken (which may still vary considerably depending on input). So what does use of generic dynamic memory allocation (such as malloc/new) mean in this context? It means that developer at some point is not able to determine exact memory consumption and it would be impossible to tell whether resulting program will be able to meet the requirements, both for memory and for computation power.