Why do 32-bit processes have a 2 GB RAM limit?

I'm curious why there is a 2 GB limit for a 32-bit process on a 32-bit OS. According to the blog post Chat Question: Memory Limits for 32-bit and 64-bit processes, the limit can be extended to 3 GB, but the question remains.

I see that the physical limit is 4 GB, so are 2 or 3 GB just hard coded in Windows? Why not 4 GB as a 32-bit process might have on a 64-bit OS?

NOTE: This question was marked as a duplicate, but the referenced question refers to the 4 GB limit of the 32-bit address space. This is NOT what I am asking. I am specifically asking why Windows limits processes to 2 GB -- even on a 32-bit platform. The accepted answer mentions it, but it doesn't explain why.


Solution 1:

On the NT platform the 4 GB virtual address space is by default divided into two parts, the lower 2 GB for process address space and the upper 2 GB for system use.

This address space is virtual and not influenced by RAM size. The CPU and OS memory manager map portions of RAM into virtual address space as needed. This is very complex and will not be described here. This was a design decision made in the interests of performance, security, and reliability.

Each process has its own private 2 GB address space, but there is only one system address space. Processes are isolated in their own private address space and cannot even see others. There is provision for sharing address among two or more processes when necessary. System address space is off limits to normal processes and is accessible only to kernel level components such as the OS itself and device drivers. If a process goes astray it can only hurt itself; other processes and the OS are unaffected.

But why not give the system its own private address space, just like for processes? This would allow the full 4 GB address space be available for the system and each processes. That could have been done - but there was a problem.

Assume that were done. The running process would have full access to its own code and data and all would seem well. But what if that process makes an OS call that requires access to the system address space, such as for an I/O operation? Or what happens if there is an interrupt that needs to be handled by the kernel?

Only the address space of the running process can be seen by the CPU. What to do? The solution is to do a context switch that brings the system address space into view. The OS can do this quite efficiently, but it does take time. If the system address space needed to be accessed frequently the overhead of context switches would be become excessive and performance suffer.

There had to be a better way.

The solution adopted was to was to divide the 4 GB total address space into two parts of 2 GB each. Process address space in the lower 2 GB and the system in the upper. This allows the system address space to be always in scope and accessible whenever needed without a context switch. As often happens design decisions are made for practical reasons.

2 GB may seem very small and restrictive now, but it was huge when NT was released in 1993. And don't forget that each process has its own 2 GB all to itself.

Solution 2:

According to the Windows Internals book it was a design decision. They split the whole 4 GB virtual memory space to two parts:

  • 2 GB kernel mode virtual address space (driver memory windows, etc.)
  • 2 GB user mode virtual address space (memory for userspace programs)

Then there's the non-recommended /3GB switch (which changes kernel 1:3 user and may lead to nasty bugs with drivers that allocate at absolute offsets), PAE and I believe there was one more API that when used allows to allocate non-paged memory and dynamically window it to programs address space (but I am sorry I can't remember its name right now).