What is the global interpreter lock (GIL) in CPython?
Python's GIL is intended to serialize access to interpreter internals from different threads. On multi-core systems, it means that multiple threads can't effectively make use of multiple cores. (If the GIL didn't lead to this problem, most people wouldn't care about the GIL - it's only being raised as an issue because of the increasing prevalence of multi-core systems.) If you want to understand it in detail, you can view this video or look at this set of slides. It might be too much information, but then you did ask for details :-)
Note that Python's GIL is only really an issue for CPython, the reference implementation. Jython and IronPython don't have a GIL. As a Python developer, you don't generally come across the GIL unless you're writing a C extension. C extension writers need to release the GIL when their extensions do blocking I/O, so that other threads in the Python process get a chance to run.
Suppose you have multiple threads which don't really touch each other's data. Those should execute as independently as possible. If you have a "global lock" which you need to acquire in order to (say) call a function, that can end up as a bottleneck. You can wind up not getting much benefit from having multiple threads in the first place.
To put it into a real world analogy: imagine 100 developers working at a company with only a single coffee mug. Most of the developers would spend their time waiting for coffee instead of coding.
None of this is Python-specific - I don't know the details of what Python needed a GIL for in the first place. However, hopefully it's given you a better idea of the general concept.
Let's first understand what the python GIL provides:
Any operation/instruction is executed in the interpreter. GIL ensures that interpreter is held by a single thread at a particular instant of time. And your python program with multiple threads works in a single interpreter. At any particular instant of time, this interpreter is held by a single thread. It means that only the thread which is holding the interpreter is running at any instant of time.
Now why is that an issue:
Your machine could be having multiple cores/processors. And multiple cores allow multiple threads to execute simultaneously i.e multiple threads could execute at any particular instant of time.. But since the interpreter is held by a single thread, other threads are not doing anything even though they have access to a core. So, you are not getting any advantage provided by multiple cores because at any instant only a single core, which is the core being used by the thread currently holding the interpreter, is being used. So, your program will take as long to execute as if it were a single threaded program.
However, potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Taken from here. So for such operations, a multithreaded operation will still be faster than a single threaded operation despite the presence of GIL. So, GIL is not always a bottleneck.
Edit: GIL is an implementation detail of CPython. IronPython and Jython don't have GIL, so a truly multithreaded program should be possible in them, thought I have never used PyPy and Jython and not sure of this.