Why are there no hashtables in the C standard library?
Why is that there is no Hashtable support as part of Standard C Library? Is there any specific reason for this?
C seems unusual by today's standards because there are no useful data structures defined. None. Not even strings — and if you think a C string is a data structure, well, we'll have to disagree on what a "data structure" is.
If you like C, then think of it as a "blank slate"... your entire application is made of code written by you and libraries you choose to pull in, plus a few fairly primitive standard library functions, with maybe one or two exceptions like qsort
. People use C these days to implement things like Python, Ruby, Apache, or the Linux kernel. These are projects that use all of their own data structures anyway, and they wouldn't be likely to use something like the STL.
Many C libraries implement generic hash tables. There are tradeoffs, and you can pick your favorite. Some of them are configurable using callbacks.
- Glib has a hash table object (documentation)
- Apache Portable Runtime has a hash table (documentation)
- Apple's Core Foundation library has a hash table (documentation) Note: Yes you can insert any object as key or value.
- UTHash is a hash table library (documentation)
- Another hash table library (link)
With all of these libraries that do what you want, what's the point of adding a hash table to the C standard?
There is no hashtable in the standard C library because either:
- no-one has submitted a proposal to the working group; or
- the working group has deemed it unnecessary.
That's the way ISO works. Proposals are put forward and accepted or rejected.
You have to be careful what you add to the standard library since you have two conflicting groups. As a user, you might want every data structure under the sun to be added to the standard to make the language more useful.
But, as a language implementor (as an aside, these are probably the people that tend to make up most of the various working groups so their view is likely to have more impact), you don't really want the hassle of having to implement stuff that may not be used by everyone. All the stuff that was there when C89 appeared was to do with the fact that the primary purpose was to codify existing practice rather than introduce new practices. All iterations of the standards since then have been a little freer in what they can do but backwards compatibility is still an important issue.
Myself, I also have conflicts. I'd love to have all the features of the Java, C++ or Python libraries at my disposal in C. Of course, that would make it so much harder to learn everything for newcomers and, as one commenter stated, probably make it so any code monkey can pump out useful code, reducing my value in the process :-)
And I pretty much have all the data structures I'll ever need, from my long and (mostly) illustrious career. You're not limited to the standard library for this sort of stuff. There are plenty of third-party tools you can get to do the job and (like me) you can also roll your own.
If you want to know why certain decisions were made in each iteration, ISO (and ANSI originally, before ISO took over) usually publish rationale documents. The C89 one from ANSI can be found here. It contains this little beauty in the scope:
This Rationale focuses primarily on additions, clarifications, and changes made to the language as described in the Base Documents. It is not a rationale for the C language as a whole: the Committee was charged with codifying an existing language, not designing a new one. No attempt is made in this Rationale to defend the pre-existing syntax of the language, such as the syntax of declarations or the binding of operators.
I especially enjoy the admission that they're not responsible for any unholy mess that may have predated their attempts to standardise.
But, perhaps the real answer to your question lies in this bit, one of the guiding principles:
Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C. There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based. Some of the facets of the spirit of C can be summarized in phrases like:
- Trust the programmer.
- Don't prevent the programmer from doing what needs to be done.
- Keep the language small and simple.
- Provide only one way to do an operation.
- Make it fast, even if it is not guaranteed to be portable.
That third one is probably the main reason why the library wasn't massively expanded with the initial standardisation effort - that, and the fact that such an expansion from a committee would probably have resulted in ANSI C being labeled C2038 rather than C89.
The standard C library doesn't include any large, persistent data structures - neither lists, nor trees, nor stacks, nor hashtables.
It's not really possible to give a definitive answer without asking the authors of the original C library. However, a plausible explanation is that the implementation of such data structures involves various tradeoffs, and only the author of the application is in the correct position to make those tradeoffs.
Note that the POSIX standard C library does specify generic hashtable functions: hcreate()
, hsearch()
and hdestroy()
; and note also that their "one size fits all" nature tends to render them inadequate for most real-world use cases, supporting the argument above.
Due to the lack of templates
This is a guess, but not having templates in the language like C++ does makes implementing containers very inelegant, as you'd need dozens of definitions to cover all possible types, not to mention user defined types.
There are C strategies to mitigate this like playing around with void *
, but they lose compile time type checks.
GLib and gnulib are my recommended implementations at the moment: Quick Way to Implement Dictionary in C