Quick Way to Implement Dictionary in C
Section 6.6 of The C Programming Language presents a simple dictionary (hashtable) data structure. I don't think a useful dictionary implementation could get any simpler than this. For your convenience, I reproduce the code here.
struct nlist { /* table entry: */
struct nlist *next; /* next entry in chain */
char *name; /* defined name */
char *defn; /* replacement text */
};
#define HASHSIZE 101
static struct nlist *hashtab[HASHSIZE]; /* pointer table */
/* hash: form hash value for string s */
unsigned hash(char *s)
{
unsigned hashval;
for (hashval = 0; *s != '\0'; s++)
hashval = *s + 31 * hashval;
return hashval % HASHSIZE;
}
/* lookup: look for s in hashtab */
struct nlist *lookup(char *s)
{
struct nlist *np;
for (np = hashtab[hash(s)]; np != NULL; np = np->next)
if (strcmp(s, np->name) == 0)
return np; /* found */
return NULL; /* not found */
}
char *strdup(char *);
/* install: put (name, defn) in hashtab */
struct nlist *install(char *name, char *defn)
{
struct nlist *np;
unsigned hashval;
if ((np = lookup(name)) == NULL) { /* not found */
np = (struct nlist *) malloc(sizeof(*np));
if (np == NULL || (np->name = strdup(name)) == NULL)
return NULL;
hashval = hash(name);
np->next = hashtab[hashval];
hashtab[hashval] = np;
} else /* already there */
free((void *) np->defn); /*free previous defn */
if ((np->defn = strdup(defn)) == NULL)
return NULL;
return np;
}
char *strdup(char *s) /* make a duplicate of s */
{
char *p;
p = (char *) malloc(strlen(s)+1); /* +1 for ’\0’ */
if (p != NULL)
strcpy(p, s);
return p;
}
Note that if the hashes of two strings collide, it may lead to an O(n)
lookup time. You can reduce the likelihood of collisions by increasing the value of HASHSIZE
. For a complete discussion of the data structure, please consult the book.
The quickest way would be to use an already-existing implementation, like uthash.
And, if you really want to code it yourself, the algorithms from uthash
can be examined and re-used. It's BSD-licensed so, other than the requirement to convey the copyright notice, you're pretty well unlimited in what you can do with it.
For ease of implementation, it's hard to beat naively searching through an array. Aside from some error checking, this is a complete implementation (untested).
typedef struct dict_entry_s {
const char *key;
int value;
} dict_entry_s;
typedef struct dict_s {
int len;
int cap;
dict_entry_s *entry;
} dict_s, *dict_t;
int dict_find_index(dict_t dict, const char *key) {
for (int i = 0; i < dict->len; i++) {
if (!strcmp(dict->entry[i], key)) {
return i;
}
}
return -1;
}
int dict_find(dict_t dict, const char *key, int def) {
int idx = dict_find_index(dict, key);
return idx == -1 ? def : dict->entry[idx].value;
}
void dict_add(dict_t dict, const char *key, int value) {
int idx = dict_find_index(dict, key);
if (idx != -1) {
dict->entry[idx].value = value;
return;
}
if (dict->len == dict->cap) {
dict->cap *= 2;
dict->entry = realloc(dict->entry, dict->cap * sizeof(dict_entry_s));
}
dict->entry[dict->len].key = strdup(key);
dict->entry[dict->len].value = value;
dict->len++;
}
dict_t dict_new(void) {
dict_s proto = {0, 10, malloc(10 * sizeof(dict_entry_s))};
dict_t d = malloc(sizeof(dict_s));
*d = proto;
return d;
}
void dict_free(dict_t dict) {
for (int i = 0; i < dict->len; i++) {
free(dict->entry[i].key);
}
free(dict->entry);
free(dict);
}
I am surprised no one mentioned hsearch/hcreate set of libraries which although is not available on windows, but is mandated by POSIX, and therefore available in Linux / GNU systems.
The link has a simple and complete basic example that very well explains its usage.
It even has thread safe variant, is easy to use and very performant.
GLib and gnulib
These are your likely best bets if you don't have more specific requirements, since they are widely available, portable and likely efficient.
GLib: https://developer.gnome.org/glib/ by GNOME project. Several containers documented at: https://developer.gnome.org/glib/stable/glib-data-types.html including "Hash Tables" and "Balanced Binary Trees". License: LGPL
gnulib: https://www.gnu.org/software/gnulib/ by the GNU project. You are meant to copy paste the source into your code. Several containers documented at: https://www.gnu.org/software/gnulib/MODULES.html#ansic_ext_container including "rbtree-list", "linkedhash-list" and "rbtreehash-list". GPL license.
See also: Are there any open source C libraries with common data structures?