How do language bindings work?

How do language bindings work?

For instance how would one make bindings from a library written in one language to another language? Would the bindings be written in the same language as the library or the language the bindings are for?

Is it possible to make bindings to and from all languages or does the language have to somehow support bindings? If that is the case then how does that support work?


Solution 1:

For the most part most languages out there are either written in C (Perl, Python, Ruby, Tcl ...) or is compatible with C (C++, C#, Objective-C). Therefore, for most languages it is easy to use a C library by writing some wrapper functions to convert data structures in that language into native C data structures. There is even an automatic (or semi-automatic depending on complexity required) tool for this: SWIG.

This is one of the main reason most libraries are written in C. It just makes it easy to port the low level code to multiple target languages. Examples of libraries using this strategy include SQLite, Tk and wxWidgets.

Another strategy is to use OS features to export the library into a language-neutral shared library. On Windows this would be DLLs and on Unixen they'd be shared libraries. Most Microsoft products use this strategy so it doesn't matter what the original code is written in you can easily access the library as long as it is compiled as a DLL. Examples of non-Microsoft libraries using this strategy include libpurple and gtk.

The third option is to use IPC. The most common method is to use sockets because it's familiar to most people and very cross platform. Code that use this method are not, strictly speaking, libraries. They are servers and their "API" are technically services. But to the average programmer using the services they look like regular APIs because most language bindings abstract away the network code and present simple function/method calls. Examples of "libraries" using this strategy include Xwindows, Gimp scripting and most databases such as MySQL and Oracle.

There are other, more convoluted ways of providing access to libraries written in another language including actually embedding that language's interpreter but the above 3 are the most common.


Clarification

I feel I should clarify a bit between the difference of the first and second approach.

In the first approach, the library is still compiled into a dll or .so like the second approach but the main difference is that the dll must conform to a higher level standard/protocol. Tcl for example cannot load any arbitrary dll because it expects all values going into and coming out of a function to be a pointer to a struct Tcl_Obj. So in order to use a library compiled as a plain old dll you'd need to compile another dll that accesses the first dll via wrapper functions that convert all variables and function parameters into struct Tcl_Obj*.

But some languages like VB can load plain old C dlls. So that would be an example of the second approach.

Solution 2:

In theory a binding framework could be built which could take a language agnostic approach but more often then not the binding features are built within the given language (framework).

Increasing a feature set would be notoriously difficult when you take an agnostic approach. This is often seen when attempting to develop database agnostic code for instance, thus not taking advantage of the feature set the database engine provides as the most common denominator must be used.

The base of bindings boils down to notification; notification that something has changed. This can generally be handled via a publish/subscribe pattern which is by definition language agnostic.