Recommendations for a scripting or plugin language for highly math-dependent user coding? [closed]
I have started a bounty for this question
...because I really want the community's input. I can (and have) looked at several languages / frameworks and think 'well, this will probably work okay' -- but I would really appreciate advice that's based specifically on the problem I face, and especially from anyone with experience integrating / using what you recommend.
I work on scientific analysis software. It provides a lot of tools for mathematical transformation of data. One tool allows the user to enter in their own equation, which is run over the data set (a large 2D or 3D matrix of values) and evaluated.
This tool has a graphical equation editor, which internally builds an object-oriented expression tree with a different object for each operation (there would be an instance of the Logarithm class, for example, which is the node in the tree for adding calculating a logarithm of a value to a base; it has two children which are its inputs.) A screenshot of part of it:
You can see the tree it's building on the left, and a few of many (fifty?) potential operations in the menu on the right.
This has a few downsides:
- A graphical editor becomes clumsy for complex equations
- There are some operations that are difficult to represent graphically, such as creating large matrices (the kernel for a n x n convolution for example)
- It only allows equations: there is no branching or other logic
It was neat when it was much simpler, but not any more, for the kind of stuff our users want to be able to do with it. If I wrote it now I'd do it quite differently - and this is my chance :)
I would like to give user something more powerful, and let them write code - script or compiled - that can perform much more advanced operations. I am seeking SO's advice for what technology this should use or the best approach to take towards it.
The rest of this question is quite long - I'm sorry. I've tried to describe the problem in detail. Thanks in advance for reading it :)
Important constraints:
Our math operates on large matrices. In the above equation, V1 represents the input (one of potentially many) and is 2D or 3D, and each dimension can be large: on the order of thousands or hundreds of thousands. (We rarely calculate all of this at once, just slices / segments. But if the answer involves something which requires marshalling the data, be aware size and speed of this is a consideration.)
-
The operations we provide allow you to write, say,
2 x V
, which multiplies every element inV
by 2. The result is another matrix the same size. In other words, a scripting or programming language which includes standard math primitives isn't enough: we need to be able to control what primitives are available, or how they are implemented.These operations can be complex: the input can be as simple as a number (2, 5.3, pi) or as complex as a 1, 2 or 3-dimensional matrix, which contains numerical, boolean or complex (paired values) data. My current thinking is a language powerful enough to which we can expose our data types as classes and can implement standard operators. A simple evaluator won't be enough.
- Rather than just writing operations that are evaluated iteratively on one or more inputs to provide an output, as currently (which is implementable easily through an expression evaluator), I'd like the user to be able to: provide outputs of different sizes to the inputs; to call other functions; etc. For the host program, it would be useful to be able to ask the user's code what part or slice of the inputs will be required to evaluate a slice or part of the output. I think exposing some part of our classes and using an OO language is probably the best way to achieve these points.
Our audience is primarily research scientists who either are not used to coding, or are probably used to a language like Matlab or R.
We use Embarcadero C++ Builder 2010 for development, with small amounts of Delphi. This may restrict what we can make use of - just because something's C++, say, doesn't mean it will work if it's only been coded against VC++ or GCC. It also has to be suitable for use with commercial software.
Our software currently has a COM interface, and part of the application can be automated with our app being the out-of-process COM server. We could add COM interfaces to some internal objects, or make a second COM framework specifically for this, if required.
The 'tools', including this one, are being migrated to a multithreaded framework. The end solution needs to be able to be executed in any thread, and multiple instances of it in many threads at once. This may affect a hosted language runtime - Python 2.x, for example, has a global lock.
It would be great to use a language that comes with libraries for math or scientific use.
Backwards compatibility with the old expression tool is not important. This is version 2: clean slate!
Current ideas:
-
RemObjects Pascal Script and DWScript are languages easily bindable to
TObject
-derived classes. I don't know if it is possible to provide operator overloading. - Hosting the .Net runtime, and loading C# (say) based DLLs as plugins. I rather like this idea: I've seen this done where the host program provides a syntax highlighter, debugging, etc. I gather it was a huge amount of coding, though. It would enable the use of IronPython and F# too.
- RemObjects Hydra looks like an interesting way of achieving this. Unfotunately it advertises itself for Delphi, not C++ Builder; I'm investigating compatibility.
- Hosting something like Python, which is doable from RAD Studio
- Providing a BPL interface, and letting users code directly against our program if they buy a copy of RAD Studio (ie, provide a plugin interface, and expose classes through interfaces; maybe require plugins be compiled with a binary-compatible version of our IDE)
- ...
Thanks for your input! I appreciate all answers even if they aren't quite perfect - I can research, I am just after pointers on where to go, and opinions (please, opinions with reasons included in the answer :p) on how to approach it or what might be suitable. Every answer, no matter how short, will be appreciated. But if you recommend something in detail rather than just "use language X" I'll be very interested in reading it :)
Cheers,
David
Updates:
The following have been recommended so far:
Python: 2.6 has a global lock, that sounds like a game-killer. 3 (apparently) doesn't yet have wide support from useful libraries. It sounds to me (and I know I'm an outsider to the Python community) like it's fragmenting a bit - is it really safe to use?
Lua: doesn't seem to be directly OO, but provides "meta-mechanisms for implementing features, instead of providing a host of features directly in the language". That sounds very cool from a programmer point of view, but this isn't targeted at programmers wanting cool stuff to play around with. I'm not sure how well it would work given the target audience - I think a language which provides more basics built in would be better.
MS script / ActiveScript. We already provide an external COM interface which our users use to automate our software, usually in VBScript. However, I would like a more powerful (and, frankly, better designed) language than VBS, and I don't think JScript is suited either. I am also uncertain of what issues there might be marshalling data over COM - we have a lot of data, often very specifically typed, so speed and keeping those types are important.
Lisp: I hadn't even thought of that language, but I know it has lots of fans.
Hosting .Net plugins: not mentioned by anyone. Is this not a good idea? You get C#, F#, Python... Does it have the same marshalling issues COM might? (Does hosting the CLR work through COM?)
A couple of clarifications: by "matrix", I mean matrix in the Matlab variable sense, ie a huge table of values - not, say, a 4x4 transformation matrix as you might use for 3D software. It's data collected over time, thousands and thousands of values often many times a second. We're also not after a computer algebra system, but something where users can write full plugins and write their own math - although the system having the ability to handle complex math, like a computer algebra system can, would be useful. I would take 'full language' over 'algebra' though if the two don't mix, to allow complex branches / paths in user code, as well as an OO interface.
According to your needs, here are some guidelines:
- Make a distinction between language and library - you can have mathematical languages (like MATLAB) or mathematical libraries called from an high-level language (like Python);
- The language (or library) should be designed by mathematicians, for mathematicians;
- The used language should be an existing one (do not reinvent the wheel);
- You should be able to share the script content with existing software;
- You should not start such a big complex project (math scripting) from scratch.
So I guess it could reduce the candidate list:
- JavaScript was not designed (not used) for such usage;
- Delphi scripts (DWS or PascalScript) were made mostly for automation, not calculation (and are not widely used);
- I don't know why you are talking about using Delphi IDE in the customer application, but you should not use Delphi IDE for such proprietary development: a primitive custom IDE will be more productive than a full RAD;
- Lua should perhaps be considered: you can make whatever you want with this script engine - but there is not a huge community of mathematicians using Lua, unlike Python...
In the Open Source world, you could find a lot of very interesting solutions. See http://blog.interlinked.org/science/open_source_math_programs.html
I guess that Octave could be considered. It's simple, powerful, mature, well known, used by a lot of software, and cross platform.
As far as I know, you can call Octave library from C/C++ code. It could be done from Delphi IMHO, after translation of the associated .h files.
But be aware of the GPL licence. If your software is proprietary, it could be impossible to distribute Octave as a part of your software. But you may call the Octave library or any other GPL stuff (like Python) from your software, if you make a clear distinction between your software and the GPL software.
Embedding Python could be a good solution. This language can be called from Delphi, and you should have a good architecture, without the need of calling directly some C libraries like Octave. Python could be your main gate to all other calculation libraries, from your Delphi application. For instance, Octave can be called from some Python libraries. And you can also use Python scripts to automate your own application. And you have some Python IDE in Delphi around. The Open Source license of every component being safe, of course. The more I think about it, the more I like this latter solution...
Just my two cents. :)
No definitive answer, but a few other suggestions:-
Have a look at the LMD Innovative ScriptPack, which supports native Pascal scripting, and also ActiveScripting-based languages. Caveat : I use a lot of the LMD tools and components, but I haven't personally used Scriptpack.
LMD also have am IDE-Tools package, which can really simplify the task of making a simple custom 'RAD' tool if you need to go that route
- Another vote for Lua. I've used Lua as a script language in C++Builder2010 apps, and it works well. You can leverage the C++Builder/Delphi RTTI to help the integration between Lua script and your C++ code.
Re. Lua: We have a product which for many years had an ultrasimple 'homebrew' scripting system within it. No loops, conditionals or procedures - just a sequence of parameterized commands. We wanted to extend this to something more powerful, and picking a third-party solution seemed a lot less pain than reinventing the wheel. Primary reasons for choosing Lua for this were:-
- Fast
- Published books available (Programming in Lua)
- Written in C
- Directly embeddable within our project via static linking
- MIT License
- C++ Code can call invoke Lua code and access Lua variables
- Lua code can call C++ functions
- Small deployment footprint. Lua and it's standard libraries added under 200K to our .EXE, before compression.
I'm sure other languages could have been equally good, but it was the 'lightweight' nature of Lua that tipped it for me.
I like many of the answers there and, well, I am a biased Delphi nerd :) but I would suggest you to use a combination: RO Pascal Script+ ESBPCS for VCL.
I don't know if this sounds like you - but I would give it a go.
From the website, I extracted this link about the matrix non visual part of the library. There are many more, you might want to give that a go!