Suggestions for writing a programming language? [closed]

What tips can you give a person who is looking to write a programming or script language? I am not worried about how to program nor design a compiler but how to develop one quickly using tools and code generators.

Last time i tried i coded it in c++ and the states and syntax took almost as long as writing the actual logic. I know the follow tools would help.

I was thinking i could generate c++ code and have gcc compile that. Using the tools above how long would you estimate it would take to write a program or script language?


Variations on this question have been asked repeatedly, as far back as Learning to write a compiler. Here is an incomplete list of SO resources on the topic.


Solution 1:

Estimating how long something like that might take is dependent on many different factors. For example, an experienced programmer can easily knock out a simple arithmetic expression evaluator in a couple of hours, with unit tests. But a novice programmer may have to learn about parsing techniques, recursive descent, abstract representation of expression trees, tree-walking strategies, and so on. This could easily take weeks or more, just for arithmetic expressions.

However, don't let that discourage you. As Jeff and Joel were discussing with Eric Sink on a recent Stack Overflow podcast, writing a compiler is an excellent way to learn about many different aspects of programming. I've built a few compilers and they are among my most memorable programming projects.

Some classic books on building compilers are:

  • Compilers: Principles, Techniques, and Tools (also known as The Dragon Book)
  • The Structure and Interpretation of Computer Programs (also known as SICP)
  • Algorithms + Data Structures = Programs

Solution 2:

Dave Hanson, who with Chris Fraser spent 10 years building one of the world's most carefully crafted compilers, told me once that one of the main things he learned from the experience was not to try to write a compiler in C or C++.

If you want to develop something quickly, don't generate native code; target an existing virtual machine such as the CLR, JVM, or the Lua virtual machine. Generate code using maximal munch.

Another good option if you're writing an interpreter is just to use the memory management and other facilities of your underlying programming language. Parse to an AST and then interpret by tree walk of the AST. This will get you off the ground fast. Performance is not the greatest, but it's acceptable. (Using this technique I once wrote a PostScript interpreter in Modula-3. The first implementation took a week and although it later underwent some performance tuning, primarily in the lexer, it never had to be replaced.)

Avoid LALR parser generators; use something that saves your time, like ANTLR or the Elkhound GLR parser generator.