Comparing SQL and Prolog
I've started learning Prolog and wondering about the theoretical difference from the SQL language.
For example:
- both are declarative languages
- both support a fact-driven knowledge database
- both support question-styled data-retrieving
- both support functional dependencies
Any more common points? Any notable differences?
Most of the (earlier) answers here are a reflection of the fact that most people do not know what SQL is (its an implementation of Relational Calculus) or what that means (that it's a form of Predicate Logic). The following statements are true of both Prolog and SQL:
- they are both logic-driven
- they can both store, express and use relations (logical relationships in Prolog)
- they can both store and express complex logical conditions
- they both have facts(data in SQL) and can derive conclusions from those facts
- they both have queries, that do in fact mean the same thing
- they both have data(facts in Prolog) and use them similarly
- they are both programming languages
- they are both turing-complete (though it is somewhat difficult to access this in both of them)
- etc, etc..
Generally, people are not aware of these equivalences between them:
- "Facts" and "Data" are the same thing. This is straight out of Codd's original paper.
- A "Relation" in Relational Theory is the same thing as a "Table" in SQL, is the same thing as a Relation or relational function in Predicate Logic and is the same thing as a tuple-set in Set Theory
- An aliased table-expression (i.e., a View, etc.) in SQL is the same thing as a Rule in Prolog.
So what are their differences? Although they operate across the same conceptual domains, their focuses are in completely different directions. In Prolog terms, SQL is primarily a Fact and Relation(set) engine, whereas Prolog is primarily a Rules and Inferencing engine. Each can do the other, to a limited extent, but it becomes increasingly difficult with even small increases in complexity. For instance, you can do inferencing in SQL, but it is almost entirely manual in nature and not at all like the automatic forward-inferencing of Prolog. And yes, you can store data(facts) in Prolog, but it is not at all designed for the "storage, retrieval, projection and reduction of Trillions of rows with thousands of simultaneous users" that SQL is.
Plus, SQL is primarily a Server-language paradigm, whereas Prolog is primarily a Client-language paradigm.
You are correct: Prolog and SQL are theoretically related (you asked specifically about theoretical differences).
I want to complement RBarryYoung's answer, giving you some hints to understand the connection, so that you have a starting point to study and understand the technicalities.
Prolog and SQL share a core: every query expressible in a subset of Prolog can be expressed in a subset of SQL and viceversa, i.e. these subsets are logically equivalent.
To understand how this can be true, you need to examine on what theoretical underpinnings both Prolog and SQL are based:
- SQL1 is a mix of different parts, not always well integrated, apparently from both Relational Algebra (RA) and Tuple Relational Calculus (TRC)2 and other parts, not related to logic (i.e. SUM, AVG operators, ORDER BY, and so on).
- RA is equivalent in expressive power to safe (domain-independent) TRC (this is known as Codd's theorem).
- RA is equivalent in expressive power to safe Datalog without recursion and with (stratified) negation.
- Datalog can be considered a "loose subset" of Prolog3; a "loose subset" in the sense that there are complications w.r.t. the operational semantics of Prolog: "..ordering of the clauses is irrelevant in Datalog in contrast to Prolog which depends on the ordering of clauses for computing the result of the query call" (citing from here).
Of course something out of these subsets needs more translation effort.
Nonetheless, I think the claim that equivalence in expressive power of the two subsets is more than an appeal to Turing equivalence4 when considering Prolog-to-SQL translation.
Notes:
1) Unfortunately SQL can be used in contrast to RDBMS theoretical foundations (relational algebra-calculi); for example, SQL tables are not necessarily relations - as per RA - i.e. they can be without a (primary) key, so duplicate rows are permitted. Such tables are not sets but multisets (aka bags) of tuples. In such context, all theoretical results for RA, where relations are sets, are not necessarily valid.
2) For a translation from SQL to TRC, see A note on the translation of SQL to tuple calculus, also here (postscript paper).
3) For more on the differences between Datalog and Prolog see What You Always Wanted to Know About Datalog (And Never Dared to Ask) (pdf paper - links directly to page 6, heading H. Datalog and Prolog).
4) For the record: RA (and so their equivalents safe TRC and safe Datalog w/o recursion) is not Turing complete on purpose, to avoid never ending queries.
Historical note: Prolog and Codd's Relational Algebra were conceived around the same time (end of '60s early '70s) in different contexts - Colmerauer conceived Prolog for natural language processing and Codd conceived RA as a theoretical foundation of Relational DBMS. So, any theoretical connection between Prolog-Datalog-RA-SQL was necessarily established a posteriori and is implicit in the fact that they are all based on first-order predicate calculus (aka first order logic).
Prolog and SQL are both based on first order logic, but SQL tables are simple binary relations, whereas Prolog predicates are Horn clauses.
This is not some obscure theoretical point. SQL binary relations are statements of fact, of the form:
f(A, B, C ... N)
Where f is the name of the relation and A...N its variables. Prolog binary relations are implications, of the form:
A <- B, C, D ... N
Where A ... N are themselves Horn clauses. SQL relations are an efficient way to describe data. Prolog relations describe complex relationships between data, themselves stored as data.
It is important to understand that in Prolog, there is no separation between data and operation. Prolog facts, rules and queries are all Horn clauses, therefore data, something that gets lost in translation in most university courses. Prolog is not like C but with facts instead of variables and rules instead of functions. On the other hand, SQL is a lot like Prolog without the rules or queries.
SQL queries are also logic predicates, but SQL queries are not stored in the database itself. Rather, they are used to extract data sets from the database of facts. You could store a query as a table row in a SQL database but you couldn't execute it in that form.
Prolog queries are stored in the database like any other Prolog predicate, because they are like any other Prolog predicate. Queries are Horn clauses of the form:
<- B, C, D ... N
So implications with a precedent but no antecedent, therefore always false. Facts are Horn clauses with an antecedent but no precedent, of the form:
A <-
So always true. Prolog proves a query by refutation: if it can't find a fact (or rule) that proves it, it will state the goal is true, since a query is always false. In the process, some variables are bound so result sets can be constructed, much like SQL does with SELECT queries.
Modern SQL DBMSs have features like stored procedures and flow control language so SQL can be used for inference (not that you want to do inference in SQL). Prolog comes ready with an inference engine tuned to its database of Horn clauses, because that's an efficient way to do inference over databases of facts stated as binary relations (and no, not just because it's pretty).
Prolog's homoiconic nature (data is operation is data) means that new new data has to be added to the database, therefore to the program, because the database is the program. So everytime a new fact is added to its database (typically using assert/1) the whole program must be decompiled. This is a huge PITA and makes Prolog inefficient for storing large data sets, though there is no reason why it has to be inefficient at data retrieval and Prolog systems will use the same algorithms as SQL systems for that purpose. SQL on the other hand is well suited for both storage and retrieval.
Finally, Prolog has a few features that SQL just doesn't have, namely its super-pattern-matching called unification, negation as failure and syntactic elements that facilitate list processing and grammar declaration (Definite Clause Grammar notation). Those are just a happy accident and were mostly added to the language because they were en vogue in the time it was first created (thanks to LISP). SQL got recursive queries relatively recently so Prolog can't boast that anymore.
And of course both languages are weak at I/O and maths, although at least you can do some arithmetic in Prolog without having to pull your hair out all the way.
So, really, Prolog and SQL are as much alike as C and Haskell. They're both based on the same root abstraction, first order logic (like C and Haskell are both based on algebra) but things get very different pretty quickly after that. Also, from the point of view of language design, SQL tends to be fractured, with many different language features (pedicates, queries, data manipulation language....). Prolog's design is extremely consistent, so that the whole language is really just predicates and a few punctuation marks.
For me, the most important difference is this: I don't like SQL but I have to work with it. I love Prolog, but I can't use it at work. Life is unfair :)
There are many differences which I think become clear when you start using them. Remember because of changes in terminology, somethings which are called the same thing in the past mean very different things now.
Very broad overview of difference.
SQL statements work against a relational database and query (ask for) data from that database, changes to that data and the results are exactly expressed in the language, whereas in Prolog you define facts and a logic engine generates new facts based off of the existing facts. New data (facts) are created via evaluation.
They both use something called queries (but they work totally differently) and they both have data (but use it differently.)
The use cases for SQL and Prolog are also totally different. It would never make sense to store an address list in Prolog whereas that is exactly what SQL was designed to do.
Simply put SQL is used to access a data store and Prolog is an expression evaluator.