How would I implement a simple site search with php and mySQL?

Everyone is suggesting MySQL fulltext search, however you should be aware of a HUGE caveat. The Fulltext search engine is only available for the MyISAM engine (not InnoDB, which is the most commonly used engine due to its referential integrity and ACID compliance).

So you have a few options:

1. The simplest approach is outlined by Particle Tree. You can actaully get ranked searches off of pure SQL (no fulltext, no nothing). The SQL query below will search a table and rank results based off the number of occurrences of a string in the search fields:

SELECT
    SUM(((LENGTH(p.body) - LENGTH(REPLACE(p.body, 'term', '')))/4) +
        ((LENGTH(p.body) - LENGTH(REPLACE(p.body, 'search', '')))/6))
    AS Occurrences
FROM
    posts AS p
GROUP BY
    p.id
ORDER BY
    Occurrences DESC

edited their example to provide a bit more clarity

Variations on the above SQL query, adding WHERE statements (WHERE p.body LIKE '%whatever%you%want'), etc. will probably get you exactly what you need.

2. You can alter your database schema to support full text. Often what is done to keep the InnoDB referential integrity, ACID compliance, and speed without having to install plugins like Sphinx Fulltext Search Engine for MySQL is to split the quote data into it's own table. Basically you would have a table Quotes that is an InnoDB table that, rather than having your TEXT field "data" you have a reference "quote_data_id" which points to the ID on a Quote_Data table which is a MyISAM table. You can do your fulltext on the MyISAM table, join the IDs returned with your InnoDB tables and voila you have your results.

3. Install Sphinx. Good luck with this one.

Given what you described, I would HIGHLY recommend you take the 1st approach I presented since you have a simple database driven site. The 1st solution is simple, gets the job done quickly. Lucene will be a bitch to setup especially if you want to integrate it with the database as Lucene is designed mainly to index files not databases. Google custom site search just makes your site lose tons of reputation (makes you look amateurish and hacked), and MySQL fulltext will most likely cause you to alter your database schema.


Use Google Custom Site Search. I've heard they know a thing or two about searching.


Stackoverflow plans to use the Lucene search engine. There is a PHP port of this written for the Zend Framework but can be downloaded as a separate entity without needing all the ZF bloat. This is called Zend_Search_Lucene, documentation for which can be found here.