Google Firestore: Query on substring of a property value (text search)
I agree with @Kuba's answer, But still, it needs to add a small change to work perfectly for search by prefix. here what worked for me
For searching records starting with name queryText
collectionRef.where('name', '>=', queryText).where('name', '<=', queryText+ '\uf8ff')
.
The character \uf8ff
used in the query is a very high code point in the Unicode range (it is a Private Usage Area [PUA] code). Because it is after most regular characters in Unicode, the query matches all values that start with queryText
.
There's no such operator, allowed ones are ==
, <
, <=
, >
, >=
.
You can filter by prefixes only, for example for everything that starts between bar
and foo
you can use
collectionRef.where('name', '>=', 'bar').where('name', '<=', 'foo')
You can use external service like Algolia or ElasticSearch for that.
Full-Text Search, Relevant Search, and Trigram Search!
UPDATE - 2/17/21 - I created several new Full Text Search Options.
See Fireblog.io for details.
Also, side note, dgraph now has websockets for realtime... wow, never saw that coming, what a treat! Slash Dgraph - Amazing!
--Original Post--
A few notes here:
1.) \uf8ff
works the same way as ~
2.) You can use a where clause or start end clauses:
ref.orderBy('title').startAt(term).endAt(term + '~');
is exactly the same as
ref.where('title', '>=', term).where('title', '<=', term + '~');
3.) No, it does not work if you reverse startAt()
and endAt()
in every combination, however, you can achieve the same result by creating a second search field that is reversed, and combining the results.
Example: First you have to save a reversed version of the field when the field is created. Something like this:
// collection
const postRef = db.collection('posts')
async function searchTitle(term) {
// reverse term
const termR = term.split("").reverse().join("");
// define queries
const titles = postRef.orderBy('title').startAt(term).endAt(term + '~').get();
const titlesR = postRef.orderBy('titleRev').startAt(termR).endAt(termR + '~').get();
// get queries
const [titleSnap, titlesRSnap] = await Promise.all([
titles,
titlesR
]);
return (titleSnap.docs).concat(titlesRSnap.docs);
}
With this, you can search the last letters of a string field and the first, just not random middle letters or groups of letters. This is closer to the desired result. However, this won't really help us when we want random middle letters or words. Also, remember to save everything lowercase, or a lowercase copy for searching, so case won't be an issue.
4.) If you have only a few words, Ken Tan's Method will do everything you want, or at least after you modify it slightly. However, with only a paragraph of text, you will exponentially create more than 1MB of data, which is bigger than firestore's document size limit (I know, I tested it).
5.) If you could combine array-contains (or some form of arrays) with the \uf8ff
trick, you might could have a viable search that does not reach the limits. I tried every combination, even with maps, and a no go. Anyone figures this out, post it here.
6.) If you must get away from ALGOLIA and ELASTIC SEARCH, and I don't blame you at all, you could always use mySQL, postSQL, or neo4Js on Google Cloud. They are all 3 easy to set up, and they have free tiers. You would have one cloud function to save the data onCreate() and another onCall() function to search the data. Simple...ish. Why not just switch to mySQL then? The real-time data of course! When someone writes DGraph with websocks for real-time data, count me in!
Algolia and ElasticSearch were built to be search-only dbs, so there is nothing as quick... but you pay for it. Google, why do you lead us away from Google, and don't you follow MongoDB noSQL and allow searches?
While Kuba's answer is true as far as restrictions go, you can partially emulate this with a set-like structure:
{
'terms': {
'reebok': true,
'mens': true,
'tennis': true,
'racket': true
}
}
Now you can query with
collectionRef.where('terms.tennis', '==', true)
This works because Firestore will automatically create an index for every field. Unfortunately this doesn't work directly for compound queries because Firestore doesn't automatically create composite indexes.
You can still work around this by storing combinations of words but this gets ugly fast.
You're still probably better off with an outboard full text search.