One of the really cool things about eXist is the XPath extensions for fulltext searching. They mimic (using XPath) the stuff that is done in XStreamDB via XQuery.
I can do stuff like:
document(*)//text() &= "*image*"
and eXist will return me any xml document (from it’s entire set of collections) that contains the string “image” somewhere in it (could be in /lom/general/title/langstring/Images Of Bangalore, or /lom/technical/format/image/jpeg, or whatever. It doesn’t care. And, it’s very fast.
What’s more, I can do stuff like:
document(*)/*[ //format &= "*image*" and //text() &= "*earth*"]
which says “find me xml documents that have “image” somewhere in a “format” element (could be, say, /lom/technical/format), and contain the string “earth” somewhere (like, say, /lom/general/title/langstring/Earth At Night or /lom/general/title/langstring/Earthquakes )
I can also do something like:
document(*)//text() &="*image* *kyoto*"
Which will give me different results than
document(*)//text() &= "*image* *kyoto* *relig*"
because the second query will restrict the search to stuff to do with “relig” - religion, religious, whatever (in this case, a Buddhist temple in Kyoto is returned, as opposed to the Kyoto Accord presentations at the University of Calgary, which are returned by the query before it…)
The fulltext extension - based queries (using the &= qualifier to indicate “boolean and” - you can also use the := qualifier to indicate “boolean or”) are amazingly fast. I’m getting results from rather complicated test queries on the entire 3600+ CAREO record set in a fraction of a second. Nice.