Wednesday, April 23, 2008

Minion - new Java search engine from Sun labs

It seems that there will be a new open source (GPLv2) search engine revealed for Java soon. It is called Minion and Stephen Green (aka The Search Guy) has just started blogging about it. He has already published few posts and it seems that Minion has some interesting concepts:
  • For example it can store the Date as Date (as opposed to Lucene which can store Date only as a String).
  • Also it seems to be conveniet for stream indexing (indexer is stateful).
  • Every document in index has an unique key and it never changes - updates of documents are then handled automatically under the hood (as far as I know Lucene can not offer such luxury).
References:
[1] http://blogs.sun.com/searchguy/entry/minion_an_open_source_search1
[2] http://blogs.sun.com/searchguy/entry/before_i_dive_in_some
[3] http://blogs.sun.com/searchguy/entry/the_rest_of_the_story

2 comments:

Lukas Zapletal said...

Point 3: you can do this with Lucene too. You just need unique key which you can create easily.

Lukas said...

Lukas, you are right, it is possible to create artificial field with unique_key value. But Minion seems to do this automatically and what I see as a possible benefit is that it prevents you from creating two documents with the same unique_key. In such case it will automatically udpate the document (there is no notion of update in Lucene).

Anyway, Minion hasn't been open sourced yet so I am bit speculating :-)