Re: Desktop search tool using lucene

Tuesday, 28 June 2005

Mike MacCana wrote:

...
>.
>They (meaning engineers at redhat) are discussing this. The solution
>won't use Lucene, as Lucene treats all fine content as equal - ie, it
>doesn't know about headings being different from body text and so on.
>
>Mike
>    
>     Also,  Lucene suffers from the Java UCS-16 scandal:  they chose a 
character encoding which is good for Japanese,  but bulks up european 
languages by a factor of two and doesn't support enough characters to do 
a good job with Chinese.

    Because of this,  Lucene loses a factor of two in performance 
compared to C++ competitors such as Xapian,  which is a minus for those 
who care about performance on computers that aren't monster servers with 
8 megs of RAM and Ultra 320 disks.  (Funny enough,  we're not all that 
happy with Lucene performance on such a machine...  But we've got a lot 
of text...)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Desktop search tool using lucene