How about using elasticsearch ( https://www.elastic.co/ ).
It is built on top of apache lucene with a powerful json based query system.

On Fri, Sep 4, 2015 at 11:31 PM, Zach Villers <greyman@zanshin.xyz> wrote:
Here are a few possibilities I've found so far;

Apache Solr - http://lucene.apache.org/solr/

      Features - "Solr is a standalone enterprise search server with a REST-like API. You put documents in it (called "indexing") via JSON, XML, CSV or binary over HTTP. You query it via HTTP GET   and receive JSON, XML, CSV or binary results."

      It's a java servlet, which may be a mark against it. There are docker images for it. To try out the Velocity Search engine, move the Velocity .jar files into the server library, use the techproducts solrconfig.xml, and start the server with bin/solr start -e techproducts. You can copy in some html files to the exampledocs directory using wget to mirror a site, then index the files with bin/post command.

Apache Spark - http://spark.apache.org/

    "Apache Spark™ is a fast and general engine for large-scale data processing."

     Here is a use case for someone using it as a search engine; http://www.maana.io/company/press-release/the-first-and-only-big-data-search-engine-powered-by-apache-spark/

     Just in scanning the docs, you can run it with java or python.

OpenSearchServer - http://www.opensearchserver.com/

     Also runs on java - I haven't tried it out yet.

Sphinx - Free open-source SQL full-text search engine -

     (Haven't tried this either)

Here is a link to a recent tech blog with few other possibilities; http://www.mytechlogy.com/IT-blogs/8685/tech-blogs-top-5-open-source-search-engines/#.VenZa7OYphE

irc #aikidouke
infrastructure mailing list