I had a full-day client briefing today in St. Louis on Big Data analytics for enterprise research. During the briefing, Lucene was mentioned by architects and developers from both sides over a dozen times as an useful tool for text indexing even at the enterprise level.
As an open-source project, it says a lot for Lucene when two 100-year companies agreed on its importance.
So I am now adding the 2nd Apache project (the first being Apache Hadoop) to the jTool catalog.
Overview of Apache Lucene
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
As an open-source project, it says a lot for Lucene when two 100-year companies agreed on its importance.
So I am now adding the 2nd Apache project (the first being Apache Hadoop) to the jTool catalog.
Overview of Apache Lucene
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Lucene offers powerful features through a simple API:
Scalable, High-Performance Indexing
Scalable, High-Performance Indexing
- over 95GB/hour on modern hardware
- small RAM requirements -- only 1MB heap
- incremental indexing as fast as batch indexing
- index size roughly 20-30% the size of text indexed
Powerful, Accurate and Efficient Search Algorithms
- ranked searching -- best results returned first
- many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
- fielded searching (e.g., title, author, contents)
- date-range searching
- sorting by any field
- multiple-index searching with merged results
- allows simultaneous update and searching
Cross-Platform Solution
- Available as Open Source software under the Apache License which lets you use Lucene in both commercial and Open Source programs
- 100%-pure Java
- Implementations in other programming languages available that are index-compatible
Links
No comments:
Post a Comment