All Classes and Interfaces

Class
Description
Abstract parent class for analysis factories TokenizerFactory, TokenFilterFactory and CharFilterFactory.
 
SmartChineseAnalyzer abstract dictionary implementation.
Base class for payload encoders.
Simplifies the implementation of iterators a bit.
AbstractKnnCollector is the default implementation for a knn collector used for gathering kNN results and providing topDocs from the gathered neighbors
Caches the results of a KnnVector search: a list of docs and their scores
 
 
 
Base implementation for PagedMutable and PagedGrowableWriter.
This class is the base of QueryConfigHandler and FieldConfig.
This class should be extended by nodes intending to represent range queries.
Search for all (approximate) vectors above a similarity threshold.
 
Abstract parent class for analysis factories that accept a stopwords file as input.
An object whose RAM usage can be computed.
Helper methods for constructing nested resource descriptions and debugging RAM usage.
Checks the "condition" part of affix definition, as in
An object representing the analysis result of a simple (non-compound) word
An object representing a prefix or a suffix applied to a word stem
 
This class acts as the base class for the implementations of the first normalization of the informative content in the DFR framework.
Model of the information gain based on the ratio of two Bernoulli processes.
Model of the information gain based on Laplace's law of succession.
This collector specializes in collecting the most relevant document (group head) for each group that matches the query.
Represents a group head.
 
Specialized implementation for sorting by score
 
General implementation using a FieldComparator to select the group head
A collector that collects all groups that match the query.
This exception is thrown when there is an attempt to access something that has already been closed.
Internal class used by Snowball stemmers
Provides a base class for analysis based offset strategies to extend from.
Wraps an Analyzer and string text that represents multiple values delimited by a specified character.
Helper class for loading named SPIs from classpath (e.g.
An Analyzer builds TokenStreams, which analyze text.
Strategy defining how TokenStreamComponents are reused per call to Analyzer.tokenStream(String, java.io.Reader).
 
This class encapsulates the outer components of a token stream.
Manages analysis data configuration for SmartChineseAnalyzer
Extension to Analyzer suitable for Analyzers which wrap other Analyzers.
Analyzes the input text and then suggests matches based on prefix matches to any tokens in the indexed text.
Suggester that first analyzes the surface form, adds the analyzed form to a weighted FST, and then does the same thing at lookup time.
 
Factory for conjunctions
A AndQueryNode represents an AND boolean operation performed on a list of nodes.
A AnyQueryNode represents an ANY operator performed on a list of nodes.
Builds a BooleanQuery of SHOULD clauses, possibly with some minimum number to match.
Strips all characters after an apostrophe (including the apostrophe itself).
Factory for ApostropheFilter.
An approximate priority queue, which attempts to poll items by decreasing log of the weight, though exact ordering is not guaranteed.
Analyzer for Arabic.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies ArabicNormalizer to normalize the orthography.
Normalizer for Arabic.
A TokenFilter that applies ArabicStemmer to stem Arabic words..
Factory for ArabicStemFilter.
Stemmer for Arabic.
This class implements the stemming algorithm defined by a snowball script.
Arc distance computation style.
Analyzer for Armenian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
An InPlaceMergeSorter for object arrays.
An IntroSorter for object arrays.
A TimSorter for object arrays.
Methods for manipulating arrays.
Comparator for a fixed number of bytes.
This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists.
Factory for ASCIIFoldingFilter.
Base interface for attributes.
An AttributeFactory creates instances of AttributeImpls.
 
Expert: AttributeFactory returning an instance of the given clazz for the attributes it implements.
Base class for Attributes that can be added to a AttributeSource.
This interface is used to reflect contents of AttributeSource or AttributeImpl.
An AttributeSource contains a list of different AttributeImpls, and methods to add and get them.
This class holds the state of an AttributeSource.
Construction of basic automata.
Represents an automaton and all its states and transitions.
Records new states and transitions and then Automaton.Builder.finish() creates the Automaton.
Automaton provider for RegExp. RegExp.toAutomaton(AutomatonProvider,int)
A Query that will match terms against a finite-state machine.
A FilteredTermsEnum that enumerates terms based upon what is accepted by a DFA.
Converts an Automaton into a TokenStream.
Edge between position nodes.
Node that contains original node id and position in TokenStream
Token Stream that outputs tokens from a topo sorted graph.
Calculate the final score as the average score of all payloads seen.
Axiomatic approaches for IR.
F1EXP is defined as Sum(tf(term_doc_freq)*ln(docLen)*IDF(term)) where IDF(t) = pow((N+1)/df(t), k) N=total num of docs, df=doc freq
F1LOG is defined as Sum(tf(term_doc_freq)*ln(docLen)*IDF(term)) where IDF(t) = ln((N+1)/df(t)) N=total num of docs, df=doc freq
F2EXP is defined as Sum(tfln(term_doc_freq, docLen)*IDF(term)) where IDF(t) = pow((N+1)/df(t), k) N=total num of docs, df=doc freq
F2EXP is defined as Sum(tfln(term_doc_freq, docLen)*IDF(term)) where IDF(t) = ln((N+1)/df(t)) N=total num of docs, df=doc freq
F3EXP is defined as Sum(tf(term_doc_freq)*IDF(term)-gamma(docLen, queryLen)) where IDF(t) = pow((N+1)/df(t), k) N=total num of docs, df=doc freq gamma(docLen, queryLen) = (docLen-queryLen)*queryLen*s/avdl NOTE: the gamma function of this similarity creates negative scores
F3EXP is defined as Sum(tf(term_doc_freq)*IDF(term)-gamma(docLen, queryLen)) where IDF(t) = ln((N+1)/df(t)) N=total num of docs, df=doc freq gamma(docLen, queryLen) = (docLen-queryLen)*queryLen*s/avdl NOTE: the gamma function of this similarity creates negative scores
Base utility class for implementing a CharFilter.
Base class for implementing CompositeReaders based on an array of sub-readers.
Base implementation for a concrete Directory that uses a LockFactory for locking.
Attribute for Token.getBaseForm().
Attribute for Token.getBaseForm().
A abstract implementation of FragListBuilder.
 
Base FragmentsBuilder implementation that supports colored pre/post tags and multivalued fields.
 
All Geo3D shapes can derive from this base class, which furnishes some common code
Base query class for ShapeDocValues queries.
A base TermsEnum that adds default implementations for BaseTermsEnum.attributes() BaseTermsEnum.termState() BaseTermsEnum.seekExact(BytesRef) BaseTermsEnum.seekExact(BytesRef, TermState) In some cases, the default implementation may be slow and consume huge memory, so subclass SHOULD have its own implementation if possible.
Base class of a family of 3D rectangles, bounded on six sides by X,Y,Z limits
This class acts as the base class for the specific basic model implementations in the DFR framework.
Geometric as limiting form of the Bose-Einstein model.
An approximation of the I(ne) model.
The basic tf-idf model of randomness.
Tf-idf model of randomness, based on a mixture of Poisson and inverse document frequency.
Factory for creating basic term queries
Stores all statistics commonly used ranking methods.
Analyzer for Basque.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
TokenFilter for Beider-Morse phonetic encoding.
Factory for BeiderMorseFilter.
Analyzer for Bengali.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies BengaliNormalizer to normalize the orthography.
Normalizer for Bengali.
A TokenFilter that applies BengaliStemmer to stem Bengali words.
Factory for BengaliStemFilter.
Stemmer for Bengali.
An indexed 128-bit BigInteger field.
SmartChineseAnalyzer Bigram dictionary.
Base class for a binary-encoded in-memory dictionary.
Base class for a binary-encoded in-memory dictionary.
Deprecated, for removal: This API element is subject to removal in a future version.
Deprecated, for removal: This API element is subject to removal in a future version.
 
 
A per-document numeric value.
Select documents using binary doc values
Field that stores a per-document BytesRef value.
A DocValuesFieldUpdates which holds updates of documents, of a single BinaryDocValuesField.
 
Buffers up pending byte[] per doc, then flushes when segment flushes.
 
 
 
An indexed binary field for fast range filters.
A binary representation of a range that wraps a BinaryDocValues field
 
 
Binds variable names in expressions to actual data.
Graph representing possible token pairs (bigrams) at each start offset in the sentence.
Implementation of the DocIdSet interface on top of a BitSet.
Bit mixing utilities.
Interface for Bitset-like structures.
Bits impl of the specified length with all bits set.
Bits impl of the specified length with no bits set.
Base implementation for a bit set.
A DocIdSetIterator which iterates over set bits in a bit set.
A producer of BitSets per segment.
A producer of Bits per segment.
Exposes a slice of an existing Bits as a new Bits.
Static helper methods for FST.Arc.BitTable.
A variety of high efficiency bit twiddling routines and encoders for primitives.
Basic parameters for indexing points on the BKD tree.
Offline Radix selector for BKD tree.
Sliced reference to points in an PointWriter.
Handles reading a block KD-tree in byte[] space previously written with BKDWriter.
 
Reusable DocIdSetIterator to handle low cardinality leaves.
Utility functions to build BKD trees.
Predicate for a fixed number of bytes.
Recursively builds a block KD-tree to assign all incoming points in N-dim space to smaller and smaller N-dim rectangles (cells) until the number of points in a given rectangle is <= config.maxPointsInLeafNode.
 
flat representation of a kd-tree
 
 
Extension of the AnalyzingInfixSuggester which transforms the weight after search to take into account the position of the searched term into the indexed text.
The different types of blender.
 
A Query that blends index statistics across multiple terms.
A Builder for BlendedTermQuery.
A BlendedTermQuery.RewriteMethod that creates a DisjunctionMaxQuery out of the sub queries.
A BlendedTermQuery.RewriteMethod defines how queries for individual terms should be merged.
Decodes the raw bytes of a block when the index is read, according to the BlockEncoder used during the writing of the index.
Encodes the raw bytes of a block when the index is written.
Writable byte buffer.
BlockGroupingCollector performs grouping with a single pass collector, as long as you are grouping by a doc block field, ie all documents sharing a given group value were indexed as a doc block using the atomic IndexWriter.addDocuments() or IndexWriter.updateDocuments() API.
 
 
Block header containing block metadata.
Reads/writes block header.
A blocking bounded min heap that stores floats.
 
 
Select a value from a block of documents.
Type of selection to perform.
One term block line.
Reads/writes block lines with terms encoded incrementally inside a block.
BulkScorer implementation of BlockMaxConjunctionScorer that focuses on top-level conjunctions over clauses that do not have two-phase iterators.
 
Scorer for conjunctions that checks the maximum scores of each clause in order to potentially skip over blocks that can't have competitive matches.
Reader for sequences of longs written with BlockPackedWriter.
A writer for large sequences of longs.
Seeks the block corresponding to a given term, read the block bytes, and scans the block terms.
Handles a terms dict, but decouples all details of doc/freqs/positions reading to an instance of PostingsReaderBase.
 
Holds all state required for PostingsReaderBase to produce a PostingsEnum without re-seeking the terms dict.
Writes terms dict, block-encoding (column stride) each term's metadata for each set of terms between two index terms.
 
 
Writes blocks in the block file.
Class used to create index-time FuzzySet appropriately configured for each field.
A PostingsFormat useful for low doc-frequency fields such as primary keys.
 
 
 
A classifier approximating naive bayes classifier by using pure queries on BM25.
BM25 Similarity.
Collection statistics for the BM25 model.
Abstract FunctionValues implementation which supports retrieving boolean values.
 
A clause in a BooleanQuery.
Specifies how clauses are to occur in matching documents.
A BooleanModifierNode has the same behaviour as ModifierQueryNode, it only indicates that this modifier was added by BooleanQuery2ModifierNodeProcessor and not by the user.
This processor is used to apply the correct ModifierQueryNode to BooleanQueryNodes children.
A perceptron (see http://en.wikipedia.org/wiki/Perceptron) based Boolean Classifier.
A Query that matches documents matching boolean combinations of other queries, e.g.
A builder for boolean queries.
This processor is used to apply the correct ModifierQueryNode to BooleanQueryNodes children.
Builder for BooleanQuery
A BooleanQueryNode represents a list of elements which do not have an explicit boolean operator defined between them.
Builds a BooleanQuery object from a BooleanQueryNode object.
BulkScorer that is used for pure disjunctions and disjunctions that have low values of BooleanQuery.Builder.setMinimumNumberShouldMatch(int) and dense clauses.
 
 
 
Simple similarity that gives terms a score that is equal to their query boost.
 
This processor removes every BooleanQueryNode that contains only one child and returns this child.
Expert: the Weight for BooleanQuery, used to normalize, score and explain these queries.
 
Abstract parent class for those ValueSource implementations which apply boolean logic to their values
Add this Attribute to a TermsEnum returned by MultiTermQuery.getTermsEnum(Terms,AttributeSource) and update the boost on each returned term.
Implementation class for BoostAttribute.
Builder for PayloadScoreQuery
A Query wrapper that allows to give a boost to the wrapped query.
A BoostQueryNode boosts the QueryNode tree which is under this node.
This builder basically reads the Query object set on the BoostQueryNode child using QueryTreeBuilder.QUERY_TREE_BUILDER_TAGID and applies the boost value defined in the BoostQueryNode.
This processor iterates the query node tree looking for every FieldableNode that has StandardQueryConfigHandler.ConfigurationKeys.BOOST in its config.
Finds fragment boundaries: pluggable into BaseFragmentsBuilder
This interface describes methods that determine what the bounds are for a shape.
An interface for accumulating bounds information.
Implementation of "recursive graph bisection", also called "bipartite graph partitioning" and often abbreviated BP, an approach to doc ID assignment that aims at reducing the sum of the log gap between consecutive postings.
A forward index.
Use a LSB Radix Sorter to sort the (docID, termID) entries.
 
 
Exception that is thrown when not enough RAM is available.
 
A merge policy that reorders merged segments according to a BPIndexReorderer.
Analyzer for Brazilian Portuguese language.
 
Factory for BrazilianStemFilter.
A stemmer for Brazilian Portuguese words.
A BoundaryScanner implementation that uses BreakIterator to find boundaries in the text.
A PassageAdjuster that adjusts the Passage range to word boundaries hinted by the given BreakIterator.
Wraps RuleBasedBreakIterator, making object reuse convenient and emitting a rule status for emoji sequences.
BufferAllocationException forked from HPPC.
Wraps another Checksum with an internal buffer to speed up checksum calculations.
Simple implementation of ChecksumIndexInput that wraps another input and delegates calls.
Base implementation class for buffered IndexInput.
Implementation of an IndexInput that reads from a portion of a file.
This wrapper buffers incoming elements.
Holds buffered deletes and updates, by docID, term or query for a single segment.
 
 
 
Tracks the stream of FrozenBufferedUpdates.
 
Tracks the contiguous range of packets that have finished resolving.
Holds all per-segment internal state used while resolving deletions.
Buffers up pending vector value(s) per doc, then flushes when segment flushes.
 
 
 
Sorting FloatVectorValues that iterate over documents in the order of the provided sortMap
Sorting FloatVectorValues that iterate over documents in the order of the provided sortMap
This class is a workaround for JDK bug JDK-8252739.
This class is a workaround for JDK bug JDK-8252739.
Analyzer for Bulgarian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies BulgarianStemmer to stem Bulgarian words.
Factory for BulgarianStemFilter.
Light Stemmer for Bulgarian.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
Efficient sequential read/write of packed integers.
This class is used to score a range of documents at once, and is returned by Weight.bulkScorer(org.apache.lucene.index.LeafReaderContext).
DataInput backed by a byte array.
DataOutput backed by a byte array.
This class enables the allocation of fixed-size buffers and their management as part of a buffer array.
Abstract class for allocating and freeing byte blocks.
A simple ByteBlockPool.Allocator that never recycles.
A simple ByteBlockPool.Allocator that never recycles, but tracks how much total RAM is in use.
Reads in reverse from a ByteBlockPool.
A guard that is created for every ByteBufferIndexInput that tries on best effort to reject any access to the ByteBuffer behind, once it is unmapped.
Pass in an implementation of this interface to cleanup ByteBuffers.
Deprecated.
This class was made public for internal reasons (instanceof checks).
This class adds offset support to ByteBufferIndexInput, which is needed for slices.
Optimization of ByteBufferIndexInput for when there is only one buffer
A DataInput implementing RandomAccessInput and reading data from a list of ByteBuffers.
A DataOutput storing data in a list of ByteBuffers.
An implementation of a ByteBuffer allocation and recycling policy.
A ByteBuffer-based Directory implementation that can be used to store index files on the heap.
An IndexInput implementing RandomAccessInput and backed by a ByteBuffersDataInput.
An implementation for retrieving FunctionValues instances for byte knn vectors fields.
Automaton representation for matching UTF-8 byte[].
An FST Outputs implementation where each output is a sequence of bytes.
Class that Posting and PostingVector use to write interleaved byte streams into shared fixed-size byte[] arrays.
IndexInput that knows how to read the byte slices written by Posting and PostingVector.
Represents byte[], as a slice (offset + length) into an existing byte[].
A simple append only random-access BytesRef array that stores full copies of the appended bytes in a ByteBlockPool.
An extension of BytesRefIterator that allows retrieving the index of the current element
Used to iterate the elements of an array in a given order.
Represents a logical list of ByteRef backed by a ByteBlockPool.
A builder for BytesRef instances.
Specialized BytesRef comparator that StringSorter has optimizations for.
An implementation for retrieving FunctionValues instances for string based fields.
Enumerates all input (BytesRef) + output pairs in an FST.
Holds a single input (BytesRef) + output pair.
BytesRefHash is a special purpose hash-map like data-structure optimized for BytesRef instances.
Manages allocation of the per-term addresses.
A simple BytesRefHash.BytesStartArray that tracks memory allocation using a private Counter instance.
Thrown if a BytesRef exceeds the BytesRefHash limit of ByteBlockPool.BYTE_BLOCK_SIZE-2.
A simple iterator interface for BytesRef iteration.
Collects BytesRef and then allows one to iterate over their sorted order.
This attribute can be used if you have the raw term bytes to be indexed.
Implementation class for BytesTermAttribute.
An IndexOutput that wraps another instance and tracks the number of bytes written
This class implements a simple byte vector with access to the underlying array.
ByteVectorSimilarityFunction returns a similarity function between two knn vectors with byte elements.
Search for all (approximate) byte vectors above a similarity threshold.
A DoubleValuesSource which computes the vector similarity scores between the query vector and the KnnByteVectorField for documents.
This class provides access to per-document floating point vector values indexed as KnnByteVectorField.
FilterDirectory that tracks write amplification factor
Caches all docs, and optionally also scores, coming from a search, and is then able to replay them to another collector.
 
 
 
This expression value source shares one value cache when generating ExpressionFunctionValues such that only one value along the whole generation tree is corresponding to one name
 
a wrapper of IndexWriter MergeContext.
A simplistic Lucene based NaiveBayes classifier, with caching feature, see http://en.wikipedia.org/wiki/Naive_Bayes_classifier
This class can be used if the token attributes of a TokenStream are intended to be consumed more than once.
Class used to match candidate queries selected by a Presearcher from a Monitor query index.
 
A filter to apply normal capitalization rules to Tokens.
Analyzer for Catalan.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
A Cell is a portion of a trie.
Character category data.
Character category data.
 
 
 
 
 
 
Automaton representation for matching char[].
Utility class to write tokenizers or token filters.
A simple IO buffer to use with CharacterUtils.fill(CharacterBuffer, Reader).
Wraps a char[] as CharacterIterator for processing with a BreakIterator
A CharacterIterator used internally for use with BreakIterator
A simple class that stores key Strings as char[]'s in a hash table.
Empty CharArrayMap.UnmodifiableCharArrayMap optimized for speed.
 
Matches a character array
A simple class that stores Strings as char[]'s in a hash table.
Forked from HPPC, holding int index and char value.
Subclasses of CharFilter can be chained to filter a Reader They can be used as Reader with additional offset correction.
Abstract parent class for analysis factories that create CharFilter instances.
This static holder class prevents classloading deadlock by delaying init of factories until needed.
A hash set of chars, implemented using open addressing with linear probing for collision resolution.
A hash map of char to Object, implemented using open addressing with linear probing for collision resolution.
Forked from HPPC, holding int index,key and value
An FST Outputs implementation where each output is a sequence of characters.
Utility functions for JapaneseCompletionFilter
Represents char[], as a slice (offset + length) into an existing char[].
Deprecated.
This comparator is only a transition mechanism
A builder for CharsRef instances.
This interface describes a character stream that maintains line and column number positions of the characters.
The term text of a Token.
Default implementation of CharTermAttribute.
An abstract base class for simple, character-oriented tokenizers.
Internal SmartChineseAnalyzer character type constants.
This class implements a simple char vector with access to the underlying array.
 
Like IntConsumer, but may throw checked exceptions.
Basic tool and API to check the health of an index and write a new segments file that removes reference to problematic segments.
The marker RuntimeException used by CheckIndex APIs when index integrity failure is detected.
 
 
Run-time configuration options for CheckIndex commands.
Returned from CheckIndex.checkIndex() detailing the health and status of the index.
Status from testing DocValues
Status from testing field infos.
Status from testing field norms.
Status from testing index sort
Status from testing livedocs
Status from testing PointValues
Holds the status of each segment in the index.
Status from testing soft deletes
Status from testing stored fields.
Status from testing term index.
Status from testing stored fields.
Status from testing vector values
Walks the entire N-dimensional points space, verifying that all points fall within the last cell's boundaries.
Utility class to check a block join index.
Extension of IndexInput, computing checksum as it goes.
Represents a circle on the earth's surface.
2D circle implementation containing spatial logic.
 
 
 
An Analyzer that tokenizes text with StandardTokenizer, normalizes content with CJKWidthFilter, folds case with LowerCaseFilter, forms bigrams of CJK with CJKBigramFilter, and filters stopwords with StopFilter
 
Forms bigrams of CJK terms that are generated from StandardTokenizer or ICUTokenizer.
Factory for CJKBigramFilter.
A CharFilter that normalizes CJK width differences: Folds fullwidth ASCII variants into the equivalent basic latin Folds halfwidth Katakana variants into the equivalent kana
Factory for CJKWidthCharFilter.
A TokenFilter that normalizes CJK width differences: Folds fullwidth ASCII variants into the equivalent basic latin Folds halfwidth Katakana variants into the equivalent kana
Factory for CJKWidthFilter.
Filters ClassicTokenizer with ClassicFilter, LowerCaseFilter and StopFilter, using a list of English stop words.
Normalizes tokens extracted with ClassicTokenizer.
Factory for ClassicFilter.
Expert: Historical scoring implementation.
A grammar-based tokenizer constructed with JFlex
Factory for ClassicTokenizer.
This class implements the classic lucene StandardTokenizer up until 3.0
The result of a call to Classifier.assignClass(String) holding an assigned class of type T and a score.
A classifier, see http://en.wikipedia.org/wiki/Classifier_(mathematics), which assign classes of type T
Helper class used by ServiceLoader to investigate parent/child relationships of ClassLoaders.
Simple ResourceLoader that uses ClassLoader.getResourceAsStream(String) and Class.forName(String,boolean,ClassLoader) to open resources and classes, respectively.
A supplier that creates RandomVectorScorer from an ordinal.
Java's builtin ThreadLocal has a serious flaw: it can take an arbitrarily long amount of time to dereference the things you had stored in it, even once the ThreadLocal instance itself is no longer referenced.
Encodes/decodes an inverted index segment.
This static holder class prevents classloading deadlock by delaying init of default codecs and available codecs until needed.
LeafReader implemented by codec APIs.
Utility class for reading and writing versioned headers.
Removes words that are too long or too short from the stream.
Extension of CharTermAttributeImpl that encodes the term text as a binary Unicode collation key instead of as UTF-8 bytes.
Converts each token into its CollationKey, and then encodes the bytes as an index term.
Indexes collation keys as a single-valued SortedDocValuesField.
Expert: representation of a group in FirstPassGroupingCollector, tracking the top doc and FieldComparator slot.
 
Contains statistics for a collection (field).
Throw this exception in LeafCollector.collect(int) to prematurely terminate collection of the current leaf.
Methods for manipulating (sorting) and creating collections.
 
 
Expert: Collectors are primarily meant to be used to gather raw results from a search, and implement sorting or custom result filtering, collation, etc.
A manager of collectors.
Default implementation of MemoryTracker that tracks allocations and allows setting a memory limit per collector
A Query that treats multiple fields as a single stream and scores terms as if you had indexed them as a single term in a single field.
A builder for CombinedFieldQuery.
 
 
 
A suggestion generated by combining one or more original query terms
Class containing some useful methods used by command line tools
Construct bigrams for frequently occurring terms while indexing.
Constructs a CommonGramsFilter.
Wrap a CommonGramsFilter optimizing phrase queries by only returning single words when they are not a member of a bigram.
Configuration options common across queryparser implementations.
A query that executes high-frequency terms in a optional sub-query to prevent slow queries due to "common" terms like stopwords.
Base class for comparison operators useful within an "if"/conditional.
This class accumulates the (freq, norm) pairs that may produce competitive scores.
The Compile class is used to compile a stemmer table.
Immutable class holding compiled details for a given Automaton.
Automata are compiled into different internal forms for the most efficient execution depending upon the language they accept.
CompletionPostingsFormat for org.apache.lucene.backward_codecs.lucene50.Lucene50PostingsFormat.
CompletionPostingsFormat for org.apache.lucene.backward_codecs.lucene84.Lucene84PostingsFormat.
CompletionPostingsFormat for org.apache.lucene.backward_codecs.lucene90.Lucene90PostingsFormat.
Wraps an Analyzer to provide additional completion-only tuning (e.g.
Weighted FSTs for any indexed SuggestField is built on CompletionFieldsConsumer.write(Fields,NormsProducer).
 
 
Completion index (.cmp) is opened and read at instantiation to read in SuggestField numbers and their FST offsets in the Completion dictionary (.lkp).
A PostingsFormat which supports document suggestion based on indexed SuggestFields.
An enum that allows to control if suggester FSTs are loaded into memory or read off-heap
Abstract Query that match documents containing terms with a specified prefix filtered by BitsProducer.
Expert: Responsible for executing the query against an appropriate suggester and collecting the results via a collector.
Holder for suggester and field-level info for a suggest field
Wrapped Terms used by SuggestField and ContextSuggestField to access corresponding suggester and their attributes
A ConcatenateGraphFilter but we can set the payload and provide access to config options.
Expert: the Weight for CompletionQuery, used to score and explain these queries.
QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*".
 
2D Geometry object that supports spatial relationships with bounding boxes, triangles and points.
Used by withinTriangle to check the within relationship between a triangle and the query shape (e.g.
2D multi-component geometry implementation represented as an interval tree of components.
Base class for composite queries (such as AND/OR/NOT)
An internal BreakIterator for multilingual text, following recommendations from: UAX #29: Unicode Text Segmentation.
Instances of this reader type can only be used to get stored fields from the underlying LeafReaders, but it is not possible to directly retrieve postings.
 
A read-only Directory that consists of a view over a compound file.
Encodes/decodes compound files
 
Base class for decomposition token filters.
Compression algorithm used for suffixes of a block of terms.
Compression algorithm used for suffixes of a block of terms.
A compression mode.
A compression mode.
 
 
 
 
 
 
 
 
A data compressor.
A data compressor.
Concatenates/Joins every incoming token with a separator into one output token for every path through the token stream (which is a graph).
Attribute providing access to the term builder and UTF-16 conversion
Just escapes the ConcatenateGraphFilter.SEP_LABEL byte with an extra.
A TokenStream that takes an array of input TokenStreams as sources, and concatenates them together.
Concurrent version of ApproximatePriorityQueue, which trades a bit more of ordering for better concurrency by maintaining multiple sub ApproximatePriorityQueues that are locked independently.
This merger merges graph in a concurrent manner, by using HnswConcurrentMergeBuilder
A MergeScheduler that runs each merge using a separate thread.
Access to ConcurrentMergeScheduler internals exposed to the test framework.
Utility class for concurrently loading queries into a Monitor.
Allows skipping TokenFilters based on the current set of attributes.
 
Abstract parent class for analysis factories that create ConditionalTokenFilter instances
An instance of this class represents a key that is used to retrieve a value from AbstractQueryConfig.
Utility class to generate the confusion matrix of a Classifier
a confusion matrix, backed by a Map representing the linearized matrix
BulkScorer implementation of ConjunctionScorer.
A conjunction of DocIdSetIterators.
Conjunction between a DocIdSetIterator and one or more BitSetIterators.
TwoPhaseIterator implementing a conjunction.
 
 
 
 
Scorer for conjunctions, sets of queries, all of which are required.
 
Common super class for multiple sub spans required in a document.
Helper methods for building conjunction iterators
n-gram connection cost data
n-gram connection cost data
 
 
 
 
 
 
Some useful constants.
A query that wraps another query and simply returns a constant score equal to 1 for every document that matches the query.
We return this as our BulkScorer so that if the CSQ wraps a query with its own optimized top-level scorer (e.g.
Builder for ConstantScoreQuery
A constant-scoring Scorer.
A Weight that has a constant score equal to the boost of the wrapped query.
Function that returns a constant byte vector value for every document.
Function that returns a constant float vector value for every document.
ConstNumberSource is the base class for all constant numbers
ConstValueSource returns a constant for all documents
 
 
 
A CompletionQuery that matches documents specified by a wrapped CompletionQuery supporting boosting and/or filtering by specified contexts.
 
Holder for context value meta data
SuggestField which additionally takes in a set of contexts.
The ContextSuggestField.PrefixTokenFilter wraps a TokenStream and adds a set prefixes ahead.
Utility class that runs a thread to manage periodicc reopens of a ReferenceManager, with methods to wait for a specific index changes to become visible.
ICONV or OCONV replacement table
Assembles a QueryBuilder which uses only core Lucene Query objects
Assembles a QueryBuilder which uses Query objects from Lucene's sandbox and queries modules in addition to core queries.
Assembles a QueryBuilder which uses Query objects from Lucene's queries module in addition to core queries.
This exception is thrown when Lucene detects an inconsistency in the index.
Simple counter class
 
 
A Query that allows to have a configurable number or required matches per document.
 
A Scorer whose number of matches is per-document.
Utility class for parsing CSV text
A general-purpose Analyzer that can be created with a builder-style API.
Builder for CustomAnalyzer.
Factory class for a ConditionalTokenFilter
Builds a QueryTree for a query that needs custom treatment
A BreakIterator that breaks the text whenever a certain separator, provided as a constructor argument, is found.
Analyzer for Czech language.
 
A TokenFilter that applies CzechStemmer to stem Czech words.
Factory for CzechStemFilter.
Light Stemmer for Czech.
Deprecated.
Visibility of this class will be reduced in a future release.
DFSA state with char labels on transitions.
Create tokens for phonetic matches based on Daitch–Mokotoff Soundex.
Analyzer for Danish.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
Abstract base class for performing read operations of Lucene's low-level data types.
Abstract base class for performing write operations of Lucene's low-level data types.
Utility class for creating training / test / cross validation indexes from the original index.
Filters all tokens that cannot be parsed to a date, using the provided DateFormat.
Provides support for converting dates to strings and vice-versa.
Specifies the time granularity.
Folds all Unicode digits in [:General_Category=Decimal_Number:] to Basic Latin digits (0-9).
Factory for DecimalDigitFilter.
A token that was generated from a compound.
A decompressor.
A decompressor.
Default policy is to allocate a bitset with 10% saturation given a unique term per document.
Simple Encoder implementation that does not modify the output
Default implementation of FlatVectorsScorer.
RandomVectorScorerSupplier for bytes vector
A RandomVectorScorer for byte vectors.
RandomVectorScorerSupplier for Float vector
A RandomVectorScorer for float vectors.
Default ICUTokenizerConfig that is generally applicable to many languages.
Creates a formatted snippet from the top passages.
Default provider returning scalar implementations.
 
ValueSource implementation which only returns the values from the provided ValueSources which are available for a particular docId.
A compression mode that trades speed for compression ratio.
A compression mode that trades speed for compression ratio.
 
 
 
 
An analyzer wrapper, that doesn't allow to wrap components or readers.
 
A DeletedQueryNode represents a node that was deleted from the query node tree.
Characters before the delimiter are the "token", those after are the boost.
Characters before the delimiter are the "token", those after are the payload.
Characters before the delimiter are the "token", the textual integer after is the term frequency.
TermState serializer which encodes each file pointer as a delta relative to a base file pointer.
 
 
Implements the Divergence from Independence (DFI) model based on Chi-square statistics (i.e., standardized Chi-squared distance from independence in term frequency tf).
Implements the divergence from randomness (DFR) framework introduced in Gianni Amati and Cornelis Joost Van Rijsbergen.
An object representing homonym dictionary entries.
An object representing *.dic file entry with its word, flags and morphological data.
In-memory structure for the dictionary (.dic) and affix (.aff) data of a hunspell dictionary.
Dictionary interface for retrieving morphological data by id.
Dictionary interface for retrieving morphological data by id.
A simple interface representing a Dictionary.
Possible word breaks according to BREAK directives
Used to read flags as UTF-8 even if the rest of the file is in the default (8-bit) encoding
Implementation of Dictionary.FlagParsingStrategy that assumes each flag is encoded as two ASCII characters whose codes must be combined into a single character.
Abstraction of the process of parsing flags taken from the affix and dic files
A morpheme extracted from a compound token.
Implementation of Dictionary.FlagParsingStrategy that assumes each flag is encoded in its numerical form.
Simple implementation of Dictionary.FlagParsingStrategy that treats the chars in each String as a individual flags.
Tool to build dictionaries.
Tool to build dictionaries.
Format of the dictionary.
A TokenFilter that decomposes compound words found in many Germanic languages.
A token stored in a Dictionary.
The Diff object generates a patch string.
 
The DiffIt class is a means generate patch commands from an already prepared stemmer table.
A Directory implementation for all Unixes and Windows that uses DIRECT I/O to bypass OS level IO caching during merging.
 
 
Retrieves an instance previously written by DirectMonotonicWriter.
In-memory metadata that needs to be kept around for DirectMonotonicReader to read data from disk.
Write monotonically-increasing sequences of integers.
A Directory provides an abstraction layer for storing a list of files.
DirectoryReader is an implementation of CompositeReader that can read indexes in a Directory.
 
 
Wraps Lucene99PostingsFormat format for on-disk storage, but then at read time loads and stores all terms and postings directly in RAM as byte[], int[].
 
 
 
 
 
 
 
 
 
 
 
 
Retrieves an instance previously written by DirectWriter
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Simple automaton-based spellchecker.
Holds a spelling correction for internal usage inside DirectSpellChecker.
Class for writing packed integers to be directly read from Directory.
A priority queue of DocIdSetIterators that orders by current doc ID.
A priority queue of DocIdSetIterators that orders by current doc ID.
 
Wrapper used in DisiPriorityQueue.
A DocIdSetIterator which is a disjunction of the approximations of the provided iterators.
A DocIdSetIterator which is a disjunction of the approximations of the provided iterators.
 
 
 
A MatchesIterator that combines matches from a set of sub-iterators
 
A query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries.
The Scorer for DisjunctionMaxQuery.
 
A helper to propagate block boundaries for disjunctions.
Base class for Scorers that score disjunctions.
A Scorer for OR like queries, counterpart of ConjunctionScorer.
Factory for NEAR queries
 
Distance computation styles, supporting various ways of computing distance to shapes.
Interface for queries that can be nested as subqueries into a span near.
A second pass grouping collector that keeps track of distinct values for a specified field for the top N group.
 
Returned by DistinctValuesCollector.getGroups(), representing the value and set of distinct values for the group.
 
The probabilistic distribution used to model term occurrence in information-based models.
Log-logistic distribution.
The smoothed power-law (SPL) distribution for the information-based framework that is described in the original paper.
A TopDocsCollector that controls diversity in results by ensuring no more than maxHitsPerKey results from a common source are collected in the final results.
An extension to ScoreDoc that includes a key used for grouping purposes
 
kNN byte vector query that joins matching children vector documents with their parent doc id.
kNN float vector query that joins matching children vector documents with their parent doc id.
 
This collects the nearest children vectors.
This is a minimum binary heap, inspired by LongHeap.
Keeps track of child node, parent node, and the stored score.
DiversifyingNearestChildrenKnnCollectorManager responsible for creating DiversifyingNearestChildrenKnnCollector instances.
Function to divide "a" by "b"
Dl4jModelReader reads the file generated by the library Deeplearning4j and provide a Word2VecModel with normalized vectors
A DocIdSetIterator like BitSetIterator but has a doc base in onder to avoid storing previous 0s.
Comparator that sorts by asc _doc
DocFreqValueSource returns the number of documents containing the term.
 
 
Utility class to help merging documents from sub-readers according to either simple concatenated (unsorted) order, or by a specified index-time sort, skipping deleted documents and remapping non-deleted documents.
 
 
Represents one sub-reader being merged
A DocIdSet contains a set of doc ids.
A builder of DocIdSets.
 
 
Utility class to efficiently add many docs in one go.
 
This abstract class defines methods to iterate over a set of non-decreasing doc ids.
A stream of doc IDs.
 
Accumulator for documents that have a value for a field.
Serves as base class for FunctionValues based on DocTermsIndex.
Custom Exception to be thrown when the DocTermsIndex for a field cannot be generated
utility class for converting Lucene Documents to Double vectors.
Documents are the unit of indexing and search.
 
 
 
A classifier, see http://en.wikipedia.org/wiki/Classifier_(mathematics), which assign classes of type T to a Documents
Dictionary with terms, weights, payload (optional) and contexts (optional) information taken from stored/indexed fields in a Lucene index.
A StoredFieldVisitor that creates a Document from stored fields.
This class accepts multiple added documents and directly writes segment files.
 
DocumentsWriterDeleteQueue is a non-blocking linked pending deletes queue.
 
 
 
 
 
 
 
This class controls DocumentsWriterPerThread flushing during indexing.
 
 
 
 
DocumentsWriterPerThreadPool controls DocumentsWriterPerThread instances and their thread assignments during indexing.
Controls the health status of a DocumentsWriter sessions.
Dictionary with terms and optionally payload and optionally contexts information taken from stored fields in a Lucene index.
This class contains utility methods and constants for DocValues
Abstract API that consumes numeric, binary and sorted docvalues.
Tracks state of one binary sub-reader that we are merging
 
A merged TermsEnum.
Tracks state of one numeric sub-reader that we are merging
Tracks state of one sorted sub-reader that we are merging
Tracks state of one sorted numeric sub-reader that we are merging
Tracks state of one sorted set sub-reader that we are merging
Deprecated.
Use FieldExistsQuery instead.
Holds updates of a single DocValues field, for a set of documents within one segment.
 
An iterator over documents and their updated values.
 
Encodes/decodes per-document values.
This static holder class prevents classloading deadlock by delaying init of doc values formats until needed.
 
 
Set of longs, optimized for docvalues usage
Abstract API that produces numeric, binary, sorted, sortedset, and sortednumeric docvalues.
Rewrites MultiTermQueries into a filter, using DocValues for term enumeration.
 
Holds statistics for a DocValues field.
Holds DocValues statistics for a numeric field storing double values.
Holds DocValues statistics for a numeric field storing long values.
Holds statistics for a numeric DocValues field.
Holds statistics for a sorted DocValues field.
Holds DocValues statistics for a sorted-numeric field storing double values.
Holds DocValues statistics for a sorted-numeric field storing long values.
Holds statistics for a sorted-numeric DocValues field.
Holds statistics for a sorted-set DocValues field.
A Collector which computes statistics for a DocValues field.
 
 
DocValues types.
An in-place update to a DocValues field.
An in-place update to a binary DocValues field
An in-place update to a numeric DocValues field
 
Helper methods for parsing XML
Comparator based on Double.compare(double, double) for numHits.
Function that returns a constant double value for every document.
Forked from HPPC, holding int index and double value.
Abstract FunctionValues implementation which supports retrieving double values.
Syntactic sugar for encoding doubles as NumericDocValues via Double.doubleToRawLongBits(double).
Field that stores a per-document double value for scoring, sorting or value retrieval and index the field for fast range filters.
Obtains double field values from LeafReader.getNumericDocValues(java.lang.String) and makes those values available as other numeric types, casting as needed.
Filter for DoubleMetaphone (supporting secondary codes)
An indexed double field for fast range filters.
Builder for multi range queries for DoublePoints
An indexed Double Range field.
Represents a contiguous range of double values, with an inclusive minimum and exclusive maximum
DocValues field for DoubleRange.
Groups double values into ranges
A GroupSelector implementation that groups documents by double values
 
Per-segment, per-document double values, which can be calculated at search-time
Base class for producing DoubleValues
 
 
 
 
 
 
 
 
Allows Tokens with a given combination of flags to be dropped.
Provides a filter that will drop tokens matching a set of flags.
Abstract ValueSource implementation which wraps two ValueSources and applies an extendible float function to their values.
This builder does nothing.
Analyzer for Dutch language.
 
This class implements the stemming algorithm defined by a snowball script.
3D rectangle, bounded on six sides by X,Y,Z limits, degenerate in all dimensions
3D rectangle, bounded on six sides by X,Y,Z limits, degenerate in X and Y.
3D rectangle, bounded on six sides by X,Y,Z limits, degenerate in X and Z.
3D rectangle, bounded on six sides by X,Y,Z limits, degenerate in X.
Creates new instances of EdgeNGramTokenFilter.
Tokenizes the given token into n-grams of given size(s).
Tokenizes the input from an edge into n-grams of given size(s).
Creates new instances of EdgeNGramTokenizer.
Internal tree node: represents geometry edge from [x1, y1] to [x2, y2].
Removes elisions from a TokenStream.
Factory for ElisionFilter.
Abstract base class implementing a DocValuesProducer that has no doc values.
An always exhausted token stream.
Encodes original text.
A ChecksumIndexInput wrapper that changes the endianness of the provided index output.
 
 
A IndexInput wrapper that changes the endianness of the provided index input.
A RandomAccessInput wrapper that changes the endianness of the provided index input.
A IndexOutput wrapper that changes the endianness of the provided index output.
Utility class to wrap open files
Analyzer for English.
A TokenFilter that applies EnglishMinimalStemmer to stem English words.
Minimal plural stemmer for English.
TokenFilter that removes possessives (trailing 's) from words.
This class implements the stemming algorithm defined by a snowball script.
Suggestion to add/edit dictionary entries to generate a given list of words created by WordFormGenerator.compress(java.util.List<java.lang.String>, java.util.Set<java.lang.String>, java.lang.Runnable).
Obtains int field values from LeafReader.getNumericDocValues(java.lang.String) and makes those values available as other numeric types, casting as needed.
A parser needs to implement EscapeQuerySyntax to allow the QueryNode to escape the queries, when the toQueryString method is called.
Type of escaping: String for escaping syntax, NORMAL for escaping reserved words (like AND) in terms
Implementation of EscapeQuerySyntax for the standard lucene syntax.
Analyzer for Estonian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
Expert: Find exact phrases
 
The ExitableDirectoryReader wraps a real index DirectoryReader and allows for a QueryTimeout implementation object to be checked periodically to see if the thread should exit or not.
Wrapper class for another FilterAtomicReader.
 
 
Wrapper class for another PointValues implementation that is used by ExitableFields.
Wrapper class for a SubReaderWrapper that is used by the ExitableDirectoryReader.
Wrapper class for another Terms implementation that is used by ExitableFields.
Wrapper class for TermsEnum that is used by ExitableTerms for implementing an exitable enumeration of terms.
Exception that is thrown to prematurely terminate a term enumeration.
A query match containing the score explanation of the match
Expert: Describes the score computation for document and query.
Base class that computes the value of an expression for a document.
A DoubleValues which evaluates an expression
Helper class holding static methods for js math functions
A Rescorer that uses an expression to re-score first pass hits.
A DoubleValuesSource which evaluates a Expression given the context of an Bindings.
The ExtendableQueryParser enables arbitrary query parser extension based on a customizable field naming scheme.
Wraps an IntervalIterator and extends the bounds of its intervals
 
ExtensionQuery holds all query components extracted from the original query string like the query field and the extension query string.
The Extensions class represents an extension mapping to associate ParserExtension instances with extension keys.
This class represents a generic pair.
An implementation of a BytesRefSorter that allows appending BytesRefs to an OfflineSorter and returns a Closeable ExternalRefSorter.ByteSequenceIterator that iterates over sequences stored on disk.
Iterates over BytesRefs in a file, closes the reader when the iterator is exhausted.
An efficient implementation of JavaCC's CharStream interface.
Another highlighter implementation.
A DoubleValuesSource instance which can be used to read the values of a feature from a FeatureField for documents.
 
Field that can be used to store static scoring factors into documents.
 
 
 
 
 
 
 
Sorts using the value of a specified feature name from a FeatureField.
Expert: directly create a field for a document.
 
Specifies whether and how a field should be stored.
 
A query node implements FieldableNode interface to indicate that its children and itself are associated to a specific field.
This listener listens for every field configuration request and assign a StandardQueryConfigHandler.ConfigurationKeys.BOOST to the equivalent FieldConfig based on a defined map: fieldName -> boostValue stored in StandardQueryConfigHandler.ConfigurationKeys.FIELD_BOOST_MAP.
A base class for ValueSource implementations that retrieve values for a single field from DocValues.
Expert: a FieldComparator compares hits so as to determine their sort order when collecting the top results with TopFieldCollector.
Sorts by descending relevance.
Sorts by field's natural Term sort order.
Provides a FieldComparator for custom field sorting.
This class represents a field configuration.
This interface should be implemented by classes that wants to listen for field configuration requests.
This listener listens for every field configuration request and assign a StandardQueryConfigHandler.ConfigurationKeys.DATE_RESOLUTION to the equivalent FieldConfig based on a defined map: fieldName -> DateTools.Resolution stored in StandardQueryConfigHandler.ConfigurationKeys.FIELD_DATE_RESOLUTION_MAP.
Expert: A ScoreDoc which also contains information about how to sort the referenced document.
A Query that matches documents that contain either a KnnFloatVectorField, KnnByteVectorField or a field that indexes norms or doc values.
FieldFragList has a list of "frag info" that is used by FragmentsBuilder class to create fragments (snippets).
List of term offsets + weight for a frag info
Represents the list of term offsets for some text
Internal highlighter abstraction that operates on a per field basis.
Access to the Field Info file that describes document fields and whether or not they are indexed.
Collection of FieldInfos (accessible by number or by name).
 
 
 
 
Encodes/decodes FieldInfos
This class tracks the number and position / offset parameters of terms being added to the index.
Wrapper to allow SpanQuery objects participate in composite single-field SpanQueries by 'lying' about their search field.
Metadata and stats for one field in the index.
Reads/writes field metadata.
Pair of FieldMetadata and BlockTermState for a specific field.
Ultimately returns an OffsetsEnum yielding potentially highlightable words in the text.
FieldPhraseList has a list of WeightedPhraseInfo that is used by FragListBuilder to create a FieldFragList object.
Represents the list of term offsets and boost for some text
Term offsets (start + end)
FieldQuery breaks down query object into terms/phrases and keeps them in a QueryPhraseMap structure.
Internal structure of a query for highlighting: represents a nested query structure
A FieldQueryNode represents a element that contains field/text tuple
Builds a TermQuery object from a FieldQueryNode object.
BlockTree's implementation of Terms.
BlockTree's implementation of Terms.
Provides a Terms index for fields that have it, and lists which fields do.
Abstract API that consumes terms, doc, freq, prox, offset and payloads postings.
 
 
 
 
Efficient index format for block-based Codecs.
Abstract API that produces terms, doc, freq, prox, offset and payloads postings.
Forms an OR query of the provided query across multiple fields.
Iterates over terms in across multiple fields.
FieldTermStack is a stack that keeps query terms in the specified field of the document to be highlighted.
Single term with its position/offsets in the document and IDF weight.
Describes the properties of a field.
This class efficiently buffers numeric and binary field updates and stores terms, values and metadata in a memory efficient way without creating large amounts of objects.
Struct like class that is used to iterate over all updates in this buffer
A factory of MatchHighlighter.FieldValueHighlighter classes that cover typical use cases (verbatim values, highlights, abbreviations).
 
Expert: A hit queue for sorting by hits by terms in more than one field.
Extension of ScoreDoc to also store the FieldComparator slot.
An implementation of FieldValueHitQueue which is optimized in case there is more than one comparator.
An implementation of FieldValueHitQueue which is optimized in case there is just one comparator.
This interface should be implemented by QueryNode that holds a field and an arbitrary value.
This class provides ability to track the reference counts of a set of index files and delete them when their counts decreased to 0.
Types of messages this file deleter will broadcast REF: messages about reference FILE: messages about file
Tracks the reference count for a single index file:
Dictionary represented by a text file.
Expert: A Directory instance that switches files between two other Directory instances.
Simple ResourceLoader that opens resource files from the local file system, optionally resolving against a base directory.
Delegates all methods to a wrapped BinaryDocValues.
A codec that forwards all its method calls to another codec.
A FilterCodecReader contains another CodecReader, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality.
Collector delegator.
Directory implementation that delegates calls to another directory.
A FilterDirectoryReader wraps another DirectoryReader, allowing implementations to transform or extend it.
A DelegatingCacheHelper is a CacheHelper specialization for implementing long-lived caching behaviour for FilterDirectoryReader subclasses.
Factory class passed to FilterDirectoryReader constructor that allows subclasses to wrap the filtered DirectoryReader's subreaders.
Abstract decorator class of a DocIdSetIterator implementation that provides on-demand filter/validation mechanism on an underlying DocIdSetIterator.
An IntervalsSource that filters the intervals from another IntervalsSource
 
 
Abstract class for enumerating a subset of all terms.
Return value, if term should be accepted or the iteration should END.
IndexInput implementation that delegates calls to another directory.
Access to FilterIndexInput internals exposed to the test framework.
IndexOutput implementation that delegates calls to another directory.
 
Abstract base class for TokenFilters that may remove tokens.
An Iterator implementation that filters elements with a boolean predicate.
LeafCollector delegator.
A FilterLeafReader contains another LeafReader, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality.
Base class for filtering Fields implementations.
Base class for filtering PostingsEnum implementations.
Base class for filtering Terms implementations.
Base class for filtering TermsEnum implementations.
A MatchesIterator that delegates all calls to another MatchesIterator
A wrapper for MergePolicy instances.
Delegates all methods to a wrapped NumericDocValues.
Filter a Scorable, intercepting methods and optionally changing their return values
A FilterScorer contains another Scorer, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality.
Delegates all methods to a wrapped SortedDocValues.
Delegates all methods to a wrapped SortedNumericDocValues.
Delegates all methods to a wrapped SortedSetDocValues.
A Spans implementation wrapping another spans instance, allowing to filter spans matches easily by implementing FilterSpans.accept(org.apache.lucene.queries.spans.Spans)
Status returned from FilterSpans.accept(Spans) that indicates whether a candidate match should be accepted, rejected, or rejected and move on to the next document.
Delegates all methods to a wrapped FloatVectorValues.
A FilterWeight contains another Weight and implements all abstract methods by calling the contained weight's method.
Filter outputs a single token which is a concatenation of the sorted and de-duplicated set of input tokens.
Factory for FingerprintFilter.
Iterates all accepted strings.
Nodes for path stack.
Analyzer for Finnish.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies FinnishLightStemmer to stem Finnish words.
Light Stemmer for Finnish.
This class implements the stemming algorithm defined by a snowball script.
FirstPassGroupingCollector is the first of two passes necessary to collect grouped hits.
Deprecated.
Fix the token filters that create broken offsets in the first place.
Deprecated.
Immutable twin of FixedBitSet.
BitSet of fixed length (numBits), backed by accessible (FixedBitSet.getBits()) long[], accessed with an int index, implementing Bits and DocIdSet.
 
TermsIndexReader for simple every Nth terms indexes.
Selects every Nth term as and index term, and hold term bytes (mostly) fully expanded in memory.
Just like BytesRefArray except all values have the same length.
A FixedShingleFilter constructs shingles (token n-grams) from a token stream.
Factory for FixedShingleFilter
A structure similar to BytesRefHash, but specialized for sorted char sequences used for Hunspell flags.
 
This attribute can be used to pass different flags down the Tokenizer chain, e.g.
Default implementation of FlagsAttribute.
A bit vector scorer for scoring byte vectors.
 
 
Vectors' writer for a field
Converts an incoming graph token stream, such as one from SynonymGraphFilter, into a flat form so that all nodes form a single linear chain with no side paths.
Holds all tokens leaving a given input position.
Gathers up merged input positions into a single output position, only for the current "frontier" of nodes we've seen but can't yet output because they are not frozen.
Factory for FlattenGraphFilter.
Utilities for FlatVectorsScorer.
Encodes/decodes per-document vectors and provides a scoring interface for the flat stored vectors
Reads vectors from an index.
Provides mechanisms to score vectors that are stored in a flat file The purpose of this class is for providing flexibility to the codec utilizing the vectors
Vectors' writer for a field that allows additional indexing logic to be implemented by the caller
An array-backed list of float.
An iterator implementation for FloatArrayList.iterator().
Comparator based on Float.compare(float, float) for numHits.
Forked from HPPC, holding int index and float value.
Abstract FunctionValues implementation which supports retrieving float values.
Syntactic sugar for encoding floats as NumericDocValues via Float.floatToRawIntBits(float).
Encode a character array Float as a BytesRef.
Field that stores a per-document float value for scoring, sorting or value retrieval and index the field for fast range filters.
Obtains float field values from LeafReader.getNumericDocValues(java.lang.String) and makes those values available as other numeric types, casting as needed.
A bounded min heap that stores floats.
An implementation for retrieving FunctionValues instances for float knn vectors fields.
An indexed float field for fast range filters.
Builder for multi range queries for FloatPoints
KNN search on top of N dimensional indexed float points.
 
 
An indexed Float Range field.
DocValues field for FloatRange.
 
FloatVectorSimilarityFunction returns a similarity function between two knn vectors with float elements.
Search for all (approximate) float vectors above a similarity threshold.
A DoubleValuesSource which computes the vector similarity scores between the query vector and the KnnFloatVectorField for documents.
This class provides access to per-document floating point vector values indexed as KnnFloatVectorField.
Default FlushPolicy implementation that flushes new segments based on RAM used and document count depending on the IndexWriter's IndexWriterConfig.
A FlushInfo provides information required for a FLUSH context.
FlushPolicy controls when segments are flushed from a RAM resident internal data-structure to the IndexWriters Directory.
Query wrapper that forces its wrapped Query to use the default doc-by-doc BulkScorer.
Utility class to encode/decode increasing sequences of 128 integers.
Utility class to encode/decode increasing sequences of 128 integers.
Processes terms found in the original text, typically by applying some form of mark-up to highlight terms in HTML search results pages.
Encode all values in normal area with fixed bit width, which is determined by the max value in this block.
 
 
 
Reads from a single byte[].
FragListBuilder is an interface for FieldFragList builder classes.
An oracle for quickly checking that a specific part of a word can never be a valid word.
Implements the policy for breaking text into multiple fragments for consideration by the Highlighter class.
FragmentsBuilder is an interface for fragments (snippets) builder classes.
Builds an ngram model from the text sent to FreeTextSuggester.build(org.apache.lucene.search.suggest.InputIterator) and predicts based on the last grams-1 tokens in the request sent to FreeTextSuggester.lookup(java.lang.CharSequence, boolean, int).
Analyzer for French language.
 
A TokenFilter that applies FrenchLightStemmer to stem French words.
Light Stemmer for French.
A TokenFilter that applies FrenchMinimalStemmer to stem French words.
Light Stemmer for French.
This class implements the stemming algorithm defined by a snowball script.
Implements limited (iterators only, no stats) Fields interface over the in-RAM buffered fields/terms/postings, to flush postings through the PostingsFormat.
 
 
 
 
 
 
 
A TimSorter which sorts two parallel arrays of doc IDs and offsets in one go.
 
 
 
 
A ring buffer that tracks the frequency of the integers that it contains.
A bag of integers.
Holds buffered deletes and updates by term or query, once pushed.
This class helps iterating a term dictionary and consuming all the docs for each terms.
 
 
Base class for Directory implementations that store index files in the file system.
Base class for file system based locking implementation.
Represents an finite state machine (FST), using a compact byte[] format.
Represents a single arc.
Helper methods to read the bit-table of a direct addressing node.
Reads bytes stored in an FST.
Represents the FST metadata.
Specifies allowed range of each int input label for this FST.
Builds a minimal FST (maps an IntsRef term to an arbitrary output) from pre-sorted terms with outputs.
Expert: holds a pending (seen but not yet serialized) arc.
Fluent-style constructor for FST FSTCompiler.
 
Reusable buffer for building nodes with fixed length arcs (binary search or direct addressing).
 
This class is used for FST backed by non-FSTReader DataOutput.
Expert: holds a pending (seen but not yet serialized) Node.
Finite state automata based implementation of "autocomplete" functionality.
A single completion for a given key.
Finite state automata based implementation of "autocomplete" functionality.
An adapter from Lookup API to FSTCompletion.
Immutable stateless FST-based index dictionary kept in memory.
Provides stateful FSTDictionary.Browser to seek in the FSTDictionary.
Builds an immutable FSTDictionary.
Can next() and advance() through the terms in an FST
A custom FST outputs implementation that stores block data (BytesRef), long ordStart, long numTerms.
 
FST term dict + Lucene50PBF
Abstraction for reading bytes necessary for FST.
A type of FSTReader which needs data to be initialized before use
An FST Outputs implementation for FSTTermsWriter.
Represents the metadata for one term.
FST-based terms dictionary reader.
FST-based term dict, using metadata as FST output.
 
Exposes a utility method to enumerate all paths intersecting an Automaton with an FST.
Holds a pair (automaton, fst) of states and accumulated output in the intersected machine.
A query that retrieves all documents with a DoubleValues value matching a predicate
Returns a score for each document based on a ValueSource, often some function of the value of a field.
A Query wrapping a ValueSource that matches docs in which the values in the value source match a configured range.
A query that wraps another query, and uses a DoubleValuesSource to replace or modify the wrapped query's score
 
 
 
Represents field values as different types.
Abstraction of the logic required to fill the value of a specified doc into a reusable MutableValue.
Builds a set of CompiledAutomaton for fuzzy matching on a given term, with specified maximum edit distance, fixed prefix and whether or not to allow transpositions.
A CompletionQuery that match documents containing terms within an edit distance of the specified prefix.
 
Configuration parameters for FuzzyQuerys
Fuzzifies ALL terms provided as strings and then picks the best n differentiating terms.
 
 
 
Builder for FuzzyLikeThisQuery
Implements the fuzzy search query.
A FuzzyQueryNode represents a element that contains field/text/similarity tuple
Builds a FuzzyQuery object from a FuzzyQueryNode object.
This processor iterates the query node tree looking for every FuzzyQueryNode, when this kind of node is found, it checks on the query configuration for StandardQueryConfigHandler.ConfigurationKeys.FUZZY_CONFIG, gets the fuzzy prefix length and default similarity from it and set to the fuzzy node.
A class used to represent a set of many, potentially large, values (e.g.
Result from FuzzySet.contains(BytesRef): can never return definitively YES (always MAYBE), but can sometimes definitely return NO.
Implements a fuzzy AnalyzingSuggester.
An interval function equivalent to FuzzyQuery.
Subclass of TermsEnum for enumerating all terms that are similar to the specified filter term.
Used for sharing automata between segments
 
Thrown to indicate that there was an issue creating a fuzzy query for a given term.
Analyzer for Galician.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies GalicianMinimalStemmer to stem Galician words.
Minimal Stemmer for Galician
A TokenFilter that applies GalicianStemmer to stem Galician words.
Factory for GalicianStemFilter.
Galician stemmer implementing "Regras do lematizador para o galego".
The Gener object helps in the discarding of nodes which break the reduction effort and defend the structure against large reductions.
A class that traverses the entire dictionary and applies affix rules to check if those yield correct suggestions similar enough to the given misspelled word
 
 
 
An per-document 3D location field.
Add this to a document to index lat/lon or x/y/z point, indexed as a 3D point.
Compares documents by distance from an origin point, using a GeoDistanceShape to compute the distance
Compares documents by outside distance, using a GeoOutsideDistance to compute the distance
Sorts by outside distance from an origin location.
Sorts by distance from an origin location.
 
A GeoArea represents a standard 2-D breakdown of a part of sphere.
Factory for GeoArea.
Shape that implements GeoArea.
Base extended areaShape object.
All bounding box shapes can derive from this base class, which furnishes some common code
Base object that supports bounds operations.
GeoCircles have all the characteristics of GeoBaseDistanceShapes, plus GeoSizeable.
Base class to create a composite of GeoAreaShapes
Base class to create a composite of GeoMembershipShapes
Base class to create a composite of GeoShapes.
Distance shapes have capabilities of both geohashing and distance computation (which also includes point membership determination).
Membership shapes have capabilities of both geohashing and membership determination.
GeoPaths have all the characteristics of GeoBaseDistanceShapes.
GeoBasePolygon objects are the base class of most GeoPolygon objects.
Base extended shape object.
All bounding box shapes have this interface in common.
Factory for GeoBBox.
Generic shape that supports bounds.
Interface describing circular area with a center and radius.
Class which constructs a GeoCircle representing an arbitrary circle.
GeoComplexPolygon objects are structures designed to handle very large numbers of edges.
Iterator execution interface, for tree traversal, plus count retrieval.
An instance of this class describes a single edge, and includes what is necessary to reliably determine intersection in the context of the even/odd algorithm used.
Iterator execution interface, for tree traversal.
An instance of this class represents a node in a tree.
An interface describing a tree.
This is the x-tree.
This is the y-tree.
This is the z-tree.
GeoCompositeAreaShape is a set of GeoAreaShape's, treated as a unit.
GeoCompositeMembershipShape is a set of GeoMembershipShape's, treated as a unit.
GeoCompositePolygon is a specific implementation of GeoCompositeAreaShape, which implements GeoPolygon explicitly.
GeoConcavePolygon objects are generic building blocks of more complex structures.
GeoConvexPolygon objects are generic building blocks of more complex structures.
Degenerate bounding box limited on two sides (left lon, right lon).
This GeoBBox represents an area rectangle of one specific latitude with no longitude bounds.
Degenerate longitude slice.
GeoShape representing a path across the surface of the globe, with a specified half-width.
This is the pre-calculated data for a path segment.
This is precalculated data for segment endpoint.
This class represents a degenerate point bounding box.
Degenerate bounding box limited on two sides (top lat, bottom lat).
An implementer of this interface is capable of computing the described "distance" values, which are meant to provide both actual distance values, as well as distance estimates that can be computed more cheaply.
Distance shapes have capabilities of both geohashing and distance computation (which also includes point membership determination).
reusable geopoint encoding methods
A predicate that checks whether a given point is within a component2D geometry.
A predicate that checks whether a given point is within a distance of another point.
 
Circular area with a center and a radius that represents the surface distance to the center.
A temporary description of a section of circle.
A description of a section of circle.
This GeoBBox represents an area rectangle limited only in latitude.
Bounding box limited on left and right.
Membership shapes have capabilities of both geohashing and membership determination.
Base class for LatLonGeometry and XYGeometry
This GeoBBox represents an area rectangle limited only in south latitude.
Bounding box limited on three sides (bottom lat, left lon, right lon), including the north pole.
Implemented by Geo3D shapes that can compute the distance from a point to the closest outside edge.
Interface describing a path.
Class which constructs a GeoPath representing an arbitrary path.
This class represents a point on the surface of a sphere or ellipsoid.
Interface describing a GeoPointShape shape.It may represents a degenerated bounding box or a degenerated circle, hence it extends such interfaces.
Class which constructs a GeoPointShape.
GeoPolygon interface description.
Class which constructs a GeoMembershipShape representing an arbitrary polygon.
Class for tracking the best shape for finding a pole, and whether or not the pole must be inside or outside of the shape.
Class representing a single (unused) edge.
Class representing a pool of unused edges, all linked together by vertices.
Class representing an iterator over an EdgeBuffer.
 
Use this class to specify a polygon with associated holes.
An instance of this class represents a known-good path of nodes that contains no coplanar points , no matter how assessed.
Exception we throw when we can't tile a polygon due to numerical precision issues.
Bounding box limited on four sides (top lat, bottom lat, left lon, right lon).
Fast implementation of a polygon representing S2 geometry cell.
Class which constructs a GeoPolygon representing S2 google pixel.
Generic shape.
Some shapes can compute radii of a geocircle in which they are inscribed.
This GeoBBox represents an area rectangle limited only in north latitude.
Bounding box limited on three sides (top lat, left lon, right lon).
Circular area with a center and cutoff angle that represents the latitude and longitude distance from the center where the planet will be cut.
GeoShape representing a path across the surface of the globe, with a specified half-width.
Base implementation of SegmentEndpoint
Endpoint that's a simple circle.
Endpoint that's a dual circle with cutoff(s).
Endpoint that's a single circle with cutoff(s).
 
Path components consist of both path segments and segment endpoints.
 
This is the pre-calculated data for a path segment.
Internal interface describing segment endpoint implementations.
 
Basic reusable geo-spatial utility methods
used to define the orientation of 3 points -1 = Clockwise 0 = Colinear 1 = Counter-clockwise
Degenerate bounding box wider than PI and limited on two sides (left lon, right lon).
Bounding box wider than PI but limited on left and right sides ( left lon, right lon).
Bounding box wider than PI but limited on three sides ( bottom lat, left lon, right lon).
Bounding box wider than PI but limited on four sides (top lat, bottom lat, left lon, right lon).
Bounding box wider than PI but limited on three sides (top lat, left lon, right lon).
Bounding box including the entire world.
This class implements the stemming algorithm defined by a snowball script.
Analyzer for German language.
 
A TokenFilter that applies GermanLightStemmer to stem German words.
Light Stemmer for German.
A TokenFilter that applies GermanMinimalStemmer to stem German words.
Minimal Stemmer for German.
Normalizes German characters according to the heuristics of the German2 snowball algorithm.
A TokenFilter that stems German words.
Factory for GermanStemFilter.
A stemmer for German words.
This class implements the stemming algorithm defined by a snowball script.
Utility to get document frequency and total number of occurrences (sum of the tf for each doc) of a term.
A collector that collects all ordinals from a specified field matching the query.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Formats text with different color intensity depending on the score of the term.
An abstract TokenFilter that exposes its input stream as a graph
 
Consumes a TokenStream and creates an Automaton where the transition labels are terms from the TermToBytesRefAttribute.
Outputs the dot (graphviz) string for the viterbi lattice.
Outputs the dot (graphviz) string for the viterbi lattice.
Analyzer for the Greek language.
 
Normalizes token text to lower case, removes some Greek diacritics, and standardizes final sigma to sigma.
A TokenFilter that applies GreekStemmer to stem Greek words.
Factory for GreekStemFilter.
A stemmer for Greek words, according to: Development of a Stemmer for the Greek Language. Georgios Ntais
This class implements the stemming algorithm defined by a snowball script.
Represents one group in the results.
Base class for computing grouped facets.
Represents a facet entry with a value and a count.
The grouped facet result.
Contains the local grouped segment counts for a particular segment.
 
Convenience class to perform grouping in a non distributed environment.
A GroupQueryNode represents a location where the original user typed real parenthesis on the query string.
Builds no object, it only returns the Query object set on the GroupQueryNode object using a QueryTreeBuilder.QUERY_TREE_BUILDER_TAGID tag.
Concrete implementations of this class define what to collect for individual groups during the second-pass of a grouping search.
 
Defines a group, for use by grouping collectors
What to do with the current value
This class contains utility methods and constants for group varint
Provides an abstraction for read int values, so that decoding logic can be reused in different DataInput.
 
Implements PackedInts.Mutable, but grows the bit count of the underlying packed ints on-demand.
An indexed half-float field for fast range filters.
This directory wrapper overrides Directory.copyFrom(Directory, String, String, IOContext) in order to optionally use a hard-link instead of a full byte by byte file copy if applicable.
Constants for primitive maps.
Base class for hashing functions that can be referred to by name.
Utility class to read buffered points from in-heap arrays.
Utility class to write new points into in-heap arrays.
Reusable implementation for a point value on-heap
Finds the optimal segmentation of a sentence into Chinese words
HighFreqTerms class extracts the top n most frequent terms (by document frequency) from an existing Lucene index and reports their document frequency.
Compares terms by docTermFreq
Priority queue for TermStats objects
Compares terms by totalTermFreq
HighFrequencyDictionary: terms taken from the given field of a Lucene index, which appear in a number of documents above a given threshold.
Marks up highlighted terms found in the best sections of text, using configurable Fragmenter, Scorer, Formatter, Encoder and tokenizers.
 
QueryMatch object that contains the hit positions of a matching Query
Represents an individual hit
Analyzer for Hindi.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies HindiNormalizer to normalize the orthography.
Normalizer for Hindi.
A TokenFilter that applies HindiStemmer to stem Hindi words.
Factory for HindiStemFilter.
Light Stemmer for Hindi.
This class implements the stemming algorithm defined by a snowball script.
Expert: Priority queue containing hit docs
Used for defining custom algorithms to allow searches to early terminate
Implementation of HitsThresholdChecker which allows global hit counting
Default implementation of HitsThresholdChecker to be used for single threaded execution
Tokenizer for Chinese or mixed Chinese-English text.
Encodes bit vector values into an associated graph connecting the documents having values.
 
Interface for builder building the OnHeapHnswGraph
A graph builder that manages multiple workers, it only supports adding the whole graph all at once.
 
This searcher will obtain the lock and make a copy of neighborArray when seeking the graph such that concurrent modification of the graph will not impact the search
Hierarchical Navigable Small World graph.
NodesIterator that accepts nodes as an integer array.
Nodes iterator based on set representation of nodes.
Iterator over the graph nodes on a certain level, Iterator also provides the size – the total number of nodes to be iterated over.
Builder for HNSW graph.
A restricted, specialized knnCollector that can be used when building a graph.
Abstraction of merging multiple graphs into one on-heap graph
An interface that provides an HNSW graph.
Searches an HNSW graph to find nearest neighbors to a query vector.
This class allows OnHeapHnswGraph to be searched in a thread-safe manner by avoiding the unsafe methods (seek and nextNeighbor, which maintain state in the graph object) and instead maintaining the state in the searcher object.
Accessor to get Hotspot VM Options (if available).
A CharFilter that wraps another Reader and attempts to strip out HTML constructs.
 
Factory for HTMLStripCharFilter.
A simple query wrapper for debug purposes.
Analyzer for Hungarian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies HungarianLightStemmer to stem Hungarian words.
Light Stemmer for Hungarian.
This class implements the stemming algorithm defined by a snowball script.
A spell checker based on Hunspell dictionaries.
TokenFilter that uses hunspell affix rules and words to stem tokens.
TokenFilterFactory that creates instances of HunspellStemFilter.
This class represents a hyphen.
When the plain text is extracted from documents, we will often have many words hyphenated and broken into two lines.
This class represents a hyphenated word.
A TokenFilter that decomposes compound words found in many Germanic languages.
This tree structure stores the hyphenation patterns in an efficient way for fast lookup.
Provides a framework for the family of information-based models, as described in Stéphane Clinchant and Eric Gaussier.
Extension of CharTermAttributeImpl that encodes the term text as a binary Unicode collation key instead of as UTF-8 bytes.
Converts each token into its CollationKey, and then encodes bytes as an index term.
Indexes collation keys as a single-valued SortedDocValuesField.
A TokenFilter that applies search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.
Factory for ICUFoldingFilter.
Normalize token text with ICU's Normalizer2.
Normalize token text with ICU's Normalizer2
Breaks text into words according to UAX #29: Unicode Text Segmentation (http://www.unicode.org/reports/tr29/)
Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.
Factory for ICUTokenizer.
A TokenFilter that transforms text with ICU.
Wrap a CharTermAttribute with the Replaceable API.
Factory for ICUTransformFilter.
Does nothing other than convert the char array to a byte array using the specified encoding.
Function that returns #idf(long, long) for every document.
A PostingsFormat optimized for primary-key (ID) fields that also record a version (long) for each ID, delivered as a payload created by IDVersionPostingsFormat.longToBytes(long, org.apache.lucene.util.BytesRef) during indexing.
 
 
Iterates through terms in this field; this class is public so users can cast it to call IDVersionSegmentTermsEnum.seekExact(BytesRef, long) for optimistic-concurrency, and also IDVersionSegmentTermsEnum.getVersion() to get the version of the currently seek'd term.
 
 
Depending on the boolean value of the ifSource function, returns the value of the trueSource or falseSource function.
Annotation to not test a class or constructor with TestRandomChains integration test.
Per-document scoring factors.
Information about upcoming impacts, ie.
DocIdSetIterator that skips non-competitive docs thanks to the indexed impacts.
Extension of PostingsEnum which also provides information about upcoming impacts.
Source of Impacts.
This selects the biggest Hnsw graph from the provided merge state and initializes a new HnswGraphBuilder with that graph as a starting point.
Computes the measure of divergence from independence for DFI scoring functions.
Normalized chi-squared measure of distance from independence
Saturated measure of distance from independence
Standardized measure of distance from independence
Represents a single field for indexing.
Describes the properties of a field.
Expert: represents a single commit into an index as seen by the IndexDeletionPolicy or IndexReader.
Expert: policy for deletion of stale index commits.
Immutable stateless index dictionary kept in RAM.
Stateful IndexDictionary.Browser to seek a term in this IndexDictionary and get its corresponding block file pointer in the block file.
Supplier for a new stateful IndexDictionary.Browser created on the immutable IndexDictionary.
Builds an immutable IndexDictionary.
Disk-based implementation of a DocIdSetIterator which can return the index of the current document, i.e.
Disk-based implementation of a DocIdSetIterator which can return the index of the current document, i.e.
Disk-based implementation of a DocIdSetIterator which can return the index of the current document, i.e.
 
 
 
This class keeps track of each SegmentInfos instance that is still "live", either because it corresponds to a segments_N file in the Directory (a "commit", i.e.
Holds details for each commit point.
This class contains useful constants representing filenames and extensions used by lucene, as well as convenience methods for querying whether a file name matches an extension (matchesExtension), as well as generating file names from a segment name, generation and extension ( fileNameFromGeneration, segmentFileName).
This exception is thrown when Lucene detects an index that is newer than this Lucene version.
This exception is thrown when Lucene detects an index that is too old for this Lucene version
Default general purpose indexing chain, which handles indexing all types of fields.
A schema of the field in the current document.
 
 
Abstract base class for input from a file in a Directory.
Merges indices specified on the command line into the index specified as the first command line argument.
 
Signals that no index was found in the Directory.
Controls how much information is stored in the postings lists.
A query that uses either an index structure (points or terms) or doc values in order to run a query, depending which one is more efficient.
A DataOutput for appending data to a file in a Directory.
Access to org.apache.lucene.index package internals exposed to the test framework.
Public type exposing FieldInfo internal builders.
IndexReader is an abstract class, providing an interface for accessing a point-in-time view of an index.
A utility class that gives hooks in order to help build a cache based on the data that is contained in this index.
A cache key identifying a resource that is being cached on.
A listener that is called when a resource gets closed.
A struct like class that represents a hierarchical relationship between IndexReader instances.
Class exposing static helper methods for generating DoubleValuesSource instances over some IndexReader statistics
 
 
 
 
 
 
Copy and rearrange index according to document selectors, from input dir to output dir.
 
Select document within a CodecReader
Implements search over a single IndexReader.
Supplier for IndexSearcher.LeafSlice slices which computes and caches the value on first invocation and returns cached value on subsequent invocation.
A class holding a subset of the IndexSearchers leaf contexts to be executed within a single thread.
Thrown when an attempt is made to add more than IndexSearcher.TooManyClauses.getMaxClauseCount() clauses.
Thrown when a client attempts to execute a Query that has more than IndexSearcher.TooManyClauses.getMaxClauseCount() total clauses cumulatively in all of its children.
Handles how documents should be sorted in an index, both within a segment and between segments.
Used for sorting documents across segments
A comparator of doc IDs, used for sorting documents within a segment
Sorts documents based on double values from a NumericDocValues instance
Sorts documents based on float values from a NumericDocValues instance
Sorts documents based on integer values from a NumericDocValues instance
Sorts documents based on long values from a NumericDocValues instance
Provide a NumericDocValues instance for a LeafReader
Provide a SortedDocValues instance for a LeafReader
Sorts documents based on terms from a SortedDocValues instance
A range query that can take advantage of the fact that the index is sorted to speed up execution.
A doc ID set iterator that wraps a delegate iterator and only returns doc IDs in the range [firstDocInclusive, lastDoc).
Provides a DocIdSetIterator along with an accurate count of documents provided by the iterator (or -1 if an accurate count is unknown).
 
Compares the given document's value with a stored reference value.
Command-line tool that enables listing segments in an index, copying specific segments to another index, and deleting segments from an index.
This is an easy-to-use tool that upgrades all segments of an index from previous Lucene versions to the current segment file format.
An IndexWriter creates and maintains an index.
 
DocStats for this index
Interface for internal atomic events.
 
If DirectoryReader.open(IndexWriter) has been called (ie, this writer is in near real-time mode), then after a merge completes, this class can be invoked to warm the reader on the newly merged segment, before the merge commits.
 
Access to IndexWriter internals exposed to the test framework.
Holds all the configuration that is used to create an IndexWriter.
Specifies the open mode for IndexWriter.
A callback event listener for recording key events happened inside IndexWriter
A TokenFilter that applies IndicNormalizer to normalize text in Indian Languages.
Normalizes the Unicode representation of text in Indian languages.
 
Analyzer for Indonesian (Bahasa)
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies IndonesianStemmer to stem Indonesian words.
Stemmer for Indonesian.
This class implements the stemming algorithm defined by a snowball script.
A Query that matches documents matching combinations of subqueries.
Combines scores of subscorers.
The Weight for IndriAndQuery, used to normalize, score and explain these queries.
Bayesian smoothing using Dirichlet priors as implemented in the Indri Search engine (http://www.lemurproject.org/indri.php).
Models p(w|C) as the number of occurrences of the term in the collection, divided by the total number of tokens + 1.
The Indri implemenation of a disjunction scorer which stores the subscorers for the child queries.
A Basic abstract query that all IndriQueries can extend to implement toString, equals, getClauses, and iterator.
The Indri parent scorer that stores the boost so that IndriScorers can use the boost outside of the term.
An indexed 128-bit InetAddress field.
An indexed InetAddress Range Field
Attribute for Kuromoji inflection data.
Attribute for Kuromoji inflection data.
Debugging API for Lucene classes such as IndexWriter and SegmentInfos.
 
This creates a graph builder that is initialized with the provided HnswGraph.
An BytesRefSorter that keeps all the entries in memory.
Sorter implementation based on the merge-sort algorithm that merges in place (no extra memory will be allocated).
Interface for enumerating term,weight,payload triples for suggester consumption; currently only AnalyzingSuggester, FuzzySuggester and AnalyzingInfixSuggester support payloads.
Wraps a BytesRefIterator as a suggester InputIterator, with all weights set to 1 and carries no payload
A DataInput wrapping a plain InputStream.
 
 
An array-backed list of int.
An iterator implementation for IntArrayList.iterator().
A pool for int blocks similar to ByteBlockPool
Abstract class for allocating and freeing int blocks.
A simple IntBlockPool.Allocator that never recycles.
Comparator based on Integer.compare(int, int) for numHits.
Forked from HPPC, holding int index and int value.
Abstract FunctionValues implementation which supports retrieving int values.
A hash map of int to double, implemented using open addressing with linear probing for collision resolution.
Forked from HPPC, holding int index,key and value
Encode a character array Integer as a BytesRef.
The "intersect" TermsEnum response to UniformSplitTerms.intersect(CompiledAutomaton, BytesRef), intersecting the terms with an automaton.
Block iteration order.
 
 
 
 
Constructs an IntervalsSource based on analyzed text.
Wraps an IntervalIterator and passes through those intervals that match the IntervalFilter.accept() function
Representation of an interval function that can be converted to IntervalsSource.
A DocIdSetIterator that also allows iteration over matching intervals in a document.
 
 
An extension of MatchesIterator that allows it to be treated as an IntervalIterator
A query that retrieves documents containing intervals returned from an IntervalsSource
Node that represents an interval function.
Builds a Query from an IntervalQueryNode.
This processor makes sure that StandardQueryConfigHandler.ConfigurationKeys.ANALYZER is defined in the QueryConfigHandler and injects this analyzer into IntervalQueryNodes.
Factory functions for creating interval sources.
 
 
 
 
A helper class for IntervalQuery that provides an IntervalIterator for a given field and segment
Field that stores a per-document int value for scoring, sorting or value retrieval and index the field for fast range filters.
Obtains int field values from LeafReader.getNumericDocValues(java.lang.String) and makes those values available as other numeric types, casting as needed.
A hash map of int to float, implemented using open addressing with linear probing for collision resolution.
Forked from HPPC, holding int index,key and value
A hash set of ints, implemented using open addressing with linear probing for collision resolution.
A hash map of int to int, implemented using open addressing with linear probing for collision resolution.
Forked from HPPC, holding int index,key and value
A hash map of int to Object, implemented using open addressing with linear probing for collision resolution.
Forked from HPPC, holding int index,key and value
An indexed int field for fast range filters.
Builder for multi range queries for IntPoints
An indexed Integer Range field.
DocValues field for IntRange.
 
Adaptive selection algorithm based on the introspective quick select algorithm.
Sorter implementation based on a variant of the quicksort algorithm called introsort: when the recursion level exceeds the log of the length of the array to sort, it falls back to heapsort.
An FST Outputs implementation where each output is a sequence of ints.
 
Represents int[], as a slice (offset + length) into an existing int[].
A builder for IntsRef instances.
Enumerates all input (IntsRef) + output pairs in an FST.
Holds a single input (IntsRef) + output pair.
Native int to int function
Exception thrown if TokenStream Tokens are incompatible with provided text
Describes how an IndexableField should be inverted for indexing terms and postings.
An IO operation with a single input that may throw an IOException.
IOContext holds additional details on the merge/search context.
Context is a enumerator which specifies the context in which the Directory is being used for.
A Function that may throw an IOException
A Runnable that may throw an IOException
This is a result supplier that is allowed to throw an IOException.
Utilities for dealing with Closeables.
Deprecated, for removal: This API element is subject to removal in a future version.
was replaced by IOConsumer.
Deprecated, for removal: This API element is subject to removal in a future version.
was replaced by IOFunction.
Analyzer for Irish.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
Normalises token text to lower case, handling t-prothesis and n-eclipsis (i.e., that 'nAthair' should become 'n-athair')
This class implements the stemming algorithm defined by a snowball script.
 
Analyzer for Italian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies ItalianLightStemmer to stem Italian words.
Light Stemmer for Italian.
This class implements the stemming algorithm defined by a snowball script.
Analyzer for Japanese that uses morphological analysis.
Atomically loads DEFAULT_STOP_SET, DEFAULT_STOP_TAGS in a lazy fashion once the outer class accesses the static final set the first time.
Replaces term text with the BaseFormAttribute.
Analyzer for Japanese completion suggester.
A TokenFilter that adds Japanese romanized tokens to the term attribute.
 
 
Completion mode
Utility methods for Japanese filters.
A TokenFilter that normalizes small letters (捨て仮名) in hiragana into normal letters.
Normalizes Japanese horizontal iteration marks (odoriji) to their expanded form.
A TokenFilter that normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC).
A TokenFilter that normalizes small letters (捨て仮名) in katakana into normal letters.
A TokenFilter that normalizes Japanese numbers (kansūji) to regular Arabic decimal numbers in half-width characters.
Buffer that holds a Japanese number string and a position index used as a parsed-to marker
Removes tokens that match a set of part-of-speech tags.
A TokenFilter that replaces the term attribute with the reading of a token in either katakana or romaji form.
Tokenizer for Japanese that uses morphological analysis.
 
Tokenization mode: this determines how the tokenizer handles compound and unknown words.
 
Token type reflecting the original source of this token
 
Factory for JapaneseTokenizer.
Similarity measure for short strings such as person names.
InfoStream implementation that logs every message using Java Utils Logging (JUL) with the supplied log level.
This class provides an empty implementation of JavascriptVisitor, which can be extended to create a visitor which only needs to handle a subset of the available methods.
An expression compiler for javascript expressions.
 
Overrides the ANTLR 4 generated JavascriptLexer to allow for proper error handling
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Allows for proper error handling in the ANTLR 4 parser
This interface defines a complete generic visitor for a parse tree produced by JavascriptParser.
Use a field value and find the Document Frequency within another field.
Utility for query time joining.
Similar to LongFunction for primitive argument and result.
Similar to BiConsumer for primitive arguments.
Converts a Katakana string to Romaji using the pre-defined Katakana-Romaji mapping rules.
 
This IndexDeletionPolicy implementation that keeps only the most recent commit and immediately removes all prior commits after a new commit is done.
A TokenFilter that only keeps tokens with text contained in the required words.
Factory for KeepWordFilter.
"Tokenizes" the entire stream as a single token.
This attribute can be used to mark a token as a keyword.
Default implementation of KeywordAttribute.
Field that indexes a per-document String or BytesRef into an inverted index for fast filtering, stores values in a columnar fashion using DocValuesType.SORTED_SET doc values for sorting and faceting, and optionally stores values as stored fields for top-hits retrieval.
Marks terms as keywords via the KeywordAttribute.
Factory for KeywordMarkerFilter.
This TokenFilter emits each incoming token twice once as keyword and once non-keyword, in other words once with KeywordAttribute.setKeyword(boolean) set to true and once set to false.
Factory for KeywordRepeatFilter.
Emits the entire input as a single token.
Factory for KeywordTokenizer.
A k-Nearest Neighbor classifier based on NearestFuzzyQuery.
A k-Nearest Neighbor classifier (see http://en.wikipedia.org/wiki/K-nearest_neighbors ) based on MoreLikeThis
A k-Nearest Neighbor Document classifier (see http://en.wikipedia.org/wiki/K-nearest_neighbors) based on MoreLikeThis .
A field that contains a single byte numeric vector (or none) for each document.
Uses KnnVectorsReader.search(String, byte[], KnnCollector, Bits) to perform nearest neighbour search.
KnnCollector is a knn collector used for gathering kNN results and providing topDocs from the gathered neighbors
KnnCollectorManager responsible for creating KnnCollector instances.
Vectors' writer for a field
A field that contains a single floating-point numeric vector (or none) for each document.
Uses KnnVectorsReader.search(String, float[], KnnCollector, Bits) to perform nearest neighbour search.
Deprecated.
Deprecated.
Use FieldExistsQuery instead.
Deprecated.
Encodes/decodes per-document vector and any associated indexing structures required to support nearest-neighbor search
This static holder class prevents classloading deadlock by delaying init of doc values formats until needed.
Reads vectors from an index.
Writes vectors to an index.
 
View over multiple vector values supporting iterator-style access via DocIdMerger.
 
 
Tracks state of one sub-reader that we are merging
Analyzer for Korean that uses morphological analysis.
A TokenFilter that normalizes Korean numbers to regular Arabic decimal numbers in half-width characters.
Buffer that holds a Korean number string and a position index used as a parsed-to marker
Factory for KoreanNumberFilter.
Removes tokens that match a set of part-of-speech tags.
Replaces term text with the ReadingAttribute which is the Hangul transcription of Hanja characters.
Tokenizer for Korean that uses morphological analysis.
Decompound mode: this determines how the tokenizer handles POS.Type.COMPOUND, POS.Type.INFLECT and POS.Type.PREANALYSIS tokens.
 
Token type reflecting the original source of this token
 
Factory for KoreanTokenizer.
This class implements the stemming algorithm defined by a snowball script.
A list of words used by Kstem
A list of words used by Kstem
A list of words used by Kstem
A list of words used by Kstem
A list of words used by Kstem
A list of words used by Kstem
A list of words used by Kstem
A list of words used by Kstem
A high-performance kstem filter for english.
Factory for KStemFilter.
This class implements the Kstem algorithm
 
Associates a label with a CharArrayMatcher to distinguish different sources for terms in highlighting
The lambda (λw) parameter in information-based models.
Computes lambda as docFreq+1 / numberOfDocuments+1.
Computes lambda as totalTermFreq+1 / numberOfDocuments+1.
Optimized collector for large number of hits.
An indexed 2-Dimension Bounding Box field for the Geospatial Lat/Lon Coordinate system
An object for accumulating latitude/longitude bounds information.
Distance query for LatLonDocValuesField.
An per-document location field.
Finds all previously indexed geo points that comply the given ShapeField.QueryRelation with the specified array of LatLonGeometry.
Lat/Lon Geometry object.
An indexed location field.
Compares documents by distance from an origin point
 
Distance query for LatLonPoint.
Holder class for prototype sandboxed queries
Finds all previously indexed geo points that comply the given ShapeField.QueryRelation with the specified array of LatLonGeometry.
Sorts by distance from an origin location.
An geo shape utility class for indexing and searching gis geometries whose vertices are latitude, longitude values (in decimal degrees).
Finds all previously indexed geo shapes that intersect the specified bounding box.
Holds spatial logic for a bounding box that works in the encoded space
A concrete implementation of ShapeDocValues for storing binary doc value representation of LatLonShape geometries in a LatLonShapeDocValuesField
Concrete implementation of a ShapeDocValuesField for geographic geometries.
Bounding Box query for ShapeDocValuesField representing XYShape
Finds all previously indexed geo shapes that comply the given ShapeField.QueryRelation with the specified array of LatLonGeometry.
Analyzer for Latvian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies LatvianStemmer to stem Latvian words.
Factory for LatvianStemFilter.
Light stemmer for Latvian.
 
Defers actually loading a field's value until you ask for it.
Collector decouples the score from the collected doc: the score computation is skipped entirely if it's not needed.
Expert: comparator that gets instantiated on each leaf from a top-level FieldComparator instance.
Provides read-only metadata about a leaf.
LeafReader is an abstract class, providing an interface for accessing an index.
Retrieves an instance previously written by LegacyDirectMonotonicWriter.
In-memory metadata that needs to be kept around for LegacyDirectMonotonicReader to read data from disk.
Write monotonically-increasing sequences of integers.
Retrieves an instance previously written by LegacyDirectWriter
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Class for writing packed integers to be directly read from Directory.
 
Immutable version of Packed64 which is constructed from am existing DataInput.
This class is similar to LegacyPacked64 except that it trades space for speed by ensuring that a single block needs to be read/written in order to read/write a value.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Legacy PackedInts operations.
Removes words that are too long or too short from the stream.
Factory for LengthFilter.
Wraps another BreakIterator to skip past breaks that would result in passages that are too short.
A LetterTokenizer is a tokenizer that divides text at non-letters.
Factory for LetterTokenizer.
Parametric description for generating a Levenshtein automaton of degree 1.
Parametric description for generating a Levenshtein automaton of degree 1, with transpositions as primitive edits.
Parametric description for generating a Levenshtein automaton of degree 2.
Parametric description for generating a Levenshtein automaton of degree 2, with transpositions as primitive edits.
Class to construct DFAs that match a word within some edit distance.
A ParametricDescription describes the structure of a Levenshtein DFA for some degree n.
Levenshtein edit distance class.
The Lift class is a data structure that is a variation of a Patricia trie.
Builder for MoreLikeThisQuery
FiniteStringsIterator which limits the number of iterated accepted strings.
This Analyzer limits the number of tokens while indexing.
This TokenFilter limits the number of tokens while indexing.
Lets all tokens pass through until it sees one with a start offset <= a configured limit, which won't pass and ends the stream.
This is a simplified version of org.apache.lucene.analysis.miscellaneous.LimitTokenOffsetFilter to prevent a dependency on analysis-common.jar.
This TokenFilter limits its emitted tokens to those with positions that are not greater than the configured limit.
Represents a line on the earth's surface.
2D geo line implementation represented as a balanced interval tree of edges.
Linear distance computation style.
LinearFloatFunction implements a linear function over another ValueSource.
Linear squared distance computation style.
Wraps another Outputs implementation and encodes one or more of its output values.
Pass a the field value through as a String, no matter the type // Q: doesn't this mean it's a "string"?
Analyzer for Lithuanian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
Format for live/deleted documents
Tracks live field values across NRT reader reopens.
Holds all the configuration used by IndexWriter with few setters for settings that can be changed on an IndexWriter instance "live".
Bayesian smoothing using Dirichlet priors.
Language model based on the Jelinek-Mercer smoothing method.
Abstract superclass for language modeling Similarities.
A strategy for computing the collection language model.
Models p(w|C) as the number of occurrences of the term in the collection, divided by the total number of tokens + 1.
Stores the collection distribution of the current term.
An interprocess mutex lock.
Base class for Locking implementation.
This exception is thrown when the write.lock could not be acquired.
This exception is thrown when the write.lock could not be released.
Simple standalone tool that forever acquires and releases a lock using a specific LockFactory.
This class makes a best-effort check that a provided Lock is valid before any destructive filesystem operation.
Simple standalone server that must be running when you use VerifyingLockFactory.
This is a LogMergePolicy that measures size of a segment as the total byte size of the segment's files.
This is a LogMergePolicy that measures size of a segment as the number of documents (not taking deletions into account).
This class implements a MergePolicy that tries to merge segments into levels of exponentially increasing size, where each level has fewer segments than the value of the merge factor.
 
An array-backed list of long.
An iterator implementation for LongArrayList.iterator().
BitSet of fixed length (numBits), backed by accessible (LongBitSet.getBits()) long[], accessed with a long index.
Comparator based on Long.compare(long, long) for numHits.
Forked from HPPC, holding int index and long value.
 
Abstract FunctionValues implementation which supports retrieving long values.
Field that stores a per-document long value for scoring, sorting or value retrieval and index the field for fast range filters.
Obtains long field values from LeafReader.getNumericDocValues(java.lang.String) and makes those values available as other numeric types, casting as needed.
A hash map of long to float, implemented using open addressing with linear probing for collision resolution.
Forked from HPPC, holding int index,key and value
A hash set of longs, implemented using open addressing with linear probing for collision resolution.
A min heap that stores longs; a primitive priority queue that like all priority queues maintains a partial ordering of its elements such that the least element can always be found in constant time.
A hash map of long to int, implemented using open addressing with linear probing for collision resolution.
Forked from HPPC, holding int index,key and value
A hash map of long to Object, implemented using open addressing with linear probing for collision resolution.
Forked from HPPC, holding int index,key and value
An indexed long field for fast range filters.
Builder for multi range queries for LongPoints
An indexed Long Range field.
Represents a contiguous range of long values, with an inclusive minimum and exclusive maximum
DocValues field for LongRange.
Groups double values into ranges
A GroupSelector implementation that groups documents by long values
 
Represents long[], as a slice (offset + length) into an existing long[].
Per-segment, per-document long values, which can be calculated at search-time
Abstraction over an array of longs.
Base class for producing LongValues
A ConstantLongValuesSource that always returns a constant value
 
 
 
 
 
Simple Lookup interface for CharSequence suggestions.
 
A PriorityQueue collecting a fixed size of high priority Lookup.LookupResult
Result of a lookup.
This class implements the stemming algorithm defined by a snowball script.
Utility class that can efficiently compress arrays that mostly contain characters in the [0x1F,0x3F) or [0x5F,0x7F) ranges, which notably include all digits, lowercase characters, '.', '-' and '_'.
Normalizes token text to lower case.
Normalizes token text to lower case.
Factory for LowerCaseFilter.
A QueryCache that evicts queries using a LRU (least-recently-used) eviction policy in order to remain under a given maximum size and number of bytes used.
Cache of doc ids with a count.
 
A LSB Radix sorter for unsigned int values.
A block-based terms index and dictionary that assigns terms to variable length blocks according to how they share prefixes.
Lucene 5.0 compound file format
Class for accessing a compound stream.
Offset/Length for a slice inside of a compound file
A StoredFieldsFormat that compresses documents in chunks in order to improve the compression ratio.
A serialized document, you need to decode its input in order to get an actual Document.
A TermVectorsFormat that compresses chunks of documents together in order to improve the compression ratio.
 
 
 
Lucene 5.0 live docs format
Lucene 5.0 postings format, which encodes postings in packed integer blocks for fast decode.
Holds all state required for Lucene50PostingsReader to produce a PostingsEnum without re-seeking the terms dict.
Concrete class that reads docId(maybe frq,pos,offset,payloads) list with postings format.
 
 
Implements the skip list reader for block postings format that stores positions and payloads.
Lucene 5.0 stored fields format.
 
Configuration option for stored fields.
Lucene 6.0 Field Infos format.
Lucene 6.0 point format, which encodes dimensional values in a block KD-tree structure for fast 1D range and N dimensional shape intersection filtering.
Reads point values previously written with Lucene60PointsWriter
Implements the Lucene 7.0 index format, with configurable per-field postings and docvalues formats.
 
Lucene 7.0 DocValues format.
 
 
 
 
 
 
 
 
 
 
 
 
 
Lucene 7.0 Score normalization format.
 
 
 
Lucene 7.0 Segment info format.
Implements the Lucene 8.0 index format.
 
Lucene 8.0 DocValues format.
Configuration option for doc values.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Lucene 8.0 Score normalization format.
 
 
 
Implements the Lucene 8.4 index format, with configurable per-field postings and docvalues formats.
Lucene 5.0 postings format, which encodes postings in packed integer blocks for fast decode.
Holds all state required for Lucene84PostingsReader to produce a PostingsEnum without re-seeking the terms dict.
Concrete class that reads docId(maybe frq,pos,offset,payloads) list with postings format.
 
 
Implements the skip list reader for block postings format that stores positions and payloads.
Implements the Lucene 8.6 index format, with configurable per-field postings and docvalues formats.
Lucene 8.6 point format, which encodes dimensional values in a block KD-tree structure for fast 1D range and N dimensional shape intersection filtering.
Reads point values previously written with Lucene86PointsWriter
Lucene 8.6 Segment info format.
Implements the Lucene 8.6 index format, with configurable per-field postings and docvalues formats.
Configuration option for the codec.
Lucene 8.7 stored fields format.
Configuration option for stored fields.
A block-based terms index and dictionary that assigns terms to variable length blocks according to how they share prefixes.
Block-based terms index and dictionary writer.
 
 
 
A helper class for an hnsw graph that serves as a comparator of the currently set bound value with a new value.
A helper class for an hnsw graph that serves as a comparator of the currently set maximum value with a new value.
A helper class for an hnsw graph that serves as a comparator of the currently set minimum value with a new value.
Implements the Lucene 9.0 index format
Configuration option for the codec.
Lucene 9.0 compound file format
 
 
Class for accessing a compound stream.
Offset/Length for a slice inside of a compound file
A StoredFieldsFormat that compresses documents in chunks in order to improve the compression ratio.
A serialized document, you need to decode its input in order to get an actual Document.
 
 
A TermVectorsFormat that compresses chunks of documents together in order to improve the compression ratio.
 
 
 
 
 
 
Lucene 9.0 DocValues format.
 
 
 
 
 
 
 
 
 
 
Lucene 9.0 Field Infos format.
Builder for HNSW graph.
Lucene 9.0 vector format, which encodes numeric vector values and an optional associated graph connecting the documents having values.
Reads vectors from the index segments along with index data structures supporting KNN search.
 
Read the vector values from the index input.
Read the nearest-neighbors graph from the index input
Lucene 9.0 live docs format
NeighborArray encodes the neighbors of a node and their mutual scores in the HNSW graph as a pair of growable arrays.
Lucene 9.0 Score normalization format.
 
 
 
An HnswGraph where all nodes and connections are held in memory.
Lucene 9.0 point format, which encodes dimensional values in a block KD-tree structure for fast 1D range and N dimensional shape intersection filtering.
Reads point values previously written with Lucene90PointsWriter
Writes dimensional values
Lucene 9.0 postings format, which encodes postings in packed integer blocks for fast decode.
Holds all state required for Lucene90PostingsReader to produce a PostingsEnum without re-seeking the terms dict.
Concrete class that reads docId(maybe frq,pos,offset,payloads) list with postings format.
Concrete class that writes docId(maybe frq,pos,offset,payloads) list with postings format.
 
 
Lucene 9.0 Segment info format.
Implements the skip list reader for block postings format that stores positions and payloads.
Write skip lists with multiple levels, and support skip within block ints.
Lucene 9.0 stored fields format.
Configuration option for stored fields.
A helper class for an hnsw graph that serves as a comparator of the currently set bound value with a new value.
A helper class for an hnsw graph that serves as a comparator of the currently set maximum value with a new value.
A helper class for an hnsw graph that serves as a comparator of the currently set minimum value with a new value.
Implements the Lucene 9.1 index format
Configuration option for the codec.
Lucene 9.1 vector format, which encodes numeric vector values and an optional associated graph connecting the documents having values.
Reads vectors from the index segments along with index data structures supporting KNN search.
 
Read the vector values from the index input.
Read the nearest-neighbors graph from the index input
NeighborArray encodes the neighbors of a node and their mutual scores in the HNSW graph as a pair of growable arrays.
An HnswGraph where all nodes and connections are held in memory.
Implements the Lucene 9.2 index format
Configuration option for the codec.
Lucene 9.2 vector format, which encodes numeric vector values and an optional associated graph connecting the documents having values.
Reads vectors from the index segments along with index data structures supporting KNN search.
 
Read the nearest-neighbors graph from the index input
Implements the Lucene 9.4 index format
Configuration option for the codec.
Lucene 9.0 Field Infos format.
Lucene 9.4 vector format, which encodes numeric vector values and an optional associated graph connecting the documents having values.
Reads vectors from the index segments along with index data structures supporting KNN search.
 
Read the nearest-neighbors graph from the index input
Implements the Lucene 9.5 index format
Configuration option for the codec.
Lucene 9.5 vector format, which encodes numeric vector values and an optional associated graph connecting the documents having values.
Reads vectors from the index segments along with index data structures supporting KNN search.
 
Read the nearest-neighbors graph from the index input
Implements the Lucene 9.9 index format
Configuration option for the codec.
Lucene 9.9 flat vector format, which encodes numeric vector values
Reads vectors from the index segments.
 
Writes vector values to index segments.
 
 
Lucene 9.9 vector format, which encodes numeric vector values into an associated graph connecting the documents having values.
Lucene 9.9 vector format, which encodes numeric vector values into an associated graph connecting the documents having values.
Reads vectors from the index segments along with index data structures supporting KNN search.
 
Read the nearest-neighbors graph from the index input
Writes vector values and knn graphs to index segments.
 
Lucene 9.9 postings format, which encodes postings in packed integer blocks for fast decode.
Holds all state required for Lucene99PostingsReader to produce a PostingsEnum without re-seeking the terms dict.
Concrete class that reads docId(maybe frq,pos,offset,payloads) list with postings format.
Concrete class that writes docId(maybe frq,pos,offset,payloads) list with postings format.
Optimized scalar quantized implementation of FlatVectorsScorer for quantized vectors stored in the Lucene99 format.
 
Calculates dot product on quantized vectors, applying the appropriate corrections
 
 
 
 
Format supporting vector quantization, storage, and retrieval
Reads Scalar Quantized vectors from the index segments along with index data structures.
 
 
Writes quantized vector values and metadata to index segments.
 
 
Returns a merged view over all the segment's QuantizedByteVectorValues.
 
 
 
 
 
 
Lucene 9.9 Segment info format.
Implements the skip list reader for block postings format that stores positions and payloads.
Write skip lists with multiple levels, and support skip within block ints.
Lucene Dictionary: terms taken from the given field of a Lucene index.
Damerau-Levenshtein (optimal string alignment) implemented in a consistent way as Lucene's FuzzyTermsEnum with the transpositions option enabled.
LZ4 compression and decompression routines.
Simple lossy LZ4.HashTable that only stores the last ocurrence for each hash on 2^14 bytes of memory.
A record of previous occurrences of sequences of 4 bytes.
A higher-precision LZ4.HashTable.
 
16 bits per offset.
32 bits per value, only used when inputs exceed 64kB, e.g.
A compression mode that compromises on the compression ratio to provide fast compression and decompression.
A compression mode that compromises on the compression ratio to provide fast compression and decompression.
 
 
 
 
Helper class for keeping Lists of Objects associated with keys.
 
A Fields implementation that merges multiple Fields into one, and maps around deleted documents.
 
 
Simplistic CharFilter that applies the mappings contained in a NormalizeCharMap to the character stream, and correcting the resulting changes to the offsets.
Factory for MappingCharFilter.
Exposes flex API, merged from flex API of sub-segments, remapping docIDs (this is used for segment merging).
 
A query that matches all documents.
Builder for MatchAllDocsQuery
A MatchAllDocsQueryNode indicates that a query node tree or subtree will match all documents if executed in the index.
Builds a MatchAllDocsQuery object from a MatchAllDocsQueryNode object.
This processor converts every WildcardQueryNode that is "*:*" to MatchAllDocsQueryNode.
Interface for the creation of new CandidateMatcher objects
Reports the positions and optionally offsets of all matching terms in a query for a single document
An iterator over match positions (and optionally offsets) for a single document and field
Contains static functions that aid the implementation of Matches and MatchesIterator interfaces.
An example highlighter that combines several lower-level highlighting utilities in this package into a fully featured, ready-to-use component.
Single document's highlights.
 
Actual per-field highlighter.
An OffsetRange of a match, together with the source query that caused it.
Class to hold the results of matching a single Document against queries held in the Monitor
Computes which segments have identical field name to number mappings, which allows stored fields and term vectors in this codec to be bulk-merged.
Computes which segments have identical field name to number mappings, which allows stored fields and term vectors in this codec to be bulk-merged.
A query that matches no documents.
A MatchNoDocsQueryNode indicates that a query node tree or subtree will not match any documents if executed in the index.
Builds a MatchNoDocsQuery object from a MatchNoDocsQueryNode object.
Utility class to compute a list of "match regions" for a given query, searcher and document(s) using Matches API.
Implements MatchRegionRetriever.FieldValueProvider wrapping a preloaded Document.
An abstraction that provides document values for a given field.
A callback for accepting a single document (and its associated leaf reader, leaf document ID) and its match offset ranges, as indicated by the Matches interface retrieved for the query.
Math static utility methods.
Returns the value of IndexReader.maxDoc() for every document.
MaxFloatFunction returns the max of its components.
Implementation class for MaxNonCompetitiveBoostAttribute.
Returns the maximum payload score seen, else 1 if there are no payloads on the doc.
Maintains the maximum score and its corresponding document id concurrently
 
 
Compute maximum scores based on Impacts and keep them in a cache in order not to run expensive similarity score computations multiple times on the same data.
Implemented by Geo3D shapes that can calculate if a point is within it or not.
Bitset collector which supports memory tracking
High-performance single-document main memory Apache Lucene fulltext search index.
 
 
 
 
 
 
A MemoryIndex.SlicedIntBlockPool.SliceWriter that allows to write multiple integer slices into a given IntBlockPool.
Uses an Analyzer on content to get offsets and then populates a MemoryIndex.
Tracks dynamic allocations/deallocations of memory for transient objects
Provides a merged sorted view from several sorted iterators.
 
 
A MergeInfo provides information required for a MERGE context.
A simple extension to wrap MergePolicy to merge all tiny segments (or at least segments smaller than specified in MergeOnFlushMergePolicy.setSmallSegmentThresholdMB(double) into one segment on commit.
Utility class to handle conversion between megabytes and bytes
Expert: a MergePolicy determines the sequence of primitive merge operations.
Thrown when a merge was explicitly aborted because IndexWriter.abortMerges() was called.
This interface represents the current context of the merge selection process.
Exception thrown if there are any problems while executing a merge.
 
A MergeSpecification instance provides the information necessary to perform multiple merges.
OneMerge provides the information necessary to perform an individual primitive merge operation, resulting in a single new segment.
Progress and state for an executing merge.
Reason for pausing the merge thread.
This is the RateLimiter that IndexWriter assigns to each running merge, to give MergeSchedulers ionice like control.
Expert: IndexWriter uses an instance implementing this interface to execute the merges selected by a MergePolicy.
Provides access to new merges and executes the actual merge
Holds common state used during segment merging.
A map of doc IDs.
MergeTrigger is passed to MergePolicy.findMerges(MergeTrigger, SegmentInfos, MergePolicy.MergeContext) to indicate the event that triggered the merge.
Message Interface for a lazy loading.
Default implementation of Message interface.
Docs iterator that starts iterating from a configurable minimum document
MinFloatFunction returns the min of its components.
Generate min hash tokens from an incoming stream of tokens.
 
128 bits of state
Operations for minimizing automata.
 
 
 
 
 
 
 
 
Calculates the minimum payload seen
Node that represents a minimum-should-match restriction on a GroupQueryNode.
File-based Directory implementation that uses mmap for reading, and FSDirectory.FSIndexOutput for writing.
 
A ModifierQueryNode indicates the modifier value (+,-,?,NONE) for each term on the query string.
Modifier type: such as required (REQ), prohibited (NOT)
Builds no object, it only returns the Query object set on the ModifierQueryNode object using a QueryTreeBuilder.QUERY_TREE_BUILDER_TAGID tag.
A class that modifies the given misspelled word in various ways to get correct suggestions
 
Simple ResourceLoader that uses Module.getResourceAsStream(String) and Class.forName(Module,String) to open resources and classes, respectively.
A Monitor contains a set of Query objects with associated IDs, and efficiently matches them against sets of Document objects.
Statistics for the query cache and query index
 
Encapsulates various configuration settings for a Monitor's query index
Defines a query to be stored in a Monitor
Serializes and deserializes MonitorQuery objects into byte streams
For reporting events on a Monitor's query index
Provides random access to a stream written with MonotonicBlockPackedWriter.
A writer for large monotonically increasing sequences of positive longs.
 
 
Generate "more like this" similarity queries.
PriorityQueue that orders words by score.
Use for frequencies and to avoid renewing Integers.
 
A simple wrapper for MoreLikeThis for use in scenarios where a Query object is required eg in custom QueryParser extensions.
Radix sorter for variable-length strings.
Concatenates multiple Bits together, on every lookup.
Abstract ValueSource implementation which wraps multiple ValueSources and applies an extendible boolean function to their values.
A Collector which allows running a search with several Collectors.
 
 
A CollectorManager implements which wrap a set of CollectorManager as MultiCollector acts for Collector.
A wrapper for CompositeIndexReader providing access to DocValues.
Implements SortedDocValues over n subs, using an OrdinalMap
Implements MultiSortedSetDocValues over n subs, using an OrdinalMap
This processor is used to expand terms so the query looks for the same term in different fields.
A QueryParser which constructs queries to search multiple fields.
Provides a single Fields term index view over an IndexReader.
FieldOffsetStrategy that combines offsets from multiple fields.
Abstract ValueSource implementation which wraps multiple ValueSources and applies an extendible float function to their values.
Abstract parent class for ValueSource implementations that wrap multiple ValueSources and apply their own logic.
 
MultiLeafKnnCollector is a specific KnnCollector that can exchange the top collected results across segments through a shared global queue.
Utility methods for working with a IndexReader as if it were a LeafReader.
This abstract class reads skip lists with multiple levels.
This abstract class writes skip lists with multiple levels.
Class to hold the results of matching a batch of Documents against queries held in the Monitor
Copy of LeafSimScorer that sums document's norms from multiple fields.
 
This tool splits input index into multiple equal parts.
This class emulates deletions on the underlying index.
 
A TermFilteredPresearcher that indexes queries multiple times, with terms collected from different routes through a querytree.
A generalized version of PhraseQuery, with the possibility of adding more than one term at the same position that are treated as a disjunction (OR).
A builder for multi-phrase queries
 
Slower version of UnionPostingsEnum that delegates offsets and positions, for use by MatchesIterator
Takes the logical union of multiple PostingsEnum iterators.
disjunction of postings ordered by docid.
queue of terms for a single document.
A MultiPhraseQueryNode indicates that its children should be used to build a MultiPhraseQuery instead of PhraseQuery.
Builds a MultiPhraseQuery object from a MultiPhraseQueryNode object.
Exposes PostingsEnum, merged from PostingsEnum API of sub-segments.
Holds a PostingsEnum along with the corresponding ReaderSlice.
Abstract class for range queries involving multiple ranges against physical points such as IntPoints All ranges are logically ORed together
A builder for multirange queries.
A range represents anything with a min/max value that can compute its relation with another range and can compute if a point is inside it
Representation of a single clause in a MultiRangeQuery
An interval tree of Ranges for speeding up computations
Represents a range that can compute its relation with another range and can compute if a point is inside it
A CompositeReader which reads multiple indexes, appending their content.
A Multiset is a set that allows for duplicate elements.
Implements the CombSUM method for combining evidence from multiple similarity values described in: Joseph A.
 
 
 
Support for highlighting multi-term queries.
 
 
An abstract Query that matches documents containing a subset of terms provided by a FilteredTermsEnum enumeration.
Abstract class that defines how the query is rewritten.
A rewrite method that first translates each term into BooleanClause.Occur.SHOULD clause in a BooleanQuery, but adjusts the frequencies used for scoring to be blended across the terms, otherwise the rarest term typically ranks highest (often not useful eg in the set of expanded terms in a FuzzyQuery).
A rewrite method that first translates each term into BooleanClause.Occur.SHOULD clause in a BooleanQuery, but the scores are only computed as the boost.
A rewrite method that first translates each term into BooleanClause.Occur.SHOULD clause in a BooleanQuery, and keeps the scores as computed by the query.
This class provides the functionality behind MultiTermQuery.CONSTANT_SCORE_BLENDED_REWRITE.
This class provides the functionality behind MultiTermQuery.CONSTANT_SCORE_REWRITE.
This processor instates the default MultiTermQuery.RewriteMethod, MultiTermQuery.CONSTANT_SCORE_BLENDED_REWRITE, for multi-term query nodes.
Exposes flex API, merged from flex API of sub-segments.
Exposes TermsEnum API, merged from TermsEnum API of sub-segments.
 
 
The MultiTrie is a Trie of Tries.
The MultiTrie is a Trie of Tries.
Obtains double field values from LeafReader.getSortedNumericDocValues(java.lang.String) and using a SortedNumericSelector it gives a single-valued ValueSource view of a field.
Obtains float field values from LeafReader.getSortedNumericDocValues(java.lang.String) and using a SortedNumericSelector it gives a single-valued ValueSource view of a field.
Obtains int field values from LeafReader.getSortedNumericDocValues(java.lang.String) and using a SortedNumericSelector it gives a single-valued ValueSource view of a field.
Obtains long field values from LeafReader.getSortedNumericDocValues(java.lang.String) and using a SortedNumericSelector it gives a single-valued ValueSource view of a field.
A ValueSource that abstractly represents ValueSources for poly fields, and other things.
This is a very fast, non-cryptographic hash suitable for general hash-based lookup.
One leaf PointValues.PointTree whose order of points can be changed.
Utility APIs for sorting and partitioning buffered points.
Base class for all mutable values.
MutableValue implementation of type boolean.
MutableValue implementation of type Date.
MutableValue implementation of type double.
MutableValue implementation of type float.
MutableValue implementation of type int.
MutableValue implementation of type long.
MutableValue implementation of type String.
Utility class to help extract the set of sub queries that have matched from a larger query.
 
Helper class for loading named SPIs from classpath (e.g.
Interface to support NamedSPILoader.lookup(String) by name.
A default ThreadFactory implementation that accepts the name prefix of the created threads as a constructor argument.
Implements LockFactory using native OS file locks.
 
Simplification of FuzzyLikeThisQuery, to be used in the context of KNN classification.
 
 
 
KNN search on top of 2D lat/lon indexed points.
 
 
A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them.
Similar to NearSpansOrdered, but for the unordered case.
NeighborArray encodes the neighbors of a node and their mutual scores in the HNSW graph as a pair of growable arrays.
NeighborQueue uses a LongHeap to store lists of arcs in an HNSW graph, represented as a neighbor node id with an associated score packed together as a sortable long, which is sorted primarily by score.
 
Analyzer for Nepali.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
N-Gram version of edit distance based on paper by Grzegorz Kondrak, "N-gram similarity and distance".
Factory for NGramTokenFilter.
A FragmentChecker based on all character n-grams possible in a certain language, keeping them in a relatively memory-efficient, but probabilistic data structure.
A callback for n-gram ranges in words
This is a PhraseQuery which is optimized for n-gram phrase query.
Tokenizes the input into n-grams of the given size(s).
Tokenizes the input into n-grams of the given size(s).
Factory for NGramTokenizer.
An FSDirectory implementation that uses java.nio's FileChannel's positional read, which allows multiple threads to read from the same file without synchronizing.
MessageBundles classes extend this class, to implement a bundle.
Interface that exceptions should implement to support lazy loading of messages.
A NoChildOptimizationQueryNodeProcessor removes every BooleanQueryNode, BoostQueryNode, TokenizedPhraseQueryNode or ModifierQueryNode that do not have a valid children.
 
An IndexDeletionPolicy which keeps all index commits around, never deleting them.
Use this LockFactory to disable locking entirely.
 
A source returning no matches
A MergePolicy which never returns merges to execute.
A MergeScheduler which never executes any merges.
 
 
Never returns offsets.
A null FST Outputs implementation; use this if you just want to build an FSA.
Normal distance computation style.
This class acts as the base class for the implementations of the term frequency normalization methods in the DFR framework.
Implementation used when there is no normalization.
Normalization model that assumes a uniform distribution of the term frequency.
Normalization model in which the term frequency is inversely related to the length.
Dirichlet Priors normalization
Pareto-Zipf Normalization
Holds a map of String input to String output, to be used with MappingCharFilter.
Builds an NormalizeCharMap.
Normal squared distance computation style.
Abstract API that consumes normalization values.
Tracks state of one numeric sub-reader that we are merging
Deprecated.
Use FieldExistsQuery instead.
Encodes/decodes per-document score normalization values.
Abstract API that produces field normalization values
Function that returns the decoded norm for every document.
Buffers up pending long per doc, then flushes when segment flushes.
 
Analyzer for Norwegian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies NorwegianLightStemmer to stem Norwegian words.
Light Stemmer for Norwegian.
A TokenFilter that applies NorwegianMinimalStemmer to stem Norwegian words.
Minimal Stemmer for Norwegian Bokmål (no-nb) and Nynorsk (no-nn)
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (ae, oe, aa) by transforming them to åÅæÆøØ.
This class implements the stemming algorithm defined by a snowball script.
 
 
 
 
This DocIdSet encodes the negation of another DocIdSet.
A NoTokenFoundQueryNode is used if a term is convert into no tokens by the tokenizer/lemmatizer/analyzer (null).
Factory for prohibited clauses
Wraps a RAM-resident directory around any provided delegate directory, to be used during NRT search.
NRTSuggester executes Top N search on a weighted FST specified by a CompletionScorer
Helper to encode/decode payload (surface + PAYLOAD_SEP + docID) output
Compares partial completion paths using CompletionScorer.score(float, float), breaks ties comparing path inputs
Builder for NRTSuggester
 
Fragmenter implementation which does not fragment the text.
This Format parses Long into date strings and vice-versa.
Returns the value of IndexReader.numDocs() for every document.
Abstract numeric comparator for comparing numeric values.
A per-document numeric value.
Field that stores a per-document long value for scoring, sorting or value retrieval.
A DocValuesFieldUpdates which holds updates of documents, of a single NumericDocValuesField.
 
 
Buffers up pending long per doc, then flushes when segment flushes.
 
 
 
Assigns a payload to a token based on the TypeAttribute
Helper APIs to encode numeric values as sortable bytes and vice-versa.
Forked from HPPC, holding int index and Object value.
Read the vector values from the index input.
Read the vector values from the index input.
 
Dense vector values that are stored off-heap.
 
 
 
 
Read the vector values from the index input.
Read the vector values from the index input.
Read the vector values from the index input.
 
 
Dense vector values that are stored off-heap.
 
 
 
 
 
 
Provides off heap storage of finite state machine (FST), using underlying index input instead of byte store on heap
Read the quantized vector values and their score correction values from the index input.
Dense vector values that are stored off-heap.
 
 
Reads points from disk in a fixed-with format, previously written with OfflinePointWriter.
Reusable implementation for a point value offline
Writes points to disk in a fixed-with format.
On-disk sorting of byte arrays.
A bit more descriptive unit for constructors.
Utility class to read length-prefixed byte[] entries from an input.
Utility class to emit length-prefixed byte[] entries to an output stream for sorting.
 
Holds one partition of items, either loaded into memory or based on a file.
The start and end character offset of a Token.
Default implementation of OffsetAttribute.
Tracks a reference intervals source, and produces a pseudo-interval that appears either one position before or one position after each interval from the reference
 
This TokenFilter limits the number of tokens while indexing by adding up the current offset.
A non-empty range of offset positions.
An enumeration/iterator of a term and its offsets for use by FieldHighlighter.
A view over several OffsetsEnum instances, merging them in-place
Based on a MatchesIterator; does not look at submatches.
Based on a MatchesIterator with submatches.
 
Based on a PostingsEnum -- the typical/standard OE impl.
This strategy retrieves offsets directly from MatchesIterator, if they are available, otherwise it falls back to using OffsetsFromPositions.
This strategy applies to fields with stored positions but no offsets.
This strategy works for fields where we know the match occurred but there are no known positions or offsets.
This strategy works for fields where we know the match occurred but there are no known positions or offsets.
Determines how match offset regions are computed from MatchesIterator.
A per-field supplier of OffsetsRetrievalStrategy.
A wrapping merge policy that wraps the MergePolicy.OneMerge objects returned by the wrapped merge policy.
Provides storage of finite state machine (FST), using byte array or byte store allocated on heap.
An HnswGraph where all nodes and connections are held in memory.
 
A OpaqueQueryNode is used for specify values that are not supposed to be parsed by the parser.
Processes TermRangeQuerys with open ranges.
A StringBuilder that allows one to access the array.
Automata operations.
 
 
 
The Optimizer class is a Trie that will be reduced (have empty rows removed).
The Optimizer class is a Trie that will be reduced (have empty rows removed).
Node that represents Intervals.or(IntervalsSource...).
 
 
Maps per-segment ordinals to/from global ordinal space, using a compact packed-ints representation.
 
 
Wraps a provided KnnCollector object, translating the provided vectorId ordinal to a documentId
This is just like Lucene90BlockTreeTermsWriter, except it also stores a version per term, and adds a method to its TermsEnum implementation to seekExact only if the version is >= the specified version.
 
 
 
 
 
BlockTree's implementation of Terms.
 
 
Iterates through terms in this field.
Holds a single input (IntsRef) + output pair.
 
An ordinal based TermState
Configuration for DirectMonotonicReader and IndexedDISI for reading sparse vectors.
Factory for disjunctions
A OrQueryNode represents an OR boolean operation performed on a list of nodes.
Represents the outputs for an FST, providing the basic algebra required for building and traversing the FST.
A DataOutput wrapping a plain OutputStream.
Implementation class for buffered IndexOutput that writes to an OutputStream.
This subclass is an optimization for writing primitives.
 
Overlays a 2nd LeafReader for the terms of one field, otherwise the primary reader is consulted.
Space optimized random access capable array of values with a fixed number of bits/value.
This class is similar to Packed64 except that it trades space for speed by ensuring that a single block needs to be read/written in order to read/write a value.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
A DataInput wrapper to read unaligned, variable-length packed integers.
A DataOutput wrapper to write unaligned, variable-length packed integers.
Simplistic compression for array of unsigned long values.
A decoder for packed integers.
An encoder for packed integers.
A format to write packed ints.
Simple class that holds a format and a number of bits per value.
A packed integer array that can be modified.
 
A PackedInts.Reader which has all its values equal to 0 (bitsPerValue = 0).
A read-only random access array of positive integers.
A simple base for Readers that keeps track of valueCount and bitsPerValue.
Run-once iterator interface, to decode previously saved PackedInts.
 
A write-once Writer.
Utility class to compress integers into a LongValues instance.
A Builder for a PackedLongValues instance.
 
 
Represents a logical byte[] as a series of pages.
Provides methods to read BytesRefs from a frozen PagedBytes.
An FST Outputs implementation, holding two other outputs.
Holds a single pair of two outputs.
An CompositeReader which reads multiple, parallel indexes.
An LeafReader which reads multiple, parallel indexes.
 
Matcher class that runs matching queries in parallel.
 
 
 
A query that returns all the matching child documents for a specific parent document indexed together in the same block.
This exception is thrown when parse errors are encountered.
This exception is thrown when parse errors are encountered.
This exception is thrown when parse errors are encountered.
Thrown when the xml queryparser encounters invalid syntax/configuration.
This class represents an extension base class to the Lucene standard QueryParser.
A multi-threaded matcher that collects all possible matches in one pass, and then partitions them amongst a number of worker threads to perform the actual matching.
 
 
Part of Speech attributes for Korean.
Part of Speech attributes for Korean.
A passage is a fragment of source text, scored and possibly with a list of sub-offsets (markers) to be highlighted.
Represents a passage (typically a sentence of the document).
Adjusts the range of one or more passages over a given value.
Formats a collection of passages over a given string, cleaning up and resolving restrictions concerning overlaps, allowed sub-ranges over the input string and length restrictions.
Creates a formatted snippet from the top passages.
 
 
Ranks passages found by UnifiedHighlighter.
Selects fragments of text that score best for the given set of highlight markers.
Tokenizer for path-like hierarchies.
SmartChineseAnalyzer internal node representation
A PathQueryNode is used to store queries like /company/USA/California /product/shoes/brown.
Term text with a beginning and end position
CaptureGroup uses Java regexes to emit multiple tokens - one for each capture group in one or more patterns.
This interface is used to connect the XML pattern file parser to the hyphenation tree.
Marks terms as keywords via the KeywordAttribute.
A SAX document handler to read and parse hyphenation patterns from a XML file.
CharFilter that uses a regular expression for the target of replace string.
A TokenFilter which applies a Pattern to each token in the stream, replacing match occurrences with the specified replacement string.
This tokenizer uses regex pattern matching to construct distinct tokens for the input stream.
Factory for PatternTokenizer.
Set a type attribute to a parameterized value when tokens are matched by any of a several regex patterns.
Value holding class for pattern typing rules.
Provides a filter that will analyze tokens with the analyzer from an arbitrary field type.
The payload of a Token.
Default implementation of PayloadAttribute.
Defines a way of converting payloads to float values, for use by PayloadScoreQuery
Mainly for use with the DelimitedPayloadTokenFilter, converts char buffers to BytesRef.
 
An abstract class that defines a way for PayloadScoreQuery instances to transform the cumulative effects of payload scores for a document.
Utility methods for encoding payloads.
Defines an interface for testing if two payloads should be consider to match
Creates a payload matcher object based on a payload type and an operation.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
A Query class that uses a PayloadFunction to modify the score of a wrapped SpanQuery
SpanCollector for collecting payloads
Experimental class to get set of payloads for most standard Lucene queries.
This class handles accounting and applying pending deletes for live segment readers
 
This analyzer is used to facilitate scenarios where different fields require different analysis techniques.
Enables per field docvalues support.
 
Enables per field numeric vector support.
VectorReader that can wrap multiple delegate readers, selected by field.
 
Utility class creating a new MergeState to be restricted to a set of fields.
 
 
Enables per field postings support.
Group of fields written by one PostingsFormat
 
 
Provides the ability to use a different Similarity for different fields.
Analyzer for Persian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
CharFilter that replaces instances of Zero-width non-joiner with an ordinary space.
Factory for PersianCharFilter.
A TokenFilter that applies PersianNormalizer to normalize the orthography.
Normalizer for Persian.
A TokenFilter that applies PersianStemmer to stem Persian words.
Factory for PersianStemFilter.
Stemmer for Persian.
A SnapshotDeletionPolicy which adds a persistence layer so that snapshots can be maintained across the life of an application.
Utility class to encode sequences of 128 small positive integers.
Utility class to encode sequences of 128 small positive integers.
Utility class to encode sequences of 128 small positive integers.
Create tokens for phonetic matches.
Factory for PhoneticFilter.
Node that represents Intervals.phrase(String...).
Helps the FieldOffsetStrategy with position sensitive queries (e.g.
Needed to support the ability to highlight a query irrespective of the field a query refers to (aka requireFieldMatch=false).
 
Base class for exact and sloppy phrase matching
Position of a term in a document that takes into account the term offset within the phrase.
A Query that matches documents containing a particular sequence of terms.
A builder for phrase queries.
Term postings and position information for phrase matching
Builds a PhraseQuery object from a TokenizedPhraseQueryNode object.
 
 
Query node for PhraseQuery's slop factor.
This processor removes invalid SlopQueryNode objects in the query node tree.
Expert: Weight class for phrase matching
A generalized version of PhraseQuery, built with one or more MultiTermQuery that provides term expansions for multi-terms (one of the expanded terms must match).
Phrase term with expansions.
All PhraseWildcardQuery.PhraseTerm are light and immutable.
Phrase term with no expansion.
Holds a pair of term bytes - term state.
Holds the TermState for all the collected Term, for a specific phrase term, for all segments.
Holds the TermState and TermStatistics for all the matched and collected Term, for all phrase terms, for all segments.
Accumulates the doc freq and total term freq.
Test counters incremented when assertions are enabled.
Split an index based on a Query.
 
Remove this file when adding back compat codecs
Dictionary represented by a text file.
We know about three kinds of planes.
Holds mathematical constants associated with the model of a planet.
Utility class for encoding / decoding from lat/lon (decimal degrees) into sortable doc value numerics (integers)
Relates all Geo3d shape with a specific PlanetModel.
Represents a point on the earth's surface.
2D point implementation containing geo spatial logic.
Finds all previously indexed points that fall within the specified polygon.
 
 
Abstract query class to find all documents whose single or multi-dimensional point values, previously indexed with e.g.
Iterator of encoded point values.
 
This query node represents a field query that holds a point value.
This processor is used to convert FieldQueryNodes to PointRangeQueryNodes.
Abstract class for range queries against single or multidimensional points such as IntPoint.
Creates a range query across 1D PointValues.
This query node represents a range query composed by PointQueryNode bounds, which means the bound values are Numbers.
Builds PointValues range queries out of PointRangeQueryNodes.
This processor is used to convert TermRangeQueryNodes to PointRangeQueryNodes.
One pass iterator through all points previously written with a PointWriter, abstracting away whether points are read from (offline) disk or simple arrays in heap.
This class holds the configuration used to parse numeric queries and create PointValues queries.
Encodes/decodes indexed points.
Abstract API to visit point values.
Abstract API to write points
Represents a dimensional point value written in the BKD tree.
Access to indexed numeric values.
We recurse the PointValues.PointTree, using a provided instance of this to guide the recursion.
Basic operations to read the KD-tree.
Used by PointValues.intersect(org.apache.lucene.index.PointValues.IntersectVisitor) to check how each recursive cell corresponds to the query.
Buffers up pending byte[][] value(s) per doc, then flushes when segment flushes.
 
Appends many points, and then at the end provides a PointReader to iterate those points.
Analyzer for Polish.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
Represents a closed polygon on the earth's surface.
2D polygon implementation represented as a balanced interval tree of edges.
Transforms the token stream as per the Porter stemming algorithm.
Factory for PorterStemFilter.
Stemmer, implementing the Porter Stemming Algorithm
This class implements the stemming algorithm defined by a snowball script.
Analyzer for Portuguese.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies PortugueseLightStemmer to stem Portuguese words.
Light Stemmer for Portuguese
A TokenFilter that applies PortugueseMinimalStemmer to stem Portuguese words.
Minimal Stemmer for Portuguese
A TokenFilter that applies PortugueseStemmer to stem Portuguese words.
Portuguese stemmer implementing the RSLP (Removedor de Sufixos da Lingua Portuguesa) algorithm.
This class implements the stemming algorithm defined by a snowball script.
Part of speech classification for Korean based on Sejong corpus classification.
Part of speech tag for Korean based on Sejong corpus classification.
The type of the token.
Determines the position of this token relative to the previous Token in a TokenStream, used in phrase searching.
Default implementation of PositionIncrementAttribute.
Determines how many positions this token spans.
Default implementation of PositionLengthAttribute.
Utility class to record Positions Spans
An FST Outputs implementation where each output is a non-negative long value.
A Collector implementation which wraps another Collector and makes sure only documents with scores > 0 are collected.
Iterates through the postings.
Encodes/decodes terms, postings, and proximity data.
This static holder class prevents classloading deadlock by delaying init of postings formats until needed.
The core terms dictionaries (BlockTermsReader, BlockTreeTermsReader) interact with a single instance of this class to manage creation of PostingsEnum and PostingsEnum instances.
Utility class to encode/decode postings block.
Like PostingsOffsetStrategy but also uses term vectors (only terms needed) for multi-term queries.
Class that plugs into term dictionaries, such as Lucene90BlockTreeTermsWriter, and handles writing postings.
Function to raise the base "a" to the power "b"
This processor pipeline extends StandardQueryNodeProcessorPipeline and enables boolean precedence on it.
This query parser works exactly as the standard query parser ( StandardQueryParser ), except that it respect the boolean precedence, so <a AND b OR c AND d> is parsed to <(+a +b) (+c +d)> instead of <+a +b +c +d>.
Prefix codes term instances (prefixes are shared).
Builds a PrefixCodedTerms: call add repeatedly, then finish.
An iterator over the list of terms stored in a PrefixCodedTerms.
A CompletionQuery which takes an Analyzer to analyze the prefix of the query term.
A Query that matches documents containing terms with a specified prefix.
A PrefixWildcardQueryNode represents wildcardquery that matches abc* or *.
Builds a PrefixQuery object from a PrefixWildcardQueryNode object.
A Presearcher is used by the Monitor to reduce the number of queries actually run against a Document.
Wraps a QueryMatch with information about which queries were selected by the presearcher
Wraps a MultiMatchingQueries with information on which presearcher queries were selected
InfoStream implementation over a PrintStream such as System.out.
A priority queue maintains a partial ordering of its elements such that the least element can always be found in constant time.
ProductFloatFunction returns the product of its components.
This class wraps a Collector and times the execution of: - setScorer() - collect() - doSetNextReader() - needsScores()
Public class for profiled timings of the Collectors used in the search.
A collector that profiles how much time is spent calling it.
A ConditionalTokenFilter that only applies its wrapped filters to tokens that are not contained in a protected set.
Factory for a ProtectedTermFilter
A ProximityQueryNode represents a query where the terms should meet specific distance conditions.
utility class containing the distance condition and number
Distance condition: PARAGRAPH, SENTENCE, or NUMBER
Controls LeafFieldComparator how to skip documents
Extension of PostingsWriterBase, adding a push API for writing each element of the postings.
A version of ByteVectorValues, but additionally retrieving score correction offset for Scalar quantization scores.
Quantized vector reader
The abstract base class for queries.
Class to analyze and extract terms from a lucene query, to be used by a Presearcher in indexing.
An Analyzer used primarily at query time to wrap another analyzer and provide a layer of protection which prevents very common words from being passed into queries.
A BitSetProducer that wraps a query and caches matching BitSets per segment.
This interface is used by implementors classes that builds some kind of object from a query tree.
Implemented by objects that produce Lucene Query objects from XML streams.
Creates queries from the Analyzer chain.
Wraps a term and boost
Factory for QueryBuilder
A cache for queries.
 
A policy defining which filters should be cached.
This class can be used to hold any query configuration and no field configuration.
Split a disjunction query into its consituent parts, so that they can be indexed and run separately in the Monitor.
 
 
 
 
 
 
 
 
Represents a match for a specific query and document
A QueryNode is a interface implemented by all nodes on a QueryNode tree.
Error class with NLS support
This exception should be thrown if something wrong happens when dealing with QueryNodes.
A QueryNodeImpl is the default implementation of the interface QueryNode
Allow joining 2 QueryNode Trees, into one.
 
This should be thrown when an exception happens during the query parsing from string to the query node tree.
A QueryNodeProcessor is an interface for classes that process a QueryNode tree.
This is a default implementation for the QueryNodeProcessor interface, it's an abstract class, so it should be extended by classes that want to process a QueryNode tree.
 
A QueryNodeProcessorPipeline class should be used to build a query node processor pipeline.
This class is generated by JavaCC.
This class is generated by JavaCC.
 
 
 
 
The default operator for parsing queries.
This class is overridden by QueryParser in QueryParser.jj and acts to separate the majority of the Java code from the .jj grammar file.
Token literal values and constants.
Token literal values and constants.
This class is a helper for the query parser framework, it does all the three query parser phrases at once: text parsing, query processing and query building.
Flexible Query Parser message bundle class
Token Manager.
Token Manager.
This class defines utility methods to (help) parse query strings into Query objects.
A record of timings for the various operations that may happen during query execution.
An extension of IndexSearcher that records profile information for all queries it executes.
This class is the internal representation of a profiled Query, corresponding to a single node in the query tree.
Scorer wrapper that will compute how much time is spent on moving the iterator, confirming matches and computing scores.
Helps measure how much time is spent running some methods.
This enum breaks down the query into different sections to describe what was timed.
This class tracks the dependency tree for queries (scoring and rewriting) and generates QueryProfilerBreakdown for each node in the tree.
Weight wrapper that will compute how much time it takes to build the Scorer and then return a Scorer that is wrapped in order to compute timings as well.
A Rescorer that uses a provided Query to assign scores to the first-pass hits.
Scorer implementation which scores text fragments by the number of unique query terms found.
Utility class used to extract the terms used in a query, plus any weights.
 
Scorer implementation which scores text fragments by the number of unique query terms found.
Notified of the time it takes to run individual queries against a set of documents
Query timeout abstraction that controls whether a query should continue or be stopped.
An implementation of QueryTimeout that can be used by the ExitableDirectoryReader class to time out and exit out when a query takes a long time to rewrite.
A representation of a node in a query tree
 
 
This class should be used when there is a builder for each type of node.
QueryValueSource returns the relevance score of the query
Allows recursion through a query tree
A QuotedFieldQueryNode represents phrase query.
Radix selector.
A straightforward implementation of FSDirectory using java.io.RandomAccessFile.
Estimates the size (memory representation) of Java objects.
 
Utility methods to estimate the RAM usage of objects.
Random Access Index API.
Random access values for byte[], but also includes accessing the score correction constant for the current vector in the buffer.
Provides random access to vectors by dense ordinal.
Byte vector values.
Float vector values.
A RandomVectorScorer for scoring random nodes in batches against an abstract query.
Creates a default scorer for random access vectors.
A supplier that creates RandomVectorScorer from an ordinal.
Query class for searching RangeField types by a defined PointValues.Relation.
Used by RangeFieldQuery to check how each internal or leaf node relates to the query.
RangeMapFloatFunction implements a map function over another ValueSource whose values fall within min and max inclusive to target.
Builder for TermRangeQuery
This interface should be implemented by a QueryNode that represents some kind of range query.
Abstract base class to rate limit IO.
Simple class to rate limit IO.
Utility class to safely share DirectoryReader instances across multiple threads, while periodically reopening.
Holds shared SegmentReader instances.
 
This class merges the current on-disk DV with an incoming update DV instance and merges the two instances giving the incoming update precedence in terms of values, in other words the values of the update always wins over the on-disk version.
Subreader slice from a parent composite reader.
Common util methods for dealing with IndexReaders and IndexReaderContexts.
Attribute for Kuromoji reading data
Attribute for Korean reading data
Attribute for Kuromoji reading data
Attribute for Korean reading data
 
A Collector that decodes the stored query for each document hit reparsing them everytime.
An adapter class to use ByteBuffersDataOutput as a FSTReader.
ReciprocalFloatFunction implements a reciprocal function f(x) = a/(mx+b), based on the float value of a field or function as exported by ValueSource.
Represents a lat/lon rectangle.
2D rectangle implementation containing cartesian spatial logic.
A ByteBlockPool.Allocator implementation that recycles unused byte blocks in a buffer and reuses them in subsequent calls to RecyclingByteBlockAllocator.getByteBlock().
A IntBlockPool.Allocator implementation that recycles unused int blocks in a buffer and reuses them in subsequent calls to RecyclingIntBlockAllocator.getIntBlock().
The Reduce object is used to remove gaps in a Trie which stores a dictionary.
Manages reference counting for a given object.
Utility class to safely share instances of a certain type across multiple threads, while periodically refreshing them.
Use to receive notification when a refresh has finished.
A CompletionQuery which takes a regular expression as the prefix of the query term.
Regular Expression extension to Automaton.
The type of expression represented by a RegExp node.
Custom Functional Interface for a Supplying methods with signature of RegExp(int int1, RegExp exp1, RegExp exp2)
A fast regular expression query based on the org.apache.lucene.util.automaton package.
A query handler implementation that matches Regexp queries by indexing regex terms by their longest static substring, and generates ngrams from Document tokens to match them.
A RegexpQueryNode represents RegexpQuery query Examples: /[a-z]|[0-9]/
Builds a RegexpQuery object from a RegexpQueryNode object.
Processor for Regexp queries.
 
A QueryNodeProcessorPipeline class removes every instance of DeletedQueryNode from a query node tree.
A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream.
This processor removes every QueryNode that is not a leaf and has not children.
Generates an iterator that spans repeating instances of a sub-iterator, avoiding minimization.
 
 
 
 
A Scorer for queries with a required subscorer and an excluding (prohibited) sub Scorer.
A Scorer for queries with a required part and an optional part.
Re-scores the topN results (TopDocs) from an original query.
Abstraction for loading resources (streams, files, and classes).
Interface for a component that needs to be initialized by an implementation of ResourceLoader.
Internal class to enable reuse of the string reader by Analyzer.tokenStream(String,String)
Reads in reverse from a single byte[].
Tokenizer for domain-like hierarchies.
Implements reverse read from a RandomAccessInput.
Reverse token string, for example "country" => "yrtnuoc".
Factory for ReverseStringFilter.
 
DocIdSet implementation inspired from http://roaringbitmap.org/
A builder of RoaringDocIdSets.
DocIdSet implementation that can store documents up to 2^16-1 in a short[].
Acts like forever growing T[], but internally uses a circular buffer to reuse instances of T.
Implement to reset an instance
Acts like a forever growing char[] as you read characters into it from the provided reader, but internally it uses a circular buffer to only hold the characters that haven't been freed yet.
Analyzer for Romanian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
 
The Row class represents a row in a matrix representation of a trie.
Base class for stemmers that use a set of RSLP-like stemming steps.
A basic rule, with no exceptions.
A rule with a set of whole-word exceptions.
A rule with a set of exceptional suffixes.
A step containing a list of rules.
Finite-state automaton with fast run operation.
Analyzer for Russian language.
 
A TokenFilter that applies RussianLightStemmer to stem Russian words.
Light Stemmer for Russian.
This class implements the stemming algorithm defined by a snowball script.
An ExecutorService that executes tasks immediately in the calling thread during submit.
Default scalar quantized implementation of FlatVectorsScorer.
Quantized vector scorer supplier
Calculates and adjust the scores correctly for quantized vectors given the scalar quantization parameters
Compares two byte vectors
Calculates dot product on quantized vectors, applying the appropriate corrections
Calculates euclidean distance on quantized vectors, applying the appropriate corrections
Calculates max inner product on quantized vectors, applying the appropriate corrections
Will scalar quantize float vectors into `int8` byte values.
 
 
 
This class is used to correlate the scores of the nearest neighbors with the errors in the scores.
Scales values to be between min and max.
 
This filter folds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o.
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
This Normalizer does the heavy lifting for a set of Scandinavian normalization filters, normalizing use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
List of possible foldings that can be used when configuring the filter
Allows access to the score of a Query
A child Scorer and its relationship to its parent.
A Scorer which wraps another scorer and caches the score of the current document.
Holds one hit in TopDocs.
How to aggregate multiple child hit scores into a single parent score.
Different modes of search.
An implementation of FragmentsBuilder that outputs score-order fragments.
Comparator for FieldFragList.WeightedFragInfo by boost, breaking ties by offset.
A Scorer is responsible for scoring a stream of tokens.
Expert: Common scoring functionality for different types of queries.
A supplier of Scorer.
Util class for Scorer related methods
A QueryMatch that reports scores for each match
Base rewrite method that translates each term into a query, and keeps the scores as computed by the query.
Special implementation of BytesStartArray that keeps parallel arrays for boost and docFreq
This attribute stores the UTR #24 script value for a token of text.
Implementation of ScriptAttribute that stores the script as an integer.
An iterator that locates ISO 15924 script boundaries in text.
Factory class used by SearcherManager to create new IndexSearchers.
Keeps track of current plus old IndexSearchers, closing the old ones once they have timed out.
Simple pruner that drops any searcher older by more than the specified seconds, than the newest searcher.
 
Utility class to safely share IndexSearcher instances across multiple threads, while periodically reopening.
Represents a group that is found during the first pass search.
 
 
 
 
SecondPassGroupingCollector runs over an already collected set of groups, further applying a GroupReducer to each group
A filtered TermsEnum that uses a BytesRefHash as a filter
Graph representing possible tokens at each start offset in the sentence.
Interface defining whether or not an object can be cached against a LeafReader
Embeds a [read-only] SegmentInfo and adds per-commit fields.
Holds core readers that are shared (unchanged) when SegmentReader is cloned or reopened
Manages the DocValuesProducer held by SegmentReader and keeps track of their reference counting.
Encapsulates multiple producers when there are docvalues updates as one producer
Information about a segment such as its name, directory, and files related to the segment.
Expert: Controls the format of the SegmentInfo (segment metadata file).
A collection of segmentInfo objects with methods for operating on those segments in relation to the file system.
Utility class for executing code that needs to do something with the current segments file.
Breaks text into sentences with a BreakIterator and allows subclasses to decompose these sentences into words.
The SegmentMerger class combines two or more Segments, represented by an IndexReader, into a single Segment.
 
 
IndexReader implementation over a single segment.
Access to SegmentReader internals exposed to the test framework.
Holder class for common parameters used during read.
Iterates through terms in this field.
Iterates through terms in this field.
 
 
 
Holder class for common parameters used during write.
SmartChineseAnalyzer internal token
Filters a SegToken by converting full-width latin to half-width, then lowercasing latin.
A pair of tokens in SegGraph
An implementation of a selection algorithm, ie.
This attribute tracks what sentence a given token belongs to as well as potentially other sentence specific attributes.
Default implementation of SentenceAttribute.
A native int hash-based set where one value is reserved to mean "EMPTY" internally.
Analyzer for Serbian.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
Normalizes Serbian Cyrillic and Latin characters to "bald" Latin.
Normalizes Serbian Cyrillic to Latin.
This class implements the stemming algorithm defined by a snowball script.
Indicates that a geo3d object can be serialized and deserialized.
A MergeScheduler that simply does each merge sequentially, using the current thread.
Marks terms as keywords via the KeywordAttribute.
A convenient class which offers a semi-immutable object wrapper implementation which allows one to set the value of an object exactly once, and retrieve it many times.
Thrown when SetOnce.set(Object) is called more than once.
Holding object and marking that it was already set
A binary doc values format representation for LatLonShape and XYShape
 
Reads values from a ShapeDocValues Field
A doc values field for LatLonShape and XYShape that uses ShapeDocValues as the underlying binary doc value format.
A base shape utility class used for both LatLon (spherical) and XY (cartesian) shape fields.
Represents a encoded triangle using ShapeField.decodeTriangle(byte[], DecodedTriangle).
type of triangle
Query Relation Types *
polygons are decomposed into tessellated triangles using Tessellator these triangles are encoded and inserted as separate indexed POINT fields
A ShingleAnalyzerWrapper wraps a ShingleFilter around another Analyzer.
A ShingleFilter constructs shingles (token n-grams) from a token stream.
 
Factory for ShingleFilter.
Combination of a plane, and a sign value indicating what evaluation values are on the correct side of the plane.
Similarity defines the components of Lucene scoring.
Stores the weight for a query across the indexed collection.
A subclass of Similarity that provides a simplified API for its descendants.
Simple class that binds expression variable names to DoubleValuesSources or other Expressions.
BoolFunction implementation which applies an extendible boolean function to the values of a single wrapped ValueSource.
Simple boundary scanner implementation that divides fragments based on a set of separator characters.
Base Collector implementation that is used to collect all contexts.
Base FieldComparator implementation that is used for all contexts.
A simple implementation of FieldFragList.
A simple float function with a single argument
A simple implementation of FragListBuilder.
Fragmenter implementation which breaks text up into same-size fragments with no concerns over spotting sentence boundaries.
A simple implementation of FragmentsBuilder.
 
Does minimal parsing of a GeoJSON object, to extract either Polygon or MultiPolygon, either directly as the top-level type, or if the top-level type is Feature, as the geometry of that feature.
Simple Encoder implementation to escape text for HTML output
Simple Formatter implementation to highlight terms with a pre and post tag.
A very simple merged segment warmer that just ensures data structures are initialized.
A simplistic Lucene based NaiveBayes classifier, see http://en.wikipedia.org/wiki/Naive_Bayes_classifier
A simplistic Lucene based NaiveBayes classifier, see http://en.wikipedia.org/wiki/Naive_Bayes_classifier
This tokenizer uses a Lucene RegExp or (expert usage) a pre-built determinized Automaton, to locate tokens.
Factory for SimplePatternSplitTokenizer, for producing tokens by splitting according to the provided regexp.
This tokenizer uses a Lucene RegExp or (expert usage) a pre-built determinized Automaton, to locate tokens.
Factory for SimplePatternTokenizer, for matching tokens based on the provided regexp.
SimpleQueryParser is used to parse human readable query syntax.
 
Fragmenter implementation which breaks text up into same-size fragments but does not split up Spans.
Base class for queries that expand to sets of simple terms.
Callback to visit each matching term during "rewrite" in SimpleTerm.MatchingTermVisitor.visitMatchingTerm(Term)
 
Forked from BKDReader and simplified/specialized for SimpleText's usage
Forked from BKDWriter and simplified/specialized for SimpleText's usage
plain text index format.
plain text compound format.
plain text doc values format.
 
 
 
 
plaintext field infos format
 
 
For debugging, curiosity, transparency only!! Do not use this codec in production.
Reads vector values from a simple text format.
 
 
 
Writes vector-valued fields in a plain text format
reads/writes plaintext live docs
 
plain-text norms format.
Writes plain-text norms.
Reads plain-text norms.
For debugging, curiosity, transparency only!! Do not use this codec in production.
 
 
For debugging, curiosity, transparency only!! Do not use this codec in production.
plain text segments file format.
 
This class reads skip lists with multiple levels.
plain text skip data.
plain text stored fields format.
reads plaintext stored fields
Writes plain-text stored fields.
plain text term vectors format.
Reads plain-text term vectors.
 
 
 
 
 
 
Writes plain-text term vectors.
 
Parses shape geometry represented in WKT format
Enumerated type for Shapes
 
An implementation class of FragListBuilder that generates one FieldFragList.WeightedFragInfo object.
A function with a single argument
Implements LockFactory for a single in-process instance, meaning all locking will take place through this one instance.
 
Subclass of FilteredTermsEnum for enumerating a single term.
Exposes multi-valued view over a single-valued instance.
Exposes multi-valued iterator view over a single-valued iterator.
Directory that wraps another, and that sleeps and retries if obtaining the lock fails.
Math functions that trade off accuracy for speed.
Find all slop-valid position-combinations (matches) encountered while traversing/hopping the PhrasePositions.
A SlopQueryNode represents phrase query with a slop.
This builder basically reads the Query object set on the SlopQueryNode child using QueryTreeBuilder.QUERY_TREE_BUILDER_TAGID and applies the slop value defined in the SlopQueryNode.
Wraps arbitrary readers for merging.
A merged CodecReader view of multiple CodecReader.
 
 
 
 
 
 
 
 
ImpactsEnum that doesn't index impacts but implements the API in a legal way.
Reports on slow queries in a given match run
An individual entry in the slow log
Floating point numbers smaller than 32 bits.
SmartChineseAnalyzer is an analyzer for Chinese or mixed Chinese-English text.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
An IndexDeletionPolicy that wraps any other IndexDeletionPolicy and adds the ability to hold and later release snapshots of an index.
A filter that stems words using a Snowball-generated stemmer.
Factory for SnowballFilter, with configurable language
Base class for a snowball stemmer
Parent class of all snowball stemmers, which must implement stem
This reader filters out documents that have a doc values value in the given field and treat these documents as soft deleted.
 
 
 
This MergePolicy allows to carry over soft deleted documents across merges.
Parser for the Solr synonyms format.
Analyzer for Sorani Kurdish.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies SoraniNormalizer to normalize the orthography.
Normalizes the Unicode representation of Sorani text.
A TokenFilter that applies SoraniStemmer to stem Sorani words.
Factory for SoraniStemFilter.
Light stemmer for Sorani
Encapsulates sort criteria for returned hits.
 
A per-document byte[] with presorted values.
Field that stores a per-document BytesRef value, indexed for sorting.
Implements a TermsEnum wrapping a provided SortedDocValues.
Buffers up pending byte[] per doc, deref and sorting via int ord, then flushes when segment flushes.
 
 
This wrapper buffers incoming elements and makes sure they are sorted based on given comparator.
A list of per-document numeric values, sorted according to Long.compare(long, long).
Field that stores a per-document long values for scoring, sorting or value retrieval.
 
Similar to SortedNumericDocValuesRangeQuery but for a set
Buffers up pending long[] per doc, sorts, then flushes when segment flushes.
 
 
 
Selects a value from the document's list to use as the representative value
Wraps a SortedNumericDocValues and returns the last value (max)
Wraps a SortedNumericDocValues and returns the first value (min)
Type of selection to perform.
SortField for SortedNumericDocValues.
A SortFieldProvider for this sort field
A multi-valued version of SortedDocValues.
Field that stores a set of per-document BytesRef values, indexed for faceting,grouping,joining.
 
Implements a TermsEnum wrapping a provided SortedSetDocValues.
Buffers up pending byte[]s per doc, deref and sorting via int ord, then flushes when segment flushes.
 
 
 
Retrieves FunctionValues instances for multi-valued string based fields.
Selects a value from the document's set to use as the representative value
Wraps a SortedSetDocValues and returns the last ordinal (max)
Wraps a SortedSetDocValues and returns the middle ordinal (or max of the two)
Wraps a SortedSetDocValues and returns the middle ordinal (or min of the two)
Wraps a SortedSetDocValues and returns the first ordinal (min)
Type of selection to perform.
SortField for SortedSetDocValues.
A SortFieldProvider for this sort
Sorts documents of a given index by returning a permutation on the document IDs.
Base class for sorting algorithms implementations.
A permutation of doc IDs.
 
Stores information about how to sort documents by terms in an individual field.
A SortFieldProvider for field sorts
Specifies the type of the terms to be sorted, or special types such as CUSTOM
Reads/Writes a named SortField from a segment info file, used to record index sorts
 
An CodecReader which supports sorting documents by a given Sort.
 
 
Sorting FloatVectorValues that iterate over documents in the order of the provided sortMap
 
 
 
 
A visitor that copies every field it sees in the provided StoredFieldsWriter.
The strategy defining how a Hunspell dictionary should be loaded, with different tradeoffs.
 
 
 
A Rescorer that re-sorts according to a provided Sort.
Base class for building SpanQuerys
An interface defining the collection of postings information from the leaves of a Spans
Keep matches that contain another SpanScorer.
 
A priority queue of DocIdSetIterators that orders by current doc ID.
Wrapper used in SpanDisiPriorityQueue.
A DocIdSetIterator which is a disjunction of the approximations of the provided iterators.
Builder for SpanFirstQuery
Matches spans near the beginning of a field.
Formats text with different color intensity depending on the score of the term using the span tag.
Analyzer for Spanish.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies SpanishLightStemmer to stem Spanish words.
Light Stemmer for Spanish
Deprecated.
Deprecated.
Deprecated.
A TokenFilter that applies SpanishPluralStemmer to stem Spanish words.
Plural Stemmer for Spanish
This class implements the stemming algorithm defined by a snowball script.
Wraps any MultiTermQuery as a SpanQuery, so it can be nested within other SpanQuery classes.
Abstract class that defines how the query is rewritten.
A rewrite method that first translates each term into a SpanTermQuery in a BooleanClause.Occur.SHOULD clause in a BooleanQuery, and keeps the scores as computed by the query.
Builder for SpanNearQuery
Factory for SpanOrQuery
Matches spans which are near one another.
A builder for SpanNearQueries
 
 
Builder for SpanNotQuery
Removes matches which overlap with another SpanQuery or which are within x tokens before or y tokens after another SpanQuery.
Builder for SpanOrQuery
Matches the union of its clauses.
Builder that analyzes the text into a SpanOrQuery
Only return those matches that have a specific payload at the given position.
The payload type.
The payload type.
Base class for filtering a SpanQuery based on the position of a match.
 
Checks to see if the SpanPositionCheckQuery.getMatch() lies between a start and end position
Base class for span-based queries.
Interface for retrieving a SpanQuery.
Factory for SpanQueryBuilders
Iterates through combinations of start/end positions per-doc.
A basic Scorer over Spans.
Builder for SpanTermQuery
Matches spans containing a term.
Expert-only.
Enumeration defining what postings information should be retrieved from the index for a given Spans
 
Keep matches that are contained within another Spans.
A bit set that only stores longs that have at least one bit which is set.
Base query class for all spatial geometries: LatLonShape, LatLonPoint and XYShape.
Holds spatial logic for a bounding box that works in the encoded space
utility class for implementing constant score logic specific to INTERSECT, WITHIN, and DISJOINT
Visitor used for walking the BKD tree.
Spell Checker class (Main class).
(initially inspired by the David Spencer code).
Virtually slices the text on both sides of every occurrence of the specified character.
 
Query that matches String prefixes
Lowest level base class for surround queries
Simple single-term clause
Query that matches wildcards
Stable radix sorter for variable-length strings.
A MergeSorter taking advantage of temporary storage.
 
Filters StandardTokenizer with LowerCaseFilter and StopFilter, using a configurable list of stop words.
Default implementation of DirectoryReader.
 
Lookup tables for classes that can be serialized using a code.
This interface should be implemented by every class that wants to build Query objects from QueryNode objects.
This query configuration handler is used for almost every processor defined in the StandardQueryNodeProcessorPipeline processor pipeline.
Class holding keys for StandardQueryNodeProcessorPipeline options.
Boolean Operator: AND or OR
This pipeline has all the processors needed to process a query node tree, generated by StandardSyntaxParser, already assembled.
The StandardQueryParser is a pre-assembled query parser that supports most features of the classic Lucene query parser, allows dynamic configuration of some of its features (like multi-field expansion or wildcard query restrictions) and adds support for new query types and expressions.
This query tree builder only defines the necessary map to build a Query tree object.
Parser for the standard Lucene syntax
 
 
Token literal values and constants.
Token Manager.
A grammar-based tokenizer constructed with JFlex.
Factory for StandardTokenizer.
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.
3D rectangle, bounded on six sides by X,Y,Z limits
Pair of states.
A thin wrapper of IntIntHashMap Maps from state in integer representation to its reference count Whenever the count of a state is 0, that state will be removed from the set
BlockTree statistics for a single field returned by FieldReader.getStats().
BlockTree statistics for a single field returned by FieldReader.getStats().
Represents a term and its details stored in the BlockTermState.
Reads block lines encoded incrementally, with all fields corresponding to the term of the line.
Reads terms blocks with the Shared Terms format.
Writes terms blocks with the Shared Terms format.
Stemmer uses the affix rules declared in the Dictionary to generate one or more stems for a word.
 
 
 
Provides the ability to override any KeywordAttribute aware stemmer with custom dictionary-based stemming.
This builder builds an FST for the StemmerOverrideFilter
A read-only 4-byte FST backed map that allows fast case-insensitive key value lookups for StemmerOverrideFilter
Some commonly-used stemming functions
Transforms the token stream as per the stemming algorithm.
Factory for StempelFilter using a Polish stemming table.
Stemmer class is a convenient facade for other stemmer-related classes.
The "intersect" TermsEnum response to STUniformSplitTerms.intersect(CompiledAutomaton, BytesRef), intersecting the terms with an automaton.
TermsEnum used when merging segments, to enumerate the terms of an input segment and get all the fields TermStates of each term.
Combines PostingsEnum for the same term for a given field from multiple segments.
Removes stop words from a token stream.
Removes stop words from a token stream.
Factory for StopFilter.
Base class for Analyzers that need to make use of stopword sets.
A field whose value is stored so that IndexSearcher.storedFields() and IndexReader.storedFields() will return the field and its value.
API for reading stored fields.
 
Controls the format of stored fields
 
Codec API for reading stored fields.
Codec API for writing stored fields: For every document, StoredFieldsWriter.startDocument() is called, informing the Codec that a new document has started.
 
Expert: provides a low-level means of accessing the stored field values in an index.
Enumeration of possible return values for StoredFieldVisitor.needsField(org.apache.lucene.index.FieldInfo).
Abstraction around a stored value.
Type of a StoredValue.
Abstract FunctionValues implementation which supports retrieving String values.
Used for parsing Version strings so we don't have to use overkill String.split nor StringTokenizer (which silently skips empty tokens).
Interface for string distances.
A field that is indexed but not tokenized: the entire String value is indexed as a single token.
Methods for manipulating strings.
String manipulation routines
PostingsFormat based on the Uniform Split technique and supporting Shared Terms.
Extends UniformSplitTerms for a shared-terms dictionary, with all the fields of a term in the same block line.
A block-based terms index and dictionary based on the Uniform Split technique, and sharing all the fields terms in the same dictionary, with all the fields of a term in the same block line.
Extends UniformSplitTermsWriter by sharing all the fields terms in the same dictionary and by writing all the fields of a term in the same block line.
 
 
 
A generator for misspelled word corrections based on Hunspell flags.
Field that indexes a string value and a weight as a weighted completion against a named suggester.
A cache allowing for CPU-cache-friendlier iteration over WordStorage entries that can be used for suggestions.
Adds document suggest capabilities to IndexSearcher.
 
An exception thrown when Hunspell.suggest(java.lang.String) call takes too long, if TimeoutPolicy.THROW_EXCEPTION is used.
Set of strategies for suggesting related terms
Bounded priority queue for TopSuggestDocs.SuggestScoreDocs.
Like StopFilter except it will not remove the last token if that token was not followed by some token separator.
Factory for SuggestStopFilter.
SuggestWord, used in suggestSimilar method in SpellChecker class.
Frequency first, then score.
Sorts SuggestWord instances
Score first, then frequency
SumFloatFunction returns the sum of its components.
Calculate the final score as the sum of scores of all payloads seen.
SumTotalTermFreqValueSource returns the number of tokens.
Annotation to suppress forbidden-apis errors inside a whole class, a method, or a field.
Analyzer for Swedish.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies SwedishLightStemmer to stem Swedish words.
Light Stemmer for Swedish.
A TokenFilter that applies SwedishMinimalStemmer to stem Swedish words.
Minimal Stemmer for Swedish.
This class implements the stemming algorithm defined by a snowball script.
A similarity with a lengthNorm that provides for a "plateau" of equally good lengths, and tf helper functions.
Deprecated.
Use SynonymGraphFilter instead, but be sure to also use FlattenGraphFilter at index time (not at search time) as well.
 
 
Deprecated.
Use SynonymGraphFilterFactory instead, but be sure to also use FlattenGraphFilterFactory at index time (not at search time) as well.
Applies single- or multi-token synonyms from a SynonymMap to an incoming TokenStream, producing a fully correct graph output.
 
 
Factory for SynonymGraphFilter.
A map of synonyms, keys and values are phrases.
Builds an FSTSynonymMap.
 
Abstraction for parsing synonym files.
A query that treats multiple terms as synonyms.
A builder for SynonymQuery.
 
 
 
 
QueryNode for clauses that are synonym of each other.
Builder for SynonymQueryNode.
A parser needs to implement SyntaxParser interface
Analyzer for Tamil.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
This class implements the stemming algorithm defined by a snowball script.
Executor wrapper responsible for the execution of concurrent tasks.
Holds all the sub-tasks that a certain operation gets split into as it gets parallelized and exposes the ability to invoke such tasks and wait for them all to complete their execution and provide their results.
This TokenFilter provides the ability to set aside attribute states that have already been analyzed.
TokenStream output from a tee.
A convenience wrapper for storing the cached states as well the final state of the stream.
Analyzer for Telugu.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
A TokenFilter that applies TeluguNormalizer to normalize the orthography.
Normalizer for Telugu.
A TokenFilter that applies TeluguStemmer to stem Telugu words.
Factory for TeluguStemFilter.
Stemmer for Telugu.
A Term represents a word from text.
Wraps a term and boost
Word2Vec unit composed by a term with the associated vector
A proximity query that lets you express an automaton, whose transitions are terms, to match documents.
 
 
Sorts by docID so we can quickly pull out all scorers that are on the same (lowest) docID.
Sorts by position so we can visit all scorers on one doc, by position.
 
 
Term of a block line.
 
 
Presearcher implementation that uses terms extracted from queries to index them in the Monitor, and builds a disjunction from terms in a document to match them.
Constructs a document disjunction from a set of terms
Sets the custom term frequency of a term within one document.
Default implementation of TermFrequencyAttribute.
Function that returns PostingsEnum.freq() for the supplied term in every document.
An implementation of GroupFacetCollector that computes grouped facets based on the indexed terms from DocValues.
 
 
 
 
 
A GroupSelector implementation that groups via SortedDocValues
Specialization for a disjunction over many terms that, by default, behaves like a ConstantScoreQuery over a BooleanQuery containing only BooleanClause.Occur.SHOULD clauses.
 
A MatchesIterator over a single term's postings list
Sorts by field's natural Term sort order, using ordinals.
 
A Query that matches documents containing a term.
Builder for TermQuery
A Query that matches documents within an range of terms.
This query node represents a range query composed by FieldQueryNode bounds, which means the bound values are strings.
Builds a TermRangeQuery object from a TermRangeQueryNode object.
This processors process TermRangeQueryNodes.
Access to the terms in a specific field.
A collector that collects all terms from a specified field matching the query.
 
 
Expert: A Scorer for documents matching a Term.
Wrapper around a TermsEnum and an integer that identifies it.
Wrapper around a term that allows for quick equals comparisons.
A TokenStream created from a TermsEnum
This class is passed each token produced by the analyzer on each field during indexing, and it stores these tokens in a hash table, and allocates separate byte streams per token.
 
This class stores streams of information per term without knowing the size of the stream ahead of time.
 
 
BlockTermsReader interacts with an instance of this class to manage its terms index.
Similar to TermsEnum, except, the only "metadata" it reports for a given indexed term is the long fileOffset into the main terms dictionary file.
Base class for terms index implementations to plug into BlockTermsWriter.
Expert: Public for extension only.
A query that has an array of terms from a specific field.
Builds a BooleanQuery from all of the terms found in the XML element using the choice of analyzer
Encapsulates all required internal state to position the associated TermsEnum without re-seeking.
Maintains a IndexReader TermState view over IndexReader instances containing a single term.
Wrapper over TermState, ordinal value, term doc frequency and total term frequency
Contains statistics for a specific term
Holder for per-term statistics.
Holder for a term along with its statistics (TermStats.docFreq and TermStats.totalTermFreq).
 
 
 
 
 
This attribute is requested by TermsHashPerField to index the contents.
A filtered LeafReader that only includes the terms that are also in a provided set of terms.
 
 
Wraps a Terms with a LeafReader, typically from term vectors.
Uses term vectors that contain offsets.
API for reading term vectors.
 
 
 
Controls the format of term vectors
Codec API for reading term vectors:
Codec API for writing term vectors: For every document, TermVectorsWriter.startDocument(int) is called, informing the Codec how many fields will be written.
 
Calculates the weight of a Term
Ternary Search Tree.
The class creates a TST node.
Computes a triangular mesh tessellation for a given polygon.
Implementation of this interface will receive calls with internal data at each step of the triangulation algorithm.
Circular Doubly-linked list used for polygon coordinates
state of the tessellated split - avoids recursion
Triangle in the tessellated mesh
A set of static methods returning accessors for internal, package-private functionality in Lucene.
Interface for a node that has text as a CharSequence
A field that is indexed and tokenized, without term vectors.
Low-level class used to record information about a section of a document with a score.
Implementation of Similarity with the Vector Space Model.
Function that returns TFIDFSimilarity.tf(float) for every document.
Analyzer for Thai language.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
Tokenizer that use BreakIterator to tokenize Thai text.
Factory for ThaiTokenizer.
Thrown by lucene on detecting that Thread.interrupt() had been called.
Merges segments of approximately equal size, subject to an allowed number of segments per tier.
 
Holds score and explanation for a single candidate merge.
 
The TimeLimitingBulkScorer is used to timeout search requests that take longer than the maximum allowed search time limit.
Thrown when elapsed search time exceeds allowed search time.
Deprecated.
Thrown when elapsed search time exceeds allowed search time.
Thread used to timeout search requests.
 
A KnnCollectorManager that collects results with a timeout.
A strategy determining what to do when Hunspell API calls take too much time
Sorter implementation based on the TimSort algorithm.
Just like ToParentBlockJoinQuery, except this query joins in reverse: you provide a Query matching parent documents and it joins down to child documents.
 
 
Analyzed token with morphological data from its dictionary.
Analyzed token with morphological data.
Describes the input token stream.
Describes the input token stream.
Describes the input token stream.
A TokenFilter is a TokenStream whose input is another TokenStream.
Abstract parent class for analysis factories that create TokenFilter instances.
This static holder class prevents classloading deadlock by delaying init of factories until needed.
One, or several overlapping tokens, along with the score(s) and the scope of the original text.
Binary dictionary implementation for a known-word dictionary model: Words are encoded into an FST mapping to a list of wordIDs.
Binary dictionary implementation for a known-word dictionary model: Words are encoded into an FST mapping to a list of wordIDs.
 
 
 
 
 
 
Thin wrapper around an FST with root-arc caching for Japanese.
Thin wrapper around an FST with root-arc caching for Hangul syllables (11,172 arcs).
A TokenizedPhraseQueryNode represents a node created by a code that tokenizes/lemmatizes/analyzes.
A Tokenizer is a TokenStream whose input is a Reader.
Abstract parent class for analysis factories that create Tokenizer instances.
This static holder class prevents classloading deadlock by delaying init of factories until needed.
Token Manager Error.
Token Manager Error.
Token Manager Error.
Adds the OffsetAttribute.startOffset() and OffsetAttribute.endOffset() First 4 bytes are the start
Convenience methods for obtaining a TokenStream for use with the Highlighter - can obtain from term vectors with offsets and positions or from an Analyzer re-parsing the stored content.
A TokenStream enumerates the sequence of tokens, either from Fields of a Document or from query text.
TokenStream created from a term vector field.
 
Analyzes the text, producing a single OffsetsEnum wrapping the TokenStream filtered to terms in the query, including wildcards.
 
Consumes a TokenStream and creates an Automaton where the transition labels are UTF8 bytes (or Unicode code points if unicodeArcs is true) from the TermToBytesRefAttribute.
 
 
Consumes a TokenStream and creates an TermAutomatonQuery where the transition labels are tokens from the TermToBytesRefAttribute.
This exception is thrown when determinizing an automaton would require too much work.
Static methods globally useful for 3d geometric work.
Exception thrown when BasicQueryFactory would exceed the limit of query clauses.
This query requires that you index children and parent docs as a single block, using the IndexWriter.addDocuments() or IndexWriter.updateDocuments() API.
 
 
 
 
A special sort field that allows sorting parent docs based on nested / child level fields.
 
 
 
 
Represents hits returned by IndexSearcher.search(Query,int).
 
 
 
A base class for all collectors that return a TopDocs output.
A Collector that sorts by SortField using FieldComparators.
 
 
Create a TopFieldCollectorManager which uses a shared hit counter to maintain number of hits and a shared MaxScoreAccumulator to propagate the minimum score across segments if the primary sort is by relevancy.
Represents hits returned by IndexSearcher.search(Query,int,Sort).
Represents result returned by a grouping search.
How the GroupDocs score (if any) should be merged.
A second-pass collector that collects the TopDocs for each group, and returns them as a TopGroups object
 
 
 
TopKnnCollector is a specific KnnCollector.
TopKnnCollectorManager responsible for creating TopKnnCollector instances.
A Collector implementation that collects the top-scoring hits, returning them as a TopDocs.
 
Scorable leaf collector
 
Create a TopScoreDocCollectorManager which uses a shared hit counter to maintain number of hits and a shared MaxScoreAccumulator to propagate the minimum score across segments
TopDocs wrapper with an additional CharSequence key per ScoreDoc
ScoreDoc with an additional CharSequence key
Collector that collects completion and score, along with document id
Base rewrite method for collecting only the top terms via a priority queue.
 
Utility class for english translations of morphological data, used only for debugging.
Helper methods to ease implementing Object.toString().
Just counts the total number of hits.
Collector manager based on TotalHitCountCollector that allows users to parallelize counting the number of hits, expected to be used mostly wrapped in MultiCollectorManager.
Description of the total number of hits of a query.
How the TotalHits.value should be interpreted.
TotalTermFreqValueSource returns the total term freq (sum of term freqs across all documents).
A delegating Directory that records which files were written to and deleted.
 
Holds one transition from an Automaton.
A Trie is used to store a dictionary of words and their stems.
An automaton allowing to achieve the same results as non-weighted GeneratingSuggester.ngramScore(int, java.lang.String, java.lang.String, boolean), but faster (in O(s2.length) time).
Trims leading and trailing whitespace from Tokens in the stream.
Factory for TrimFilter.
A token filter for truncating the terms into a specific length.
Factory for TruncateTokenFilter.
Ternary Search Trie implementation.
Suggest implementation based on a Ternary Search Tree
Analyzer for Turkish.
Atomically loads the DEFAULT_STOP_SET in a lazy fashion once the outer class accesses the static final set the first time.;
Normalizes Turkish token text to lower case.
This class implements the stemming algorithm defined by a snowball script.
An interface for implementations that support 2-phase commit.
A utility for executing 2-phase commit on several objects.
Thrown by TwoPhaseCommitTool.execute(TwoPhaseCommit...) when an object fails to commit().
Thrown by TwoPhaseCommitTool.execute(TwoPhaseCommit...) when an object fails to prepareCommit().
Returned by Scorer.twoPhaseIterator() to expose an approximation of a DocIdSetIterator.
 
Makes the TypeAttribute a payload.
Adds the TypeAttribute.type() as a synonym, i.e.
Factory for TypeAsSynonymFilter.
A Token's lexical type.
Default implementation of TypeAttribute.
Removes tokens whose types appear in a set of blocked types from a token stream.
Factory class for TypeTokenFilter.
Filters UAX29URLEmailTokenizer with LowerCaseFilter and StopFilter, using a list of English stop words.
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs.
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs.
A parameter object to hold the components a FieldOffsetStrategy needs.
CharsSequence with escaped chars information.
This file contains unicode properties used by various CharTokenizers.
Class to encode java's UTF16 char[] into UTF8 byte[] without always allocating a new byte[] as String.getBytes(StandardCharsets.UTF_8) does.
Holds a codepoint along with the number of bytes required to represent it in UTF8
An Analyzer that uses UnicodeWhitespaceTokenizer.
A UnicodeWhitespaceTokenizer is a tokenizer that divides text at whitespace.
A Highlighter that can get offsets from either postings (IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS), term vectors (FieldType.setStoreTermVectorOffsets(boolean)), or via re-analyzing text.
Builder for UnifiedHighlighter.
Flags for controlling highlighting behavior.
Fetches stored fields for highlighting.
Source of term offsets; essential for highlighting.
Wraps an IndexReader that remembers/caches the last call to TermVectors.get(int) so that if the next call has the same ID, then it is reused.
PostingsFormat based on the Uniform Split technique.
Terms based on the Uniform Split technique.
A block-based terms index and dictionary based on the Uniform Split technique.
A block-based terms index and dictionary that assigns terms to nearly uniform length blocks.
Builds a FieldMetadata that is the union of multiple FieldMetadata.
Dictionary for unknown-word handling.
Dictionary for unknown-word handling.
 
 
 
 
 
 
 
 
This wrapper buffers the incoming elements and makes sure they are in random order.
An object with this interface is a wrapper around another object (e.g., a filter with a delegate).
This MergePolicy is used for upgrading all existing segments of an index when calling IndexWriter.forceMerge(int).
Normalizes token text to UPPER CASE.
Factory for UpperCaseFilter.
An FST Outputs implementation where each output is one or two non-negative long values.
Holds two long outputs.
A QueryCachingPolicy that tracks usage statistics of recently-used filters in order to decide on which filters are worth caching.
Class for building a User Dictionary.
Class for building a User Dictionary.
UserInputQueryBuilder uses 1 of 2 strategies for thread-safe parsing: 1) Synchronizing access to "parse" calls on a previously supplied QueryParser or..
Converts UTF-32 automata to the equivalent UTF-8 representation.
 
 
Static helper methods.
Represents a path in TopNSearcher.
Holds a single input (IntsRef) + output, returned by shortestPaths().
Compares first by the provided comparator, and then tie breaks by path.input.
Utility class to find top N shortest paths from start point(s).
Holds the results for a top N search using Util.TopNSearcher
SmartChineseAnalyzer utility constants and methods
This interface should be implemented by QueryNode that holds an arbitrary value.
Instantiates FunctionValues for a particular reader.
 
 
 
 
A GroupSelector that groups via a ValueSource
Scorer which returns the result of FunctionValues.floatVal(int) as the score for a document, and which filters out documents that don't match ValueSourceScorer.matches(int).
A helper to parse the context of a variable name, which is the base variable, followed by the sequence of array (integer or string indexed) and member accesses.
Represents what a piece of a variable does.
 
Selects index terms according to provided pluggable VariableGapTermsIndexWriter.IndexTermSelector, and stores them in a prefix trie that's loaded entirely in RAM stored as an FST.
Sets an index term when docFreq >= docFreqThresh, or every interval terms.
Hook for selecting which terms should be placed in the terms index.
A 3d vector in space, not necessarily going through the origin.
The numeric datatype of the vector values.
An implementation for retrieving FunctionValues instances for knn vectors fields.
A provider of vectorization implementations.
This static holder class prevents classloading deadlock.
Computes the similarity score between a given query vector and different document vectors.
Perform a similarity-based graph search.
Vector similarity function; used in search to return top K most similar vectors to a target vector.
VectorSimilarityFunction returns a similarity function between two knn vectors.
An abstract class that provides the vector similarity scores between the query vector and the KnnFloatVectorField or KnnByteVectorField for documents.
Utilities for computations with numeric arrays, especially algebraic operations like vector dot products.
Interface for implementations of VectorUtil support.
Deprecated.
use FloatVectorValues instead
Streams vector values for indexing to the given codec's vectors writer.
Converts individual ValueSource instances to leverage the FunctionValues *Val functions that work with multiple values, i.e.
A LockFactory that wraps another LockFactory and verifies that each lock obtain/release is "correct" (never results in two processes holding the lock at the same time).
Use by certain classes to match version compatibility across releases of Lucene.
This is just like Lucene90BlockTreeTermsWriter, except it also stores a version per term, and adds a method to its TermsEnum implementation to seekExact only if the version is >= the specified version.
 
 
 
 
BlockTree's implementation of Terms.
A utility for keeping backwards compatibility on previously abstract methods (or similar replacements).
This implements the WAND (Weak AND) algorithm for dynamic pruning described in "Efficient Query Evaluation using a Two-Level Retrieval Process" by Broder, Carmel, Herscovici, Soffer and Zien.
Implements a combination of WeakHashMap and IdentityHashMap.
 
Expert: Calculate query weights and build query scorers.
Just wraps a Scorer and performs top scoring using it.
A weighted implementation of FieldFragList.
A weighted implementation of FragListBuilder.
Lightweight class to hold term, weight, and positions used for scoring this term.
Class used to extract WeightedSpanTerms from a Query based on whether Terms from the Query are contained in a supplied TokenStream.
 
This class makes sure that if both position sensitive and insensitive versions of the same term are added, the position insensitive one wins.
Lightweight class to hold term and a weight value used for scoring this term
Suggester based on a weighted FST: it first traverses the prefix, then walks the n shortest paths to retrieve top-ranked suggestions.
 
An Analyzer that uses WhitespaceTokenizer.
A tokenizer that divides text at whitespace characters as defined by Character.isWhitespace(int).
Factory for WhitespaceTokenizer.
Just produces one single fragment for the entire text
Extension of StandardTokenizer that is aware of Wikipedia syntax.
Factory for WikipediaTokenizer.
JFlex-generated tokenizer that is aware of Wikipedia syntax.
Node that represents Intervals.wildcard(BytesRef).
Implements the wildcard search query.
A WildcardQueryNode represents wildcard query This does not apply to phrases.
Builds a WildcardQuery object from a WildcardQueryNode object.
The StandardSyntaxParser creates PrefixWildcardQueryNode nodes which have values containing the prefixed wildcard.
Word2VecModel is a class representing the parsed Word2Vec model containing the vectors for each word in dictionary
Applies single-token synonyms from a Word2Vec trained network to an incoming TokenStream.
The Word2VecSynonymProvider generates the list of sysnonyms of a term.
Supply Word2Vec Word2VecSynonymProvider cache avoiding that multiple instances of Word2VecSynonymFilterFactory will instantiate multiple instances of the same SynonymProvider.
 
A spell checker whose sole function is to offer suggestions by combining multiple terms into one word and/or breaking terms into multiple words.
Determines the order to list word break suggestions
 
 
 
 
 
 
 
 
Deprecated.
Use WordDelimiterGraphFilter instead: it produces a correct token graph so that e.g.
Deprecated.
Use WordDelimiterGraphFilterFactory instead: it produces a correct token graph so that e.g.
Splits words into subwords and performs optional transformations on subword groups, producing a correct token graph so that e.g.
A BreakIterator-like API for iterating over subwords in text, according to WordDelimiterGraphFilter rules.
SmartChineseAnalyzer Word Dictionary
A utility class used for generating possible word forms by adding affixes to stems (WordFormGenerator.getAllWordForms(String, String, Runnable)), and suggesting stems and flags to generate the given set of words (WordFormGenerator.compress(List, Set, Runnable)).
 
 
 
Loader for text files that represent a list of stopwords.
Parser for wordnet prolog format
Segment a sentence of Chinese text into words.
A data structure for memory-efficient word storage and fast lookup/enumeration.
 
Internal SmartChineseAnalyzer token type constants
 
 
A Collector that decodes the stored query for each document hit.
3D rectangle, bounded on six sides by X,Y,Z limits, degenerate in Y and Z.
3D rectangle, bounded on six sides by X,Y,Z limits, degenerate in Y
Represents a circle on the XY plane.
An per-document location field.
XYGeometry query for XYDocValuesField.
3D rectangle, bounded on six sides by X,Y,Z limits, degenerate in Z
reusable cartesian geometry encoding methods
Cartesian Geometry object.
Represents a line in cartesian space.
Represents a point on the earth's surface.
Compares documents by distance from an origin point
An indexed XY position field.
Finds all previously indexed points that fall within the specified XY geometries.
Sorts by distance from an origin location.
Represents a polygon in cartesian space.
Represents a x/y cartesian rectangle.
A cartesian shape utility class for indexing and searching geometries whose vertices are unitless x, y values.
A concrete implementation of ShapeDocValues for storing binary doc value representation of XYShape geometries in a XYShapeDocValuesField
Concrete implementation of a ShapeDocValuesField for cartesian geometries.
Bounding Box query for ShapeDocValuesField representing XYShape
Finds all previously indexed cartesian shapes that comply the given ShapeField.QueryRelation with the specified array of XYGeometry.
An object for accumulating XYZ bounds information.
Interface for a family of 3D rectangles, bounded on six sides by X,Y,Z limits
Factory for XYZSolid.
This class implements the stemming algorithm defined by a snowball script.