Package org.apache.lucene.search
Class BayesianScoreEstimator
java.lang.Object
org.apache.lucene.search.BayesianScoreEstimator
Estimates
BayesianScoreQuery parameters (alpha, beta, base rate) from corpus statistics
via pseudo-query sampling.
The estimation algorithm:
- Reservoir-sample terms from the target field's indexed vocabulary
- Partition the sampled terms into pseudo-queries
- Run each pseudo-query via BM25 and collect the score distribution
- Estimate: beta = median(scores), alpha = 1 / std(scores)
- Estimate base rate: mean fraction of documents scoring above the 95th percentile
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final recordEstimated parameters forBayesianScoreQuery. -
Method Summary
Modifier and TypeMethodDescriptionestimate(IndexSearcher searcher, String field) Estimates parameters with default settings (50 samples, 5 tokens per query, seed 42).estimate(IndexSearcher searcher, String field, int nSamples, int tokensPerQuery, long seed) Estimates BayesianScoreQuery parameters from the given index.
-
Method Details
-
estimate
public static BayesianScoreEstimator.Parameters estimate(IndexSearcher searcher, String field, int nSamples, int tokensPerQuery, long seed) throws IOException Estimates BayesianScoreQuery parameters from the given index.- Parameters:
searcher- the index searcher to sample fromfield- the indexed text field to create pseudo-queries fornSamples- number of pseudo-queries to sample (default 50)tokensPerQuery- number of indexed terms per pseudo-query (default 5)seed- random seed for reproducible sampling- Returns:
- estimated alpha, beta, and base rate
- Throws:
IOException- if an I/O error occurs reading the index
-
estimate
public static BayesianScoreEstimator.Parameters estimate(IndexSearcher searcher, String field) throws IOException Estimates parameters with default settings (50 samples, 5 tokens per query, seed 42).- Parameters:
searcher- the index searcherfield- the text field- Returns:
- estimated parameters
- Throws:
IOException- if an I/O error occurs
-