In cases where users want unlimited facets, they have been setting facet.limit to -1. This can be a potential issue when counting millions of distinct values. And at times, users may have found that the results were far larger than expected, bogging down the cluster, or even causing outages.
An alternative to unlimited facets is using limited with the following:
query1:
facet.sort=index
shard.shuffling.strategy=SEED
shard.shuffling.seed=10.1.1.250
facet.offset=0
facet.limit=15000
query2:
facet.sort=index
shard.shuffling.strategy=SEED
shard.shuffling.seed=10.1.1.250
facet.offset=15000
facet.limit=15000
Increasing the facet.offset during each iteration and using the same seed node for shard.shuffling.seed will ensure consistent results for that query.
NOTE: If working on several different facet counts, then using a different seed node is optimal. But, for the same facet query group (query1, query2), please use the same seed node during each query.
NOTE: This type of facet counting is not available in cql. You must use HTTP.
You will know when you've hit the end when the facet count is less than the facet.limit, or when you receive no further results.