Issue
Solr query returning wrong number of rows, and this issue has to do with UTF-8 character encoding. When doing something like:
CREATE TABLE keyspace1.terms(collection text,field text,term text,displayname list<text>,locale list<text>,metadata map<text,text>,parentterm text,solr_query text,source list<text>,sourcekey list<text>,sourcetype list<text>,PRIMARY KEY((collection,field,term)));
CREATE CUSTOM INDEX keyspace1 ON keyspace1.terms (solr_query) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE INDEX terms_field_idx ON keyspace1.terms(field);
INSERT INTO keyspace1.terms(collection,field,term,displayname, metadata)VALUES('sessionSolr', 'kwb-case', '@Eata Banana', ['Eata Banana'], {'metadata_key': 'value'});
INSERT INTO keyspace1.terms(collection,field,term,displayname, metadata)VALUES('sessionSolr', 'kwb-case', '@Eata Bönana2', ['Eata Bönana'], {'metadata_key': 'value'});
INSERT INTO keyspace1.terms(collection,field,term,displayname, metadata)VALUES('sessionSolr', 'kwb-case', '@Eata Bönano2', ['Eata Banana'], {'metadata_key': 'value'});
INSERT INTO keyspace1.terms(collection,field,term,displayname, metadata) VALUES('sessionSolr', 'kwb-case', '@Eata Bonana', ['Eata Bönana'], {'metadata_key': 'value'});
Now, when doing a solr query:
http://10.101.32.78:8983/solr/#/keyspace1.terms/query
The numFound results in 2 rows found, which is clearly incorrect.
Solution
The issue is due to character encoding and to resolve follow this procedure:
1) add -Dfile.encoding=UTF8 to the bottom of the jvm.options file
2) sudo service dse restart
3) reload index on keyspace1.terms;