DataStax Help Center

CassandraSQLcontext throws error: Solr search queries only supported on the 'solr_query' field

Summary

On a 2-node cluster with separate workload, 1-solr and 1-spark, a query run using the csc (cassandraSQLcontext) fails with this error:

solr search queries only supported on the 'solr_query' field

Symptoms

The following is typical of the error seen in the spark shell:

scala> csc.sql("select * from keyspace1.table1 where col3 = 'ABCDEF'").show()
WARN 2016-05-25 10:41:55,490 org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 6.0 (TID 12, 172.31.29.142): java.io.IOException: Exception during execution of SELECT "col1", "col2", "col3", "col4", "col5" FROM "keyspace1"."table1" WHERE token("col3") <= ? AND "col3" = ? ALLOW FILTERING: Solr search queries only supported on the 'solr_query' field
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.com$datastax$spark$connector$rdd$CassandraTableScanRDD$$fetchTokenRange(CassandraTableScanRDD.scala:215)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$13.apply(CassandraTableScanRDD.scala:229)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$13.apply(CassandraTableScanRDD.scala:229)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at com.datastax.spark.connector.util.CountingIterator.hasNext(CountingIterator.scala:12)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)

 

Cause

The cassandraSQLcontext (csc) does not correctly interpret the select on non solr_query fields. If it is required to run select on fields indexed by DSE solr from the spark shell using csc, the query has to be run from a mixed workload node where solr and spark are both started.

Solution

The solution for separate workload clusters is to use the hive context. CassandraSQLcontext is being phased out starting with DSE5 and the standard will be the hive context, hc.

The same query above must be re-written as:

scala> hc.sql("select * from keyspace1.table1 where col3 = 'col3'").show()

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk