Summary
The thrift server was failing with a JVM Out of memory error either heap space of GC overhead due to a full table request from a client application
Symptoms
The following errors may be observed in the thrift.log
2017-04-11 00:41:55,017 org.apache.spark.util.Utils: Uncaught exception in thread task-result-getter-0
java.lang.OutOfMemoryError: GC overhead limit exceeded
2017-04-11 15:01:52,277 org.apache.spark.util.Utils: Uncaught exception in thread task-result-getter-0
java.lang.OutOfMemoryError: Java heap space
Cause
The problem was due to a client application needing to pull the entire contents of a given table from DSE. Note: while this is not a common action there may well be certain scenarios where this is necessary.
Spark will run all tasks in the job in parallel, because of this the data set will not fit into the JVM heap memory and cause these out of memory conditions.
Workaround
There is a way to configure the thrift server so tasks in spark so they only run one at a time, while this is slower it means that they run in an incremental fashion. The following parameter may be invoked on the thrift server command line
--conf spark.sql.thriftServer.incrementalCollect=true
Solution
The user must think about other ways to pull back a smaller data set from DSE using the power of the distributed computation that spark offers to do the “heavy lifting” on the cluster itself rather than in some upstream client application.