DataStax Help Center

"java.lang.NoSuchMethodException" seen when attempting Spark streaming from Kafka

Summary

When attempting to stream into spark from Kafka the user sees an error. Extra configuration is needed to resolve this

Symptoms

The following error is seen:

15/02/18 08:39:11 ERROR scheduler.ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - java.lang.NoSuchMethodException: kafka.serializer.StringDecoder.<init>(kafka.utils.VerifiableProperties)
at java.lang.Class.getConstructor0(Class.java:2892)
at java.lang.Class.getConstructor(Class.java:1723)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:106)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:264)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:257)
at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
at org.apache.spark.scheduler.Task.run(Task.scala:54)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Cause

There is some additional configuration required to get streaming from Kafka to work

Solution

1. remove/rename "resources/spark/lib/libthrift-0.8.0.jar"

2. comment out this line in "resources/spark/bin/spark-class"

#SPARK_JAVA_OPTS+=" -Djava.system.class.loader=$DSE_CLIENT_CLASSLOADER"

 

Further information

The following internal bugs cover some more details on this issue

DSP-4720

DSP-4238

DSP-4804

Was this article helpful?
3 out of 3 found this helpful
Have more questions? Submit a request

Comments

  • Avatar
    Rahul Gupta

    I have a 4 node cluster running DSE Spark. I guess this need to be done on all 4 nodes?

    Also do I need to restart DSE on all 4 nodes after this change?

Powered by Zendesk