DataStax Help Center

Spark submit fails with "class not found" when deploying in cluster mode


Spark jobs can be submitted in "cluster" mode or "client" mode. The former launches the driver on one of the cluster nodes, the latter launches the driver on the local node.

When using submit in cluster mode, a class not found error can occur if the relevant jar files are not accessible. This note addresses an example to show how this can be achieved.


The following is typical of the type of error that might be seen:

Exception in thread "main" java.lang.reflect.InvocationTargetException 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    at java.lang.reflect.Method.invoke(
    at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
    at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils$


Although the jar files were made available to all nodes in the cluster (i.e. via NFS share) the --driver-class-path was not included.


The following is an example of what was used to resolve the issue:

$ sudo -u cassandra dse spark-submit -v
--conf "spark.serializer=org.apache.spark.serializer.KryoSerializer"
--jars $JARS \ 
--executor-memory 512M \
--total-executor-cores 2 \
--deploy-mode "cluster" \
--master spark:// \
--supervise \
--driver-class-path $JARS_COLON_SEP \
--class "com.test.example" $APP_JAR "$INPUT_PATH" --files $INPUT_PATH

The above env variables were also set as follows

JARS = /home/bob/spark_job/lib/nscala-time_2.10-2.0.0.jar,/home/bob/spark_job/lib/kafka_2.10-,/home/bob/spark_job/lib/kafka-clients-,/home/bob/spark_job/lib/spark-streaming-kafka_2.10-1.4.1.jar,/home/bob/spark_job/lib/zkclient-0.3.jar,/home/bob/spark_job/lib/protobuf-java-2.4.0a.jar
JARS_COLON_SEP = /home/bob/spark_job/lib/nscala-time_2.10-2.0.0.jar:/home/bob/spark_job/lib/kafka_2.10-

Further info

The following outlines spark submit options

The following links discuss some examples of using this. The main distinction is if the jar files need to be in the system class path the --driver-class-path option is required

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request


Powered by Zendesk