Summary
This article discusses an issue where JARs are not placed on the Spark driver classpath.
Symptoms
When running spark-submit
with the --driver-class-path
option, the JAR does not get placed in the driver classpath. For example, invoking Spark shell as follows:
$ dse spark --driver-class-path /path/to/my.jar
or
$ dse spark --conf spark.driver.extraClassPath=/path/to/my.jar
will not add the JAR to the classpath. Although there are no errors returned, the Spark driver web UI (default port 4040) will not contain an entry for the JAR file. As an example, the following entry should appear at the bottom of the page:
/path/to/my.jar System Classpath
Since the JAR is not placed in the classpath, applications requiring the JAR fail to run.
Cause
In DSE 5.1.0, there is an argument missing in a section of the modified spark-class
shell script responsible for setting up the classpath for the DSE implementation of Apache Spark (internal defect ID DSP-13289).
As a result, the classpath is not populated with all the required arguments.
Workaround
Modify the spark-class
shell script in the following locations:
- for packaged installations -
/usr/share/dse/spark/bin
- for tarball installations -
<install_location>/resources/spark/bin
Step 1 - In this section of the exec_with_dse_impl()
function:
if [ "$substituteClassPath" == "1" ]; then substituteClassPath=0 ARG="$LAUNCH_CLASSPATH"
Replace the following line:
ARG="$LAUNCH_CLASSPATH"
with (add $ARG
at the end):
ARG="$LAUNCH_CLASSPATH:$ARG"
Step 2 - Make the update on all nodes.
Solution
The fix for DSP-13289 is included in DSE 5.1.1. Upgrade affected clusters to DSE 5.1.1 or newer to obtain the fix.
See also
DSE doc - DSE 5.1.1 Release Notes