This article provides information on configuring multiple contact points (hosts) for Bring-Your-Own-Spark configurations.
Since DataStax Enterprise 5.0, the Bring-Your-Own-Spark (BYOS) feature supports connecting to a DSE cluster from an external Spark cluster.
When connecting to DSE Analytics using an external Hadoop client, one of the configuration items is the
spark.hadoop.cassandra.host property. The value of this property is set to the IP address of one of the DSE nodes running in Analytics mode. For example, a generated BYOS configuration file
byos.properties will contain the following entry:
Apache Hadoop does not natively support multiple connections for high-availability so this property is not able to handle a string of IP addresses or hostnames. Apache Spark has kept the implementation as-is for backward-compatibility.
The DSE Analytics team is reviewing the feasibility of implementing this enhancement. The internal feature request IDs are DSP-10873, DSP-13151 and DSP-16183.
Click the "Follow" button above to get notified of updates.