Overview
This article explains the warning messages relating to ports already in use when launching a Spark shell.
Background
Each SparkContext has an associated web interface (SparkUI) which displays useful information about Spark jobs and configuration. The SparkUI is accessible by browsing http://<node_ip>:4040
.
By default, the SparkUI is bound to port 4040. If there are multiple SparkContexts running on the same node, then the SparkUI will bind to the next available port, e.g. port 4041.
Spark shell
When launching a Spark shell, a SparkContext is automatically created and hence a Spark web UI gets started bound to port 4040.
To illustrate with an example, given a Spark shell which spawns a java process ID 11919:
$ lsof -i -n | grep LISTEN | grep 11919 java 11919 automaton 347u IPv4 58510 0t0 TCP *:52154 (LISTEN) java 11919 automaton 378u IPv4 58356 0t0 TCP 127.0.0.1:38747 (LISTEN) java 11919 automaton 379u IPv4 58357 0t0 TCP *:55823 (LISTEN) java 11919 automaton 380u IPv4 58878 0t0 TCP *:4040 (LISTEN) java 11919 automaton 399u IPv4 59271 0t0 TCP *:43170 (LISTEN)
Launching a second Spark shell, the following warning message is displayed:
WARN 2015-12-17 02:16:45 org.spark-project.jetty.util.component.AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use java.net.BindException: Address already in use ... at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) ~[na:1.7.0_45] at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.7.0_45] at org.spark-project.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187) ~[spark-core_2.10-1.4.1.3.jar:1.4.1.3] ... at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:228) [spark-core_2.10-1.4.1.3.jar:1.4.1.3] at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:238) [spark-core_2.10-1.4.1.3.jar:1.4.1.3] at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:238) [spark-core_2.10-1.4.1.3.jar:1.4.1.3] at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1991) [spark-core_2.10-1.4.1.3.jar:1.4.1.3] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) [scala-library-2.10.5.jar:na] at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1982) [spark-core_2.10-1.4.1.3.jar:1.4.1.3] at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:238) [spark-core_2.10-1.4.1.3.jar:1.4.1.3] at org.apache.spark.ui.WebUI.bind(WebUI.scala:117) [spark-core_2.10-1.4.1.3.jar:1.4.1.3] at org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:448) [spark-core_2.10-1.4.1.3.jar:1.4.1.3] at org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:448) [spark-core_2.10-1.4.1.3.jar:1.4.1.3]
Further on, it reports:
WARN 2015-12-17 02:16:45 org.apache.spark.util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Similar to the first Spark shell, assuming that the second Spark shell spawned a java process ID 12403, the output below shows the process is bound to port 4041:
$ lsof -i -n | grep LISTEN | grep 12403 java 12403 automaton 347u IPv4 63010 0t0 TCP *:37127 (LISTEN) java 12403 automaton 378u IPv4 61103 0t0 TCP 127.0.0.1:33946 (LISTEN) java 12403 automaton 379u IPv4 61104 0t0 TCP *:45943 (LISTEN) java 12403 automaton 380u IPv4 61106 0t0 TCP *:4041 (LISTEN) java 12403 automaton 399u IPv4 64573 0t0 TCP *:58217 (LISTEN)
Conclusion
The warning message is generated by Spark and is important since it alerts the user to the fact that other SparkContexts are in use and that the Spark web UI is running on a different port. This avoids confusion in situations where a user is attempting to monitor jobs in a different SparkContext.
See also
Spark doc - Using the shell, Spark Programming Guide
Spark doc - Web Interfaces, Spark Monitoring and Instrumentation