Full ERROR Message Example
ERROR [RMI TCP Connection(1204)-127.0.0.1] 2020-07-23 04:29:12,024 StorageService.java:4813 - Error while decommissioning node
TYPICALLY PRODUCES A STACK TRACE SIMILAR TO THE FOLLOWING (STACK TRACE WILL DIFFER BASED ON VERSION):
ERROR [RMI TCP Connection(1204)-127.0.0.1] 2020-07-23 04:29:12,024 StorageService.java:4813 - Error while decommissioning node
org.apache.cassandra.streaming.StreamException: Stream failed
at org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:92)
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1764)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:456)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:817)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:753)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:634)
at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:240)
at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:216)
at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:536)
at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:936)
at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:622)
at org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:290)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: Error while decommissioning node: Failed to transfer all hints to 20a1d78c-f395-48b8-ae11-7e2e17440ab3
at org.apache.cassandra.service.StorageService.decommission(StorageService.java:4814)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401)
at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
What does this ERROR message mean?
The error while decommissioning a node means that an exception occurred during one of the phases of the decommissioning process, which caused Cassandra to abort the running process.
It is thrown during a catch block where the cause of the failure follows a code path specific to the exception.
Most commonly a failed streaming operation is the process throwing the exception.
The stack trace will provide the most detail as to which phase of the decommission process the ERROR occurred in.
Why does this ERROR occur?
The error typically occurs due to a stream failure, but the stream can fail for a variety of reasons including:
- Network failures
- Overloaded or under-provisioned nodes
- Running repairs
- Long GC pauses
- Sstable corruption
How do you fix this ERROR?
When this ERROR occurs, the decommission operation will be aborted. Resolving the underlying cause of the failure will be necessary before restarting the decommission process.
Examining the log entries is an effective way to determine the cause of the failure. The following can serve as a guide when examining the Cassandra log, but make note of the timestamp when the decommission failure occurred and examine only the relevant time period of the Cassandra logs in the cluster leading up to the point of failure.
Network failures
When network problems cause decommission to fail, examine the cassandra logs for connectivity related errors, for example:
ERROR [STREAM-OUT-/10.5.24.100:7000] 2020-05-23 14:53:24,442 StreamSession.java:603 - [Stream #539374e0-5e99-11e8-8f59-8567d2f2bd2c] Streaming error occurred on session with peer 10.3.22.100 org.apache.cassandra.io.FSReadError: java.io.IOException: Broken pipe
ERROR [internode-messaging RemoteMessageServer IO worker-5-6] 2020-03-30 05:27:16,582 MessageServer.java:306 - Connection reset by peer
java.io.IOException: Connection reset by peer
If the network error is believed to be a temporary problem or caused by an increase in traffic, simply retry decommissioning the node or wait until off-peak hours to retry. If decommission still fails, perform a rolling restart of the cluster and retry.
Overloaded or under-provisioned nodes
When this is the cause of the decommission failure, examine the Cassandra logs for signs the nodes in the cluster were overloaded to the point they became unavailable during the decommission process.
This can include dropped messages, excessive tombstone warnings, the inability to reach QUORUM, etc... for example:
INFO [ScheduledTasks:1] 2020-05-23 14:09:20,509 MessagingService.java:1273 - MUTATION messages were dropped in last 5000 ms: 0 internal and 43 cross node. Mean internal dropped latency: 0 ms and Mean cross-node dropped latency: 5960 ms
INFO [ScheduledTasks:1] 2020-05-23 14:09:20,509 MessagingService.java:1273 - READ messages were dropped in last 5000 ms: 0 internal and 136 cross node. Mean internal dropped latency: 0 ms and Mean cross-node dropped latency: 5960 ms
WARN [ReadStage-47] 2020-05-23 16:48:00,809 ReadCommand.java:543 - Read 5000 live rows and 10000 tombstone cells for query SELECT * FROM <keyspace.table> LIMIT 5000 (see tombstone_warn_threshold)
ERROR [ReadRepairStage:518] 2020-05-23 15:07:52,411 CassandraDaemon.java:229 - Exception in thread Thread[ReadRepairStage:518,5,main]
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses.
If the cluster is overloaded to the point decommission fails, it may be necessary to retry when there is less activity, throttle the workload while decommission completes, clear excessive tombstones, or add resources/nodes to better suit the cluster's needs.
Running repairs
As part of the decommission process, it is necessary to stream data from the decommissioning node to other nodes in the cluster. For this reason, repair must be stopped before decommissioning a node.
If repair is not stopped before decommission starts, sstables being written/compacted as part of the repair process can cause decommission to fail with errors similar to the following:
WARN [STREAM-IN-/127.0.0.1:7001] 2020-05-08 11:42:58,896 CompressedStreamReader.java:119 - [Stream 6b7f2ae0-5269-11e8-ac52-0d9f9c87e6b4] Error while reading partition null from stream on ks='keyspace' and table='table'.
ERROR [STREAM-IN-/100.64.43.27:7001] 2020-05-08 11:42:58,897 StreamSession.java:597 - [Stream #6b7f2ae0-5269-11e8-ac52-0d9f9c87e6b4] Streaming error occurred on session with peer 100.0.0.27
If this happens, stop any running repair, then run the following command on each node in the cluster to clear any lingering repair streams before restarting the decommission process:
nodetool sjk mx -mc -b "org.apache.cassandra.db:type=StorageService" -op forceTerminateAllRepairSessions
Long GC pauses
If a node in the cluster encounters a long GC pause during the decommission process, it can appear down to the decommissioning node and cause a stream failure. To determine if this contributed to the failure look for GCInspector log entries and look for the duration of the pause, in this case 5170ms, which is substantial enough to cause a failure:
WARN [Service Thread] 2020-05-23 14:09:15,508 GCInspector.java:282 - G1 Young Generation GC in 5170ms. G1 Eden Space: 18035507200 -> 0; G1 Old Gen: 12280584520 -> 26468408336; G1 Survivor Space: 1132462080 -> 662700032;
If this is the cause of the failure, take action to alleviate long GC pauses, for example:
- Reduce the number of tombstones being read
- Avoid large batch updates
- Avoid queries that perform full table scans on large tables
- Throttle the workload
SStable corruption
If any node in the cluster, including the decommissioning node, encounters a corrupted sstable during streaming, it can cause decommission to fail. To determine if a corrupted sstable caused the problem, look for log messages that point to corruption during an sstable read, for example:
ERROR [CoreThread-3] 2020-05-16 05:16:07,516 VerbHandlers.java:77 - Unexpected error during execution of request READS.RANGE_READ (2216292): /10.100.123.108 -> /10.100.123.108 org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /var/lib/dse/cassandra/data/keyspace/table-4d442f92601711eabd8cd37a4e99c262/ac-16222-bti-Data.db
If this is found, the corruption will need to be repaired before restarting the decommission process.
To attempt to repair the corruption online, run:
nodetool scrub <keyspace> <table>
Some types of corruption cannot be repaired online, when this happens, bring down the node and run offline:
sstablescrub <keyspace> <table>
If the sstable is corrupted to the point it cannot be repaired, but the replication factor for the keyspace is greater than one, delete the corrupted sstable from the node and run a full repair on the node to repair the data, for example:
nodetool repair --full <keyspace> <table>
If the replication factor is one, or more than one node is reporting corruption for the same table, it may be necessary to restore the table before decommissioning.