DataStax Help Center

Removing unreachable nodes from Gossip

Summary

On occasion where a node has been decommissioned or removed, sometimes the process may not complete. This article discusses how to remove the node from gossip using JMX tools

Symptoms

The node may have long since been removed from the cluster using nodetool decommission on the node itself of nodetool removenode <Host ID> on another node, but is still showing up in nodetool status for example:

Datacenter: DC1 
=============== 
Status=Up/Down 
State=Normal/Leaving/Joining/Moving 
-- Address Load Tokens Owns Host ID Rack 
UL 10.0.0.1 275.12 GB 1 16.7% 22005584-544e-47bb-80a7-c5e283be137b RAC1 
UN 10.0.0.2 335.22 GB 1 16.7% 6e3e4793-9161-4873-98f1-bc040ae907e7 RAC2

The output of the nodetool netstats command may also show that all streams are complete at 100%

Cause

Depending on what version DSE is in use this may be caused by a known issue with gossip. One of the most recent ones at the time of writing this article was CASSANDRA-10371

Solution

Using a graphical JMX tool or a command line too you can remove the node from gossip using the mbean org.apache.cassandra.net Gossiper.unsafeAssassinateEndpoint(<IP address>)

Here's an example using jconsole:

 

assassinate.png

 

 

Was this article helpful?
2 out of 2 found this helpful
Have more questions? Submit a request

Comments

  • Avatar
    HT

    It's also a good idea to manually remove the unreachable node from each node's system.peers table.

  • Avatar
    Sean Fuller

    This also works if you do not have jconsole access to the nodes:
    https://gist.github.com/justenwalker/8338334

  • Avatar
    Gregory Smith

    I have had several occasions where jmxterm times out before the assassinate completes, try jmxsh instead.

Powered by Zendesk