Following an upgrade of DSE it's sometimes necessary to perform a nodetool upgradesstables on your nodes to convert sstables to the new Cassandra version. This article aims to provide answers to some common questions that are often asked by users when running nodetool upgradesstables.
Question and Answers
- When do I need to run nodetool upgradesstables?
When upgrading DSE, if the underlying version of Cassandra is going up by a major version you will need to run nodetool upgradesstables. For example, you would need to run nodetool upgradesstables if you were upgrading from DSE 4.6.x (uses Cassandra 2.0.1) to DSE 4.8.x (uses Cassandra 2.1.x). You can find out the version of Cassandra in your version of DSE by looking at the output of cqlsh when it's starting or refer to the release notes in our online documentation. If in doubt or if you need to query the upgrade path from an older version of DSE contact DataStax Support.
- Can I run nodetool upgradesstables on more than one node?
Yes - you can run nodetool upgradesstables on multiple nodes at the same time. However, to ensure your cluster is still able to service requests refer to the replication factor. For example, if your replication factor is 3, only run nodetool upgradesstables on every third node.
- How can I speed up nodetool upgradesstables?
You can speed up nodetool upgradesstables by unthrottling compaction throughput using 'nodetool setcompactionthroughput 0' on the fly (no restart of DSE is required). However, if your nodes start to struggle you can throttle the load using 'nodetool setcompactionthroughput 16' or a lower value (16mb is the default).
- How long will nodetool upgradesstables take to run?
Under the data directory, if you look at the sstables for a table you'll notice all sstable names have a filename version. For example, consider an upgrade between these versions:
DSE 4.6.x (uses Cassandra 2.0.1) has 'jb' in the file name
DSE 4.8.x (uses Cassandra 2.1.x) has 'ka' in the file name.
Once nodetool upgradesstables is running, for an indication of how long the process has left to run you can perform a find for 'jb' sstable file name and use this as an indicator of how much work nodetool upgradesstables has left to do. This find command can help to show how many older format 'jb' sstables are left:
find . -name *jb*.db |wc -l
This find command shows how many sstables have been upgraded to the newer format:
find . -name *ka*.db |wc -l
Comparing these numbers over a length of time provides some indication of how long nodetool upgradesstables has left to run. When nodetool upgradesstables has completed all sstables will show 'ka' in the file name.