Summary
This article discusses a rare situation where nodes upgraded to DSE 5.0 fails to start.
Applies to
- DataStax Enterprise 5.0.13 and earlier
Symptoms
In very rare and specific scenarios, a node previously running with DSE 4.8 which has been upgraded to earlier versions of DSE 5.0 fails to start immediately after the software binary upgrade. During startup, the LegacySchemaMigrator
responsible for converting the old schema fails with a null pointer exception. Here is an example stack trace from a node upgraded to DSE 5.0.12:
ERROR [main] 2018-05-15 02:01:50,152 CassandraDaemon.java:678 - Exception encountered during startup java.lang.NullPointerException: null at org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:156) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.serializers.AbstractTextSerializer.deserialize(AbstractTextSerializer.java:41) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.serializers.AbstractTextSerializer.deserialize(AbstractTextSerializer.java:28) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:113) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.cql3.UntypedResultSet$Row.getString(UntypedResultSet.java:267) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:743) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:698) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:334) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:275) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:246) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:239) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_131] at org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:239) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:188) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:179) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_131] at org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:179) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:79) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:235) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:469) ~[dse-core-5.0.12.jar:5.0.12] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:582) ~[cassandra-all-3.0.15.2128.jar:3.0.15.2128] at com.datastax.bdp.DseModule.main(DseModule.java:91) [dse-core-5.0.12.jar:5.0.12]
Cause
Range tombstones in Apache Cassandra 2.1 SSTables are handled incorrectly in Cassandra 3.0 where two CQL rows get created incorrectly when only one should exist (CASSANDRA-14008).
In rare conditions where a range tombstone exists in one of the legacy CQL schema SSTables, the redundant row which gets created contains null values leading to the NullPointerException
for the LegacySchemaMigrator
which runs on startup after a node is upgraded to DSE 5.0.
The fix was not included in earlier versions of DSE 5.0 until DSE 5.0.14.
Solution
If the cluster is affected by the symptoms above, upgrade to DSE 5.0.14 or later to obtain the fix which allows upgraded nodes to start successfully.