Summary
This article discusses a race condition that causes SSTable reads to fail.
Applies to
- DataStax Enterprise 6.7.3
Symptom
In rare situations, reading SSTable files fail with an assertion error. Here is an example read failure which occurred during a compaction operation:
ERROR [CompactionExecutor:5] 2019-05-21 14:50:57,702 CassandraDaemon.java:126 - Exception in thread Thread[CompactionExecutor:5,5,main] org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /data/shows/got-69a222a2765211e99f83f7c2988ecce9/aa-806-bti-Data.db at org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:167) at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:101) at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:33) ... Caused by: java.io.IOException: Error building row with data deserialized from RandomAccessReader:Prefetching rebufferer: (8/4) buffers read-ahead, 4096 buffer size at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:648) at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:488) ... Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /data/shows/got-69a222a2765211e99f83f7c2988ecce9/aa-806-bti-Data.db at org.apache.cassandra.io.util.CompressedChunkReader$Standard.error(CompressedChunkReader.java:272) at org.apache.cassandra.io.util.CompressedChunkReader$Standard.readChunk(CompressedChunkReader.java:154) at org.apache.cassandra.io.util.ChunkReader.readScattered(ChunkReader.java:108) ... Caused by: java.lang.AssertionError: Slab should have been unreferenced and all buffers returned before recycling at org.apache.cassandra.utils.memory.buffers.MemorySlabWithBumpPtr.recycle(MemorySlabWithBumpPtr.java:174) at org.apache.cassandra.utils.memory.buffers.TemporaryBufferPool.newSlab(TemporaryBufferPool.java:321) at org.apache.cassandra.utils.memory.buffers.TemporaryBufferPool.switchSharedSlab(TemporaryBufferPool.java:232) at org.apache.cassandra.utils.memory.buffers.TemporaryBufferPool.allocateFromShared(TemporaryBufferPool.java:197) at org.apache.cassandra.utils.memory.buffers.TemporaryBufferPool.allocate(TemporaryBufferPool.java:128) at org.apache.cassandra.io.util.CompressedChunkReader$Standard.doReadChunk(CompressedChunkReader.java:172) at org.apache.cassandra.io.util.CompressedChunkReader$Standard.readChunk(CompressedChunkReader.java:136) ...
Cause
A memory optimization for read operations implemented in DSE 6.7.3 (DB-3124) inadvertently allowed thread-per-core (TPC) threads to share memory buffer slabs in a temporary buffer pool. In rare race conditions, memory slabs that are still referenced by a TPC read thread are getting recycled back into the pool for another thread.
When a node hits this condition, an assertion error is thrown by MemorySlabWithBumpPtr.recycle()
method because it cannot recycle the slab back into the pool while it is still being used by a TPC thread.
Workaround
OPTION 1 - Restart DSE
Restart DSE on the affected node to reset all memory buffers and start from a clean slate.
OPTION 2 - Temporarily downgrade
If the issue persists after restarting DSE a number of times, temporarily downgrade to DSE 6.7.2 by reinstalling the binaries. This issue affects only DSE 6.7.3 so switching to DSE 6.7.2 will stop the errors. Note that this binary change does not require an upgrade operation on the SSTables.
Solution
A fix is included in DSE 6.7.4 (DB-3172). Upgrade to the latest version of DataStax Enterprise to get the latest fixes and improvements.