DataStax Help Center

OutOfMemoryError when repairing a large number of sstables

Summary

In some circumstances when running incremental repair nodes have been observed to run out of heap space.

 

Symptoms

During manual incremental repair run with "nodetool -inc -par <keyspace>" it has been observed that G1GC starts doing ever more frequent garbage collection with pauses for Old Generation in the range of 10 seconds. These pauses become so frequent that there are 6 of them in a minute. Also only a small fraction of the Old generation gets purged  and this goes on for many minutes until the system crashes for OutOfMemoryError.

 

Cause

When incremental repair run it keeps references to all sstables that it needs until the end of the repair session. With a large number of sstables, the amount of references stored in memory can grow until space in the heap becomes exhausted. This can cause an OutOfMemory condition leading to JVM instability

The jvm heap dump will show the objects referenced by  "org.apache.cassandra.service.ActiveRepairService". 

Workaround

A way to alleviate the problem and reduce the risk of OOM, is to manually run "nodetool repair -inc -par" on single tables at a time, avoiding the buildup of large number of sstable references in memory.

Solution

Two Jiras are tracking the fix for this issue:
CASSANDRA-11739
- internal jira: DSP-9640

Tags

OutOfMemoryError heap repair incremental 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk