What is Garbage Collection?
Garbage collection is the process by which Java removes data that is no longer needed from memory. A Garbage Collection Pause will usually occur when a region of memory is full and the JVM needs to make space to continue. During a pause (also known as a stop-the-world event) all operations are suspended whilst the memory is freed. As this includes networking the node will often appear as down to other nodes in the cluster. Select and Insert statements will also wait usually affecting read and write latencies.
The following are the two most common log messages indicate that this problem is occurring:
INFO [ScheduledTasks:1] 2013-03-07 18:44:46,795 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 1835 ms for 3 collections, 2606015656 used; max is 10611589120 INFO [ScheduledTasks:1] 2013-03-07 19:45:08,029 GCInspector.java (line 122) GC for ParNew: 9866 ms for 8 collections, 2910124308 used; max is 6358564864
Any pause of more than a second - or multiple pauses within a second that add to a large fraction of that second - are a cause for concern and should be investigated.
At heart, this problem is caused by the rate of things being stored in memory out pacing the rate at which it can be removed out again. Over time we've noticed that there are some common scenarios where this occurs. If the problem has recently manifested then a good starting point is to see if there have been any recent changes to applications.
Things to look out for:
Excessive tombstone activity - often caused by heavy delete workloads.
Large row updates - or large batch updates. (Essentially you want to get the size of the individual write below 1Mb at the most).
Large selects - either total rowsize retrieved or unbounded selects (latter is mitigated by paging in 4.0+).
Extremely Wide rows - can manifest problems in repairs, selects, caching and probably elsewhere.
On the server side factors include:
- Missing or strange JVM parameters. Compare those set to the default settings shipped with latest.
- JNA not found.
- Swap enabled.