The datastax agent on a node stops running unexpectedly due to running out of memory.
The datastax agent stops running and the agent.log shows an error similar to this:
java.lang.OutOfMemoryError: GC overhead limit exceeded
In OpsCenter version 5.0 onwards, the datastax agent requires more memory. The out of box memory settings allocated to the datastax agent are not sufficient in some environments and cause the datastax agent to run out of memory and stop running. This problem has also been seen where OpsCenter is collecting metrics for lots of column families.
To workaround the problem allocate more memory to the datastax agent by updating the datastax-agent-env.sh file. For example, the OpsCenter 5.1.2 datastax agent has this entry in the datastax-agent-env.sh file:
JVM_OPTS="$JVM_OPTS -Xmx128M -Djclouds.mpu.parts.magnitude=100000 -Djclouds.mpu.parts.size=16777216"
Increase the amount of memory allocated like this:
JVM_OPTS="$JVM_OPTS -Xmx256M -Djclouds.mpu.parts.magnitude=100000 -Djclouds.mpu.parts.size=16777216"
Alternatively, given this issue can occur when OpsCenter is collecting metrics for a lot of column families you can actively exclude keyspaces and column families using "ignored_keyspaces" and "ignored_column_families" respectively in the <cluster_name>.conf to reduce memory consumption:
ignored_keyspaces = system, OpsCenter, <other keyspaces you can ignore>
ignored_column_families = <comma separated list of column families you can ignore>
The problem is being investigated in this internal OpsCenter JIRA:
OPSC-3426 Datastax-agent OOM - caused by backup/restore
For more information on this JIRA contact DataStax Technical Support.