This article discusses latency issues on servers which have power-saving features.
Servers can show unusually high CPU utilisation or high latencies for Cassandra operations despite a low (or lower) level of application traffic to the cluster.
In some situations, the application traffic/load is well below the peak or prime time but the servers are still heavily loaded with high CPU utilisation according to utilities such as
In an effort to reduce the footprint at data centres, servers in the last few years feature a power-saving mode which not only reduces a server's power consumption but also reduces the overall heat output which has a significant impact on cooling costs.
On Linux systems, this feature is called CPU frequency scaling (or CPU speed scaling) and allows the clock speed to be dynamically adjusted on running servers. This, for example, enables a server to run at lower clock speeds when the demand or load is low.
CPUfreq governor manages the scaling of frequencies based on defined rules. On most Linux systems, the default
ondemand governor switches the clock frequency to maximum when the demand is high then switches to the lowest frequency when the system is idle.
However, the power saved by the
ondemand governor is offset by the latency cost when switching clock frequencies. In certain conditions, the governor incorrectly misjudges the load requirements and CPUs get pinned on lower frequencies making it appear that processors are 100% utilised despite traffic being low on the server.
This behaviour has a detrimental effect on servers running DSE and Cassandra since the throughput gets capped at a lower rate than what they were designed to serve.
Do not use governors which lower the CPU frequency. Reconfigure all CPUs to use the
performance governor which locks the frequency at the maximum possible. This governor will not switch frequencies which effectively means there will be no power savings but the servers will always run at maximum throughput.
On most systems, set the governor as follows:
for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor do [ -f $CPUFREQ ] || continue echo -n performance > $CPUFREQ done
WARNING - Ensure that changes will persist after a reboot. Consult your system administrator for implementing this specifically for your environment.
Blog post - Al Tobey's Cassandra tuning guide
RedHat doc - Power Management Guide | Using CPUfreq Governors