This article explains the significance of the compression ratio statistic.
One of the
nodetool tablestats statistics is the SSTable compression ratio. It gives an indication of how well the table's data is compressed based on the ratio of the size of the compressed SSTable data and the original (uncompressed) size.
compressionRatio = (double) compressed/uncompressed;
Values range from 0 to 1:
- A low value reflects a high rate of compression.
- A ratio close to 1 indicates the data is hardly compressed.
For tables with a high compression ratio, the cost of compression may outweigh the benefits. Note that what is considered high is a subjective matter and depends on the cluster use case, performance characteristics, and access patterns. DataStax recommends that you perform tests with and without compression enabled to determine the configuration that best suits the application requirements.
In the output below, the compressed data is 12.9% of the original size which indicates a very good compression ratio:
SSTable Compression Ratio: 0.1287426576442238
In contrast, this output shows 89.6% ratio indicating the compressed data is still huge compared to the original size:
SSTable Compression Ratio: 0.8964684393508305