Overview
This article answers frequently asked questions about the use of the disk_access_mode
in cassandra.yaml
and why this setting is important to DSE.
Applies to
- DSE 5.1 and earlier
What is the disk_access_mode setting in cassandra.yaml?
What is the cassandra.yaml
setting disk_access_mode
, which settings can I use, and how can I determine which setting is appropriate for my environment?
In DSE 5.0 and earlier, compressed data (the default) was always effectively read using disk_access_mode: mmap_index_only
, even though the option was set to mmap
. This read behavior occurred because the data from disk was decompressed when DSE read it using temp buffers.
With CASSANDRA-8464, this behavior changed to directly compress data to enable mmap to work correctly. The problem is when DSE has a lot of random reads (the hot dataset > memory) it's more efficient to set disk_access_mode: mmap_index_only
to prevent page faults.
DataStax recommends conducting normal tuning and running system activity reports. As part of the tuning process, be sure to monitor page faults and system performance on operator machines. For example, use a performance monitoring tool like sar (System Activity Report).
Switching to mmap_index_only
has improved performance in these use cases:
- Linux OOM killer terminates DSE due to rapid off-heap memory growth (
mmap
allocating too much too fast) - High end percentile latencies (for example, 99% and Max)
- Sporadic read timeouts occur, usually in conjunction with latencies as above
disk_access_mode: mmap_index_only
.
Note These read errors are not observed in DSE 6.x and later. The standard mode is default and standard data is not mapped into memory.
What are some example use cases?
When checking page faults, you can use sar -B to observe higher page faults as well as page scans and steals (using a sar man page reference):
03:28:24 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff
...
03:49:18 PM 1704.00 0.00 39492.00 107.00 703.00 0.00 0.00 0.00 0.00
03:49:19 PM 1828.00 0.00 21955.00 89.00 1005.00 0.00 0.00 0.00 0.00
03:49:20 PM 1944.00 16.00 43172.00 79.00 981.00 0.00 0.00 0.00 0.00
03:49:21 PM 1868.00 264.00 223040.00 76.00 184417.00 10670.00 57090.00 19921.00 29.40
03:49:22 PM 2500.00 8.00 145066.00 111.00 97732.00 0.00 240918.00 91635.00 38.04
03:49:23 PM 2376.00 12.00 70774.00 116.00 234907.00 0.00 341864.00 223022.00 65.24
When checking pmap, we can see this process is using a large amount of RSS for “anon” which can again point to a large amount of off-heap mapped memory. Here is an example pmap output:
sudo pmap -x $(cat /run/dse/dse.pid) | awk 'NR==2;/anon/' | more
Address Kbytes RSS Dirty Mode Mapping
0000000000400000 4 4 0 r-x-- java
0000000000600000 4 4 4 rw--- java
0000000000bde000 19676 19548 19548 rw--- [ anon ]
00000004c0000000 12588416 12588416 12588416 rw--- [ anon ]
00000007c0560000 1043072 0 0 ----- [ anon ]
00000032c8c00000 128 116 0 r-x-- ld-2.12.so
00000032c8e20000 4 4 4 r---- ld-2.12.so
00000032c8e21000 4 4 4 rw--- ld-2.12.so
00000032c8e22000 4 4 4 rw--- [ anon ]
00000032c9000000 1576 680 0 r-x-- libc-2.12.so
00000032c918a000 2048 0 0 ----- libc-2.12.so
When there’s lots of mmap-ing going on, you may see a lot of [ anon ]
in the pmap output. This should be verifiable against rapid RSS growth of the DSE process.
Which disk_access_mode mode am I using?
If disk_access_mode is not explicitly set in cassandra.yaml
, then the default is auto
. In your DSE output.log
or system.log,
look Node configuration (bolded for emphasis in this example).
For example, we can see the node configuration line and subsequent message:
INFO [main] 2018-12-17 19:35:53,236 Config.java:509 - Node configuration:[aggregated_request_timeout_in_ms=120000;... **disk_access_mode=auto**;...
...
INFO [main] 2018-12-17 19:35:53,237 DatabaseDescriptor.java:350 - **DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap**
The following settings are valid for disk_access_mode
auto — disk access mode is determined automatically. On a 64-bit OS, mmap is used for index files and SSTable files. On a 32-bit OS, mmap is used only for the smaller index files.
mmap — use mmap for SSTable files and index files
mmap_index_only — use mmap only for the smaller index files, not the larger SSTable files
standard — use standard i/o, not mmap, for SSTable files and index files
See also
KB articles:
Increased memory utilisation on nodes after upgrading to DSE 5.0 or 5.1
Nodes become unresponsive during high app traffic periods when mmap is enabled on DSE 6