Full ERROR Message Example
ERROR [NonPeriodicTasks:1] 2019-07-15 09:20:28,823 LogTransaction.java:272 - Transaction log [mc_txn_compaction_c9ff5650-a4b9-11e9-aac5-99b8bc39009e.log in /var/lib/cassandra/data3/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e] indicates txn was not completed, trying to abort it now ERROR [NonPeriodicTasks:1] 2019-07-15 09:20:28,823 LogTransaction.java:275 - Failed to abort transaction log [mc_txn_compaction_c9ff5650-a4b9-11e9-aac5-99b8bc39009e.log in /var/lib/cassandra/data3/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e] java.lang.RuntimeException: java.nio.file.FileSystemException: /var/lib/cassandra/data3/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc_txn_compaction_c9ff5650-a4b9-11e9-aac5-99b8bc39009e.log: Input/output error at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:590) at org.apache.cassandra.io.util.FileUtils.appendAndSync(FileUtils.java:571) at org.apache.cassandra.db.lifecycle.LogReplica.append(LogReplica.java:90) at org.apache.cassandra.db.lifecycle.LogReplicaSet.append(LogReplicaSet.java:216) at org.apache.cassandra.db.lifecycle.LogFile.addRecord(LogFile.java:348) at org.apache.cassandra.db.lifecycle.LogFile.abort(LogFile.java:261) at org.apache.cassandra.utils.Throwables.perform(Throwables.java:116) at org.apache.cassandra.utils.Throwables.perform(Throwables.java:106) at org.apache.cassandra.utils.Throwables.perform(Throwables.java:101) at org.apache.cassandra.db.lifecycle.LogTransaction$TransactionTidier.run(LogTransaction.java:273) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.FileSystemException: /var/lib/cassandra/data3/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc_txn_compaction_c9ff5650-a4b9-11e9-aac5-99b8bc39009e.log: Input/output error at sun.nio.fs.UnixException.translateToIOException(Unknown Source) at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(Unknown Source) at java.nio.file.spi.FileSystemProvider.newOutputStream(Unknown Source) at java.nio.file.Files.newOutputStream(Unknown Source) at java.nio.file.Files.write(Unknown Source) at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:583) ... 18 common frames omitted
What does this ERROR message mean?
In this context, the transaction log is a file that records the progress of low-level file transactions in the sstable write path, such as compactions, flushes, streaming, etc. This allows sstables to be swapped in or out of the write path in an atomic way. If a transaction is incomplete, Cassandra attempts to abort it and roll back the change. This error occurs when the transaction could not be aborted.
Why does this ERROR occur?
Usually this error occurs due to some other underlying I/O error. There is normally a "Caused by" exception that gives the underlying reason. Here are some common "Caused by" reasons that have been observed by DataStax support in the past.
The following exception indicates an underlying low-level I/O error such as a failing disk.
Caused by: java.nio.file.FileSystemException: /var/lib/cassandra/data3/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc_txn_compaction_c9ff5650-a4b9-11e9-aac5-99b8bc39009e.log: Input/output error
at sun.nio.fs.UnixException.translateToIOException(Unknown Source)
at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(Unknown Source)
at java.nio.file.spi.FileSystemProvider.newOutputStream(Unknown Source)
at java.nio.file.Files.newOutputStream(Unknown Source)
at java.nio.file.Files.write(Unknown Source)
at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:583)
The exception below indicates that the nofile ulimit has been hit, which is restricting the total number of open files allowed.
Caused by: java.nio.file.FileSystemException: /app/cassandra/dse-6.0.4/dse-data/data/dse_perf/write_latency_histograms_ks-9b2c3f: Too many open files in system at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434) at java.nio.file.Files.newOutputStream(Files.java:216) at java.nio.file.Files.write(Files.java:3351) at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:585) ... 22 common frames omitted
The next exception indicates that a permission error has occurred on the file, and the Cassandra process is not able to read or write it.
Caused by: java.nio.file.AccessDeniedException: /graphdb_data1/dse_data/cassandra/data/system/repairs-a3d277d1cfaf36f5a2a738d5eea9ad6a/ac_txn_flush_5df101b0-c615-11ea-8dd2-09f202282ffd.log at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434) at java.nio.file.Files.newOutputStream(Files.java:216) at java.nio.file.Files.write(Files.java:3351) at org.apache.cassandra.io.util.FileUtils.write(FileUtils.java:637) ... 23 common frames omitted
How do you fix this ERROR?
The solution depends on the underlying cause indicated by the "Caused by" exception:
- For an "Input/output error", check the syslog and dmesg to see if the Linux kernel is reporting any underlying problems with your disk. You may also wish to use a SMART monitoring utility to check the health of your disk drives.
- For a "Too many open files in system" error, check the ulimits applied to the Cassandra process and make sure it is sufficient for the number of data files you have. Apply the recommended limits from the DSE documentation: https://docs.datastax.com/en/dse/6.7/dse-dev/datastax_enterprise/config/configRecommendedSettings.html#Setuserresourcelimits.
- For an AccessDeniedException, make sure that the user running the Cassandra process has read and write permission to the data directory and all the files within it. This error commonly occurs if you accidentally started Cassandra as root and then later restart it using a less privileged user. Cassandra will have created files owned by root and can no longer read or write those files as another user.