Full ERROR Message Example:
This error message is quite generic and can be seen in different scenarios:
Scenario 1 - Assertion Error:
ERROR [SharedPool-Worker-1] 2018-11-12 11:16:15,326 Message.java:538 - Unexpected exception during request; channel = [id: 0x4368f2bd, /10.1.29.135:44088 => /10.1.12.51:9042]
java.lang.AssertionError: null
Scenario 2 - Marshall error:
ERROR [Native-Transport-Requests-3] 2020-01-13 00:19:15,918 ErrorMessage.java:387 - Unexpected exception during request
org.apache.cassandra.serializers.MarshalException: Invalid remaining data after end of UDT value
Scenario 3 - INFO message due to network issue:
INFO [epollEventLoopGroup-6-36] 2019-05-01 10:13:05,482 Message.java:627 - Unexpected exception during request; channel = [id: 0x55348049, L:/10.133.170.110:9042 ! R:/10.133.164.56:58464]
java.nio.channels.ClosedChannelException: null
What does this ERROR message mean?
According to the scenarios where this error or info message is encountered, the messages indicate a different problem. In general, the message means that a query didn't complete successfully because of problems during runtime.
Why does this ERROR occur?
If the message refers to a "communication channel", the message will include the IP addresses of the calling computer and the that of the server computer, like this:
channel = [id: 0x55348049, L:/10.133.170.110:9042 ! R:/10.133.164.56:58464]
where R and L indicate Remote and Local IP addresses. In this case, there could be a network issue between the remote and local computers. Maybe the port number of the local computer is not open or there is a firewall between the computers.
In other instances, the communication channel is not mentioned. This scenario refers to other root causes that are responsible for the failed query.
In the case of the marshall error (Scenario 2), there is a problem with data mismatch between the columns or data type that the query uses and the columns or data type of the cassandra table.
There are at least two cassandra JIRA tickets that refer to this error:
CASSANDRA-12737
and
CASSANDRA-15263
A useful articles where this error is also discussed is:
https://support.datastax.com/hc/en-us/articles/360004550998-FAQ-High-blocked-NTR-count-during-increased-workload-on-Cassandra-node
How do you fix this ERROR?
There is no single procedure to resolve these types of errors.
If the error is observed in scenarios 1 and 3 above, it is important to troubleshoot the network link between the calling computer and the Cassandra node. Checking port binding or trace routing the IP addresses could be some of the things to check. Making sure that ports are enabled through firewall is also important.
if the error is observed in scenario 2, then it is important to check the actual query and see if the list of columns and data types are correct and match the table schema.