Summary
This article discusses errors in the logs of DataStax Enterprise nodes when processing messages from clients/drivers.
Applies to
- DataStax Enterprise 5.0.14 or earlier
- DataStax Enterprise 5.1.10 or earlier
- DataStax Enterprise 6.0.2 or earlier
Symptoms
Server-side logs report Netty messaging errors for IllegalStateException
and TooLongFrameException
while processing messages from the clients. Here are example stack trace outputs from a node running with DSE 5.0.11:
ERROR [RemoteMessageClient worker-8-19] 2018-08-16 12:34:56,108 ClientServerConnection.java:423 - Adjusted frame length exceeds 268435456: 290752842 - discarded io.netty.handler.codec.TooLongFrameException: Adjusted frame length exceeds 268435456: 290752842 - discarded at io.netty.handler.codec.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:499) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.handler.codec.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:477) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.handler.codec.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:403) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at com.datastax.bdp.node.transport.MessageCodec$FrameDecoder.decode(MessageCodec.java:171) ~[dse-core-5.0.11.jar:5.0.11] at io.netty.handler.codec.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:343) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:369) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] ...
ERROR [RemoteMessageClient worker-8-19] 2018-08-16 12:34:56,111 ClientServerConnection.java:423 - java.lang.IllegalStateException: Unexpected protocol magic number! io.netty.handler.codec.DecoderException: java.lang.IllegalStateException: Unexpected protocol magic number! at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:400) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:307) [netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:293) [netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-all-4.0.34.Final.jar:4.0.34.Final] ... Caused by: java.lang.IllegalStateException: Unexpected protocol magic number! at com.datastax.bdp.node.transport.MessageCodec$FrameDecoder.decode(MessageCodec.java:169) ~[dse-core-5.0.11.jar:5.0.11] at io.netty.handler.codec.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:343) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:369) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] ... 15 common frames omitted
Cause
DataStax Enterprise uses Netty-based transport for remote client connections and internode messaging. Each frame is encoded with a "magic number" at the beginning, followed by the length of the message, and finally the message itself. The magic number is an arbitrary value defined by the protocol used as a quick way to verify the sanity of the sender. If the magic number at the start of the frame matches the value defined by the protocol then the message is accepted by the node for processing. Otherwise, the frame is considered invalid or corrupt and gets discarded.
By default, the maximum frame size is 268435456 bytes (defined as native_transport_max_frame_size_in_mb: 256
in cassandra.yaml
). Frames larger than this value also get discarded by the nodes. Below is a visual representation of the frame:
If the size of the frame coming down the wire is larger than the maximum frame size, the body of the message "spills" or overflows into the next frame where the node expects the magic number to be located. In a situation where the frame length is too long, the Netty decoder throws a TooLongFrameException
and discards the frame.
If the contents of the frame is corrupted, the decoder does not get the expected magic number at the beginning of the frame and throws an IllegalStateException
. Possible causes of the corruption are network transmission errors and/or encryption/decryption errors (i.e. incorrect SSL configuration either on the client or server-side).
Solution
Review the amount of data being sent by the application via the driver. Redesign the data model such that the application is not writing more than 256MB of data to the cluster.
In a future release of the Java driver, the maximum frame length will be configurable (JAVA-1293). The size of the frame the driver can send to the cluster will also be restricted (JAVA-1294).
On the server side, nodes will not be allowed to send large frames (internal improvement ID DSP-15664) and be restricted to a maximum value (CASSANDRA-12630).
Click the "Follow" button above to get notified for when the improvements become available.