Full ERROR Message Example
ERROR 08:00:14 Batch of prepared statements for stresscql.batch_too_large is of size 58024, exceeding specified threshold of 51200 by 6824. (see batch_size_fail_threshold_in_kb)
What does this error mean?
This error means a BATCH statement in CQL is larger in size than the batch_size_fail_threshold_in_kb setting in the cassandra.yaml file.
Why does this error occur?
As the error suggests, the total amount of data that has been placed into a single BATCH statement is larger than the batch_size_fail_threshold_in_kb will allow. The default value for the is 50 kb in Apache Cassandra and in DataStax Enterprise. When a batch statement is executed in CQL, the BatchStatement.verifyBatchSize method checks size of the data for all mutations in the batch. If the size is greater than the failThreshold value, the ERROR message is printed and an InvalidRequestException is thrown (”Batch too large”)
How do you fix this error?
You could increase the batch_size_fail_threshold_in_kb value to allow for larger batches. However, this is not generally recommended. A better option is to break down the batch request into smaller batches. Larger batches will have a significantly negative impact on the coordinator node for the request, leading to node instability. Regardless of the route taken, proper testing should always occur before implementing on a production cluster. If you need assistance with determining batch sizing and batch configuration settings, please reach out to the services team at DataStax.