When importing data into a Cassandra database using Sqoop, some import jobs return a StackOverflowError exception.
Below is a sample error from a Sqoop import job:
attempt_201507310947_0006_m_000001_0: ERROR 11:02:41 Error running child : java.lang.StackOverflowError
In some instances, the default thread stack memory size set for the JVM (defined with the JVM option -Xss) is insufficient when processing large data sets. When iterating through the data, new stack frames are recursively allocated but when the allocated memory runs out, the JVM throws a StackOverflowError exception.
The default stack size may very depending on the OS and Architecture (32bit / 64bit) that the job is being run on. The following link quotes some default sizes
In such cases this can be adjusted by editing the /etc/dse/hadoop/mapred-site.xml file as per the following example. The stack size can be gradually increased until the error is resolved.