DataStax Help Center

Tuning memory for Hadoop tasks

The Hadoop task tracker itself is run within the same JVM as DSE.  However, individual tasks that operate over your data are forked or child JVMs of the task tracker.  The setting for how much memory to allocate for each mapper and reducer task is mapred.child.java.opts.  The default is 256M.  If you are running into out of memory exceptions when running your jobs, consider increasing this value.  Common settings for this property range from 1G to 4G.  Keep in mind that this amount of memory will be allocated per mapper or reducer task.  So to calculate the total amount of memory used by these tasks, multiply the value by the number of total concurrent mappers and reducers that you have configured for that node.

You can set this property in the Hadoop configuration in the mapred-site.xml on each node.  You can also override this property for individual jobs.  For example, in the job configuration or in your pig or hive script, you can set this property to override the value that's in mapred-site.xml.

Was this article helpful?
3 out of 3 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk