DataStax Help Center

Enabling reserve job tracker for DSE Hadoop

Summary

The DataStax Enterprise Hadoop implementation runs a job tracker and reserve job tracker to take over in the event of a problem that would affect availability. Information is available for setting up the jobtracker here. However, there is limited instructions for setting up the reserve job tracker. This article provides a detailed example of how to setup the reserve job tracker.

 

Symptoms

If the job tracker fails manual intervention is required to restart the job tracker.

 

Cause

If the job tracker fails, new jobs will not be assigned to the reserve job tracker if it hasn't been configured.

 

Solution

The following example uses two analytics nodes with IPs 192.168.101.121 and 192.168.101.122. To enable the reserve job tracker follow these steps.

 

List the existing jobtracker using dsetool listjt:

$ dsetool listjt
DC                             JobTracker  
--                             --          
Analytics-Analytics            192.168.101.122

 

Use dsetool movejt <node IP> to set the reserve job tracker on 192.168.101.121. Ensure you don't enable the reserve jobtracker on the node running the primary job tracker.

 

$ dsetool movejt 192.168.101.121
Setting 'reserve' JT to point to /192.168.101.121

 

Listing jobtrackers with dsetool listjt should now show two job trackers:

$ dsetool listjt
DC                             JobTracker  
--                             --          
Analytics-Analytics            192.168.101.122
------
Reserve-JT = 192.168.101.121

 

NOTE: There is a known issue JIRA DSP-2627 that prevents the job tracker switching to reserve job tracker which is fixed in DSE 3.2.2 onwards.

 

You will also need to ensure your dse_system keyspace is replicating with everywhere strategy. On an Analytics/Hadoop node, this keyspace contains information about the location of the job tracker. If only a single node contains the job tracker replica, other nodes cannot find the job tracker when that node is unavailable for some reason. For more information see this link under  'Changing replication settings'. You can check whether keyspace dse_system is replicating with everywhere strategy by querying system.schema_keyspaces table.

 

In summary, to setup high availability with a reserve jobtracker you'll need to upgrade to DSE 3.2.2 or later, then follow the steps provided above to set a reserve job tracker. After taking down the primary job tracker you will need to exit and reopen your client (for example hive) session or restart the server for changes to the job tracker to be picked up. Old jobs will need to be cleared out as only new jobs will be submitted to the active job tracker.

 

Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk