The DataStax Enterprise Hadoop implementation runs a job tracker and reserve job tracker to take over in the event of a problem that would affect availability. Information is available for setting up the jobtracker here. However, there is limited instructions for setting up the reserve job tracker. This article provides a detailed example of how to setup the reserve job tracker.
If the job tracker fails manual intervention is required to restart the job tracker.
If the job tracker fails, new jobs will not be assigned to the reserve job tracker if it hasn't been configured.
The following example uses two analytics nodes with IPs 192.168.101.121 and 192.168.101.122. To enable the reserve job tracker follow these steps.
List the existing jobtracker using dsetool listjt:
$ dsetool listjt
Use dsetool movejt <node IP> to set the reserve job tracker on 192.168.101.121. Ensure you don't enable the reserve jobtracker on the node running the primary job tracker.
$ dsetool movejt 192.168.101.121
Setting 'reserve' JT to point to /192.168.101.121
Listing jobtrackers with dsetool listjt should now show two job trackers:
$ dsetool listjt
Reserve-JT = 192.168.101.121
NOTE: There is a known issue JIRA DSP-2627 that prevents the job tracker switching to reserve job tracker which is fixed in DSE 3.2.2 onwards.
You will also need to ensure your dse_system keyspace is replicating with everywhere strategy. On an Analytics/Hadoop node, this keyspace contains information about the location of the job tracker. If only a single node contains the job tracker replica, other nodes cannot find the job tracker when that node is unavailable for some reason. For more information see this link under 'Changing replication settings'. You can check whether keyspace dse_system is replicating with everywhere strategy by querying system.schema_keyspaces table.
In summary, to setup high availability with a reserve jobtracker you'll need to upgrade to DSE 3.2.2 or later, then follow the steps provided above to set a reserve job tracker. After taking down the primary job tracker you will need to exit and reopen your client (for example hive) session or restart the server for changes to the job tracker to be picked up. Old jobs will need to be cleared out as only new jobs will be submitted to the active job tracker.