Repair service incorrectly reporting timeout warnings in OpsCenter releases prior to 5.2.3


In releases prior to 5.2.3 some users might see incorrect repair service timeouts warnings.


The following types of errors might be seen in the repair service log:

2015-07-08 05:52:12+0000 [MyCluster] ERROR: Repair task (<Node'-502648577219449610'>, [-9185345399602795050, 9217528375793162274]) timed out after 3600 seconds.
2015-07-08 05:52:12+0000 [MyCluster] ERROR: Task (<Node'-502648577219449610'>, [-9185345399602795050, 9217528375793162274]) has failed 9 times.
2015-07-08 05:52:12+0000 [MyCluster] ERROR: 9 errors have occurred out of 100 allowed.
2015-07-08 05:52:12+0000 [MyCluster]  INFO: Adding repair task to end of queue
2015-07-08 05:52:12+0000 [MyCluster]  INFO: Beginning repair of range [-9185345399602795050, 9217528375793162274] on node ks=my_keyspace, cfs=[u'my_keyspace', u'my_keyspace_users_keyspaces']



There was a known issue in outlined in the following internal jira:

OPSC-5955 - Repair service reporting timeouts when actually repairs are completing ok


Checking the actual node logs for repair failure messages will suffice until it is possible to upgrade. Using manual repair around the ring is also an alternative.


Upgrade to OpsCenter 5.2.3


