Introduction
OpsCenter is capable of collecting metrics, managing repairs, performing backups, and performing other management operations on DataStax Enterprise (DSE) clusters even when Lifecycle Manager (LCM) is not used to deploy and configure DSE. As such, clusters can be added and removed to OpsCenter for purposes of management independent of whether they are managed in LCM. However, all LCM clusters are registered with OpsCenter for management.
When an LCM-managed cluster is deleted from OpsCenter management, LCM jobs will fail with the error: "Cluster connection settings could not be updated: Cannot find cluster configuration in opscenterd. Please update your cluster connection settings manually." LCM does not attempt to automatically recover from this situation.
Workaround
LCM is capable of re-registering the cluster with OpsCenter, but first it's necessary to manually clear the old and currently invalid cluster id:
- Get a list of LCM clusters and make note of the ID of the affected cluster: curl http://127.0.0.1:8888/api/v1/lcm/clusters/ | json_pp
- Clear the opsc-cluster-id, signalling to LCM that it should re-register the cluster: curl -X PUT http://127.0.0.1:8888/api/v1/lcm/clusters/LCM-CLUSTER-UUID -H "Content-Type: application/json" -d '{"opsc-cluster-id": null}' | json_pp
- Run an install-job in LCM. It's necessary to run an install job even if DSE is already installed in order to cause LCM to re-register the cluster for management in OpsCenter. It is safe and efficient to run install jobs multiple times on the same nodes, LCM will detect that DSE is already installed with the expected version and efficiently skip software installation steps.
Notes:
- LCM-CLUSTER-UUID must be replaced with the id found in step 1.
- Piping to json_pp is optional, but sending output to json_pp, another json formatter, or using a rest-client like Postman makes output much easier to read.
- If OpsCenter api/ui authentication is enabled, it's necessary to acquire a session token and include it in the requests in steps 1 and 2. This is documented at https://docs.datastax.com/en/opscenter/6.5/api/docs/authentication.html
Technical Details
In order to reduce the frequency with which this issue is encountered in the field, OPSC-15026 has been filed as a request for OpsCenter deletion of clusters to automatically clear the opsc-cluster-id from LCM clusters. Inquire with DataStax support for an update on the status of this feature request.