Description
The current health endpoint used by the load-balancer only checks the getcapabilities request. But it should also check if jobs can still be submitted (issues with slurm queue, filesystem down, ...).
Maybe add a health process which runs quickly but gives us the possibility to add the checks we want?
Environment
- rook version used, if any: 0.4.1
- Python version, if any:
- Operating System:
Steps to Reproduce
Additional Information
Description
The current
healthendpoint used by the load-balancer only checks thegetcapabilitiesrequest. But it should also check if jobs can still be submitted (issues with slurm queue, filesystem down, ...).Maybe add a
healthprocess which runs quickly but gives us the possibility to add the checks we want?Environment
Steps to Reproduce
Additional Information