Intermittent Workload Restarts
Incident Report for amazee.io
Resolved
This incident has been resolved.
Posted Apr 05, 2024 - 06:03 UTC
Update
We're still investigating certain workload reastarts. There seems to be certain workloads that can trigger coditions on the compute nodes which then leads to the workloads on the compute node being rescheduled.
Posted Feb 13, 2024 - 09:39 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jan 24, 2024 - 22:51 UTC
Identified
Following up from the earlier Incident regarding the intermittent workload restarts: We'll run an additional maintenance window after 21:00 UTC today to move workloads onto a new set of compute nodes. This action should stabilize the intermittent workload restarts we are seeing.
Posted Jan 24, 2024 - 16:58 UTC
This incident affected: Switzerland (ch4.lagoon).