Incident Response Automation
Detect production issues and coordinate incident response
Prerequisites
- OpenClaw installed and running
- Monitoring tools (Sentry, Datadog, or Pingdom)
- Slack workspace
- On-call rotation defined
Required Skills
openclaw install sentry-debuggeropenclaw install slack-digestopenclaw install docker-managerInstallation Steps
Install required skills
Install the Sentry debugger, Slack digest, and Docker manager skills.
openclaw install sentry-debugger slack-digest docker-managerConfigure alert sources
Set up webhooks from Sentry, Datadog, and/or Pingdom to your OpenClaw instance.
Define severity levels
Configure the severity assessment rules and corresponding actions (page, Slack mention, message, or log).
Set up on-call rotation
List the on-call engineers and configure the escalation path.
Add the config snippet
Copy the configuration below and customize the alert sources, severity levels, and on-call team.
Configuration
{
"webhooks": {
"alert": {
"url": "/webhooks/alert",
"sources": ["sentry", "datadog", "pingdom"],
"actions": [
"assess-severity",
"create-incident-channel",
"notify-on-call",
"gather-diagnostics",
"suggest-remediation"
]
}
},
"incidentResponse": {
"onCall": ["alice", "bob"],
"severityLevels": {
"critical": "page-immediately",
"high": "slack-mention",
"medium": "slack-message",
"low": "log-only"
}
}
}Add this to your openclaw.json and customize the values for your setup.
SOUL.md
## Incident Response Behavior
- Stay calm in all messaging. No exclamation marks, no "URGENT!!!" — a measured tone helps the team think clearly.
- In the first message to the incident channel, state only what you know for certain. Separate confirmed facts from hypotheses.
- Don't page for issues that self-resolve within 2 minutes (transient spikes, single-request failures). Wait, re-check, then escalate.
- If multiple alerts fire within 60 seconds, treat them as one incident. Look for a common cause before creating separate channels.
- When suggesting remediation, always include the rollback option first. The fastest fix is usually undoing the last deploy.
- Never restart services or roll back automatically — suggest it and wait for human confirmation. You don't have full context.
- Post a timeline of events in the incident channel as you gather information. This becomes the post-mortem foundation.Add this to your SOUL.md to define the agent's behavior for this workflow.
Expected Behavior
When a production alert fires, OpenClaw assesses severity, creates a dedicated Slack incident channel, pages the on-call engineer, gathers system diagnostics, and provides remediation suggestions based on similar past incidents.
Usage Guide
The incident response is fully automated. When an alert fires, OpenClaw creates a #incident-XXXX Slack channel, pages the on-call person for critical issues, and starts gathering diagnostics. The remediation suggestions improve over time as the system learns from past incidents.
Community Use Cases
All Use Cases →More DevOps Recipes
All Recipes →Sentry → Auto-Debug → Open PR
Automatically analyze Sentry errors, generate fixes, and create pull requests
PR Review Automation
Automatically review pull requests with security scanning and style checks
Daily Standup Summarizer
Collect GitHub activity and generate standup summaries for the team