qwen_agent/skills/incident-postmortem-ja/SKILL.md
2026-04-14 20:49:56 +08:00

83 lines
2.6 KiB
Markdown

---
name: incident-postmortem-ja
description: Create structured postmortems and 障害報告書 for incidents, outages, and service failures with clear timelines, root-cause analysis, and preventive actions. Use this whenever users ask for an incident report, postmortem, RCA, incident review, 障害報告, 障害報告書, 振り返り, or 再発防止計画 focused on system and process improvement; use it for formal incident analysis, not for routine status updates or personal blame.
---
# Incident Postmortem JA
## Overview
Produce structured incident postmortems in Japanese enterprise reporting style.
This skill is for:
- Service outage reports
- Internal incident retrospectives
- Customer-facing incident summaries
- Follow-up prevention planning
## Triggering Cues
Use this skill when user messages include:
- Chinese cues: 事故复盘、故障报告、根因分析、再发防止、事后报告
- Japanese cues: 障害報告、インシデント報告、ポストモーテム、原因分析、再発防止策
- English cues: postmortem, incident report, RCA, lessons learned
## Input Requirements
Ask for or infer:
1. Incident summary and impact window
2. Affected users/systems
3. Timeline events (with timestamps if possible)
4. Immediate mitigation and long-term fixes
5. Evidence links (logs, alert IDs, ticket IDs) if available
## Output Format
Always output in this template:
1. **概要 (Executive Summary)**
2. **影響範囲 (Impact Scope)**
3. **時系列 (Timeline)**
4. **原因分析 (Root Cause Analysis)**
5. **実施済み対策 (Mitigations Applied)**
6. **再発防止策 (Preventive Actions)**
7. **フォローアップ項目 (Owner / Due Date)**
For root cause, prefer simple 5-Whys style when data allows.
## Workflow
1. Build a factual timeline first.
2. Separate symptom, trigger, and root cause.
3. Distinguish temporary workaround vs permanent fix.
4. Convert prevention ideas into assigned actions.
5. Keep language objective, accountable, and non-defensive.
## Examples
### Example 1
Input:
- 支付服务宕机 2 小时,影响约 3000 用户,已紧急回滚。
Output style:
- Full postmortem sections
- Quantified impact and timeline clarity
- Preventive action list with owners and deadlines
### Example 2
Input:
- API 限流阈值配置错误导致高峰时段持续降级。
Output style:
- Clear distinction between config mistake and process gap
- Include monitoring/alerting improvement actions
## Guidelines
- Do not assign personal blame.
- Focus on system/process improvements.
- Explicitly label unknown facts as pending verification.
- Keep follow-up items measurable and time-bound.