DevOps & SRE Runbooks

Writing & Translation technical

$115/hr Starting at $150

I write infrastructure runbooks, incident response playbooks, and operational procedures for DevOps, SRE, and platform engineering teams — designed to be used at 3am by an on-call engineer who needs fast, unambiguous steps, not narrative prose.

Runbook scope: service degradation and outage response; database failure and recovery (Postgres, MySQL, Redis); Kubernetes node pressure, pod crash loops, and OOMKill events; certificate expiry and rotation; deployment failure and rollback; container registry and build pipeline failures; DNS resolution failures; and scheduled maintenance procedures.

Incident response playbooks: severity classification criteria; escalation path and notification tree; communication templates (internal Slack, external status page, customer communication); post-mortem structure and action item tracking; and on-call handoff procedures.

Format: decision-tree structure with exact commands, expected outputs, and conditional branches ('if X, proceed to step 5; if Y, escalate to DBA'). Every runbook ends with a verification step and escalation path.

My infrastructure background means the failure scenarios I document are ones I've debugged myself — not generic templates.

About

$115/hr Ongoing

Download Resume

I write infrastructure runbooks, incident response playbooks, and operational procedures for DevOps, SRE, and platform engineering teams — designed to be used at 3am by an on-call engineer who needs fast, unambiguous steps, not narrative prose.

Runbook scope: service degradation and outage response; database failure and recovery (Postgres, MySQL, Redis); Kubernetes node pressure, pod crash loops, and OOMKill events; certificate expiry and rotation; deployment failure and rollback; container registry and build pipeline failures; DNS resolution failures; and scheduled maintenance procedures.

Incident response playbooks: severity classification criteria; escalation path and notification tree; communication templates (internal Slack, external status page, customer communication); post-mortem structure and action item tracking; and on-call handoff procedures.

Format: decision-tree structure with exact commands, expected outputs, and conditional branches ('if X, proceed to step 5; if Y, escalate to DBA'). Every runbook ends with a verification step and escalation path.

My infrastructure background means the failure scenarios I document are ones I've debugged myself — not generic templates.

Skills & Expertise

Communication SkillsDatabase DevelopmentDesign Verification TestingDevOpsDNSDocBookDocument DesignEngineeringEvaluation DesignJavaDocSoftware DeploymentTechnical EditingTechnical WritingTemplates

Andrew Barth