DevOpsLinkedIn PostSeptember 24, 20251 min read113 words

GitLab's 2017 'oops' moment - One command. Wrong server. 6 hours of data gone.

M

Mojahid Ul Haque

DevOps Engineer

7 likes0 comments476 views

GitLab's 2017 "oops" moment

One command. Wrong server. 6 hours of production data… gone.

What went wrong? - A spam attack overloaded their DB → replication lag. - An engineer tried to resync the replica… but ran the wipe command on the primary. - Backups? Many were broken or untested. - Final fix: restoring from a 6-hour-old staging copy (painfully slow).

Lessons for us DevOps folks: 1. Backups mean nothing until you've tested restores. 2. Guardrails on destructive ops save careers. 3. Treat RPO/RTO as facts, not assumptions. 4. Blameless culture = faster learning, fewer cover-ups.

If you've never practiced a restore, you don't have a backup — you have a bedtime story.

Originally posted on LinkedIn

View original post