The last time I did something you might call pretty stupid was when I started my previous job, freshly minted in a support role. I had to update the encrypted password field of a single user. I forgot the WHERE clause. I updated about 15,000 passwords with the same value. Oops. Quite an impression to make within your first week on the job.
These days I’m responsible for quite a bit more, if I screw up, it’s going to be significantly more troubling.
That day finally came on Tuesday, when I deleted a SharePoint content database from the live server, believing I was deleting it from a development server I’d recently restored it to. It was 22.07pm, so there wasn’t much by way of resistance, because nothing was really accessing the database. Angrily I clicked through the deadlock warnings and deleted away…
As I scrolled up my list of connections in SSMS, I thought to myself, “Why was that deadlocked, nothing should be accessing it other than me, I’d already closed all connections?”
The damage is done
When I realised what I’d done, I took about 60 seconds to let the fear in, panic and think about what unemployment would be like. Then I decided this was really happening, now it was down to me to put it right, somehow.
The first thing I did was check the backup situation, of course, thankfully I’d recently moved the schedule on that server to start when most people are going home, 17.30pm. It had finished at 20.05pm. “Let us hope this backup is valid” I thought to myself, as I used Restore-DbaBackup to make it happen.
While this happened, I confessed my boo boo to my bosses boss, because why not. Her response was “Shut the front door!” Then various conversations were had with a few people in my team while the restore completed. I can’t stress how much of a relief it is to have colleagues who support you in these dark times, even if it’s just with kind words, or stories about DBAs before you who screwed up too.
Fast forward to the restore completing, SharePoint was happy once again. I learned a few valuable lessons, which I already knew in theory, but the reality is quite different.
Make sure you’re using the right Recovery Model. If your database is that important, use the FULL Recovery Model and take regular log backups. If the 2 hour old FULL backup had turned out to be no good, I was faced with going back to Friday’s FULL backup, which I’d have to request from a tape. Bad. Why Friday’s? Because Saturday’s and Sunday’s get overwritten and never make it to tape either, the current backup “standard” I’ve inherited doesn’t apply date/time to the filename. Bad.
Keep more backups to hand. Adding the date/time to a backup filename is a must in my opinion, don’t overwrite a backup because it’s called WSS_Content.BAK regardless of the date and time it was created. If you can’t send the backups off-site as often as you’d like, at least keep them to hand a while longer, don’t just overwrite them. You can never have too many backups.
Test your backups. I had zero confidence the backup I was restoring would work. The last time I restored a backup of this database was earlier that day, but the backup was from April. What if something had gone awry since then? Prior to that, had this database ever been restored, anywhere, by anyone?! See this awesome post over at dbatools.io about creating a dedicated backup test server.
Enable Instant File Initialization. If you can, because there are “security considerations” with this, apparently. When you’re being asked “how are things going” and the progress bar isn’t moving, while the server zeros that 230GB file it’s creating, you’ll wish you had it enabled. See here for wise words from Ozar or here for the boring stuff from Microsoft.
Don’t ignore or dismiss error messages. If I’d read the error message instead of bullishly clicking my way through it, I’d have saved some pain.
Why have I confessed this to the internet? Well my entire department knows about it, so will numerous senior managers, several pay grades above me, when the incident is quite rightly reviewed. I expect before long, when word gets around, people I’d imagine have no interest or need to know otherwise will stop me in the hallway, saying “I heard about what you did.”
This isn’t what I’d imagined my first real blog post would be about, but here it is. I screwed up, but I’ve learned 🙂