Photo by Sarah Kilian on Unsplash
This story takes place in a company where I previously worked. Back then, I was a young developer and I was tasked with the deletion of former customer’s data.
I checked the database and identified all the tables that would be impacted. Then, I wrote a SQL script to delete all the rows concerning this particular customer.
This script was not directly executed in the production environment. Instead, all changes are included in a version control system tool equivalent to Liquibase or Flyway.
A few days later, our support team received a phone call from a customer who could no longer find his data on the application. After performing an investigation, we came to the realization that all customer data were deleted from the database. The main culprit was none other than my SQL script. Indeed, some DELETE
SQL statements were missing the WHERE
clause. In the meantime, the ops team tried to restore the production database with the most recent backup.
At that time, I was petrified and I dreaded what my boss could say to me.
To my surprise, he did not single me out but blamed all the team and particularly our quality checks. In fact, he simply wanted to answer this question: How could this blunder have made its way to the production environment?
First of all, the SQL script was merged on the master branch, with at least 2 approbators on my pull request. So, it means that our code review was not thorough enough.
Secondly, this script passed all our End to End tests and manual testing.
Finally, we discovered that the database backup system was not working properly and the last available backup was at least 2 months old. Luckily, it had a fairly limited impact on our business, since it happened during summertime.
In my opinion, we can learn two things about this little story. In the first place, a good team and a good manager will not put the blame on you if you have done a mistake. Everyone makes mistakes and it’s a good way to question some of the practices.
In the second place, it is important for a team to insure itself against the risk of fault, error or minor oversight. Despite quality processes, it’s important to test and prove those processes.
Thanks to Marc Barret for his time and his review
Top comments (1)
I wish I could say that I no longer committed flonky code that no one noticed in a code-review. Unfortunately, far too many people don't seem to take their review-responsibilities seriously enough. It's made worse by the fact that some of the more-junior team members sort of assume that the more-senior team members have enough experience that we don't make errors or that the type of errors we're likely to have made, they don't think they'll be able to suss out.