DEV Community

Cover image for Cognitive Bias in Performance Engineering
Harinder Seera 🇭🇲
Harinder Seera 🇭🇲

Posted on • Edited on • Originally published at linkedin.com

Cognitive Bias in Performance Engineering

The words were along the line "... We can go live without performance testing this entity".

Human beings are prone to cognitive biases. Over time we learn to put in place strategies to recognize these biases and take action. Biases never go away. We just get better at recognizing them. Yet, from time to time biases will sneak through our blindspot.

This article details such a situation where I got blindsided by different biases. After all, Performance engineers are human beings too. And we are not immune to cognitive biases.

Let's first understand what cognitive bias is before getting to the story.

What is Cognitive Bias?

Cognitive bias is the tendency to think a certain way, often resulting in a deviation from rational and logical decision-making. Cognitive biases are often a result of our brain attempting to simplify processing of the information.

We are consistently getting bombarded with information. Biases help us unclutter the information and reach decisions with relative speed. However biases can subtly creep in and influence our decisions.

As human beings we like to think we are rational - but in reality, we are prone to biases that cause us to think and act irrationally. To some extent, every one of us in our lifetime will have exhibited a bias blind spot.

Also, we are less likely to detect bias in ourselves than others. Researchers from Carnegie Mellon University, the City University London, Boston University, and the University of Colorado did a study to prove this hypothesis.

According to the Cognitive Bias Codex, there are around 180 cognitive biases (lists keeps growing) that can impact our rational thinking. That is a lot of biases to deal with regularly.
Biases

Now that you know what Cognitive bias is, Can you think of a time when cognitive bias got the better of you? If yes, then pat on your shoulder. And if you have never experienced it then I want to be like you.

Back to my experience

Before we start, let me first define the term Entity. It will be referred through out the post. Consider an entity as something that maintains a separate and distinct existence. Also each entity is defined by different attributes. A corporation, relationship and address are an example of an entity. Street type, street name and post code are example of attributes for the address entity.

A generic ETL framework was developed by our development team to assist with loading entities into a database in the cloud. More than ten different types of entities were expected to be loaded using this framework. The framework was going to save us hours/days of development & testing effort per entity.

In the beginning, a considerable amount of testing and tuning effort went into the implementing the framework. We managed to reduce the ETL timing from more than 15 hours to less than 35 minutes. The framework was expected to handle millions of records per entity.

When optimization was complete, the first three entities were successfully loaded into the database. The timing observed in production for these entities was similar to what was observed during performance testing. The production data validated our test results.

We continued conducting performance testing for the next six entities. And again similar timings were observed between performance testing and production. This further built our confidence in the framework and testing approach.

There where many things that helped build our confidence over time. Such as:

  • Test data used for performance testing mirrored production like skewness. And based on production data, all nine entities had positive skewness.
  • Statistics gathered from Production consistently showed that it was taking less than an hour to load an entity. Which validated our test findings.
  • The next six entities did not encounter any major performance-related bottlenecks during testing. Again, production metrics validated this too.
  • Information gathered through performance testing & loading of nine entities in production confirmed that the ETL process can be performed at any time of the day. Plus, it does not impact the Online load.
  • ...

Finally, the day had arrived to load the last entity into production. The team was confident in the whole process and expected not to encounter any issues.

To our surprise, we crossed an hour mark, and the job was still running. Ten hours went by and the job was still running. After 12 hours we made a call to stop it.

The framework had a build-in mechanism in place to handle such a scenario, if it occurred. However, that is a separate discussion for some other time.

The following day the team reconvened to conduct RCA (root cause analysis) on what went wrong. Our analysis concluded that the data for this particular entity had a multimodal skewness. And as such the testing and the framework did not account for such a scenario. Once we modeled the correct data skewness in the performance environment, we were able to replicate and fix the issue.

So you ask, where is the bias in all this. Well...

  • The bias was in us thinking all the entities had a positively skewed data.
  • Confidence in our framework that it covered all different scenarios.
  • Making a decision based on ETL information gathered from the previous nine entities. There are times you have to make decisions with incomplete data. Yet, in this case, we had the capability to analyze the data for the tenth entity from production and make a decision. Because of being overconfidence, we chose not to.

The Confidence & Confirmation bias led me to make an incorrect decision. I should have analyzed the production data for the tenth entity before making a call.

So take away from this post are:

  • To some extent, we all have experienced cognitive biases in our life. And performance engineers are not immune to it.
  • From time to time we will get blindsided by our biases. However, that doesn't mean we don't put in place strategies to counter their effects whenever possible. For example, ask your peer to review your work. Get others to challenge your decision.
  • Share your example with others so they can learn and recognize different types of biases.

Finally, if you have an example, I would love for you to share with the rest of the community. It does not have to be related to the performance engineering field. It can be from a different field too. This will help everyone learn from others experience.


Thanks for reading!

If you enjoyed this article feel free to share it on social media 🙂

Say Hello on: Linkedin | Twitter | Polywork

Github repo: hseera

Top comments (0)