Cyclomatic what? Even spell check doesn't recognise the word, but it's a super useful software metric for understanding how your software works.
Having read about it a few years ago, it apparent use seems to have deteriorated. I feel it's a very valuable tool in a developer's arsenal, and something that should be used in code reviews and for the maintainability of your codebase. We all know to keep our code "Simple" We've all heard about the KISS principle, but were we ever told what simple really was, and how we should measure it?
Well this is where cyclomatic complexity enters the frame.
Definition
Courtesy of wikipedia:
Cyclomatic complexity is a software metric, used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program's source code. It was developed by Thomas J. McCabe, Sr. in 1976. Cyclomatic complexity is computed using the control flow graph of the program: the nodes of the graph correspond to indivisible groups of commands of a program, and a directed edge connects two nodes if the second command might be executed immediately after the first command. Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program.
What that actually means
In essence it's the different number of routes through a piece of logic. To put it differently it's generally considered in the context of a maintainability index. The more branches there are within a particular function, the more difficult it is to maintain a mental model of its operation. The metric is roughly equivalent to one plus the number of loops and if statements. This is a pretty good view of how the value is configured, yes x may be bigger than 100 and move straight on, and that path through would have a complexity of 1, however the code block/ method itself has a score of 11.
Why should I care?
Code coverage is becoming an integral part of the development cycle. Cyclomatic complexity ultimately effects the amount of different unit tests you will need to write for a given piece of code. Additional routes through a method will require an additional test to be written, and an additional place your code could fall over, or bugs could crop up.
When you take this all into consideration, the cyclomatic complexity of your code ultimately determines its effectiveness, simplicity, maintainability and general practicality. So yeah pretty important, don't you think?
High “complexity” can be directly translated to low readability, and that also means it's more difficult for new developers coming in and understanding what is going on.
I'm sure you have had an experience of looking at some code, where you had no idea what was going on, and why something was written "that" way.
So next time you write something, bear that in mind, that the next person looking at it might not be you. Leave it in such a state that you would be happy finding it in. This approach has always helped me when finishing up a new feature.
How important is Cyclomatic Complexity?
Who doesn't love a good table, this one shows what the different values a method could have and what that means?
Complexity | What IT MEANS |
---|---|
1-10 | Structured and well written code that is easily testable. |
10-20 | Fairly complex code that could be a challenge to test. Depending on what you are doing these sorts of values are still acceptable if they're done for a good reason. |
20-40 | Very complex code that is hard to test. You should look at refactoring this, breaking it down into smaller methods, or using a design pattern. |
>40 | Crazy code, that is not at all testable and nearly impossible to maintain or extend. Something is really wrong here and needs to be scrutinised further. |
These different levels help us better understand the code we are writing and the implication this will have on any testing resource we may need. It also makes us aware that any high levels of complexity will cause us problems in the future and time should be spent on refactoring this at the next available opportunity.
What can we do to fix this?
The table above shows the different levels of complexity and when you should start looking at refactoring your code. We'll have a look at a few ways this can be achieved. By far the easiest way is to remove any un-needed if or else statements. These sometimes crop up during development and then don't get removed. One common example you may find in your code base goes like this.
var msg = "";
if (month == 12 && day == 25) { // + 2
msg = "Merry Christmas"; // +1
} else {
msg = "Have a nice day"; // +1
}
return msg; // +1 - Total 5
There doesn't look a lot wrong with the code above. However, if we simply remove the else statement and move the default message to the declaration we remove 1 complexity point straight away. It's an easy change and one that is common.
Another big culprit for high complexity are case or switch statements.
switch (day) { // +1
case 0: return "Sunday"; // +2
case 1: return "Monday"; // +2
case 2: return "Tuesday"; // +2
case 3: return "Wednesday"; // +2
case 4: return "Thursday"; // +2
case 5: return "Friday"; // +2
case 6: return "Saturday"; // +2
default: throw new Exception(); // +2 Total 17!
}
In certain cases, you can't get away from blocks of code as above, that's what they were designed for. But sometimes switch statements are just bad code design. The strategy pattern is a good approach to take if your switch statement is likely to increase. In the example above it is unlikely we are going to get new days added to our calendar, but take for example:
switch (carGarage) {
case 'seat': return contactSeat(); // +2
case 'audi': return contactAudi(); // +2
default: return contactFord(); // +2 - Total 6
}
We have 3 case statements here, but one could expect that to expand, looking at what it's currently implementing. Adding additional case statements is one possible solution for expanding this code, but that will increase the complexity with each additional case! A strategy pattern would tidy this up nicely.
enum CarDealerTypes { Seat, Audi, Ford }
interface CarDealerStrategy {
CallDealer();
}
class SeatDealer implements CarDealerStrategy {
CallDealer() {
CallSeat(); // +1
}
}
class AudiDealer implements CarDealerStrategy {
CallDealer() {
CallAudi(); // +1
}
}
class FordDealer implements CarDealerStrategy {
CallDealer() {
CallFord(); // +1
}
}
class Dealership {
// Here is our alternative to the case statements, easy right!
CallDealer(dealer: CarDealerStrategy) {
dealer.CallDealer(); // +1
}
// These are the methods that will ultimately be used
ContactAudiDelership() {
this.CallDealer(new AudiDealer()); // +1
}
}
It's a higher setup cost, and slightly more complicated to begin with. However, after you're 15 switch statement is added, you'll be happy you decided to switch approaches! In addition, we have improved the complexity from the original 3 in the case statement to 1 in the Strategy pattern. Imaging if your switch statement was doing additional logic, and had additional if statements embedded, you can see this becoming a real struggle to test!
Use that noggin!
As with everything development related, the most important thing to keep in mind is that you don't just change stuff because you want to.
Refactoring and improving your codebase is imperative to keeping a clean and concise environment. If you find your code is running smoothly, not causing you or your customers any issues, then don't change it because a code metric is telling you it's wrong.
Code is legacy the moment it is written, so your refactoring might be obsolete during the next round of development. Improve the code if it is being amended anyway. A good programmer should fix any issues they find whilst working on a story or feature, but not changing code that will require additional testing that doesn't directly affect what they are currently doing.
Tools
So you understand the concept, you understand how to fix it, but what's the easiest way to figure out what's a potential problem! Well most IDE's should offer some in built tools to help you out. I'll run through a couple now:
Visual Studio
Simply calculate your code metrics by going to Analyze | Calculate Code Metrics for Solution. Find more details here: Visual Studio - Code Metric Help
VsCode
I've linked to a great extension that I've been using recently, this will display the complexity at the top of the function! Find it here: CodeMetric Extension
There are tools out there for most IDE's so go out there and find the one that suits you!
I hope this introduction into Cyclomatic Complexity gives you something to think about, and helps you sometime in the future. The additional reading below delves further into the subject so feel free to have a read if this topic interest you further. As always, let us know what you think in the comments below.
This was originally posted on my own blog here: Design Puddle Blog - Coding Concepts- Cyclomatic Complexity
Additional Reading
McCabes full thesis: http://mccabe.com/pdf/mccabe-nist235r.pdf
A different perspective, why you shouldn't use it? https://www.cqse.eu/en/blog/mccabe-cyclomatic-complexity/
And some more clarification: https://dzone.com/articles/what-exactly-is-mccabe-cyclomatic-complexity
Top comments (17)
I definitely like the write-up! I love to see software quality metrics being discussed. I do want to point out in your first example:
Your suggested reduction in cyclomatic complexity (which I agree with) actually increases your essential complexity... though not above the structured/unstructured threshold. You add an additional exit point to the function, which is a trade off. Your second example:
The essential complexity is 8 because each case is an exit, as well as the
default
case. In this example, you don't need any trade-offs between cyclomatic and essential complexity; it's bad all the way around. The threshold we use on my quality team (based on McCabe) are that a method in OO code with an essential complexity of greater than 4 is considered to be unstructured.Again, I'm definitely glad to see good editorials on software quality metrics. Thanks for the post!
"Your suggested reduction in cyclomatic complexity (which I agree with) actually increases your essential complexity... though not above the structured/unstructured threshold. You add an additional exit point to the function, which is a trade off. Your second example:
"
Additional exit point will not be added since the msg variable will have a default, then you only change the content of the variable when the if statement evaluate to true, and only one return statement is needed.
Very true! Thanks for the comment.
You're absolutely right. I was mixing code samples in my head. Thanks for the correction!
Thanks for the comment. I like the introduction to essential complexity. So long as the same code metrics are used throughout the codebase they should give an insight into code smells and areas ripe for refactoring!
Interesting. I assumed the topic would be more, well, complex.
One thing this method would seem to miss is that additional branches sometimes multiply the number of paths and sometimes don't.
There are 3 possible paths through the above code, but the following code has four:
However, the latter code's cyclomatic complexity would be lower. If I understand correctly, example one would have a cyclomatic complexity of 5, while example two would have a cyclomatic complexity of 3.
Hi Dustin, nested branches are definitely classed as more complex, have a look at your examples run through the Code Metrics extension in VSCode. The easiest way to gauge the count is to look at every possible route through the code, then add 1.
Hmmm, the cyclomatic complexity M of a given function is M = E - N + 2, where E and N are the edges and nodes in the graph of its control flow. The control flows of both these examples contain 7 nodes and 8 edges, so both have a cyclomatic complexity of M = 8 - 7 + 2 = 3. This is confirmed by running radon on the Python version @cathodion offered.
Nesting does hurt readability, as the CQSE post you linked to mentions.
However, the first example has fewer routes because of the nesting. If
a
is False, there's only one route to take, regardless of the value ofb
.It's about all potential routes through. Rather than specific routes. If you wanted to get 100% code coverage you would need to cover every possible route, this is what the complexity of trying to convey.
I'm not sure what you mean by potential vs. specific routes.
I think you're trying to say that it's either in the if or else statement. So there are two specific routes through. But those two routes vary massively depending on which it goes into. The metric doesn't gauge which is more likely or weight them accordingly. I agree It's definitely not perfect, the reverse piece in the additional reading section defines it's pitfalls well.
In the past, I used this vscode extension a lot. Some of my scripts had 500+ complexity and had to spend a lot of time to tune it down to 100. I learned a lot from that, however it was showing everywhere so I decided to frequently turn it off.
Great post.
Very useful! I didn't know about this concept. I will definitely add this to my tool box.
Thanks, glad it was of use.
This is one of the biggest advantage gained when using linters (static code analyzers).
Enforcing rules to keep the bad code away from the master branch!
Very true, they can be annoying at times, but a bit of tuning usually gets them configured to be useful for your methods.