Maciej Posłuszny

Posted on Aug 19, 2017

Refactoring is not so scary

#refactoring #cleancode #languageagnostic

What is refactoring?

The answer to this question is very crucial for your understanding of the whole process. Refactoring is making changes in the structure of code, while not making changes to the functionality.
By the way did I say process? Yes I did, because it is a continous process and we often forget about that. Lately I've asked a fellow programmer in my company "So how do you integrate your development process with refactoring?". He said "Everytime after the sprint we are given some time to refactor". So why is this a bad practice? Because after a whole sprint you don't have as much context as you did when you've been coding every functionality. You have to: read the code, understand the code, switch to the other functionality, repeat. Think about how efficient it would be when you do this while you're coding this new fresh stuff.
So once again, refactoring is a process that you should apply in parallel to writing code!

Why do we want to refactor?

The answer is simple - refactoring saves costs in the future.
Every project which is getting bigger is also getting more complex. If the code quality doesn't go along with the complexity, everyone will spend a lot more time reading it. Even for the new, greenfield projects the ratio of writing/reading code might be something like 30%/70% (and it goes up to something like 5%/95% in bigger projects). If you think about optimization of these 70% and multiply it by the amount of team members, it could lead to huge savings. Now, don't you think that making your code more readable would help a lot in the future?

How do I refactor?

Let's say you're implementing a new feature and you just wrote that fresh new class that handles some operations. As the class is getting bigger you're going to see more flaws. How can you spot them? Clean code to the rescue!

Functions length should be approximately 7-20 lines of code. No more. In some specific situations you won't be able to achieve this, but as you spot some lines of code that could be given a name, just extract them to a new function (you don't have to count if, while etc. as whole line, just their bodies).
Function arguments should be approximately taking 3 arguments. Ideally you'll want to take zero arguments and all data would be taken from class' properties. It's not always possible so at least try to not make them more than 3.
Class length should be approximately 150 lines of code. Why? Because most probably it breaks SOLID's Single Responsibility Principle - a class should do only one thing and should do it well.
Indentation should be approximately 2 levels deep. Let's show you an example just for clarity:

public void doSomething() {
    while(conditionIsMet()) {
        if (anotherConditionIsMet()) {
            doSomeLogic();
        }
    }
}

Don't go beyond that and you're good to go. When you are about to go deeper - extract to functions!

If you have to use the word and in your function name, then there is something wrong with it. Once again Single Responsibility Principle of SOLID.
Extract to variables and functions everytime you can provide a clear information about its purpose. Especially extract the bodies of loops to functions and number/string literals to constants!

Before:

  product.grossValue = product.price * 1.23;

After:

product.grossValue = calculateGrossValue(product.price);

public double calculateGrossValue(int productPrice) {
  return productPrice * PRODUCT_FEE;
}

Simplify conditions by extracting them. Similar to previous one but now we are targeting conditions in if, while, for statements etc.

It's easy to see what is the condition for the if statement in this particular scenario

if (!inventory.cars.isEmpty()) {
  doSomething();
}

But wouldn't it be more readable like this?

if (carsInInventoryNotEmpty()) {
  doSomething();
}

private boolean carsInInventoryNotEmpty() {
  return !inventory.cars.isEmpty();
}

Never allow duplication of code! If a bug ever happens in one place, you will have to cover every other and most probably you will miss one of them. Instead look for a smart abstraction that you can extract. Design patterns are very helpful in getting out of sticky situation.
Don't use comments. Instead try to name your functions and variables better. Comments are allowed only for TODO, FIXME (not on master branch though) and some special cases where you would want to describe your intentions instead of mechanism.

These are my rules of thumb to keep my code clean and organized. You can read more about other rules in the Uncle Bob's book or in many other resources on the internet.
It's also good to take a look at catalog of refactorings and see more tools to make your code better. Also most basic refactorings should be available in your IDE. Use it whenever you can, because your IDE is smart and will not let you refactor if it will break anything!

Summary

The idea of refactoring is not to make everything perfect at once. Instead you want to make a lot of small changes that have any positive influence on your code. Every time you make a change, ask yourself: will it be more readable to me and other people that might come across this? If the answer is yes, then you should keep it.
Remember, you produce code for humans after all, not machines. And yes, you can make a difference :) Good luck!

What are your thoughts on this topic? Share your best practices in the comments!

Top comments (18)

Osaetin Daniel • Feb 14 '18

Don't use comments. Instead try to name your functions and variables > better. Comments are allowed only for TODO, FIXME (not on master branch > though) and some special cases where you would want to describe your > intentions instead of mechanism.

I don't agree with this. The only case where comments aren't useful are autogenerated comments by IDE's (Java I'm looking at you).

It makes it easier and quicker to understand the source code, Especially in open source projects with a lot of contributors.

I've found that it's easier for me to understand the code if they're helpful comments about what a particular block of code/function does. I can just skip the code if the comment says it does something different that I'm not interested in. I don't need to read it.

On the other hand, if there's no comment (and you have just "special" comments) you'll spend a lot of time reading a lot of code that has nothing to do with what you're interested in at that moment No matter how clear the function/method name is.

enguerran 🐐💨 • Jul 19 '18

Of course it is easier if comments explain the code.
But what if comments and code become desynchronised?
How about some self-explaining code instructions?

That is the purpose of don't use comments: functions names, variables names, code organization that self-explain its intention.

Jason C. McDonald • Oct 25 '19

I agree with intent-commenting, but I can answer your other two concerns...

But what if comments and code become desynchronised?

Then fix both. My team has used intent-commenting as a mandatory part of the code base for years, and desync nearly always indicates a logic bug. You'll note no one makes this complaint about function names, even though those are just as likely to fall out of sync. They fix the name.

How about some self-explaining code instructions?

Code should always self-explain what it does, but it is virtually impossible for code to explain its own code-agnostic intent.

To Comment Or Not To Comment?

Jason C. McDonald ・ Jan 20 ・ 12 min read

#coding #culture #productivity

ItsASine (Kayla) • May 23 '18

Don't use comments. Instead try to name your functions and variables better.

Comments are perfectly fine if they're doing their job correctly. They should focus on why it was done this way rather than what it does.

Bad code and bad comment:

var t = .24; // tax rate

Good code and good comment:

var taxRate = .24; // tax rate as per the 2018 tax code

The comment expresses that the hardcoded value was set to that, which helps a new developer or future you know why at a glance. They can be more helpful if I actually used functionality as an example rather than variable initialization, but it gets the gist across.

enguerran 🐐💨 • Jul 19 '18

I was reported a wrong result from our accountant. I checked the code: it seems than taxRate has a wrong value, but the comment says it is the tax rate as per the 2018 tax code. I checked on the internet and found that .24 was the tax rate as per the 2017 tax code, the tax rate as per the 2018 tax code is .25. Comment were source of noise.

rhymes • Aug 19 '17

Nice article Maciej!

A couple of things I would like to add:

tests :-) refactoring is definitely easier (and less scary) if you have tests for the code
code linting tools: most programming language and editors can easily integrate configurable linting tools so you can have a common setup for you and your team.

I'm not fluent in Java but in Ruby I always use tools like rubocop that can automatically tell you stuff like "your line is too long", "your method is too long", "hey there are too many parameters" and so on.

Bye!

Maciej Posłuszny • Aug 19 '17

Hey, thanks for your feedback and stopping by!

HawiCaesar • Aug 19 '17

Just what I needed. I was writing a pagination helper and i realised it wasnt following the single responsibility principle. I changed it up. Havent finished writing it. But this validated me today, thanks!!

Maciej Posłuszny • Aug 19 '17

Glad it helped! Cheers :)

Chris Shepherd • Aug 19 '17

"Functions length should be 7-20 lines of code." - shouldn't that be "Maximum 7-20 lines of code?".
Your example functions are not 7 lines long ;)

Ben Sinclair • May 24 '18

Well that could be simplified to "maximum 20 lines of code" if we were being all programmery about it.

Chris Shepherd • May 24 '18

Ahh but then it would all go wrong when someone puts in 0 or -1 lines of code.
and are we only allowing integers....

this is trickier than I thought ;)

Maciej Posłuszny • Aug 19 '17

Of course it should. Thank you :)

Lluís Josep Martínez • Feb 14 '18

I don't understand how do you compute 150 as the maximum lines of a class. It's a magic number! Please provide some scientific proof. Even in the Java API there classes with several hundred lines perfectly accomplishing SOLID principles. About comments I have no words, take any complex algorithm without comments and you won't understand shit.

Casey Brooks • Jul 18 '18

This is amazing. I've been working on a big refactoring/modularization project at work for the last 6 months, and this perfectly captures everything I've learned in this process, and so much more.

One thing I would add: learn your application's lifecycle, and design your APIs around that lifecycle. Rather than having "methodA() calls methodB() which calls methodC()...", recognize the fact that these are 3 sequential steps, and create a "lifecycle method" which does the work of calling them in sequence.

From

function a() { b() }
function b() { c() }
function c() { // so on... }

function a() { }
function b() { }
function c() { }
function onLIfecycleAction() {
    a();
    b();
    c();
}

This will make each method more individually-testable, and also more reusable. You really start to realize why you need this when one method passes parameters to another, because when each method relies on the lifecycle for moving data along, then you have to return those arguments from the method rather than just passing them, and you get more discriminatory about which data is going where in your application.