I've recently been reading Martin Fowler's book "Refactoring: Improving the Design of Existing Code", and while I could (and probably should) praise the book's overall quality and usefulness, there is something from the earlier chapters that particularly resonated with me: the concept of refactoring before starting a task, or 'Prefactoring'.
But why? Well, let us first take into consideration refactoring in unto itself.
Refactoring in a nutshell
Refactoring has a rather bad reputation outside of the engineering department; it's usually seen as work done with no new functionality added, and therefore no added value to the product. But for those within engineering, we know that's contrary to the truth; just because the end result doesn't have a tangible added value to customers, it doesn't necessarily mean it was time wasted. Refactoring, generally speaking, is done for two main reasons:
1. Maintaining existing code
We as engineers are by no means perfect, and even if we're not digesting articles and books in our free time we are still always learning about programming with every class or function that we write. Once our code is written and no longer just verbose specifications handed to us or ideas in our heads we tend to interpret it differently and see it more clearly for what it is. Perhaps we can improve the efficiency of an algorithm, or maybe just simplify some code that turned out to be complex overengineering on our behalf. Potentially we named it something very particular to our feature work/use case, but in fact has quite a general implementation that could be used by other subsystems. As long as we already have our test suite coverage, all of these things and more can now be cleaned up very quickly to have a happier and healthier code base. For those familiar with the principles of Test Driven Development this is where the 'Refactor' in "Red; Green; Refactor" comes in.
2. Extending existing code
It is an obvious observation that we write code to fulfill a specification, those specifications can change over time, and so with it the code. So as pragmatic programmers, we try to write clean, modular code that can be adapted and substituted if needed. But where is the line between being prudent and helpful to future developers of this code, and going into 'YAGNI' (you aren't going to need it) territory? We can only do our best with the information we have at the time to make the right decision(s), but rarely do we have enough to be truly confident in them. This is where Prefactoring can shine.
What is Prefactoring?
As I mentioned at the start, while typical refactoring is done after we've written some code and the corresponding tests, prefactoring is (as the name implies) all about doing our refactoring before we start our next task or feature. It could be something done as part of your planning for the next sprint, or something you schedule to include in your sprint's capacity. What makes it so valuable is that it excels in situations where we might struggle with normal refactoring.
While cleaning up code as soon as we've written it is the best thing to do, it's not an avenue that is always available to us. Sometimes we're in a rush to finish a sprint for whatever reason and we can't afford the time to clean everything up. On other occasions maybe we just don't know that we can improve on what we have because we haven't learned a better approach yet. With regards to extending what we have, we now know the direction the code is going in because we're planning on (or are already in the process of) changing it for that very purpose. This helps eliminate any concern of overengineering what we aren't going to need. Additionally, although I might be pointing out the obvious we must also remember that less code is easier to refactor than more code, and so if we do this before adding new functionality we will therefore save time overall!
An example
Let's start with a very basic deployment script that takes in a given (constant) set of files and processes them before deploying. Initially, we could have code that looks like the following:
web_files = [
"my_page.html",
"my_scripts.js",
"another_page.html",
"my_styles.css",
"more_scripts.js",
…
]
for web_file in web_files:
with open(web_file, 'r') as file:
if web_file.endswith(".html"):
# do special HTML processing here
# do typical file processing here
While maybe not the prettiest of solutions, it certainly does the job asked of it. But now let's introduce a change to the spec - we need to minify the javascript files when we process them.
Naturally, we could just add another if
condition and be done with it. There's nothing inherently wrong with that approach and depending on the particular situation it might be the most pragmatic one, but for the sake of this example let's pretend that there could already be many if
conditions, or maybe we know that minifying .js files is just the first of many to come. Let's prefactor!
class FileProcessor:
def __init__(self, file_name):
self.file_name = file_name
def process(self, file):
# do typical file processing here
class HtmlFileProcessor(FileProcessor):
def process(self, file):
# do special HTML file processing here
super().process(file)
By introducing the strategy pattern here, we can now simplify the original orchestration:
web_file_processors = [
HtmlFileProcessor("my_page.html"),
FileProcessor("my_scripts.js"),
HtmlFileProcessor("another_page.html"),
FileProcessor("my_styles.css"),
FileProcessor("more_scripts.js"),
…
]
for web_file_processor in web_file_processors:
with open(web_file_processor.file_name, 'r') as file:
web_file_processor.process(file)
What have we achieved here? From a product perspective, arguably nothing - no new functionality has been added. But from an engineering perspective, I would argue a lot of value has been gained! Primarily, we have separated out our processing logic from our orchestration logic. The orchestration can now live happily on its own with simpler unit tests and be closed for modification. On the other hand, our FileProcessor
class is very much open for extension - as per the spec change we can add a JsFileProcessor
that introduces our required minification, as well as any other strategies we may require in the future!
Suggested guidelines for prefactoring
As with refactoring, prefactoring should follow similar guidelines:
Prefactor with purpose! While our prefactoring is not explicitly adding value to the product, it should always be adding value to the code base, and we achieve this by focusing on making the code 'cleaner' than when we started. If the result is going to be harder to extend or maintain, then we should reconsider our approach.
Test coverage is essential. As one of my former Leads once told me "refactoring without tests is no more than hopeful guesswork".
Do not add any new features or functionality as a part of your prefactoring. Firstly, this misses the whole point of prefactoring as a separate, self-contained, and reviewable step. Secondly, this has a tendency to be more prone to accidentally introducing mistakes/regressions and will make reviewing the code more difficult. Ultimately, depending on how your tests are laid out, you should only need superficial changes at most to your test suite to keep everything passing.
In summary
As those who have worked with me already know, I'm a big fan of refactoring because it challenges our engineering skills to get the best out of our code. To me, Prefactoring is a natural and wonderful extension of that. But what are your thoughts? Have you tried Prefactoring (under this name or another), or would you consider giving it a try? What other benefits or drawbacks should we take into account? Feel free to share your thoughts in the comments!
Top comments (0)