So you're going to start fixing some of your technical debt? Great! Just remember the one cardinal rule: Don't break the application.
Think about it - when we get to take on technical debt the business paying an opportunity cost for us to refactor code instead of making new features or even fixing existing bugs. That's okay and necessary and there are ways of telling the business how important this activity is, but the one thing you can't do is this:
When paying down technical debt, you can't cause new bugs.
This is the equivalent of the medical Hippocratic Oath (First, do no harm). Sure, you're performing surgery on code instead of a live patient, but the difference is the same. If every time the business lets you make code better, you cause user-facing defects, the business won't let you take too many attempts at code cleanup in the future.
So, how do we do the vitally needed "code surgery" while minimizing risk to the end users?
As it turns out, I have some ideas.
Awhile ago I gave a talk on .NET unit testing techniques and I found while preparing that my ideas for improving a codebase tend to sort themselves into 5 buckets or phases that occur roughly sequentially:
- Build a Safety Net
- Improve Testability
- Improve your Tests
- Refactor the Code
- Expand your Tests
By repetitively going through this process during code cleanup, you can progressively make your code better while minimizing risk, earning you greater trust and latitude in the future as well as cleaner, more maintainable code.
Build a Safety Net
The number one question I ask when getting ready to change code is this: do I have tests or processes in place that will catch a mistake if something gets past me?
This safety net could be unit tests, regression testing plans, or even such trivial things as the compiler or code analysis suite.
If you don't have a safety net that covers the major use cases of your application, you need to create one before you change anything.
Important Note: If you are unsure you adequately understand all cases in which the application is being used, you're likely to miss necessary testing scenarios. I recommend that you check with product management and look over Google Analytics or other logging metrics to get a full sense of the various types of requests your system deals with.
Snapshot Testing
When creating a safety net, you're looking for "pinning tests" that pin the current behavior of the system (right or wrong) in place so that you're explicitly aware of any change you're making to code behavior.
The number one technique I recommend for quickly creating such a test is the use of snapshot testing.
Snapshot testing compares an object against a previous snapshot and fails if any change is detected. If you'd like to see some more specifics, see my article on using Jest for JavaScript / TypeScript pinning tests as well as Snapper for .NET tests.
UI Testing
UI testing involves automating interaction with an application's user interface. It can be done to web, desktop, and mobile applications and allows you to automatically test a system from the end-user's perspective. The various tools and techniques differ based on languages and platforms, though Selenium is an extremely popular option for web testing.
Fiddler
If your code makes HTTP calls to other systems, it can be handy to look at what the frequency and content of these calls. Fiddler acts as a proxy intercepting outbound traffic and allowing you to inspect the call and its response. This way you can monitor the exact details of calls made by the application to determine if behavior is changing.
Postman
On the other side of API development, you can use a tool like Postman to make calls to your application (or any other) and inspect the response. On top of that, Postman has features to run sets of calls in a group or on a schedule and to do some basic assertions around the return values of REST calls. This lets non-developers use Postman to write a suite of integration tests.
Regression Testing
I'd be remiss if I didn't mention actual regression tests or smoke tests that the team can conduct on the application to verify that behavior is still working as expected.
Improve Testability
Now that we have some basic end-to-end tests to verify we're not going to accidentally change behavior, let's make aspects of the application easier to test in isolation via unit tests.
Reduce Tight Coupling
A common issue encountered is for code to be tightly coupled to a database or other resource by relying on dependencies on databases, web services, the file system, or other resources which may be irrelevant to what you want to test.
Instead of talking with these directly, you can encapsulate these calls into a new interface. That interface can then be provided into your method and interacted with so your code can be tested with a mock implementation of that interface.
Doing so makes it much easier to write tests that test only the logic in question and isolate the actual code under test.
This process is called inversion of control (IoC) or Dependency Injection because you instead of a class being tightly coupled to what it calls, the dependency is injected into the class via parameter, property, or constructor argument.
Working Around Database Limitations
If you want to test the database or separating calls to the database is too difficult to do in one go, you may need to do some special setup logic with the database.
Specifically, before every test runs, you can check to see if expected records exist in the database and insert them if they do not. I generally view this as unneccessary and an anti-pattern due to the cumersome test setup code you now have to maintain, but it is a solution.
The reason you want to do this logic before tests run instead of afterwards is because if a test failed, the data may be in an invalid state and will fail next run even if your code is properly implemented.
.NET has some interesting capabilities built into Entity Framework that allow you to work with an in-memory database. This lets your code still work with database contexts for the sake of testing, but the contents of that database are entirely up to your control at test time.
Improve your Tests
As you grow more tests, it's important to make sure those tests are maintainable, clear, easy to read, and effective.
Since the focus of this article is on reducing risk when making changes and not on cleaning up test code, we'll skip over this, but if you're curious about what the process of progressively improving unit tests looks like, I recommend you read my article on refactoring C# unit tests
Refactor the Code
Now that you have unit tests that have an adequate degree of coverage and simplicity, you should feel comfortable making changes. If you don't feel comfortable, you likely need to go back and add more varied tests or consult with a senior team member until you can confidently state that your risk mitigation plan adequately covers the work you need to do.
I'm not going to talk about the particular mechanics of altering code as these vary based on what sort of projects, languages, tools, technologies, and by the flavor of technical debt that you're working to resolve.
What I will share is some general ideas on reducing the risk when actually making your changes as well as some ideas for releasing your changes in ways that reduce risk to the end user.
Use Intelligent Tooling
Most modern development environments have refactoring tools baked into them. These offer common operations like extracting a variable, pulling a method up to a base class (or interface definition), extracting a new method from a larger method, etc.
I strongly suggest you find tools that you trust that can automate these processes because they're a lot less likely to make a mistake than you are and getting distracted or interrupted is not a weakness they suffer from.
I personally love the ReSharper Visual Studio extension and the JetBrains suite of products commonly referred to as the IntelliJ family of editors (though they serve many languages), but you need to find tools that you personally trust and get value from.
Feature Flags
When deploying refactored code, one of the things you can potentially do is keep a backup version of the logic around and use a feature flag to route your application to use either the old or the new implementation.
Typically I start with the application routing all traffic to the new version of the refactored logic, but I can set a configuration entry somewhere in an emergency and the application will revert to using the legacy implementation without having to modify or deploy any code.
Once the code has been verified as working as expected, you can remove the feature flag and the legacy implementation from a future release.
Scientist
Scientist is another technique I use to reduce risk when releasing refactored code. Similarly to how feature flags work above, you keep the old and new implementations of an algorithm.
What's different about Scientist is that it will always return the legacy version's return result to the caller, but will compare the old and new implementation's results to see if they changed. If they did Scientist allows you to log this incident wherever you want.
The advantage of this approach is that you get to find differences between your new version and the old version without having to risk exposing your users to defective new logic of an unproven improvement.
See my article on Scientist for more information.
Victimless Canary Testing with Scientist .NET
Matt Eland ・ Aug 31 '19
Expand your Tests
Now that you're comfortable and equipped to start paying down technical debt, you're likely noticing places where testing is weak. Take this opportunity to expand tests in the cleanest way possible as it will only make your life easier.
Here are some quick high-level ideas for expanding tests:
Parameterized Unit Tests
Many unit testing frameworks allow you to write a single test method that takes in parameters from different parameterized unit test cases. This keeps the test code simple and maintainable.
Require Tests during Code Review
While not a code change, one of the things I find helpful is to challenge my team members during code review and insist that things carry the necessary unit tests when it makes sense to do that.
Providing Draft Test Plans to QA
Another thing I like to do during code review is to review a draft of test plans that the developer has written up. This could be something from a quick sentence on how to test the feature to a formal record in a test case tracking system.
The idea is that quality assurance can take this test case suggestion and accept or remove it, then comment on the test cases the developer didn't identify and tidy up the test case. This helps QA get oriented around what the new feature is (by having a draft test case) and helps developers think more like testers by giving them feedback from QA on their test plans.
Behavior Driven Development
Finally, behavior driven development (BDD) tools and workflows exist to enable product owners and quality assurance members to write unit tests in plain English and for developers to code special matching functions to interpret those commands and execute tests.
The advantage of this is that you now have a wider variety of people writing test cases for your application, helping quality overall and making it clear that quality and unit testing is not just development or QA's responsibility, but the entire team's.
Closing Thoughts
I hope this article has given you some new things to consider in preserving application quality during refactoring and beyond.
If you have a way of handling technical debt safely that I don't mention above, please let me know what that is in the comments.
Top comments (3)
Regarding the "Scientist" method this really only ever works for routines that do not change state. If the new and legacy routine use the same underlying data, and the data is mutated as part of the routine.
Now you can argue to not share the data beneath these two implementations, but then you might have problems accessing the data which you wouldn't catch now.
Ideally, you'd need a shared store with snapshot query capabilities, such that the legacy code can run, but will create a snapshot of the data before it is mutated. Afterwards, the new routine runs on the snapshot with all mutations resulting in a "dry-run" that will be discarded after the routine is done. Then you can compare the results. But this is of course much more work to do, and might not be feasible for certain data storages or as always deadlines.
You're spot on with all of this. I've been meaning to do a follow-on article for Scientist that explores this more both for data modification as well as for potentially impacting methods like E-Mail sending or external API calls.
You can use Scientist, but you have to structure your routines in such a way that they're essentially pure.
One more idea that just came to my mind is to have a complete copy of the data, which would enable you to verify afterwards, if the new routine correctly mutates the data as well as returning the expecting result.
So you can run legacy and new routines side by side for a couple days, then make a diff between databases, and see how the new code creates different data. Some of the changes might be within tolerances, for things like timestamps that aren't the same down to the microsecond, but for many fields this would probably work quite well. Just a thought.