Split-test Your Code
When you split-test a sales-page or landing-page, you have a "control" piece; and you have a set of alternatives.
Your "winner" becomes the "control" based on it winning out on a set of criteria.
The best sales-page is not the prettiest, the most artsy, fancy, colorful, linguistically eloquent page. The best page is the one that converts the most, thus is the most profitable.
If the ugly page converts best, that's the one you use. Also, even if it's a page that violates all the so-called "best practices", it is still the one you use.
However
However, there is more to your criteria set than just conversions, and they must be prioritized.
So here is a criteria set in order of prioritization...
- legal
- non-deceptive
- moral
- highest conversion rate
- aesthetics
If it's not legal, your conversion rate won't matter much, because you likely won't be in business for long.
If it's deceptive, you may get high conversions, but you'll get charge backs down the road.
If it's immoral, you'll want this to reflect on your branding or business reputation; and this is a relative judgment; thus depends on your industry, your personal values, your target customers, etc.
The above being equal, you choose your best performer.
Finally, all things listed above being equal, you choose the best aesthetics; i.e. beauty, simplicity, performance (like load speed).
Programming is split-testing
Software architecture is essentially the application of split testing.
The code you use is the "control".
The code that doesn't make the cut are your alternatives.
The criteria for determining what code the control is...
- natural language documentation
- security
- robustness
- performance
- code readability
Notice what's not in the list..
- proper
- best practices
- what the cool kids are using
- a particular paradigm, like OOP (Object-Oriented Programming) or FP (Functional Programming)
Let's start with performance.
All things being equal, I want what's most performant; uses the least cycles, and least memory.
However, if my fastest control is not robust, the second fastest becomes the control.
If the control is the fastest and robust, but if it has security concerns, the second place alternative becomes the control.
At the top we have "natural language documentation"; this is a "freebie"; every candidate being split-tested can have this; and this gives you the desired software architectural attributes like grokability and changeability.
This is important, because your control might be a narly, unreadable, but highly performant, robust, battle-tested, tightly secure piece of code.
But if the next developer just sees the narly, they may not understand it, will try to simplify it, or rewrite it; without considering the full criteria set.
You need an official policy of what criteria to use to determine what code to use (becomes the control).
When you perf-test, robust-test, fuzz-test, load test, pen-test your code, keep that gathered intelligence close to the code itself. Explain why you're using a certain variant. Explain the intent of the code, it's raison d'être (reason for existing).
If you don't, how are you going to remember what was performant or not?
If you don't document it, it's all for naught.
Finally, "code readability" is like the "aesthetics" of the code. All the above criteria being equal, you're going to choose the simplest, most readable code.
Tips and Caveats
This criteria set applies to algorithms and code that can be wrapped in a function. Evaluating frameworks and subframeworks (like Redux for example), and third-party libraries have a slightly more complex criteria set. (I'll explain those in a future article, please follow me if you're interested).
Make sure the performance gain is statistically significant verses the alternatives. The margin-of-error is likely going to be about 5%.
The time-to-run is not the most important, but it's time relative to the alternatives is what you want to watch for.
Test with smaller inputs and very large inputs. Some algorithms run fast with small inputs but slower with larger inputs; and vice-versa.
Run the perf-tests separately from each other (to prevent the compiler from optimizing one and not the other).
Run the perf-tests a few times and keep the best score (because that's how good it can potentially do).
If you change the code, retest to see how it was impacted, then update your "best score".
Consider where the code runs, for example, in a...
- server
- serverless function
- desktop browser
- mobile browser
- mobile app
Consider how often it runs. For example, if it's a utility that is used all over your code base, then that is more impactful than a function that is used only in an offline cron job that runs once a week.
There are exceptions to the rules. There are management concerns; there is the ability to sell new approaches to your company and/or colleagues; there is the lifecycle of the code base (will it be re-written soon or will there be a switch of frameworks, languages or paradigms soon).
What's next?
To keep this article from getting too long, I will post another article with examples. Please follow if you're interested.
If you have ideas, feedback, etc. please comment. Together we can all learn more.
P.S.
All of life's decisions are kind of about split-testing your alternatives, aren't they?
Top comments (1)
If it matters enough I put the test in the production code.