My regular readers (both of them) may notice that my previous article basically disavowed the idea of the "code smell". The term is too often a smug, arrogant, dismissive mantra that serves no meaningful purpose. So why is my very next article dedicated to, what I believe is, an actual code smell?? Well, on one level, I'm more than willing to troll myself from time to time. But on another level, there is really only one aspect of programming that I routinely view as a "code smell": Comments.
You may have heard others parrot the phrase "comments are a code smell". I certainly didn't come up with it and can claim no credit in its creation. But I have come to (almost always) agree with it. And the reason I agree with it really speaks to much deeper ideals about how to code.
Programmer Evolution
Most of us have followed a similar path to "enlightenment". It tends to look something like this:
1. Whatever Works
This whole coding thing is brand new. We don't really have any clue what we're doing. We're usually baffled when our code doesn't work. Even when it does work, we're often somewhat confused about why it works. We have no experience with any aspects of "professional" software development.
In this first stage of evolution, our code may look something like this:
const currSC = () => {
const db = connect('rt', 'kwqw92klad92;wqed');
return db.get('amt').from('sc').where('sess = ' + sess.Id);
}
const st = (amt) => {
return amt * 1.07;
}
let tot = currSC();
tot = st(tot);
At this stage in our evolution, there is only one standard of "quality": Does it work?? And if anyone tries to point out that some of this might be a biiiiiit obtuse, their objections sound rather silly to us. The code is clearly, obviously, self-explanatory. So why would we waste time with silly comments??
Inevitably, we end up having to revisit our own code. It may be months later. Heck, it may only be weeks later. But at some point, we're staring at our own logic - and it no longer seems so... logical.
2. Comments To The Rescue(?)
This is when we first start getting some of that comment religion. The abbreviations that felt so natural - only a short while ago - have now left our mind. But there's a "fix" for this! The how-to-code gurus tell us to "Use Comments!"
Now that we've seen the downside of writing unintelligible code, the Copious Comment Strategy makes sooooo much more sense. And all of the talking heads hail it as the mark of a good programmer. So... we dive in. Head friggin first. And we typically end up with something like this:
// gets the current shopping cart total, based on the user's session ID
// returns a number for the current total
const currSC = () => {
// connect to database
const db = connect('rt', 'kwqw92klad92;wqed');
// find any saved shopping cart total in the 'amt' column in the 'sc' table based on session ID
return db.get('amt').from('sc').where('sess = ' + sess.Id);
}
// calculate sales tax
const st = (amt) => {
return amt * 1.07;
}
// retrieve current total
let tot = currSC();
// add sales tax
tot = st(tot);
The pseudo-code above isn't... wrong. I consider it to be somewhat amateurish. And I wouldn't hire someone on my team who wrote code like this. But this approach is at least "functional".
Writing "functional" code is certainly not a bad thing. The bad thing is that far too many programmers reach this second stage of evolution - and they just stop.
Seriously.
If you never evolve past step 1, you'll probably never get hired to do software development. But I've met plenty of guys who've been in this game for many years - and yet, they've never advanced past this level of evolution.
3. Removing The Translator
The problem here is that farrrrr too many coders come to lean on their comments as though they are "translators". The thinking goes like this:
- The code I've written is hard to understand.
- So I'll add a bunch of comments to "translate" the code for anyone who reads the code after me.
I sincerely hope you recognize the horrific flaw in that logic. Despite the connotation of the word "code", the simple fact is that most algorithms needn't be ridiculously difficult to understand.
My girlfriend is incredibly intelligent. But she's never been, nor has she ever tried to be, a programmer. Despite this "difference" between us, there are times when I'm aggravated about a particular issue that I'm trying to solve in my code, and she ask asks me what's wrong. So... I show her the code.
And here's the thing: When she looks at my code, she generally understands the underlying concept that I'm trying to solve/accomplish/develop. Even though she's never coded a day in her life. Sure, she doesn't know all the specifics of the language constructs I'm using. But at a high level, she can actually read my code - without actually being a coder at all.
Let me be perfectly clear with this. You might be thinking:
Your non-programmer girlfriend understands your code because comments are written in plain language and can be understood by nearly anyone!
Umm... no. Absolutely not. I rarely use comments. I might have a few lines of comments in a few thousand lines of code. So the fact that she can grasp the workings of my code has nothing to do with comments. It has everything to do with the way that I write my code.
The more "evolved" we become, the more attuned we become to the idea of self-documenting code. I can lay no claim to this concept. It's been floating around programming circles for decades. But even now, in the 2020s, I feel that far too many devs have a poor grasp of this concept.
Think of it like this: If I move to your town and I plan to work in your office for a long time to come, and if I speak English but everyone in your office speaks Russian, should I write all my emails in English and then try to add a whole pile of "translation" notes to every message? Or should I just put in the effort to format the message, originally, in a way that anyone after me is likely to understand it? In other words, should I keep blathering in English and lean on some "translator" to make everyone else understand? Or should I make every effort to start writing my messages in Russian??
We can illustrate some of these translation roadblocks. The pseudo-code above has some glaring issues:
- Cryptic abbreviations
- Obtuse function names
- Magic literals
So if we're going to address those issues in the name of self-documenting code, we could do something like this:
// gets the current shopping cart total, based on the user's session ID
// returns a number for the current total
const getTotalFromShoppingCart = () => {
// connect to database
const database = connect(process.env.databaseUser, process.env.databasePassword);
// find any saved shopping cart total in the 'amt' column in the 'sc' table based on session ID
return database.get('amount').from('shoppingcarts').where('sessionid = ' + session.Id);
}
// calculate sales tax
const calculateSalesTax = (amount) => {
return amount * salesTax.florida;
}
// retrieve current total
const currentTotalInShoppingCart = getTotalFromShoppingCart();
// add sales tax
const currentTotalWithTax = calculateSalesTax(currentTotalInShoppingCart);
This is typically where many devs - even seasoned, experienced devs - look at my code and think:
OMG! Your variables are so longggg! Everything's so verbose! There's no way that I want to code like this!
But "long" is a relative term. Yes, I tend to favor full-word-name variables. And no, I don't necessarily expect you to do the same. But I do expect you to write your code in such a way that it's as self-explanatory as it can be.
In the example above, it's true that I've written some long variable names. And some long function names. But I've also written code that is, IMHO, fairly self-explanatory. You can just... read it. In fact, it's so self-explanatory that the comments now seem rather... silly. They just duplicate what you should already be able to understand just by reading the code.
So if we're really going to turn this into some "level 3" code, there's really no need for any of those comments. This means that, even though my code might feel a little "heavier" as you read through the variable names, this "weight" is counterbalanced by the fact that I don't have to chuck everything full of comments in order for you to understand it. This is illustrated by the readable nature of this:
const getTotalFromShoppingCart = () => {
const database = connect(process.env.databaseUser, process.env.databasePassword);
return database.get('amount').from('shoppingcarts').where('sessionid = ' + session.Id);
}
const calculateSalesTax = (amount) => {
return amount * salesTax.florida;
}
const currentTotalInShoppingCart = getTotalFromShoppingCart();
const currentTotalWithTax = calculateSalesTax(currentTotalInShoppingCart);
This is why I feel, very strongly, that comments are, in general, a "code smell". Because nearly all comments that I encounter are trying to explain to me what the code is doing.
I shouldn't have to read any comments to understand what your code is doing. I should be able to understand what your code is doing... by reading the code.
I'm sorry (not sorry) if this sounds a bit harsh. But if I have to read your comments to properly grok your code - then... you've probably written some pretty crappy code. Don't write esoteric code and then rely on your comments to tie it all together. Just write some clearer damn code.
When To Comment
At this point, you may be thinking that I demonize all comments. But that's not the case. There is a time and a place to use comments. But like Regular Expressions, comments should be used as a targeted tool. They're powerful, when used properly. But they can also be a sign of something truly wrong in the code.
Here are a few examples where comments don't suck:
IDE Tooling (e.g., JSDoc)
In modern coding environments, there are many instances where the comments aren't designed so much to speak to other coders. Instead, they're written to speak to the IDE. In other words, comments can help our development tools to "hook everything up" and check the data relationships as we code. That would typically look something like this:
/**
* @returns {number}
*/
const getTotalFromShoppingCart = () => {
const database = connect(process.env.databaseUser, process.env.databasePassword);
return database.get('amount').from('shoppingcarts').where('sessionid = ' + session.Id);
}
/**
* @param {number} amount
* @returns {number}
*/
const calculateSalesTax = (amount) => {
return amount * salesTax.florida;
}
const currentTotalInShoppingCart = getTotalFromShoppingCart();
const currentTotalWithTax = calculateSalesTax(currentTotalInShoppingCart);
I'm not always the biggest fan of JSDoc. But I don't generally have any problem with its use. And I understand the service it provides. So if this is how you're using comments, then... great. As long as this is the approach embraced by your team, then it's certainly a valid use of comments.
[Side Note: Don't be that guy - the one who insists on cramming JSDoc notation into all of his code files, even though the rest of the team has consciously eschewed it. Conversely, don't be that guy that refuses to add JSDoc comment blocks to his functions when the rest of the team has settled on it as a standard.]
Not What, But Why
As I've already tried to explain, if your comments tell me what your code is doing, then it's probably some crappy code. But there's a very different use-case to consider: Sometimes, we need to alert others as to why our code is doing what it's doing.
For example, consider the little pseudo-app we've been working with up to this point. Even after we've made things as "clear" as possible with expressive variable/function names, it can still sometimes be a challenge to understand why the code is doing what it's doing. Specifically, let's look at this function:
const calculateSalesTax = (amount) => {
return amount * salesTax.florida;
}
Seems pretty simple, right? It accepts a given amount
and returns that amount
multiplied by Florida's sales tax. But for a sharp coder, this begs a question:
Why have we hardcoded a sales tax number that is only specific to Florida???
If you're accustomed to building hardy, scaleable apps, this bit of hardcoded logic seems... off. In fact, it's the kind of "oversight" that might lead me to spontaneously refactor the code to utilize a lookupSalesTaxRate()
function - even if my current task didn't call for it. This is where a well-placed comment can do a world of good:
const calculateSalesTax = (amount) => {
// The client specifically told us that they only operate in Florida and they have no plans
// for expansion. They understand that we will need to rework this if they ever decide
// to do commerce outside the state of Florida.
return amount * salesTax.florida;
}
IMHO, this is an excellent use of a comment. It basically alerts all future coders that the current logic is not an oversight and should not be changed, unless that change is driven by the client.
To put this in different terms, we can already tell, by simply reading the code, what it's doing. But it's not clear, from the code, exactly why it's doing that. In this case, the comment can be extremely useful.
Conclusion
Last year, I worked in a shop that had PHP code like this:
//foreach
foreach ($items as $index => $item) {
}
I wish I was making that up, but I'm not even kidding. They had comments above basic language constructs, ostensibly to alert you to the fact that those basic language constructs were there. It was no different than putting a STOP AHEAD
sign one meter in front of the STOP
sign. It was one of the dumbest damn things I've ever seen in a codebase (and that's saying a lot).
Don't be "that guy". Be smart about how you use your comments. And, more often than not... don't use them. At all.
Top comments (41)
I disagree. Comment itself is not a "code smell", its misuse is.
There are situations where leaving a comment is a necessity.
Comments are not meant to describe things, but to clarify certain statements and leave notes, regarding use of certain constructions and tools.
For example :
I disagree with the above samples. I hope I don't go too much into detail :)
On first one I'd remove the comment and write a test which would ensure correct output. If later on somebody gets an idea to remove the
value === null
because it is "not needed" the test will break and they'll see why.Second one should be
userInput % 360
. To me the comment tells what the code does (modulo) so there is no need to reiterate it, and I would remove the comment upon seeing it. It also lies because the code doesn't do anything to check against negative values, so the comment could give a false feeling of correctness. Tests would ensure correctness.For the third one I'd add emphasis that it should tell exactly which 3rd party module is in question, and also suggest a short comment on why the bug has to be worked around the way it is (especially if the code for it isn't obvious). Reason: I've seen people use these "due to bug in 3rd party..." comments as-is and they're not useful. Of course the link is the meat, but because the Internet cannot guarantee links to work forever I'd always also add the minimum necessary required for understanding into the comment.
I agree, tests will prevent things from being removed without notice, but having one line right by the function would (hopefully) prevent the "refactoring" altogether.
Good point on modulus, bad example.
It was implied that the comment would state the name of the package or in the worst case link to GitHub issue will.
Bottom line, If you're not sure if you should leave the comment - you should.
It's a text, it's a core feature of every single language and it's meant to be used, not feared.
Maybe I've had bad experiences or bad luck, but I've noticed comments to be rather ineffective against "refactoring". You need only one person with too strong an opinion and less experience who successfully ignores what the comment warns or informs about, and goes ahead changing and breaking things.
Code reviews don't perfectly protect against this as those may get through by another who doesn't stop to care about the comment, possibly because it doesn't appear in the diff. Only tests seem to successfully protect against this, at least against breaking stuff. You'd have to go for slightly malicious mindset to start removing or changing valid tests. These "refactors" are often done thinking it is an improvement, with no malicious intent.
This is one reason why I keep comments as the absolute last tool to convey understanding that the code or naming things can't provide.
I'm pretty sure that this is exactly what I described in the article.
Please forgive me if I misinterpreted your article.
It seemed to me, that your article main idea was "don't use comments unless you're absolutely sure", opposite to "use
commentscode wisely".Descriptive, long variable names can be good. However you should still be cautious with length, repetition and context.
If the file already defines context you don't need to repeat that context in each name especially if those variables are not exported.
If you have multiple variables that have repetition of the same things you may want to consider changing the names for visual clarity to make emphasis on how they are different. The more similarity there are between two variable names, and the longer they are, the harder it becomes for human mind to keep track of them.
I've seen people do things like this.
I would say
currentTotalInShoppingCart
is roughly the maximum length you should usually have for a variable name. There are OK exceptions. With this particular case I'd probably remove the wordcurrent
for quicker distinction of differences, but even that decision depends on other variables in the file.There are no absolute correct answers, but you can do your best to make the code reading process effortless.
Agreed on all fronts. As I stated in my article, I strongly prefer full-word-name variables. What I didn't explain in the article is that this often leads me to think - deeply - about the "exact", most-concise way to name a variable such that it is 1) clear, 2) descriptive, and 3) as short as possible.
It's not uncommon for me to refactor code because, when I was first writing it, I couldn't think of anything clearer than
longYellowFruit
. But later, while I'm reading the code, I realize that it's soooo much simpler to just call the variablebanana
.I also share your frustration with context that is needlessly stuffed into the variable names. One great example of this annoyance is in objects. I see stuff like this far too often:
Despite my love of descriptive names, code like this drives me nuts. There's usually no reason for the repetition of
Name
inside each of the keys.Yup, and my comment wasn't really to question anything in your article, I think it is very much on point. Went more for providing more food for thought for the others that end up here and might find this topic for the first time :)
I've always found that way of thinking very childish. Code is read way more often than typed, so choosing variable names that are shorter to type is incredibly short-sighted.
It's not that I don't often find myself thinking that. But I often find myself thinking "damn I want a siesta right now" yet I don't just start sleeping in the office.
Comments should never (unless you're hacking bits in an OS kernel or something) explain the code. They should explain design choices:
So overall I'd say I pretty much agree with the article, though I find the title somewhat clickbaity. There's so many valid uses for comments that calling them a "code smell" is really stretching the expression a lot.
This has so many underrated implications.
Guilty as charged! But I made a point to illustrate exactly where comments are useful (critical, even). And my examples pretty much line up perfectly with yours.
And with regard to those who complain about variable name lengths, I also find their arguments to frequently be arbitrary. In other words, they complain about the length of the variables you've chosen, but when they decide to use a longer naming convention, it's totally "OK".
I get the point of this and I agree largely. Comments are tools and can be used or abused.
Some things need comments. For example, in C# some absolutely amazing code can be written in (e.g.) LINQ or Rx.NET, and without the comments you might as well be looking at hieroglyphics for a while.
Come to think of it though, LINQ, Rx.NET and perhaps SQL are the only examples of where I think comments are often essential I can think of. I don't even want to think about regular expressions - I prefer to pretend those things don't even exist.
I'm not familiar with Rx.NET. But as a C# dev myself (although it's been some years since I was "in the thick of it", this is exactly why I do not like LINQ. I've seen numerous C# devs write some absolute chicken scratch in LINQ, and then swear to anyone who will listen that it's "beautiful". You don't have to agree with me, but I stand by my original contention: If I need comments to understand your (LINQ) code, then it's some crappy code.
As for SQL, no, you absolutely do NOT need comments to understand a good SQL command. Some years ago, I was working with a guy who'd never seen the way that I wrote SQL statements (which was quite different from the way that he - or anyone else - wrote them). He took one look at it, processed it for a few seconds, nodded his head, and said, "Yeah... I get this." And from there forward, he always wrote his SQL statements in the exact same format that I do.
So ah, how do you write SQL statements?
To be clear, I don't mean to imply that I have some awesome magical way of writing queries that makes the hairiest queries easy to understand for even the most junior of devs. But this is an example of how I write them:
That may not look "revolutionary", but there are a few key points to this format:
firstName
oraccessLevel
. It's always referenced asperson.firstName
orpermission.accessLevel
.SELECT
, the order is alwaysSELECT
followed byFROM
followed by any-and-allJOIN
s, followed by any-and-all filters, followed byORDER BY
(if needed).Also, I never alias them into one-or-two-letter abbreviations. I hate this:
In contrast, I often see queries written something like this, and I hate it:
If this queries tends to "grow" over time, as the dev team decides that they need to add more return values and more
JOIN
s to it, it becomes ever-more-difficult to simply read.I see what you mean. Not magical but definitely better. I use SQL infrequently but I've written queries the second way and now I'm going to write them your way. It is much clearer 👍.
Thanks!
I tend to over comment my code, mostly because I'll look at it and then 6 months later I will loo at it again. It always ends up that I can't remember why I did anything in the code so the comments are to remind me. This was all internal code at my employer and mostly tools not apps. In the end, I suppose you could instead write a doc, and then add issues instead of adding "FIXME:" type of comments.
I still enjoy adding lots of comments, but most of the time my code tends to be self documenting.
I definitely don't like the idea of external documentation - cuz in a quarter century of doing this, I have never seen the external documentation kept up-to-date with the live code.
Also, I don't hate the
TODO
/FIXME
type comments. I personally think they can be useful when you're in a pre-launch state and the code is changing rapidly and you want to remind yourself that you need to do a little more work at this point in the code. But once the app is launched, I've rarely ever seen those comments acted upon. They become just as stale (and as meaningful) as a bit of "Kilroy Was Here" graffiti.Oh I agree. But I've been really happy with past self for writing for for future forgetful self. In these cases I'm the only person likely to see the code so I guess it works for this particular scenario.
For FOSS projects, I try to write self documenting code, but still try to put some comments in. I'm open to knowing alternative ways to give myself hints on what I was thinking without writing a design document! :-)
Great post. I completely agree that comments should not say WHAT the code does (such comments just duplicate the code), but WHY the code is here or WHY is it written that way, HOW it is supposed to work or WHEN it is supposed to be used.
You described WHAT and WHY.
HOW - For complex problems, briefly describe the logic behind. Not the actual two or three lines, but overall concept. Imagine something more than just a = b * 1.07, where several other methods are called, etc. With proper comment, you may not need to study it deeply. Also, HOW helps with writting tests and debugging.
WHEN - Describe appropriate use case. When it should and when it shouldn't be used. Let's say that you have method that is consuming a lot of time (which may be correct behavior) and it shouldn't be called in UI thread. Of course, only add such comment when there is a possibility of misusing the method.
Good points. I could quibble that "how" can often be accomplished by writing cleaner code. But that's not always the case.
I especially agree with "when". Sometimes you open a code file and there's a function/method there that doesn't seem to be called from anywhere within that file, and you literally can't imagine when it would ever need to be called. In those times, it's tempting to consider "cleaning" it up - by removing it. But it can be extremely helpful to have a comment there that explains, "this only gets called by a cronjob that's launched nightly from the client's environment".
Back in college, I had one prof who told us to comment EVERYTHING, and would grade us on our comments.
This was also the prof who wanted us to name our variables/functions "x", "foo" or "bar".
I caused him a lot of grief when I said we needed more explanatory names and less comments. I wish I knew his contact info so I could send this too him. 😂
Read the article with a frown ready to point out the usefulness of comments to explain the why, then got to that section and thought: well done sir :D
(although there are other nasty code smells that are guaranteed to cause trouble; a common example I've come across is functions returning an empty array/ object synchronously that gets populated async after the return with XHR call results, relying on pass-by-reference, when the proper way would be to return a promise. This kind of shit breaks React re-rendering as the prop doesn't change)
I don't normally agree with articles like this, as I feel so much of what we do in this particular human endeavor is subjective.
I completely agree, though. I'd go one step further and say that we should document our intent. I've gone into far too much code a few years after it was written and wondered why things were done the way they were. It's risky to have code like that, you never know what is intentional, what is a bug, and what is a common-law feature.
Hi Adam,
Again, nicely written article.
I would add that using long names to explain your code should be the norm. But good naming is an art.
The team should refuse to validate functional code if there are some naming issues.
Code is not just the instructions you give to the machine. It's also the way you inform the future maintainers of your code what it does. It's your communication tool to the future YOU that will read this code two years from now.
The issue with comments is that they become obsolete when the code evolves.
And the only thing that evolves with your code is your code...so everyone should keep in mind that code that works and have no bugs is not sufficient.
You should strive to produce self documented code. And most of the time self documenting code, means long names that convey meaningfull information.
I agree with all of this. And in my article, I completely forgot to point out the tendency of comments to become stale and unrepresentative of the very code in which they're embedded. I particularly enjoy this statement: