So often I see a variables type()
inside of its name and it hurts me a little inside. Tell me I'm right or prove me wrong below.
Examples
Pandas DataFrames
are probably the worst offender that I see
# bad
sales_df = get_sales()
# good
sales = get_sales()
Sometimes vanilla structures too!
# bad
items_list = ['sneakers', 'pencils', 'paper', ]
# good
items = ['sneakers', 'pencils', 'paper', ]
Edge Cases?
It's so common when you need to get inside a data structure in a special way that itsn't provided by the library.... I am not exactly sure of a good way around it.
# bad ??
sales = get_sales()
sales_dict = sales.to_dict()
# good
🤷♀️
Containers are plural
Always name your containers plural, so that naming while iterating is simple.
prices = {}
items = ['sneakers', 'pencils', 'paper', ]
for item in items:
prices[item] = get_price(item)
Before I start fights 🥊 in code review, am I inline here or just being pedantic?
Top comments (13)
"Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged—the compiler knows the types anyway and can check those, and it only confuses the programmer" - Linus Torvalds
With modern tools and IDE's I really don't see this having to be a thing anymore. If you can just hover over or cursor to a variable and know its type. Then there is no need to add a type affix. Sadly I still do see this in the wild. Either due to an old code base or an old mindset and almost always when I do come across it there are at least a few dozen variable names that have since changed and have not been updated.
That beside, if you can't tell the semantic meaning of a variable by its name, and thereby deduce the like type(s) in use, you need better names anyway.
For example, what is the type of
is_ready
? How aboutlike_count
?average_temp
?pending_requests
? (If you guessed "boolean", "integer", "float", and "something akin to a queue", you're right!)Duck typing 🦆
Which reminds me of something I put in my upcoming Python book...
I had no idea what Hungarian notation was. That is an amazing quote.
Just today I had vscode (pyright in particular) catch me with an edge case. I had a function that would return an iterable if there were values, but
False
if there were none. It caught me that I wasn't catching theFalse
case.It needs to be kept as simple as possible. That is the beautiful thing about Python. As long as your code is easy to read and understand, not just by you, but to all else that read it.
When I was going through my course the variables were often referenced with their type, but that just made the learning easier and less confusing whilst learning different types/classes etc.
When you actually start to write your own code then the naming with type is just an incredibly long drawn out process, unnecessarily. If it is a list, you know it's a list because the data is in [ ]. A dictionary is contained within { } so no need to call it a dict as it is obvious with key value pairs.
Some people may still want to use variable types when naming them. You gotta do yourself a favor from a time point of view surely!!
Simple, easy to understand and flowing. I'm a little OCD so I agree with you completely on this one.
Interesting point with the difference between courses and real work, I hadn't thought about that.
It's also a bit less obvious sometimes where variables are often coming in from other functions that we might no be familiar with.
There's also the fact that a variable's type may change in the course of refactoring, or even executing, a Python program. Systems Hungarian notation is evil in production code.
Apps Hungarian is an example of good practice, e.g. distinguishing between otherwise identically named variables and functions relating to rows vs. columns in Microsoft Excel, which Charles Simonyi was working on when he invented it. The central tenant is that the prefixes relate to the semantic meaning.
Systems Hungarian was a total misunderstanding of the intent, wherein the type was stated in a prefix. Your post pretty well sums up why that's a terrible idea.
I agree ;)
So, in many cases I agree with you, especially for most of the built in structures.
For those there often exists some common naming patterna. I think dicts are a bit harder to name in general. I often think of names like key2value/key_to_value or keys_values but it is very context dependent. I must admit I also use df prefix/postfix for data frames at times. It is short and communicates that the data is in the shape of a table. I, in general, prefer using type annotations over baking the information into the name as they are kind of made for it.
I think it's a decision to be made as a group for the codebase. So gather diverse opinions and decide as a team (no fights)!
Haha, I am too nice to actually start fights in code review. People already feel vulnerable enough. Good point basing it off of the group/codebase.