After working in python for many years, I decided it was time I learn a second programming language. Everyone seemed to suggest C++ was the hardest language to learn and the most unpleasant to work with, but it also seemed the most widely used for native applications. So I decided to learn it anyway and found myself slowly growing to appreciate it more and more (heavily recommend this course, it's long but covers so much necessary information). Now, all my hobby projects are written in C++ and I am constantly looking for excuses to write more C++ code in my day job. Yet the whole internet is filled with people talking about how awful C++ is, how Rust is going to bury it, etc. So I decided to write up a comparison of python and C++ and give my rationale for preferring the language everyone loves to hate.
Controlling memory
In python, everything is an object reference. What this means, is that every variable is a container for an object that allows you to use that object. This is not a problem, it's just how python is implemented. The problem comes in when we want to control how we move around objects. For example, what does the following code output?
a = [1, 2, 3]
def myFunc(b):
b.append(4)
myFunc(a)
print(a)
The answer is [1, 2, 3, 4]
. What about this?
a = [1, 2, 3]
def myFunc(b):
c = b
c.append(4)
myFunc(a)
print(a)
It still prints [1, 2, 3, 4]
! Ok, now what about this:
a = [1, 2, 3]
def myFunc(b):
b = [1, 2, 3]
b.append(4)
myFunc(a)
print(a)
Now it prints [1, 2, 3]
! What is happening here? Well, that variable b
in the function is an object reference, because everything in python is an object reference. At the start of the function, the function variables are all told to point to the object referenced by the input parameters. In other words, b
and a
both point to the same list. Not a copy, the exact same data stored in memory. Which means any modifications done to the object in the function stick with that variable after the function ends. That's why the append
method permanently modified our variable. What about the second function? As it turns out, the =
operator in python is more complicated than you think. It changes what an object reference is looking at, without changing the object previously referenced (the garbage collector eventually cleans up the old memory if all references to it are gone). But that means that, again, if we change c
it will change b
which means we change a
. Finally, in the last function the right hand side of the assignment initializes a new list, and then gives the reference to it to our variable. b
no longer refers to the data it grabbed at the start, it refers to a completely new list. But the variable a
still refers to the old object, because b
is a different object reference, it just shared data with a
at the start of the function. When you use the =
operator, you break that reference and now there's no way to access that data within the function. Let's compare it with C++ code:
#include <vector>
#include <iostream>
void myFunc1(std::vector<int> b) {
b.push_back(4);
}
void myFunc2(std::vector<int> &b) {
b.push_back(5);
}
int main() {
std::vector<int> a {1, 2, 3};
myFunc1(a);
myFunc2(a);
std::cout << '[';
for (auto i: a) std::cout << i << ", ";
std::cout << ']';
return 0;
}
What is the output? It's [1, 2, 3, 5, ]
. myFunc1
was not given a reference to a
, it was given a copy, and we explicitly told it we only wanted it to have a copy! We know that no matter what that function does, a
will come out the other side unchanged. In myFunc2
we gave it a reference instead, meaning we understand that function can modify the data. If we want to avoid making a copy of large objects but still want to protect the object from being modified we can add in a const
modifier, more on that later. When assigning inside a function or outside, we can assign variables as references to other variables, as copies of the data, or just grab a pointer to the data. In python, you get object references. This means you need to be careful about modifying an object, and if you expect your function to modify an object you have to be very careful throwing around the =
operator.
Constants
This brings me to the next part of python that is sorely lacking, constants. There are no constants in python. You can make some rules such as "all caps means don't change this" but it's always on the programmer not to throw that away and change a constant. For functions, where you can only pass things as object references, there is no way to protect your object from being changed. In very large code bases, if you actually change a constant somewhere it can be a pain to track down.
Privacy
Python has no private members or private attributes, which kind of means it is not really an object oriented language despite everything being an object. When architecting your code, you frequently want to create objects that are in complete control of their own internal data. In strict OOP, objects should always be in complete control of their data with everything accessed by getters and setters. Even in more data oriented design philosophies, there are data structures that will need strict control over access. And for private methods, running them at the wrong time can completely break an object in the worst cases. In python, you are on the honor system. Prefixing methods with _
or __
can send a signal to your linter that you shouldn't be using that method or directly accessing that data, but python will still run just fine. C++ gives you the option to encapsulate your data and control how it is accessed, ensuring things don't get changed in unpredictable ways. Methods and attributes marked as private
cannot be accessed, the compiler itself will throw an error. And it even gives you the ability to mark functions and classes as "friends" for those rare use cases where a function needs more access to either a whole class or just a specific method or attribute.
Not just speed
The most obvious reason to use C++ over python is of course performance. To my knowledge, there are no languages that can go significantly faster than C++ (though of course many languages match it in speed), but that is not the whole story. I can do matrix multiplication using numpy and probably beat a naïve C++ implementation, but there is no way I could roll my own in python and expect it to work in anything close to a reasonably amount of time. Coding in python for years left me with a strong aversion to for loops and completely mistaken intuition on code optimization. Basically, the only way to make python performant at all is to rely on libraries that implement algorithms in C/C++ for you. Granted, implementing complex algorithms on your own is not necessarily a great use of time when libraries are almost guaranteed to do a better job regardless of language. However, learning a list of tricks for mashing your code into list comprehensions or creative uses of numpy will not make you a better general purpose programmer, it will make you a better python programmer. And whenever you move to a different language, all those tricks will not help you at all. In C++ or more performant languages, you might start to learn more about O notation, or how hardware works, and slowly build more efficient code with more transferrable skills.
Typed
This is more up to personal preference, which is why I left it for last. There are pages of debate on typed vs untyped languages, but I have fallen squarely on the side of strongly typed languages. A variable really has no business changing its type, I have yet to see a solid reason for it. Function parameters being duck typed does make sense, but this can also be achieved with templates in C++ where the compiler can make sure your use of the template is valid. Python does not even have specific variable declarations, the appearance of a new variable tells the interpreter to make a new object reference. This means there are no checks on whether a variable was already declared or not, it's entirely on the programmer to make sure they don't accidentally override a variable.
Why I still use python 90% of the time
Portability
Now, after all this I still use python, because not all the complaints about C++ are undeserved. As a compiled language, C++ needs to be built for each system you are targeting. Achieving cross platform functionality is not trivial, and each new computer architecture you target will require more troubleshooting. Python is not perfectly portable, because you have to make sure all packages you rely on, as well as their own dependencies, are compatible with your target architecture.
Speaking of packages, this is one area where I am torn as to which language has the right idea. C++ makes it a pain to manage multiple library dependencies while python makes it as easy as possible to bring more and more dependencies in. For C++, this means rolling your own code more often than your should, while for python it means watching your list of dependencies balloon out of control. Installing three packages with pip can easily mean twenty packages are actually installed, and if any one of them happens to not support Windows or Linux it can break everything. But even with this caveat, python I think still edges out C++ and the various attempts at package managers floating around out there.
Syntax and maintainability
While you shouldn't just focus on which language uses fewer lines of code, it's hard to miss the stark contrast in how much code I needed to write in C++ vs python for my examples at the beginning. Despite what I said above about python's speed getting in the way of learning how to optimize code, it's syntax makes it easy to write algorithms in a clear way. And python's ability to work well with C/C++ code means you can offload the heavier stuff to underlying libraries and have a clean and easy to read script sitting on top. Python is sometimes called a "glue" language or the "duct tape" of programming languages, and not at all derisively. It really does make scripting your code a very pleasant experience. And it does make it easier to get developers up to speed in python and therefore make your code more maintainable for the future.
Bad C++ is VERY_BAD
Finally, working in legacy python code might get confusing at times because of how flexible it is, but legacy C++ code can bring a developer to tears. Nothing makes my heart sink faster than opening up some source code and seeing nothing but MACROS_IN_ALL_CAPS everywhere. These are not always the fault of the developer, sometimes C++ just didn't have a feature to do what a developer needed when the code was written and it would be foolish to think you can just clear away all macros from a code base without breaking something somewhere. After you have learned C++ (and that beginner's course is 42 hours long just in lectures) you basically need to learn a whole new language: preprocessor directives. Some code bases have hundreds or thousands of lines of code just for their preprocessor directives, it makes one wonder why they are using C++ at all at that point when they are essentially coding their own language. And a lot of it can tie back to portability and making code cross-platform, which means it can't really be avoided unless you want to target only one OS and architecture. Knowing C++, in other words, does not mean you actually know how to read the majority of C++ out there because everyone tweaks the language for themselves.
Conclusion
Python is essentially the English of programming languages. It is easy to write nonsense that is still technically valid. Lots of my complaints (no constants, dynamic types, etc.) can be solved by programmers being disciplined. In a paradoxical way, python's simplicity and flexibility make it a pretty bad language to start in. It's simplicity is deceiving, because it is expecting you to keep track of what the compiler keeps track of in C++. And it's flexibility makes it very easy to make critical errors.
In the end once you know enough all languages boil down to personal preference, and I just like C++ better.
Top comments (1)
Very nice post! 😊