Code optimization and refactoring are crucial for enhancing the efficiency and speed of software. Share your experience of a specific instance wher...
For further actions, you may consider blocking this person and/or reporting abuse
I have 3 performance tips:
It's almost always one form or another of caching (assuming it isn't a bug). One of the earliest examples I did of this was in the 80s where I pre-caculated the results of trig functions and stored them in an array. I could then perform trig calculations needed for my game with the cost of an array lookup.
I totally relate to this. In the past I've made stuff like my own 3D engine, demoscene effects etc. in super low powered environments. The speed boost from using pre-calculated trig tables (and using plain integer arithmetic wherever possible) was huge!
That's great!
I remember doing that exact optimization in Turbo pascal in the early 90's when writing Wolfenstein 3d clones ;-)
Ah... back in the days when developers actually cared about performance 😉
About 10 years ago on earlier Android devices. I worked on image processing apps and a couple games. Floating point calculations were super slow on Android devices back then, specifically for image processing and video game math. So step one was to convert those floating point calculations to integer calculations, which already gave a pretty decent performance boost. Step two was to rewrite those routines in C and call them using the Java Native Interface.
The next problem was garbage collector activity that would stall especially video game animations and game performance. So the optimization trick was to recycle all objects and arrays etc., so nothing would be ever garbage-collected while the game was running. So if the game had entities such as enemies, projectiles etc, I would use so-called "pools" for each entity type and retrieve them when needed, and put them back again when done.
Pools are great for this sort of thing! I hear games using MonoGame using similar approaches, never allocating in the game loop. In C I just use static arrays for this kind of thing, saves a bunch of work on memory management too. In C++ you can use custom allocators that'll automate this behaviour for you.
By prioritizing the lightest tests in an IF condition. So that in case it does not pass the first lightest test then the other tests will be immediately discarded. Example:
This is oddly specific but for some reason
Array.from
is faster than a spread operator on strings in JavaScript.a while back, working on a SASS platform, I implemented something I named at the time "request caching". so caching requests instead of the results. Much later I found out this prwctise is named 'query batching' and it worked just like the
Dataloader
library from facebook. just that my solution worked with higher order functions, could seemless integrate with transactions and other contexts like pagination.by adding this into the projects own ORM, the entire app got a perfornance boost.
by the way it was a time when node apps where made with callbacks and not even with promises.
by the way, tcacher still has some advantages over Dataloader. But it could be lifted to the age of ESM modules.
Ironically, I've not done much performance tuning/refactoring in the 30+ years I've been developing or maintaining software (except perhaps for caching, as Shai Almog says ... caching!).
But my most memorable and lauded work has often been quite the opposite.
That is, actively slowing things down or at least discarding performance as a criterion in order to pursue competing goals (generally, maintainability, the cost or even facility of maintenance).
I'm not the only person to have landed on a project that was a black box because no-one, since implementation, wants to touch it. Anyone who's looked at it saw a house of cards, a mysterium of complexity and fled. The risks of making changes or the costs of a complete replacement both judged too high ... Just leave it be, if it ain't broke don't fix it.
But then a rewrite is budgeted, mainly because new hardware is bought with new peripherals, firmware and OS etc... And so this black box needs porting which equates to a rewrite.
A deep analysis of the thing to be rewritten begins the job, teasing it apart, building an internal documentation of the old, an internal spec, all the things lost to time in this legacy system. Then a rewrite, but often the main goal this time is not to land here again, but to have software that can be maintained, enduring staff turn over. And with that goal eclipsing performance, with a new generation of hardware providing enormous performance gains, a lot of complicated and difficult to describe optimisations in the old software, on the old hardware are tested against a simpler implementation on the new hardware ... scrutinising performance and accuracy and precision (I have always worked in the engineering and science realms). If the new is not significantly slower than the old or if, on the new hardware, is still faster, then with its clarity of code, internal documentation and interface specifications, it is the winner.
The result: performance optimisations removed in favour of simpler code that can be maintained into the future and evolve incrementally unlike the black box that just burst from its bubble.
I once managed a 1000x speed up from some code. The language had two similar types lists and arrays.
Lists had some features not available to arrays but we're basically wrappers around strings. So doing any operation like sorting them involved a lot of string manipulation.
One feature was talking over a minute to run and it was due to this list processing. All of the things inside it could be done with arrays. So I tried changing it. Converting to and from lists on either side of the system. I was hoping it would run in less than thirty seconds but it came in as fractions of a second.
I guess the moral is know your data types and how they are handled internally.
for many years our company did a lot of fix/rescue/up-feature work on other people's projects (usually apis). the first three places i always looked to address performance issues were always:
database design. the persistence layer is almost always the most time-intensive component, and most of the time adding a few well-chosen indexes on columns made huge improvements.
heavy loops. lots of devs will put some heavy call, like a query, in a loop. on test data with five or whatever records, it works great, but when live data grows to a thousand records, it becomes an issue. migrate looped selects to joins or some other strategy like joins. investigate memoization.
caching. there are a lot of caching strategies. generally, i like to start by throwing everything behind a cloudfront and start from there. if you have heavy components that are generally static or long-lived data, a good caching strategy can be a massive win.
worker queues. lots of tasks can just be deferred and a worker queue can make a huge improvement. keep an eye on the queues, though, since they're not free. i once had a fix project where the queue was so full that password recovery mails were taking two hours to get sent.
of course, before starting any optimization, it's necessary to figure out where the pain points actually are. spend some time profiling; get actual data on performance by using a profiling tool or home-rolling your own. i wrote an api logging and tracking tool for laravel for our company to do precisely this and it has been very valuable.
I was part of a development team that were developing Microsoft ISAPI extensions in C++
for a web application.
Yes.
As for actual examples, there's two that come to mind:
MinBy
query checking squared distances. The JIT generated really tight vectorized code for this and it was super fast.When I joined my recent company, I was given the task to improve the performance of a Socket Server (Socket.io + Nodejs).
At that time we were only able to handle 2K concurrent users, after that our EC2 instance used to go down due to High CPU.
We were doing some API call whenever the users connects to the Socket Server. Doing lots of parallel API calls was resulting in High CPU as HTTP 3 way handshake is CPU intensive.
Then instead of making API calls we decided to do DB queries on the Socket Server only, eliminating the REST call.
Then we started queuing requests in a local IN Memory queue and were processing multiple users request in a Single DB Batch Query (one DB query per 500 requests).
Then we used Redis to horizontally scale these socket servers.
Now we are handling 35K concurrent users and have tested out Service to handle 100K concurrent connections per ec2 instance (1CPU x 2GB RAM).
Nice read on JS performance optimisation here.