We have just spent the last 24 hours fixing a nasty bug in production.
But first, here's a tricky question for you: for how long do you think this code will run?
//I'm trying to be language agnostic here
for (i=0; i<1000000; i++)
thread.sleep(1);
It would seem that the answer is obvious - a million milliseconds, which is about 15 minutes.
WRONG.
The correct answer is - more than 4 hours.
Wait, what?! Bare with me ;-)
The nasty production bug story
We have a background worker on the backend - a huge while
loop that runs through an array of millions of elements and does all kinds of in-memory checks and manipulations on them.
But we donβt want the CPU to get stuck at 100% during this loop and choke the server, do we? We want the server to stay alive and kicking.
So what does the average Joe programmer do? That's right - Joe adds a short pause into the cycle and goes home.
if there's any game developers reading this - I can already hear you giggling and reaching out for popcorn
Here's the thing: "pauses" AKA Delay()
AKA Sleep()
in most operating systems are based on timers. The resolution of these timers is 12-15ms. You cannot pause for 1 millisecond - there will be at least 15.
So on a large array with a million elements, we get 15ms * 1000000 / 1000 / 60 / 60 = 4.16
- more than four hours.
And coming back to work in the morning our Joe-the-programmer sees what? That his loop is still running from yesterday. The job, that used to take 7 minutes (although it kept the CPU at 100%), now takes half a day to finish. In a "relaxed" mode though.
Everything is broken and customers start creating the "huh?!" tickets.
and the gamedevs reading this - laugh viciously
Because in their game development world this happens all the time, this is called a "busy loop" or a "tight loop". And you can't rely on timers.
But how do we throttle properly?
1. Use multimedia timers or timers from OpenGL/DirectX (overkill)
2. Throttle every N-th step, not every step (inelegant and stinks)
3. Dump "pauses" completely and use the magic Thread.Yield
instruction - which is a "polite" way to share resources and tell the operating system "hey, I'm still busy, but if you really need this, slow me down and let other threads do the work" (and this is the best way)
The 100% CPU load won't go away, but now it's not an issue - everything is fast and responsive.
Thread.Yield
is available in many languages:
C#: Thread.Yield
C++: std::this_thread::yield
Win32: SwitchToThread
Java: Thread.yield
Go: runtime.Gosched
(I think...)
Visual Basic: DoEvents
(kidding! ...although not really)
Python: time.sleep(0)
(on Windows it's time.sleep(0.0001)
don't ask me why... because Python...)
(by the way, it's not just Python - quite a few system libraries are smart enough to translate sleep(0)
into a yield
, including .NET, Posix and WinApi)
etc. Google your favorite language.
The moral of the story, I guess, is that even the trivial things get super tricky at scale. Things that used to be nice and simple when we just launched our little SaaS now get complicated when you have thousands of companies using your stuff. But that's a nice problem to have, I guess.
Top comments (3)
For 1000 seconds,
public class java.lang.Thread implements java.lang.Runnable {
public static final int MIN_PRIORITY;
public static final int NORM_PRIORITY;
public static final int MAX_PRIORITY;
public static native java.lang.Thread currentThread();
public static native void yield();
public static native void sleep(long) throws java.lang.InterruptedException;
public static void sleep(long, int) throws java.lang.InterruptedException;
public static void onSpinWait();
public java.lang.Thread();
public java.lang.Thread(java.lang.Runnable);
public java.lang.Thread(java.lang.ThreadGroup, java.lang.Runnable);
public java.lang.Thread(java.lang.String);
public java.lang.Thread(java.lang.ThreadGroup, java.lang.String);
public java.lang.Thread(java.lang.Runnable, java.lang.String);
public java.lang.Thread(java.lang.ThreadGroup, java.lang.Runnable, java.lang.String);
public java.lang.Thread(java.lang.ThreadGroup, java.lang.Runnable, java.lang.String, long);
public java.lang.Thread(java.lang.ThreadGroup, java.lang.Runnable, java.lang.String, long, boolean);
public synchronized void start();
public void run();
public final void stop();
public void interrupt();
public static boolean interrupted();
public boolean isInterrupted();
public final native boolean isAlive();
public final void suspend();
public final void resume();
public final void setPriority(int);
public final int getPriority();
public final synchronized void setName(java.lang.String);
public final java.lang.String getName();
public final java.lang.ThreadGroup getThreadGroup();
public static int activeCount();
public static int enumerate(java.lang.Thread[]);
public native int countStackFrames();
public final synchronized void join(long) throws java.lang.InterruptedException;
public final synchronized void join(long, int) throws java.lang.InterruptedException;
public final void join() throws java.lang.InterruptedException;
public static void dumpStack();
public final void setDaemon(boolean);
public final boolean isDaemon();
public final void checkAccess();
public java.lang.String toString();
public java.lang.ClassLoader getContextClassLoader();
public void setContextClassLoader(java.lang.ClassLoader);
public static native boolean holdsLock(java.lang.Object);
public java.lang.StackTraceElement[] getStackTrace();
public static java.util.Map getAllStackTraces();
public long getId();
public java.lang.Thread$State getState();
public static void setDefaultUncaughtExceptionHandler(java.lang.Thread$UncaughtExceptionHandler);
public static java.lang.Thread$UncaughtExceptionHandler getDefaultUncaughtExceptionHandler();
public java.lang.Thread$UncaughtExceptionHandler getUncaughtExceptionHandler();
public void setUncaughtExceptionHandler(java.lang.Thread$UncaughtExceptionHandler);
}
I guess Java is smart enough to translate short "sleeps" into a
yield
then! Good to know.in Java the method sleep is a static in nature, can be called without any object directly by java.lang.Thread.sleep(mills, nanos)