Here's one of the most common interview questions you'll face when looking for a Node.js job: "Can you explain Node's Event Loop?" There are two types of engineers: those who can describe the Event Loop and those who cannot! This course will ensure that you are incredibly well prepared to answer that most important question.
This is the preface to an advanced course from Stephen Grider, an Engineering Architect and a distinguished Udemy Instructor Partner based in San Francisco Bay Area, authoring engineering courses including the 3rd highest-rated course on React.js in all of Udemy, among 1000s of courses. Stephen has received over 400,000 reviews, and over a million students studied his published materials.
The Advanced Node.js Concepts course contains 16 aggregate hours of lectures divided into several sections:
- The Internals of Node
- Enhancing Node performance
- Project Setup
- Data caching with Redis
- Automated Headless Browser Testing
- Wiring Up Continuous Integration
- Scalable Image/File Upload
The first two explores solely the Node.js anatomy in considerable depth, while the rest address several challenges using standard libraries and approaches in a suspensive manner. While benefitting from the course materials, I have greatly enjoyed the discoveries and solidified what I already knew.
Without taking notes information flies away, which is why I made sure to do that maintaining a quite high signal-to-noise ratio as Chuck Zerby mentions in his book The Devil's Details: A History of Footnotes. The notes may be the most helpful, if you already have an introductory idea on Node.js and willing to further it. It may also be advisable to follow Stephen alongside. However, I have included addendum wherever seemed relevent.
So, here we go, the notes, enjoy!
The Internals of Node.js
Node.js is not a new programming language, rather a runtime built on top of JavaScript. A programming language is primarily distinguished through its syntax. Node.js does not introduce any new syntax. In such a manner, TypeScript is a new programming language since it introduces new syntax.
In computing, a program can be defined by an executable file stored on disk, containing code or binary instructions for the processor. A process can be defined as an instance of a program, when it is loaded into memory and executed by the processor. The operating system allocates and manages resources, e.g. heap, stack pointers, registers, for the process in a secure isolated manner.
Threads are units of instructions to be executed from a process. Scheduling is the order of execution, and it refers to the ability of an operating system to decide which thread to process/execute in any given time. Some threads are more important than others. However, note the distinctions that:
- In Node.js, threads are best suited for CPU-intensive tasks, not I/O-intensive tasks,[1] for example any cryptographic computation, e.g. crypto.pbkdf2() will run fast and fs.writeFile() will not.
- Also note that threads are not used in Node.js for network I/O, rather network requests are managed by libuv using underlying functions provided by the operating system, e.g. epoll for Linux, kqueue for Mac.
Further note that, having more number of threads does not necessarily improve the file I/O performance. File read-write performance depends on completely different set of factors, such as bus speed, bus bandwidth, storage device, etc.
Inspite of that, using a thread from the thread pool to perform file I/O is necessary, in order to enable asynchronicity. Having more threads does in fact mean that we can read more files simultaneously, but it doesn't mean the act of reading an individual file becomes faster 🤯
Thread pool (also known as the Worker pool) is a collection of worker threads that are managed by the libuv library. By default 4 threads are deployed by Node.js, and the long-running operations as referenced in the event loop, in truth, refer to thread pool tasks. Note that the event loop itself does not assign tasks to the thread pool, but only checks and runs pending callbacks when the thread pool tasks are complete. Thread pool tasks are assigned by the standard libraries e.g. fs module, themselves. Custom thread pools tasks can be assigned from a Node.js C++ addon through N-API and WebWorker Threads. Despite its relative feasibility, it may be suggested to use the builtin thread pool instead of creating new workers using the worker_threads module.[2]
In the source code of the Node.js opensource project, lib folder contains JavaScript code, mostly wrappers over C++ and function definitions. On the contrary, src folder contains C++ implementations of the functions, which pulls dependencies from the V8 project, the libuv project, the zlib project, the llhttp project, and many more - which are all placed at the deps folder.
For example, the latest implementation of the exported function named pbkdf2 in the file lib/internal/crypto/pbkdf2.js contains a reference to a class named PBKDF2Job. However that class is never found among the JavaScript source code, rather found at the file src/crypto/crypto_pbkdf2.h - which is made available to internal JavaScript code through const {PBKDF2Job} = internalBinding('crypto');
.
When does libuv and V8 come into play? The purpose of V8 is to translate C++ values into their V8 JavaScript equivalence. The libuv project provides file system access, some aspects of concurrency, and a lot of processing constructs on the C++ side.
To process threads well, multiple cores are introduced. A single core can process multiple threads, which is called multi-threading or hyperthreading. For example, let's say Thread 1 is reading from a file, and Thread 2 is multiplying 3x3. Reading a file always takes a non-zero amount time, which the OS can detect and put the thread on pause, to allow running Thread 2.
Threads inside the event loop. Node.js creates 1 thread by default, and executes all code in that one thread. There is exactly one Event Loop per Node.js process, however that Event Loop may initiate other smaller event loops within itself. Each iteration in an Event Loop is called a tick.
Stephen Grider mentions 5 major phases, checked by the event loop:
- Pending callbacks of timers to run? e.g. setTimeout, setInterval
- Pending callbacks of system operations and long-running operations to run? e.g. epoll(), inotify_init1(), kevent() are system operations, and Thread Pool tasks are long-running operations
- Pause execution, continue only when:
- a new event done for system operations
- a new event done for long-running operations
- a timer is about to complete
- Pending callbacks for setImmediate?
- Pending close events to cleanup?
So, the Event Loop makes one more iteration if it detects any pending timers, or system operations, or long-running operations.
Is Node.js single-threaded? The answer isn't straightforward. The Event loop of Node.js is single-threaded, however file I/O operations, cryptographic operations, and some network operations run in multiple threads through the Thread pool.
Conclusion
I can't say I don't yearn for another deeper more advanced series of materials from Stephen Grider or someone alike. Nevertheless, I do think I have been well-packed enough to pursue the next steps in this learning journey by myself.
[1] Node.js API Docs: Worker threads
[2] Node.js Guide Docs: Don't Block the Event Loop
Top comments (9)
These are mostly the opinions of Stephen Grider. I researched this topic further after completing his course. The newer, more concise notes I made are available here at github.com/midnqp/nodejs
Have a great one 🎉
I've learnt a lot
Thanks a lot @obednuertey1! I'm glad to know :D
Thanks for the article, please make more articles related to that advanced Nodejs course
Hey @nishant12897, sure thing!
Thanks for article!!
Sure @fullstackscout, I'm absolutely happy to!
Awesome bro
Awesome !!