All programming languages have their advantages and disadvantages. Ruby’s ease of use and semantic language makes it a great introduction to the world of software engineering. A topic came up the other day in a discussion about the pluses and minus of Ruby vs some other languages. Ruby is kind of in this grey area of being neither, and both, a compiled language and an interpreted one. The code we right is compiled into a lower level language called byte code before that is then interpreted into machine code, and even lower level language native. Interpreted languages tend to be more portable to other environments while compiled languages give no guarantee they will work on a machine they were not compiled on. Compiled languages are generally faster and less resource intensive and can be tweaked for optimization down to the machine code level, while interpreted languages cannot.
I was told by a fellow colleague that Ruby was slow in comparison due to the fact that multi-threading is not possible with an interpreted language. In fact, ruby does have a few ways to utilize multiple threads. But, what does that even mean? When people talk about multi-threading, they are saying “a technique by which a single set of code can be used by several processors at different stages of execution.” To clarify, it is a way to run code in parallel with separate processors. If a program takes 4 minutes to compute, you may be able to break the work load into to 4 parts and run them in 4 threads for a total process time of 1 minute instead of 4.
We’ll get back to multi-threading in a moment. There are two things that often get confused, so let’s first talk about concurrency and parallelism. Parallelism is when two tasks run at the same time. While concurrency is when two tasks can start, run, and complete in overlapping time periods but that doesn’t mean that they are ever running at the same time.
As you can see from the image above, concurrency is cool, but what I really want is parallelism. In Ruby, it turns out there are a few ways to achieve parallelism. We can use multiple processes or multiple threads, and again, each way has it’s pluses and minuses. In Ruby, we can use the fork() method to create a “copy” of the process we are calling it on.
Using fork() is very memory intensive however. This is because resources are allocated for each fork. So, if you are running a process that takes 10mb to run but are running it 1000 times, it might get done in the timespan of running just a few processes, but but the memory required would now be 10gb. Also, if there are nested processes and the parent process dies before the child process, the child may turn into a zombie and you may end up with zombie processes running in the background just eating up resource with no purpose. Examples for using multiple processes include Unicorn, and Resque.
Here is an info graphic that explains some of the pluses and minuses to both multi-processing and multi-threading:
Looking at the table above, we can see that the advantages to multi-threading are more numerous than that of using multi-processing. However, the drawbacks of multi-threading can be more complex to deal with. With multi-threading, the resource usage is much less because memory allocation only occurs once per defined container/object. This is great!….But, this can lead to data corruption and unexpected value returns because multiple threads are accessing and/or writing a single memory address. Ruby has a handy “thread()” class to assign threads to processes, however controlling and debugging how memory is accessed by multiple accessors can be very complicated.
Something that safeguards against funky memory access is called the GIL (Global Interpreter Lock) and it is utilized by most Ruby engines. The definition of a GIL is as follows : A mechanism used in computer-language interpreters to synchronize the execution of threads so that only one native thread can execute at a time. An interpreter that uses GIL always allows exactly one thread to execute at a time, even if run on a multi-core processor. So basically, it won’t allow true multi-threading in order to safeguard against memory access/writing issues.
So are we just screwed in Ruby? The answer is kind of, but no. There are implementations of Ruby such as CRuby that have the GIL safe-guard built in. But, there are also implementations like JRuby and Rubinius that do not include the GIL safeguard. There are a few frameworks for working with server side multi-threading: Thin, Sidekiq, and Puma.
In conclusion, there are a few ways to handle running multiple processes simultaneously and ruby. Multi-processing and multi-threading are both possible. They each have different pros and cons and usage should be chosen depending on the application.
Top comments (0)