DEV Community

Bertug Korucu for Kodeas

Posted on • Edited on

Bulk Update Multiple Records with Separate Data — Laravel

As a rule of thumb, we should never run database queries inside a for-loop!

“Database transaction” is an expensive operation.

For example, let’s say we designed an inventory software and it’s being used in production for a year, and reached 1,000,000 transactions.

Suddenly, we learnt that we didn’t add the VAT to our transactions. For the transactions in the future, it’s pretty easy to deal with, maybe with a mutator.

class Transaction extends Model { 
    public $vat = 0.20;

    public function setPriceAttribute($value) {
        $this->attributes['price'] += $value * $this->vat;
    }
}
Enter fullscreen mode Exit fullscreen mode

Future records are pretty easy to deal with. However how are we going to edit the 1 million records from the past.

For editing data from the past, I prefer to create a Seeder.

php artisan make:seeder AddVatToTransactions

How not to do it?

Drake says no

class AddVatToTransactions extends Seeder {

  public function run() 
  {
    $vat = 0.20;
    $transactions = Transaction::get();

    foreach ($transactions as $transaction) {
       $transaction->price += $transaction->price * $vat
       $transaction->save();
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

However, running it in a loop of 1 million and making a “database transaction” in each iteration of the loop — not a good idea! (Spoiler Alert: It’ll freeze your system 😀)


Then, how to do it?

Again, in our AddVatToTransactions Seeder:

The idea in mysql query is “CASE Statements”

UPDATE db.transactions
SET PRICE = CASE  
              WHEN id = 3 THEN 500
              WHEN id = 4 THEN 300
           END 
WHERE ID IN (3, 4)
Enter fullscreen mode Exit fullscreen mode

Now, let’s do it in Laravel:

$vat = 0.20;
$transactions = Transaction::get();

$cases = [];
$ids = [];
$params = [];

foreach ($transactions as $transaction) {
    $cases[] = "WHEN {$transaction->id} then ?";
    $params[] = $transaction->profit * $vat;
    $ids[] = $transaction->id;
}

$ids = implode(',', $ids);
$cases = implode(' ', $cases);

if (!empty($ids)) {
    \DB::update("UPDATE transactions SET `price` = CASE `id` {$cases} END WHERE `id` in ({$ids})", $params);
}
Enter fullscreen mode Exit fullscreen mode

This will make one database transaction to write it all your updates.⚡️

Drake says yes


🗣 I can hear some of you saying: “It’s still FREEZING”

So.. Optimizing it even further:

#1: “Select” only the data you need from the database to consume less RAM.

In our example, we only use “id” and “price” columns. So let’s only select them.

$transactions = Transaction::select('id', 'price')->get();
Enter fullscreen mode Exit fullscreen mode

#2: “Chunk” your collection to separate your transaction to multiple database transactions.

In Laravel, you can chunk collections like

Transaction::get()->chunk(5000);
Enter fullscreen mode Exit fullscreen mode

Let’s apply all in our example

Here, first we divide our $transactions collection into 5000 chunks and we do a “database transaction” per 5k records at once.

$vat = 0.20;
$transactions = Transaction::get();

foreach ($transactions->chunk(5000) as $chunk) {
   $cases = [];
   $ids = [];
   $params = [];

   foreach ($chunk as $transaction) {
       $cases[] = "WHEN {$transaction->id} then ?";
       $params[] = $transaction->profit * $vat;
       $ids[] = $transaction->id;
   }

   $ids = implode(',', $ids);
   $cases = implode(' ', $cases);

   if (!empty($ids)) {
       \DB::update("UPDATE transactions SET `price` = CASE `id` {$cases} END WHERE `id` in ({$ids})", $params);
   }
}
Enter fullscreen mode Exit fullscreen mode

Hope you like this trick!

Please let me know what you think in the comments 💬

Happy coding! 😊

Top comments (4)

Collapse
 
squatto profile image
Scott Carpenter

Using CASE statements is a clever way to do this! You just need to make sure that you pay attention to the length of your query so that you don't go over the max query length/packet size. In MySQL you can check the max by looking at the max_allowed_packet variable: show variables like 'max_allowed_packet';

When I have a one-off update to run against a large number of records, I typically create an Artisan command (or a job) that updates a small subset of the records, and then I'll schedule it to run every minute (or a longer period, if needed). It spreads the load over time and is easy to keep track of the progress. Once it's done, I remove the call from the scheduler and delete the command/job class.

Collapse
 
bertugkorucu profile image
Bertug Korucu • Edited

Spreading it to several jobs is indeed a great strategy! I guess it's all about picking the right strategy for the job.

If the task is not a time-sensitive issue, spreading it overtime like you said beats the bulk update - but if it's time-sensitive, I go with the bulk update strategy.

Collapse
 
sahilkashyap64 profile image
Sahil kashyap • Edited

Thank you,I used this in my Upload multiple images and rename them using drag & drop UI using dropzone.js

Link to article

img

Github link

Collapse
 
timoye profile image
Timothy Soladoye

I remember how this saved me few years back!