• GoodJob Bulk Enqueue

    A common pattern in Rails application to queue similar jobs for a collection objects. For example:

    post.watchers.find_each do |user|
      NotifyOfChanges.perform_later(user, post)
    end
    

    The above will generate 1 INSERT SQL statement for each job queued. I recently noticed that GoodJob introduced a bulk enqueue feature. It allows using a single INSERT statement for all those jobs, similar to Rails’s #insert_all:

    GoodJob::Bulk.enqueue do
      post.watchers.find_each do |user|
      NotifyOfChanges.perform_later(user, post)
    end
    

    Let’s see what the performance is locally:

    class NoOpJob < ApplicationJob
      def perform
      end
    end
    
    require 'benchmark/ips'
    
    Benchmark.ips do |x|
      x.config(:time => 10, :warmup => 5)
    
    
      x.report('Single Inserts') {
        ApplicationRecord.transaction do
          500.times { NoOpJob.perform_later }
        end
      }
      x.report('Bulk Inserts') {
        ApplicationRecord.transaction do
          GoodJob::Bulk.enqueue do
            500.times { NoOpJob.perform_later }
          end
        end
      }
    
      x.compare!
    end
    
    $ rails runner benchmark.rb
    Running via Spring preloader in process 46655
    Warming up --------------------------------------
          Single Inserts     1.000  i/100ms
            Bulk Inserts     1.000  i/100ms
    Calculating -------------------------------------
          Single Inserts      0.833  (± 0.0%) i/s -      9.000  in  10.823196s
            Bulk Inserts      4.746  (± 0.0%) i/s -     48.000  in  10.155956s
    
    Comparison:
            Bulk Inserts:        4.7 i/s
          Single Inserts:        0.8 i/s - 5.70x  slower
    

    Locally, we can see a significant performance boost due to fewer round trips to the database. But using bulk enqueue can be even more impactful than that. Production systems typically see much more concurrent load that my local machine. When the queueing is wrapped in a transaction, it can be very disruptive. Long-running transactions can slow the whole system down. Bulk inserting records is a great way to keep transactions short, and the GoodJob feature provides an easy way to do that, while keeping the semantics of the code the same.

    Read on →

  • The REPL: Issue 104 - April 2023

    Making A Network Call: Mitigate The Risk

    Nate Berkopec, well knows for his Ruby/Rails performance work, writes some good advice to mitigating the performance risk of making network calls: Make calls whenever possible in background jobs, set aggressive network timeouts, and use circuit breakers to fail fast when you detect a system is misbehaving.

    I’m not saying this is easy, I’m saying it’s necessary.

    Makefile Tutorial By Example

    make is tried and true technology. I don’t write Makefiles often. When I do, having a mental model of how make treats dependencies helps make the whole enterprise more efficient and enjoyable. This guide has plenty of material to get you started.

    Pure sh Bible

    Very ingenious collection of recipes for sh, that avoid using new processes. Some of the syntax is clever, but terrifying to read. Case in point:

    trim_string() {
        trim=${1#${1%%[![:space:]]*}}
        trim=${trim%${trim##*[![:space:]]}}
    
        printf '%s\n' "$trim"
    }
    

    Read on →

  • The REPL: Issue 103 - March 2023

    What Is ChatGPT Doing … and Why Does It Work?

    The Stephan Wolfram explains what ChaptGPT is doing. This article is very technical and on the long side. I found it quite enlightening to learn about what LLM are doing. A more accessible article article can be found, but I hate it’s name: “normie” is an objectionable term. It’s condecending, like you are not “in on something”.

    I wish people would stop insisting that Git branches are nothing but refs

    The article covers the friction of the mental model of using git vs the actual implementation. I’ve read a ton of complaints about git’s UX, the leaky abstraction, and all that. It’s true: Using git without knowing some of the plumbing is hard and doesn’t make much sense. In my experience, the more I’ve learned over the years from articles and especially from the Building Git Book, have made me much better at it, because I know understand the internals.

    In any case, the author seems correct to me: Saying that branches are just refs, is not helpful. A branch is a moving ref, which implies a series of commits. That is how most people think and talk about branches.

    Read on →

  • TIL: rails restart

    I first started writing Rails in 2010. Today I learned on the Ruby on Rails Blog that you can restart a running server in development with:

    $ bin/rails restart
    

    Up until today, I always quit my server (ctrl-c) and restarted when I wanted to pick a change that won’t be hot-reloaded (e.g. a change to an initializer). This works, but it is slower, especially when I am using foreman to start a fleet of processes (e.g. webpacker, background workers).

    Learning is a life-long process.

    Read on →

  • The REPL: Issue 102 - Februrary 2023

    That Wild Ask A Manager Story

    This article references a story that is new to me. In short, the person that went through the series of interviews is not the same person that shows up for work. Instead of over-reacting, Jacob Kaplan-Moss argues that we should do nothing:

    The premise here is simple: designing a human process around pathological cases leads to processes that are themselves pathological.

    Postgres DDL Statements and Availability

    This is a great reference of how each schema changing operation affects availability in Postgres.

    A career ending mistake

    John Arundel talks about career paths in software engineering:

    As software engineers, we’re constantly making detailed, elaborate plans for computers to execute. Isn’t it weird that we rarely give a moment’s thought to the program for our own careers?

    Read on →