• The REPL: Issue 100 - December 2022

    Just Use Postgres for Everything

    Complexity can be reduced by having less dependencies and systems. Postgres is a fantastic technology, and getting better with every release. I’ve been doing what this article advocates for years: Using Postgres by default (e.g. JSON storage, back a job queue, full-text search), and only moving away when needed.

    SQLite’s automatic indexes

    Preetam Jinka explains how SQLite handles join on un-indexed fields: It creates a temporary index! This saves postgres from having to implement hash joins.

    What I learned from pairing by default

    Eve Ragins talks about what he learned when pairing by default. I’ve done a fair amount of pairing, but my sweet spot is no more than 2 or 3 hours a day. After that it becomes to tiresome. There is some exploratory work that I also rather do by myself, to avoid having to talk through everything I am thinking.

    Read on →

  • The REPL: Issue 99 - November 2022

    Postgres: Safely renaming a table with no downtime using updatable views

    Once again, Brandur posts a practical example of using Postgres effectively. The article covers how to rename a table safely using views. Other renames can be a bit more complicated, for example in that example, a table was renamed from chainweel to sprocket. In a typical app, there will also be foreign keys pointing to the table, named chainweel_id (or similar). Those would still need to be renamed to sprocket_id. Postgres includes support for generated columns:

    A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.

    but it doesn’t quite have all the functionality needed to be able to change a column name without down time.

    Vanilla Rails is plenty

    Jorge Manrubia, from 37 Signals, objects to criticism that Rails encourages poor separation of concerns. Among the things that I agree with, is that the use of plain Ruby objects (POROs) is probably underused in most application. I don’t like some of the prescriptions in the article, though.

    I don’t like concerns. While it’s nice that functionality is split into it’s own file, when included in models they end up making the API of then ActiveRecord model bigger. It’s already huge to start with. With large code bases, it can be very challenging knowing all the ways that ActiveRecord objects are being used. Adding more domain methods doesn’t make it better. Instead, I’ve had better luck using service objects. They make the APIs narrower. A win in my book.

    In the last few years, I’ve found that separating data from functionality is one of the patterns that gives great results and scales well. Value or data objects encapsulate the data. Other classes manipulate that data. Each has it’s own lifecycle. Mixing them together is the OOO way – which Rails leans heavily on – but it tends to create very broad interfaces (see ActiveRecord).

    Read on →

  • Asdf, Direnv Together

    I previously wrote about how I use asdf and dirvenv together to setup per-project postgres versions. I recently learned about asdf-direnv, a direnv plugin for asdf.

    asdf works by creating shims of every executable. This adds some overhead. The plugin works by leveraging direnv to change the PATH to the actual executable, instead of the shim.

    Results

    I use asdf to install most versions that I want to control precisely for my projects. Usually, this means the ruby and postgres version. Let’s time the performance without using asdf-direnv:

    $ which ruby
    /Users/ylansegal/.asdf/shims/ruby
    
    $ time ruby -e "puts 'hello'"
    hello
    ruby -e "puts 'hello'"  0.04s user 0.02s system 38% cpu 0.155 total
    
    
    $ which psql
    /Users/ylansegal/.asdf/shims/psql
    
    $ time psql -c 'select now()'
                  now
    -------------------------------
     2022-11-28 17:01:07.470615-08
    (1 row)
    
    Time: 0.142 ms
    psql -c 'select now()'  0.01s user 0.01s system 12% cpu 0.129 total
    

    Installing asdf-direnv is straight forward, as listed in the documentation. Once enabled in my .envrc file:

    $ cat .envrc
    use asdf
    watch_file ".ruby-version"
    

    We can see the performance gains:

    $ which ruby
    /Users/ylansegal/.asdf/installs/ruby/3.0.4/bin/ruby
    
    $ time ruby -e "puts 'hello'"
    hello
    ruby -e "puts 'hello'"  0.04s user 0.02s system 93% cpu 0.065 total
    
    $ which psql
    /Users/ylansegal/.asdf/installs/postgres/13.5/bin/psql
    
    $ time psql -c 'select now()'
                  now
    -------------------------------
     2022-11-28 17:01:42.357192-08
    (1 row)
    
    Time: 0.195 ms
    psql -c 'select now()'  0.00s user 0.00s system 56% cpu 0.012 total
    
    Command With shim (s) Without shim (s)
    ruby 0.155 0.065
    psql 0.129 0.012

    In both cases, the savings are ~90 ms. It’s commonly said that anything below 200 ms is acceptable UX as “immediate”. To me, my terminal feels much snappier.

    I’ve been using this setup for a few weeks. The only issue I’ve encountered was that the plugin seems to fail to pickup the occasional changes in .ruby-toolbox even though the documentation states that watch_file in the documentation should fix that. I’ve been able to work around that by with touch .envrc, which forces the PATH to be re-calculated.

    Read on →

  • The REPL: Issue 98 - October 2022

    Rebase dependent branches

    Taylor Blau at the GitHub blog points highlights a new feature git (v2.38) that I am super excited about. You can now git rebase --update-refs. Since reading that, I’ve already saved a lot of time (and minimized mistakes) when working on a set of branches that build on each other.

    Partitioning in Postgres, 2022 edition

    Brandur highlights that Postgres has made great usability improvements to partitioning over the last few years. It is now relatively easy to take advantage of it.

    Add Data class implementation: Simple immutable value object

    An new immutable value object, Data, has been merged into Ruby for release soon. It’s stricter than a Struct, which in many cases is exactly what you need from a value object.

    Read on →

  • Git Monorepo Improved Performance

    git recently shipped some performance improvements when working with large repositories, as announced on the GitHub blog.

    I tested in a large repository. With default configuration:

    $ time git status
    On branch master
    Your branch is behind 'origin/master' by 686 commits, and can be fast-forwarded.
      (use "git pull" to update your local branch)
    
    nothing to commit, working tree clean
    git status  0.40s user 8.55s system 429% cpu 2.082 total
    

    We then configure fsmonitor and untrackedcache:

    $ git config core.fsmonitor true
    $ git config core.untrackedcache true
    

    And run twice, to warm up the cache:

    $ time git status
    On branch master
    Your branch is behind 'origin/master' by 686 commits, and can be fast-forwarded.
      (use "git pull" to update your local branch)
    
    nothing to commit, working tree clean
    git status  0.38s user 1.43s system 159% cpu 1.141 total
    
    $ time git status
    On branch master
    Your branch is behind 'origin/master' by 686 commits, and can be fast-forwarded.
      (use "git pull" to update your local branch)
    
    nothing to commit, working tree clean
    git status  0.13s user 0.03s system 92% cpu 0.178 total
    

    The improvement is quite significant. The end performance is under 200 ms, generally considered to be perceived as instantaneous by users. I’m thrilled!

    Read on →