Ylan Segal

The REPL: Issue 63 - November 2019

The Language Agnostic, All-Purpose, Incredible, Makefile

This post is a great introduction to make, one of the most versatile unix programmer tools, that has lost favor in recent years. I personally use make in some of my personal projects, but have yet to take advantage of it in large Ruby projects at work.

How containers work: overlayfs

Julia Evans explains how overlayfs – a union filesystem – powers containers and makes it much more efficient to build images from other images. Great read.

Large Teams: Finding a Green Build

I work on a large engineering team. The main Slack channel for our engineering department has 425 people in it. The code base is split into many repositories, dominated by a big Ruby on Rails application with many contributors. At the time of writing, the last 1,000 commits in master on that repository where made by 169 contributors.

The continuous integration for said mono-repo is heavily parallelized but still takes ~30 minutes to complete. Occasionally, a branch is merged that causes the build to fail. Usually, the case is that the specs worked correctly for that branch (otherwise we can’t merge), but new changes in master are not compatible. As hard as the team tries to maintain a green build (i.e. a build that passes and is deployable), a red build is somewhat frequent.

Branching-off for new work from a commit that is not green guarantees that, later down the line, your branches build will fail, even if your code is nowhere near the failures. The solution is to merge master (hopefully green this by now!), and wait for new builds.

I developed some scar tissue around this. I noticed that I started opening Circle CI, finding the last green build for the project’s workflow, copying the commit hash, and branching-off of that for my work. The results were very positive: I haven’t seen failures in one of my branches that are not related to my work since. The process is a bit tedious, though.

Automation to the rescue. Circle CI has an API, but it seems it is suitable to work around builds, while I was interested in whole workflows. However, Circle CI posts it’s builds results to GitHub, along with some of the other integrations we use. Github’s GraphQL API lets me include the state of the last few commits on master and find the last one that was successful.

Let’s see the commented code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
#!/usr/bin/env ruby

# Use bundler, self-contained in the same scrip file, without needing
# a Gemfile and Gemfile.lock
require "bundler/inline"

gemfile do
  source "https://rubygems.org"
  gem "git"
end

require "net/http"
require "json"

# A GitHub token be obtained in github.com
token = ENV.fetch("GITHUB_API_TOKEN")

# Find the GitHub owner and repo from the git remote.
owner, repo =
  Git
  .open(".")
  .remotes
  .first
  .url
  .match(%r{git@github.com:(.*)/(.*).git})
  .captures

# This is my first GraphQL query ever!
# I think I should be able to query only
# for builds with state == SUCCESS
# but I don't know how
query = <<~GRAPHQL
  query {
    repository(owner: \"#{owner}\", name: \"#{repo}\") {
      defaultBranchRef {
        target {
          ... on Commit {
            history(first: 50) {
              nodes {
                oid
                status {
                  state
                }
              }
            }
          }
        }
      }
    }
  }
GRAPHQL

github_url = URI("https://api.github.com/graphql")
headers = { "Authorization" => "bearer #{token}" }
body = { query: query }.to_json

# Send request to GitHub and parse response
response = JSON.parse(Net::HTTP.post(github_url, body, headers).body)

# Find commit nodes
nodes = response.dig("data", "repository", "defaultBranchRef", "target", "history", "nodes")
# Find the first "green" one.
last_build = nodes.find { |node| node.dig("status", "state") == "SUCCESS" }

if last_build
  puts last_build["oid"]
else
  # If we don't have a green build,
  # let's exit with non-zero status
  # to be a good unix citizen
  exit 1
end

I named it git-last-green-commit and put it in my path, which makes it available for invoking on any project that has a GitHub remote:

1
2
$ git last-green-build
8f8767ef8e176a851f04fade3fbd11406c084a7a

Since it’s output is just the commit hash, it can be chained like so:

1
$ git last-green-build | xargs git switch --create my_new_branch

Or what I must commonly use:

1
2
3
4
5
$ git switch master

$ git fetch origin master

$ git last-green-build | xargs git merge

Now my local master points to a green build, which I can use to branch-off, rebase, etc.

The REPL: Issue 62 - October 2019

The Night Watch

In this article James Mickens writes about being a systems programmer. The writing is witty and funny. It’s not new, but it is new to me. A few choice quotes:

One time I tried to create a list<map>, and my syntax errors caused the dead to walk among the living. Such things are clearly unfortunate.

Indeed, the common discovery mode for an impossibly large buffer error is that your program seems to be working fine, and then it tries to display a string that should say “Hello world,” but instead it prints “#a[5]:3!” or another syntactically correct Perl script

However, when HCI people debug their code, it’s like an art show or a meeting of the United Nations. There are tea breaks and witticisms exchanged in French; wearing a non-functional scarf is optional, but encouraged.

Do you see the difference between our lives? When you asked a girl to the prom, you discovered that her father was a cop. When I asked a girl to the prom, I DISCOVERED THAT HER FATHER WAS STALIN.

Empathy is a Technical Skill

Andrea Goulet writes an interesting article about empathy. The takeaway is that technical-minded folks should think of empathy as a skill that can be learned, and used effectively to achieve your aims. From experience, I can attest that increasing your empathy is like having a super power.

pg_flame

This project looks really promising. It formats the output of Postgres EXPLAIN ANALYZE as a flame graph, which can help in figuring out which parts of your queries are worth digging into.

The REPL: Issue 61 - September 2019

Building A Relational Database Using Kafka

Robert Yokota explores building a relational database on top of Kafka. It follows his previous article on creating an in-memory cache on backed by Kafka. RDBM systems are commonly thought of keeping track of tables and rows. The semantics of SQL reinforce the concept of rows being updatable. In practice though, most implementation use an immutable log under the hood. That is what makes transactions possible, each with its own consistent view of the world. Kafka can be thought of as an “exposed” MVCC system, and the current state of the data can be derived by consuming the messages in a topic. The article is interesting in that it assembles a relation database by using different existing open-source projects.

3 Key Ideas Behind The Erlang Thesis

Yiming Chen summarizes Joe Armstrong’s thesis: “Making reliable distributed systems in the presence of software errors”. The 3 key ideas identified: Concurrency oriented programming, abstracting concurrency, and let-it-fail philosophy. Armstrong is Erlang’s creator, and his thesis has been very influential in the Erlang and Elixir communities.

The REPL: Issue 60 - August 2019

Issue 60! I’ve been posting my favorite links to tech articles every month for the last 5 years! I’ve linked to 163 in that time (not including the links in this post). And, now that I am looking back… I realize that I’ve made a mistake and I re-used #53 for the 2018-12 and 2019-12 issues. ¯\_(ツ)_/¯

Engineers Don’t Solve Problems

This article by Dean Chahim is not about software engineering or computer science. It’s about Mexico City’s infrastructure and the decades-long battle to prevent flooding in the city. The article stroke a chord with me: Mexico City is my home town, it’s where I went to University to obtain my degree in Civil Engineering. The article illustrates how engineers make trade-offs that might have far-reaching consequences, and are not immune from political and socio-economic influence. There are lessons there for all engineers.

How to Build Good Software

Software has characteristics that make it hard to build with traditional management techniques; effective development requires a different, more exploratory and iterative approach.

Li Hongyi writes a thoughtful article on why software projects are not the same as other engineering projects, and require different management techniques. Successful software projects are very iterative and oscillate between cycles discovery and consolidation.

Arcs of Seniority

Stevan Popovic breaks down engineering seniority into a few factors: Independence, authority, Design, and Influence. During once career each of these develops in an engineer, and mark different types of seniority. As expected, not everyone reaches the same maturity in all factors at once. Each senior engineer has it’s own mix. The illustrations on the articles are particularly helpful.