Ylan Segal

The REPL: Issue 37 - August 2017

The fallacies of web application performance

Is performance only a production concern? Are threads enough for multi-core concurrency? Are there cost-free solutions to solve performance? José Valim answers these en some other questions in this post. José is the creator of Elixir’s Phoenix framework and was part of the Rails core team. I’ve found most of his writing to be worth my time. This is no exception.

Developing with Kafka and Rails Applications

Sam Goldman explains how Blue Apron uses Ruby on Rails to work with Apache Kafka. Part of the article touches on which gems they use to process Kafka streams. The other portion describes how to setup a local development environment. Docker is leveraged effectively to make a complicated setup something easy to spin up locally: The final product has 4 different services: Zookeper, a Kafka broker, A schema registry, and a REST proxy for Kafka.

An Intro to Compilers

Nicole Orchard writes an introductory post on how compilers work. Specifically those leveraging the LLVM toolchain – used by Swift, most Mac gcc compilers, Crystal and many more. It takes a simple “Hello, Compiler!” program through the 3 phases: Front-end, Optimizer and Back-end. Short and sweet.

Book Review: Understanding the Four Rules of Simple Design

Understanding the Four Rules of Simple Design by Corey Haines is a book about how to approach software design from a perspective of his years of the authors involvement in Code Retreats. A Code Retreat is a day-long practice session for software developers where they can explore different ways of building software by practicing deliberately without the pressure of having to deliver production code. I’ve previously written about my experience in a code retreat.

The book uses the same base example that code retreats do: Conway’s Game of Life. This example is specifically chosen because the rules are simple enough to understand quickly, yet it possible to write an implementation in many different ways with interesting tradeoffs.

The 4 rules of simple design, first enumerated by Kent Beck are presented in simplified form as:

  1. Test Passes
  2. Express Intent
  3. No Duplication (DRY)
  4. Small

Each of this rules is expanded on in detail with plenty of examples. One of my favorite quotes:

In the end, most design guidelines are best internalized and applied subconsciously.

This books converges on many of the better patterns that I like about the Ruby community: Outside-in test-driven development, writing small intention revealing methods, consciously think about what each object’s public API and avoiding over-designing for a future that may not materialize. I enjoyed reading it very much.

Links:

Testing a Puts Method

When I code long-running tasks, I often want to see some sort of progress report in my terminal to let me know that my code is still running. Let’s take a simple example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class ThumbnailCreator
  def process
    images.each_with_index do |image, index|
      # ...
      puts "Processed #{index + 1} images" if index % 10 == 0
    end
  end

  private

  def images
    # ... somehow find eligible images for processing
  end
end

The above code will print a new line to the console every 10th image processed. While this approach works, it is also hard to test and causes undesired output when running my tests. Can we do better? Where does the puts method comes from:

1
2
3
4
5
6
7
8
9
10
11
pry(main)> show-doc ThumbnailCreator#puts

From: io.c (C Method):
Owner: Kernel
Visibility: private
Signature: puts(*arg1)
Number of lines: 3

Equivalent to

    $stdout.puts(obj, ...)

pry makes it easy to trace the source of that method the Kernel module. Furthermore, it lets us know that Kernel#puts is equivalent to calling $stdout.puts. $stdout is a global ruby constant, which holds the current standard output. We can make that explicit in our code:

1
2
3
4
5
6
7
8
class ThumbnailCreator
  def process
    images.each_with_index do |image, index|
      # ...
      $stdout.puts "Processed #{index} images" if index % 10 == 0
    end
  end
end

Adding an explicit receiver for the puts makes the code a bit longer and more verbose – usually things that rubyists shun. It also makes it clear that our class is collaborating with $stdout, a different object. Once we realize that, it follows that we can also make this collaboration configurable through dependency injection.

1
2
3
4
5
6
7
8
9
10
11
12
class ThumbnailCreator
  def initialize(out = $stdout)
    @out = out
  end

  def process
    images.each_with_index do |image, index|
      # ...
      @out.puts "Processed #{index} images" if index % 10 == 0
    end
  end
end

All existing code that use our class continue to work as before: The default value for out will ensure that by default, we continue printing to $stdout. However, in our tests, we can now inject a different collaborator. What can we use?

So far, we’ve used only one method on out. Ruby will happily let us inject any object that we want, as long as it implements puts in a compatible manner (in terms of arity). However, there is a risk that our tests can become too coupled to our implementation by only passing an object that implements the narrowest of interfaces. Ruby’s stdlib includes a class that we can use: StringIO

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ ri StringIO

= StringIO < Data

------------------------------------------------------------------------------
= Includes:
(from ruby core)
  Enumerable
  IO::generic_readable
  IO::generic_writable

(from ruby core)
------------------------------------------------------------------------------
Pseudo I/O on String object.

Commonly used to simulate `$stdio` or `$stderr`

=== Examples

  require 'stringio'

  io = StringIO.new
  io.puts "Hello World"
  io.string #=> "Hello World\n"
------------------------------------------------------------------------------

Our tests can now use and verify the collaborator:

1
2
3
4
5
6
7
8
9
10
11
12
require "rspec"

describe ThumbnailCreator do
  subject { described_class.new(out) }
  let(:out) { StringIO.new }

  it "shows progress while processing images" do
    subject.process

    expect(out.string).to match(/Processed/)
  end
end

Conclusion

Often classes collaborate implicitly with other objects. Making the collaboration explicit allows us to use dependency injection as a way to configure behavior, resulting in a more modular design. Our initial motivation to test our code resulted in a better design, at little cost.

The REPL: Issue 36 - July 2017

Is Ruby Too Slow For Web-Scale?

Nate Berkopec writes a long post about Ruby performance and how it affects web applications. Not-withstanding the click-bait title, Nate brings up that raw performance might not be as significant as many teams would like to think. Many of use work on applications that receive only a modest amount of traffic. In this organizations, the trade-off between engineering productivity and server costs tilts towards productivity.

Five ways to paginate in Postgres, from the basic to the exotic

Most web-applications encounter a need to paginate results into multiple page loads. Joe Nelson works his way from the most simple implementations (LIMIT and OFFSET) to the more complex. He discusses the benefits and drawbacks of each. The techniques described cover most of the typical web-application needs. The more exotic ones – like stable page loads that return the same results even if elements are added or deleted from the collection – require more exotic solutions. They are usually expensive to compute.

An engineer’s guide to cloud capacity planning

Patrick McKenzie writes a great guide on how to plan server capacity in the cloud. He covers decoupling the applications with knowledge from it’s deployment environment, advises to automate provisioning and deployment, covers how to estimate capacity and what to focus on as traffic grows. This is another great article by Increment.

The REPL: Issue 35 - June 2017

You Are Not Google

Ozan Onay reminds us that Google, Amazon and other tech giants have problems at scales that most in the industry don’t have. Adopting their practices might not suit your organization or project. As with most things in Engineering, choosing a good solution depends on deep understanding of the problem.

The Many Meanings of Event-Driven Architecture

Event-Driven Architecture means different things to different people. In this talk, Martin Fowler dives in to the nuances of what it means and provides a framework to talk about event drive in a meaningful way.

Postgres EXPLAIN Visualizer (pev)

This tool provides a handy way to visualize the results of an EXPLAIN query in Postgres. I found this very useful. I hope this would exists for other databases as well.