Ylan Segal

The REPL: Issue 8 - March 2015

Turning The Database Inside Out With Apache Samza

Based on a talk at Strange Loop 2014, this post was eye-opening. Although it’s supposed to be about Apache Samza, most of the talk is devoted to talking about databases in general and what they are good at: Keeping global state, replication, secondary indexing, caching, and materialized views. This high-level view provided me with a lot of new perspective of how to think of databases. The many illustrations in the article are beautiful. Please go and read.

Your Most Important Skill: Empathy

The legendary Chad Fowler makes the case that empathy is a skill that everyone will benefit from developing further. Provides great list of why that is. Most importantly, he also details how to practice.

Git From The Inside Out

Git has often been criticized for having an inconsistent interface and leaking unneeded abstractions to the user. Some of that criticism is warranted. Nonetheless, git is one of my favorite programs. I use it hundreds of times throughout the day, always on the command-line, complemented by tig, the ncurses client for git. This article talks about the internals of git: How it stores data on disk for commits, trees, objects, tags, branches, etc. It is well written, well organized and a pleasure to read. If you read this guide, it will make it easier for you to interact with git because you will understand it’s intrenals. However, I think you should read it because it shows how great functionality can be achieved with software with minimal dependencies and using only the local filesystem as a data store.

Dipping My Toes in Elixir

I recently came across two interesting posts about the Phoenix framework: A benchmark comparing the performance against Rails and a how-to for creating a simple JSON API. I was interested by the performance characteristics described in both articles. What really got my attention though was the syntax: The feel for it was very ruby-like.

Phoenix is written in Elixir, a relatively young functional programming language design by Jose Valim, committer in the Rails Core team. Elixir compiles to byte-code that runs on the Erlang virtual machine, which lists as it’s main features scalability and fault-tolerance.

The main Elixir website has the best getting started guide I have ever read. I started off by installing Elixir, which on my Mac was as simple as $ brew install elixir. That took care of the Erlang VM and whatever else it needed. A few moments later I had a working elixir installation, ready to go.

The guide is written in clear and concise manner, introducing the reader to the language, the syntax, the tools and the concepts in a gradual manner. I am not familiar with functional programming, other than what I have heard described in a few podcast. Even so, I was able to follow along quickly and start exploring. The guide also includes a more advanced section covering mix and OTP.

mix is the main tool that Elixir uses for compiling code, resolving and getting dependencies, running tests, etc. Think of it as the equivalent of make or rake. OTP stands for “Open Telecom Platform” and is a set of libraries included with Erlang that allows developers to build fault-tolerant, distributed applications.

mix is a pleasure to work with: Be it creating a new project, installing dependencies or running your test, you go through mix. (As opposed to rails, rake, bundler)

ExUnit, the included testing framework is familiar to anyone having used any XUnit framework before. The error messages are helpful by default. It includes a great feature where examples inside the documentation of a module can be executed as tests directly. I was very happy to see that the guide introduces tests and TDD early on. It shows the values of the Elixir community, load and clear.

Concurrency is prominent. In fact, in order to have any kind of state at all, you need to use separate Elixir Processes, which are like light-weight threads, not system processes. Messages are sent between processes. I like the idea of having to think about concurrency early, as opposed than doing later in the application life-cycle. Elixir processes are allowed to fail fast on errors. It’s not a big deal, since they are under Supervisors that will just restart them if they fail.

Pattern-matching at first seems weird. They grow on you. It allows the programmer to deal with different cases easily and separate them out into their own functions, foregoing a lot of conditionals. Defining functions many times, with different guard clauses is great. It’s hard to describe (because I don’t know the correct language yet), but here is my take on the Fibonacci sequence:

1
2
3
4
5
6
7
8
9
10
11
defmodule F do
  def fib(x) when x in 1..2  do
    1
  end

  def fib(x) do
    fib(x - 2) + fib(x - 1)
  end
end

IO.puts F.fib(5)

In conclusion: I have barely gotten my feet wet, but have been impressed with what I have seen. I was not expecting this level of polish from a new language. I am itching to find a personal project to write in Elixir!

Book Review: The Ruby Way

Probably one of the most well-known books among rubyists, “The Ruby Way” by Hal Fulton with André Arko, has now been updated and released in its third edition. The first part of the book is dedicated to the language itself and covers syntax, semantics, some comparison to other languages and specific issues, like garbage collection, that developers are well served to know when writing ruby.

We do find OOP to be a useful tool and a meaningful way of thinking about problems; we do not claim that it cures cancer

The Ruby Way

The majority of the book is divided into sections that deal with specific task that a developer may encounter. From basics like working with String, numerical calculations and Enumerable collections to more advanced techniques like Threads and Concurrency, Metaprogramming, Network Programming and Distributed Ruby. Each chapter has plenty of code examples and thorough explanations.

I expect my copy to get plenty of used as my programming takes me to unknown or forgotten parts of ruby.

Links:

Fuzzy Match 'All-The-Things'

I started using fuzzy matching when I switched to Sublime Text 2 a few years ago (I currently use Atom, which also has the same feature built-in). The seemingly little improvement increased my productively greatly. It saves a few moments while opening files, but more importantly it prevents context switching. It lets me start working with a file (usually prompted by knowing or wanting to know the contents of it) without needing to think about the location of that file.

In computer science, approximate string matching (often colloquially referred to as fuzzy string searching) is the technique of finding strings that match a pattern approximately

Wikipedia

Such functionality is useful outside of code editors. Thanks to Unix’s modularity, general purpose fuzzy matches can accomplish a lot.

There are a few tools out there, but I found Selecta and Percol to be the more polished o the options. As far as I can tell, they are interchangeable, since they operate on STDIN and STOUT exclusively. I prefer percol on a day to day, because it has better support for the using the keyboard arrows for selection and it has colored output showing the matched portions. It is also full-screen. However, in terms of functionality, both perform equally well.

Changing Projects

My most-common use case for fuzzy matching is changing to project directories. I generally work on a few different projects throughout the day, which reside on different parent directories in my home directories. I created an function cdp that will search common locations using find and pipe the results to the fuzzy matcher for filtering. The output then goes to cd.

1
2
3
4
# .bashrc, .profile, .zshrc or other file sourced by the shel initialization
cdp () {
  cd $(find ~/Development ~/Personal -maxdepth 1 -type d | percol)
}

Fuzz: Fuzzy Match Current Directory Contents

I find myself wanting to find a file in the current directory often. Copying, diffing, working with git, running tests, etc. I created a general-purpose finder that uses find to find all files below the current directory, excluding those in directories starting with .. In addition it can be passed an argument to pre-filter for a string, which I only used on very large projects where find would take a long time to complete and make the fuzzy matching excessively slow.

1
2
3
4
5
# .bashrc, .profile, .zshrc or other file sourced by the shel initialization
fuzz () {
  search_term=$1
  find . -wholename \*$search_term\* -not -path './.*/*' | percol
}

zsh biding for ^S

The culmination of all this, comes by binding ^S on my zsh shell to run fuzz. This allows me to run fuzz while writing arguments to a command already started on the command line and once the fuzzy match is done, returning control to the shell with the argument in place. This allows me to use fuzzy matching on-demand, for any command, across my shell. It has quickly become a tool I reach for throughout the day.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Only for zsh
# ^S for fuzzy matching
# By default, ^S freezes terminal output and ^Q resumes it. Disable that so
# that those keys can be used for other things.
unsetopt flowcontrol
# Run Selecta in the current working directory, appending the selected path, if
# any, to the current command.
function insert-fuzzy-path-in-command-line() {
    local selected_path
    # Print a newline or we'll clobber the old prompt.
    echo
    # Find the path; abort if the user doesn't select anything.
    selected_path=$(fuzz) || return
    # Append the selection to the current command buffer.
    eval 'LBUFFER="$LBUFFER$selected_path"'
    # Redraw the prompt since Selecta has drawn several new lines of text.
    zle reset-prompt
}
# Create the zle widget
zle -N insert-fuzzy-path-in-command-line
# Bind the key to the newly created widget
bindkey "^S" "insert-fuzzy-path-in-command-line"

It is worth noting that most of this ideas where obtained directly from the Selecta README. Thanks Gary Bernhardt!


UPDATE (03/10/2015): A new project was just announced that has another interchangeable fuzzy picker utility, pick. This one is written in C, so it promises to be faster. More importantly, the project shows some more fantastic ideas on where to leverage fuzzy matching.

The REPL: Issue 7 - February 2015

Introducing Discrete Integration

Gustave Dutra at Plataformatec, discusses a concept called Discrete Integration. The process described is very similar to what I use at work at the moment. A sort of dialed-back version of Continuous Integration. It reaps a lot of the benefits of rapid feedback, just not as extreme. Feature branches are worked-on until they are at a stable state before merging into master. Good developer communications is encouraged at all stages.

Require Only What You Require

Janko Marohnić presents a well-thought out argument against requiring all code in gems up-front. His concerns are around what this communicates to other developers. The intention of each file or library required is lost as are the dependencies of each individual class. Finally, it makes it harder to clean-up dependencies once the are no longer used. I intend to adopt this style in the next gem I work on. I have actually done a similar thing in some Rails project, with a smaller scope: For example, if a class uses an external service wrapper, I require it on the top of the class even though it’s already required by bundler, as a notice to other developers on the project.

Programming Is Not A Craft

I previously wrote about my thoughts on the Software Craftsmanship metaphor. My main objection to the metaphor is that it evokes imagery that is not helpful to software development and completely unnecessary. In this article Dan North’s tells us that the risk behind the craftsmanship movement lies in putting software ahead of the benefits it provides. My understanding is that so called ‘craftsmen’ pride themselves in delivering value first, but the article did provoke in my a deep sense that there is a lot of navel-gazing going on. Let’s just focus on professionalism and excellence. Keep calm and code on.

Mark Methods Private When You Don’t Test Them

In his usual clear way Pat Shaughnessy writes about when to mark methods as private. Essentially, methods that have tests are part of what is considered the public API of the object. Untested methods and those that are private are not. Ruby performs the most basically check against calling private methods, that can be easily circumvented by using send. However, marking methods as private is more about communicating to other developers what part is internal to the class and should not be relied upon. This is very much in vein with Sandi Metz take on testing