Ylan Segal

Benchamarking With Abprof & Abcompare

This week at RubyConf I learned about two new tools.

ripgrep (rg) combines the usability of The Silver Searcher (an ack clone) with the raw speed of grep.

And abcompare:

Determine which of two programs is faster, statistically.

Let’s use abcompare to check the speed of ag against rg.

1
2
3
4
5
6
7
8
9
10
11
12
$ abcompare "ag 'protected$'" "rg 'protected$'"
...
Based on measured P value 1.938532867962195e-05, we believe there is a speed difference.
As of end of run, p value is 1.938532867962195e-05. Now run more times to check, or with lower p.
Lower (faster?) process is 2, command line: "rg 'protected$'"
Lower command is (very) roughly 2.804074569852101 times lower (faster?) -- assuming linear sampling, checking at median.
         Checking at mean, it would be 2.794296598639456 lower (faster?).

Process 1 mean result: 0.08557533333333334
Process 1 median result: 0.085886
Process 2 mean result: 0.030625
Process 2 median result: 0.030629

And:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ abcompare "ag '< ApiController'" "rg '< ApiController'"
...
Trial 3, Welch's T-test p value: 0.00019649943055588537   (Guessed smaller: 2)
Based on measured P value 0.00019649943055588537, we believe there is a speed difference.
As of end of run, p value is 0.00019649943055588537. Now run more times to check, or with lower p.
Lower (faster?) process is 2, command line: "rg '< ApiController'"
Lower command is (very) roughly 2.3338791691803844 times lower (faster?) -- assuming linear sampling, checking at median.
         Checking at mean, it would be 2.408501905599531 lower (faster?).

Process 1 mean result: 0.082154
Process 1 median result: 0.08124
Process 2 mean result: 0.03411
Process 2 median result: 0.034809

The above was tried on my largest project, with ~3,000 Ruby files in it:

1
2
$ find . -name '*\.rb' | wc -l
  2757

Conclusion

For my use case, it seems clear that rg performs faster than ag. Don’t take my word for it, though. Run a benchamrk yourself!

The REPL: Issue 27 - October 2016

Karafka

Karafka is a framework used to simplify Apache Kafka based Ruby applications development. It looks like a Rails-like abstraction to remove some of the boilerplate and decisions around how to structure a Kafka application. I don’t know if it’s ready for production, but worth keeping an eye on it.

MiniTest is not “Just Ruby”, it is “Just Rails”

Victor Shepelev writes his opinion about RSpec and MiniTest and how the differ. I don’t subscribe to all the author’s opinions or conclusions, but I do prefer RSpec and I have never found the “It’s just Ruby” argument for MiniTest very convincing. If anything, I find that having a distinct shape, structure and feel for test is a net positive. It promotes shifting from “This is the part that specifies behavior” to “This is the part that implements behavior” in a cleaner way.

Be Kind

Being a good and kind person pays dividends. I love this story. You should read it.

Subtleties of Xargs on Mac and Linux

xargs is one of my go-to tools in Unix. It reads lines from stdin and executes another command with each line as an argument. It’s very useful to glue commands together.

It’s default behavior is slightly different in Mac (or BSD) and Linux, in a subtle way. On the Mac, if there is no input from stdin, it will not execute the command. On Linux, it will execute it without any argument.

As an example, let’s say that we want to use rubocop (a ruby syntax checker and linter) to check only RSpec files in a project. We can write something like this:

1
$ find . -name '*_spec.rb' | xargs rubocop

On a project that has a two spec files, expanding the above example:

1
2
3
$ find . -name '*_spec.rb'
./spec/one_spec.rb
./spec/two_spec.rb

xargs will execute the equivalent of:

 $ rubocop ./spec/one_spec.rb ./spec/two_spec.rb

The subtlety in behavior cames in when no files are found. To illustrate, let’s see the difference in a trivial example:

1
2
3
4
$ uname
Linux
$ echo "" | xargs echo "Hello"
Hello
1
2
3
4
$ uname
Darwin
$ echo "" | xargs echo "Hello"
$

On Linux, xargs will execute the utility, on a Mac it will not. The Linux version can be configured to have the same behavior as the Mac:

1
2
3
4
$ uname
Linux
$ echo "" | xargs --no-run-if-empty echo "Hello"
$

Unfortunetly, the --no-run-if-empty option is not recognizable by the Mac:

1
2
3
4
5
6
7
$ uname
Darwin
$ echo "" | xargs --no-run-if-empty echo "Hello"
xargs: illegal option
usage: xargs [-0opt] [-E eofstr] [-I replstr [-R replacements]] [-J replstr]
             [-L number] [-n number [-x]] [-P maxprocs] [-s size]
             [utility [argument ...]]

Why is this important? In the original example, if no files are found, rubocop will not be invoked at all on the Mac, but will be invoked with no arguments on Linux. In my case, that is unwanted behavior because rubocop will then check all files in the project.

Conclusion

When writing bash scripts that are intended to run on different Unix version, be careful that you understand and test the behavior of the Unix commands used, sometimes they have subtle differences in behavior.

The REPL: Issue 26 - September 2016

Microservices – Please, don’t

Sean Kelly writes a cautionary post about microservices, organized into debunking 5 fallacies that he has encountered about microservices: They keep the code cleaner, they are easy, they are faster, they are simple for engineers and, they scale better.

The science of encryption: prime numbers and mod n arithmetic

While looking into Apache Milagro, I found a link to this short paper on the math behind public-key cryptography. It’s a great introduction, or refresher, to the mathematics that makes the secure web work. The paper itself has no author information, but the URL suggests that it written by Kathryn Mann at the University of California at Berkley.

Concurrency in Ruby 3 with Guilds

Olivier Lacan has a great explanation of Koichi Sasada recent proposal for bringing better parallelism to Ruby 3. The proposal is to introduce a new abstraction, called Guilds that is implemented in terms of existing Threads and Fibers, but can actually execute in parallel, because they have stronger guarantees around accessing shared state. In particular, guilds won’t be able to access objects in other guilds, without explicitly transferring them via channels. It’s exciting to think about Ruby’s performance not being bound by the Global Interpreter Lock (GIL).

Goodbye StartSSL, Hello Let's Encrypt

Mozilla is considering taking action against two Certificate Authorities, WoSign and StartCom after an investigation into improper behavior, including not reporting that the WoSign bought StartCom outright.

As I wrote about earlier, this blog used a StartCom TLS certificate, under their StartSSL brand, which was free. At the time, the only reason why I didn’t pick Let’s Encrypt was because the certificate expiration is every 3 months. However, given the contents of the report, I would much rather use an organization that wants to make the web better – not exploit it.

Obtaining and installing the new certificate, turned out to to be an easy process.

Obtaining A Certificate From Let’s Encrypt

I used the certbot client to obtain a certificate. On my mac, I installed via Homebrew:

1
$ brew install certbot

certbot can request and install the certificate, if it’s executed in the same machine that runs the web-server. In my case, I just wanted the certificates to be generated and downloaded locally.

1
$ sudo certbot certonly --manual

During the in-terminal process, certbot will ask for the intended domain and instruct you to make available some specified content at a particular url in that domain. This is to prove that you the person requesting the TLS certificate is an administrator for that domain. For me, this involved copying one new file to my hosting service.

After that, the certificate is issued immediately and available locally at /etc/letsencrypt/live/ylan.segal-family.com (your domain will vary).

Installing The New Certificate

The last time I installed a certificate, I had to open a support ticket at Nearly Free Speech. Since then, they have made the process automated and available from the control panel. The instructions are to paste into the provided form the certificate (including the full cert chain) and private key, all into the same field:

1
$ cat fullchain.pem privkey.pem | pbcopy

A few seconds later, the new certificate was installed and being served on this domain.

Conclusion

The Let’s Encrypt process ended up being simpler than StartSSL, since there was no need to manually create the private key and certificate signing request: It’s all done with one command.