Very handy recipes for testing ActiveRecord
transactions.
Using cat
to start a pipelines is about composing commands: It makes it easier to build pipelines in steps. Technically, you could be adding an extra process that you don’t need, but in day-to-day unix pipe operations, the performance is never an issue.
Nice visual animation of how removing stuff improves the design. The pie chart in particular was great!
]]>Let’s resume with our example:
class FindScore
DEFAULT_SCORE = 0
URL = 'http://example.com'.freeze
def initialize(user, http_client = HTTParty)
@user = user
@http_client = http_client
end
def call
make_api_request(@user, @http_client)
.then { parse_response(_1) }
.then { extract_score(_1) }
end
private
def make_api_request(user, http_client = @http_client, url = URL)
http_client.post(
url,
body: { user: user.id }
)
end
def parse_response(response)
JSON.parse(response.body)
end
def extract_score(response_body, default = DEFAULT_SCORE)
response_body.fetch("score", default)
end
end
That class doesn’t handle any errors. Each of the private methods can fail in different ways. For the sake of examples, lets say that we can encounter HTTP errors in make_api_request
, the response may fail to be valid JSON or the response might have a different JSON shape than what we expect. One way to handle them is via exceptions or checking for specific conditions, and then ensuring that the value passed along is what the next step in our pipeline expects:
class FindScore
DEFAULT_SCORE = 0
URL = 'http://example.com'.freeze
def initialize(user, http_client = HTTParty)
@user = user
@http_client = http_client
end
def call
make_api_request(@user, @http_client)
.then { parse_response(_1) }
.then { extract_score(_1) }
end
private
def make_api_request(user, http_client = @http_client, url = URL)
response = http_client.post(
url,
body: { user: user.id }
)
response.ok? ? response.body : "{}"
end
def parse_response(response)
JSON.parse(response.body)
rescue JSON::ParserError
{}
end
def extract_score(response_body, default = DEFAULT_SCORE)
response_body.fetch("score", default)
end
end
In that version, #make_api_request
checks for a correct response, passing the response body to #parse_response
. If the response is not successful however, it returns "{}"
, which is JSON that will be parsable by that response. In a similar manner, parsing JSON might raise JSON::ParseError
. #parse_response
rescues the exception, and returns a hash, as expected by #extract_score
.
The code is now more resilient: It can handle some errors and recover from them by returning a value that can be used in the next method. However, these errors are being swallowed. What if we wanted to add some logging or metrics for each error, so we can understand our system better? One way, is to add a logging statement on the error branch of each method. I prefer another way, using result objects.
For our purposes a result object can either be a success or an error. In either case, it wraps another value, and it has some methods that act differently in each case. This object is known as a result monad, but let’s now dwell on that. Our result object will make it easier to write pipelines of method calls, without sacrificing error handling.
A very minimal implementation looks like this:
class Ok
def initialize(value)
@value = value
end
def and_then
yield @value
end
def value_or(_other)
@value
end
end
class Error
def initialize(error)
@error = error
end
def and_then
self
end
def value_or
yield @error
end
end
The polymorphic interface for Ok
and Error
has two methods: #and_then
which is used to pipeline operations, and #value_or
which is used to unwrap the value. Let’s see some examples:
Ok.new(1)
.and_then { |n| Ok.new(n * 2) } # => 1 * 2 = 2
.and_then { |n| Ok.new(n + 1) } # => 2 + 1 = 3
.value_or(:error)
# => 3
Ok.new(1)
.and_then { |n| Ok.new(n * 2) } # => 1 * 2 = 3
.and_then { |n| Error.new("something went wrong") }
.and_then { |n| Ok.new(n + 1) } # => Never called
.and_then { |n| raise "Hell" } # => Never called either
.value_or { :error }
# => :error
A chain of #and_then
calls continue much like #then
does, expecting a result object as a return value. However, if the return value at any point is an Error
, subsequent blocks will not execute, and instead will continue returning the same result object. We then have a powerful way of constructing pipelines. Error handling can be left to the end.
Our class with error handling, can now be written as:
class FindScore
DEFAULT_SCORE = 0
URL = 'http://example.com'.freeze
def initialize(user, http_client = HTTParty)
@user = user
@http_client = http_client
end
def call
make_api_request(@user, @http_client)
.and_then { parse_response(_1) }
.and_then { extract_score(_1) }
.value_or { |error_message|
log.error "FindScore failed for #{@user}: #{error_message}"
DEFAULT_SCORE
}
end
private
def make_api_request(user, http_client = @http_client, url = URL)
response = http_client.post(
url,
body: { user: user.id }
)
response.ok? ? Ok.new(response.body) : Error.new("HTTP Status Code: #{response.status_code}")
end
def parse_response(body)
Ok.new(JSON.parse(body))
rescue JSON::ParserError => ex
Error.new(ex.to_s)
end
def extract_score(parsed_json)
score = parsed_json["score"]
score.present? ? Ok.new(score) : Error.new("Score not found in response")
end
end
Now, each method is responsible for returning either an Ok
or and Error
. The #call
method is responsible for constructing the overall pipeline and handling the failure (i.e. returning a DEFAULT_SCORE), and with a single line, it also logs all errors.
This technique is quite powerful. The result objects are not limited to private class methods. Public methods can return them just as well. The Ok
and Error
implementation is quite minimal as a demonstration for this post. There are full-featured libraries out there (e.g. dry-rb), or you can roll your own pretty easily and expand the API to suit your needs (e.g. #ok?
, #error?
, #value!
, #error
, #fmap
).
As I concluded in my previous post, writing Ruby classes so that the class is read in the same order as the operations will be performed leads to more legible code. Adding result objects enhances those same goals, and makes error conditions a first-class concern.
]]>I agree with the author that there is a lot of Pop Culture in software companies, in the sense that they forget about the past, and there is a bias for “newer is better”. Thus, we get all the articles advising to choose “boring” technology with a proven track record.
There also does seem to be a good amount of contagion in the current round of layoffs. Companies are firing people, even if that they are doing well. I disagree that it is irrational. I dislike that characterization. It seems like a crutch for failing to understand the motivation for the people making the decisions. I believe that company executives do know that layoffs are bad for morale and create some problems down the line. There are some pretty smart people in company management. I think that they are making those decisions in spite of knowing that there are real downsides. Maybe the pressure from boards or investors is too much. Even if it is a case of copying what others are doing, it need not be irrational. There is an incentive to go with the flow: It’s safe. No one ever got fired for buying IBM. If things go wrong, you wont be blamed for making the same decision everyone else made.
Mike Burns writes about how iteratively building a collection is an anti-pattern:
What follows are some lengthy method definitions followed by rewrites that are not only more concise but also more clear in their intentions.
It resonates with me that the pattern should be avoided. Brevity and clarity are great, but I think minimize mutation is a better reason to avoid building collections iteratively. Written in a functional style, your code handles less mutation of data structures, which means that it handles less state. Handling state is were a lot of complexity hides, and the source of many bugs. In fact, in Joe Armstrong’s estimation:
State is the root of all evil. In particular functions with side effects should be avoided.
The style of Ruby that the article encourages removes the state handling from your code. 👍🏻
Every few years, my routes start acting up in strange ways. Some devices function great, while others seem to have intermittent downloads. This articles confirms my suspicions. Router just wear out:
In general, routers can and do fail. The primary cause of failure for consumer grade equipment is heat stress. Most consumer grade hardware runs far too hot and have respectively poor air circulation compared to their ventilation needs.
To increase ventilation, I’ve started raising my router from the surface it’s on with a Lego structure that increases airflow from the bottom. It seems to improve heat dissipation by the imprecise measure of “it feels cooler to my touch”. 🤷🏻♂️
Some languages, like Go, have a built-in linter, which applies a universal style that’s been set by the language designers. That’s the most totalitarian approach. A decree on what a program should look like that’s barely two feet removed from a compile error. I don’t like that one bit.
It reminds me of Newspeak, the new INGSOC language from Orwell’s 1984. Not because of any sinister political undertones, but in the pursuit of a minimalist language, with no redundant terms or ambiguities or flair. Imagine every novel written in the same style, Hemingway indistinguishable from Dickens, Tolkien from Rowling. It would be awfully gray to enjoy the English language if there was only a single shade of prose.
The best code to me is indeed its own form of poetry, and style is an integral part of such expression. And in no language more so than Ruby.
There are probably people who would prefer the more conventional, literal style, and they could encode that preference in a lint rule. I’d have absolutely no problem with that, as long as they’re not trying to force me to abide by their stylistic preferences.
Now, in The Rails Doctrine DHH writes about Convention over Configuration:
One of the early productivity mottos of Rails went: “You’re not a beautiful and unique snowflake”. It postulated that by giving up vain individuality, you can leapfrog the toils of mundane decisions, and make faster progress in areas that really matter.
Who cares what format your database primary keys are described by? Does it really matter whether it’s “id”, “postId”, “posts_id”, or “pid”? Is this a decision that’s worthy of recurrent deliberation? No.
Let me try to unpack the two posts. Table names and primary and foreign key columns have strong conventions in Rails:
class Post < ApplicationRecord
has_many :comments
end
class Comment < ApplicationRecor
belongs_to :post
end
Implicit in the code above is that our database has a table posts
with a primary key field id
, and a table named comments
with a primary key field id
and a foreign key post_id
referencing posts.id
. There are many arguments to make about these specific conventions: About the plural table names, about using snake case, about the Ruby class names being singular, etc. The point over convention over configuration is that those changes don’t matter. The convention saves of from making unimportant decisions, in DHH words:
Part of the Rails’ mission is to swing its machete at the thick, and ever growing, jungle of recurring decisions that face developers creating information systems for the web. There are thousands of such decisions that just need to be made once, and if someone else can do it for you, all the better.
Let’s get back to Ruby syntax. The argument seems to be that the following two ways of writing the same code are different forms of self expression and the building blocks of poetry:
has_many :posts
has_many(:posts)
I am not convinced by that argument. I used to think that creating a style-guide for each team was a worthwhile exercise. Since then, I’ve been on 3 different teams in 12 years (hardly a lot by tech standards). I’ve come to experience the power of hitting the ground running on an unfamiliar Rails application, exactly because of convention over configuration. I’d rather we have more of that, with a convention for code style across teams.
In A writer’s Ruby, DHH says:
Imagine every novel written in the same style, Hemingway indistinguishable from Dickens, Tolkien from Rowling. It would be awfully gray to enjoy the English language if there was only a single shade of prose.
That would be an awful world indeed, but I don’t think it’s a fair comparison. Novels are mostly individual works of art. Code style is mostly a team endeavour with very different goals than literary works. Presumably, the team is working towards producing a maintainable code base that is easy to work on by current and future members of the team. Predictable style and idioms are better than poetic code. I would hate for all the novels I read to have the same style. I would hate just as much for the instructions on my dishwasher to be in the style of James Joyce or Leo Tolstoy.
For now, I am using standardrb, even if I don’t like all the conventions.
]]>Mind blown. This promises to be scalable and elastic with minimal code shenanigans. Like they mention: It doesn’t solve a problem. It removes it. Thanks to the power of the Erlang VM.
Overall, very good advice for Rails performance.
Like having a cake and eating it too: Ruby got faster, and is using less memory. I’ve personally seen 20% improvement in speed building this blog
]]>ruby
was released.
It promises performance improvements, especially with the YJIT turned on. Let’s see how it does using jekyll
the static site generator for this very blog.
$ ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
$ jekyll build
Configuration file: /Users/ylansegal/Personal/blog/_config.yml
Source: /Users/ylansegal/Personal/blog/src
Destination: /Users/ylansegal/Personal/blog/_site
Incremental build: disabled. Enable with --incremental
Generating...
Jekyll Feed: Generating feed for posts
done in 2.676 seconds.
Auto-regeneration: disabled. Use --watch to enable.
$ jekyll build
[...]
done in 2.685 seconds.
$ jekyll build
[...]
done in 2.679 seconds.
Average: 2.68 seconds
$ ruby -v
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) +YJIT [arm64-darwin23]
$ jekyll build
Configuration file: /Users/ylansegal/Personal/blog/_config.yml
Source: /Users/ylansegal/Personal/blog/src
Destination: /Users/ylansegal/Personal/blog/_site
Incremental build: disabled. Enable with --incremental
Generating...
Jekyll Feed: Generating feed for posts
done in 2.079 seconds.
Auto-regeneration: disabled. Use --watch to enable.
$ jekyll build
[..]
done in 2.212 seconds.
$ jekyll build
[...]
done in 2.163 seconds.
Average: 2.15 seconds
Of course this is unscientific, but 20% local reduction in build time is impressive!
]]>I don’t write go-lang, and don’t have any insights into this new queue. The introduction of why a database based queue solves the dual-write problem are clear and applicable to any other system. Transactionality is one of the main reasons that I recommend GoodJob. The other one, also discussed is that operationally, having less data-stores is a win.
A number of Postgres improvements also make it more performant than previous attempts at using db-backed job queues.
Coincidentally, I was listening to a recent podcasts. They were discussing Redis as a store for background queuing. I was disappointed that they didn’t mention the dual-write problem as a pro of relational database queues.
Concise and interesting. I’ve internalized as a best practice avoiding premature optimization. Rule 5 has me thinking a bit. Most web application programming doesn’t need to think much about data structures in memory, rather about how to design the database schema for storage.
]]>Rails 7.1 is out with some very interesting features for Potsgres users. Composite primary keys support in particular caught my eye: When partitioning tables, using a composite primary key that includes the partition key is a best practice. Now, Rails supports the composite primary keys in the model and associations (through query_constraints
) ensuring that when reading from the table, the partition key is always used.
Improved support for CTEs is also welcome!
It turns out that the way you structure classes (and more precisely variable instantiation) in ruby > 3.2 has performance implications. Ben Sheldon discusses how to structure classes to take advantage of those optimizations. It is an interesting demonstration on how code style and the ruby interpreter interact.
The article doesn’t mention how much it impacts performance. I wonder: On a typical web request, how much can this save by structuring your classes for optimization?
It’s called TLDR and it blows up if your tests take more than 1.8 seconds to run.
Testing is a near and dear topic to me. I have not tried this new framework, but I have some initial thoughts:
TLDR automatically prepends the most-recently modified test file to the beginning of the suite
This is brilliant. I have a script that guesses which test files to run on a branch based on what changed in git. After reading this, I immediately incorporated ordering the files by modification date.
]]>This book focuses on software design, identified as a continuos process that spans the complete lifecycle of a software system. The first part of the book proposes that the main issue in software design lies in managing complexity.
If a software system is hard to understand and modify, then it is complicated; if it is easy to understand and modify, then it is simple.
The rest of the book are a collection of principles, techniques, and heuristics to help remove or hide the complexity, replacing it with a simper design.
It is easier to tell whether a design is simple than it is to create a simple design,
Probably the most salient piece of advise is that “modules should be deep”: A module is deep when it provides a narrow interface to it’s callers that provides a lot of functionality that abstracts away the details of the implementation.
Adding garbage collection to a system actually shrinks its overall interface, since it eliminates the interface for freeing objects. The implementation of a garbage collector is quite complex, but that complexity is hidden from programmers.
Overall, I found the books worthwhile. Especially the attitude that the overall design of a system is constantly shifting. Individual programmers add or remove to the complexity in small increments every time they make changes to the system. Cutting corners very often, will leave the code in a state that is hard to recover from.
My own attitudes to software design align well with Ousterhout’s except for comments and tests. The author uses comments as a design aid: Writing interface comments first, before implementing any code, so that they guide the design. It gets the programmer thinking about how the module will be used, instead of how it will be implemented. As for tests:
The problem with test-driven development is that it focuses attention on getting specific features working, rather than finding the best design.
I wholeheartedly agree with the goal of writing comments first: Outside-in thinking results in better design. Focusing on how a module will be used from a caller’s perspective improves the module’s API. Sometimes comments can serve that purpose, but I think that the author misses the point that test-driven design (TDD) accomplishes that purpose as well. When you write your tests first, by definition you are forced to think how the module will be called, because the test itself uses it! In fact, TDD works best when you start writing tests in the outermost layer of your system and work your way inwards. It gets some time getting use to because the outermost test wont pass until the innermost implementation is complete. The gain is that those tests inform the design through the layers. As for the criticism about TDD being to focused on getting specific features working, I think this is a “shallow” TDD. TDD is typically a red-green-refactor loop. Red: write a failing test. Green: make it pass. Refactor: improve the design. I would agree with Ousterhout if we stopped at red-green, but the last step, the refactor, is what makes it complete: Red improves the API design, Green makes it correct, Refactor improves the internal design.
Links:
]]>I’ve seen some recent post on social media about the great performance of Ruby + YJIT. It’s time to give a try!.
I got it working locally with asdf
:
$ asdf install rust 1.72.1
$ export ASDF_RUST_VERSION=1.72.1
$ export RUBY_CONFIGURE_OPTS=--enable-yjit
$ asdf install ruby 3.2.1
$ asdf shell ruby 3.2.1
$ ruby --yjit -v
ruby 3.2.1 (2023-02-08 revision 31819e82c8) +YJIT [arm64-darwin22]
The author points out that rails app:update
should be use with caution, because it might make unwanted changes to your application or remove manually added configuration. Fair enough. What I don’t understand is the remedy: not to use it! That is what version control is for! I’ve upgraded multiple apps, multiple times using rails app:update
. In every case, before committing the changes to version control I inspect each one and make an informed decision if I want to keep them or not.