• The REPL: Issue 70 - June 2020

    Why Your Microservices Architecture Needs Aggregates

    Dave Taubler writes a comprehensive piece about the use of Domain-Driven Design concepts in micro-service architectures. The use of aggregates, entities and invariants can prevent unwanted dependencies between objects, leaky references and delineate a clear boundary around groups of data. Eventually, the use of aggregates simplifies things like sharding, message passing, idempotent retries, caching and tracking changes.

    I’ve been thinking about event schema design a lot lately, and found interesting that in the context of messaging between services via a message broker, the author advises to use the same aggregate!:

    So what should we pass as our messages? As it turns out, if we’ve embraced Aggregates, then we have our clear answer. Anytime an Aggregate is changed, that Aggregate should be passed as the message.

    Why Tacit Knowledge is More Important Than Deliberate Practice

    The author talks about tacit knowledge:

    Tacit knowledge is knowledge that cannot be captured through words alone.

    Most of the article is new to me: The idea that there is knowledge that can’t be expressed through words alone, and that is distinct and separate from deliberate practice.

    I can relate to the part about expertise: An experience software engineer can come up with a good design – or critique an existing one – in seconds. Clearly, they is not going down a list of heuristics one by one. In their mind, the pattern have been established and the brain quickly comes up with a solution. This is the subject of “Blink” by Malcom Gladwell.

    My sense is that as a person gains expertise, they gain intuition about the field, and their brain gets wired for rapid pattern recognition. They reach this stage before they can put in words why their intuition went in that direction. That actually comes in a later part of expertise, when the person can articulate and reason about the intuition and communicate it to others.

    I believe the author is saying that because people can’t articulate their intuition, that means that they hold tacit knowledge that can’t be articulated. I think that is a flawed syllogism. In fact, the author points at examples of fields were it was previously thought that apprenticeship was the only way to transfer tacit knowledge. Then someone came along and turned that knowledge into explicit knowledge.

    I believe that knowledge can be gained by learning explicit and implicit knowledge. On one end, reading books and academic material goes a long way, but at times it can be disconnected from application. The proverbial ivory tower. On the other end, there is apprenticeships that focus on following rules and procedures, without real understanding, that can nevertheless bring proficiency: Most people learn to play the guitar like this. Of course, there is a hybrid: Academic project-based learning that aims to cover both modes of learning. In my experience all 3 can work, even for the same person. It depends a lot on familiarity with the material adjacent to the new material that we are learning, and how the new materials fall into the existing subject’s mental model.

    There are interesting points in the article. My takeaway is that the real jump in understanding is when you can turn tacit knowledge into explicit knowledge, which is distinct from the author’s point about embracing tacit knowledge.

    Read on →

  • The REPL: Issue 69 - May 2020

    Schema evolution in Avro, Protocol Buffers and Thrift

    Martin Kelppmann goes writes about schema evolution in binary protocols – namely Avro, Protocol Buffers, and Thrift. Schema evolution is an important concern when building systems connected via event stream and immutable logs. This post inspired me to dig deeper about Avro Schema Evoltion

    Why does writing matter in remote work?

    The pandemic has shifted a lot of people to working remotely. Writing is an important part of remote work.

    While writing forces people to think clearly, writing also forces teams to think clearly.

    I think this is one of the most important points in the article: Writing things down helps develop a train of thought and connect things together. Holes in logic or implicit conjectures become evident. Writing for the consumption of others, increases this effect.

    I’ve found that collaborating on well-written work is easier than collaborating on work that is hard to follow, doesn’t spell out assumptions, doesn’t show examples, etc.

    Tools for better thinking

    Collection of different tools for problem solving, systems thinking, and decision making. Some are new to me, but all interesting and useful in separate situations. Great to have as a reference.

    Read on →

  • Avro Schema Evolution

    I’m involved in an ambitious project to split a monolith into a service-oriented architecture connected via a stream of events using Kafka topics. Kafka is agnostic to the data serialization format used, and I’ve been looking at Avro in particular.

    Avro – like Protocol Buffers and [Thrift] – is a binary format, making it more space-efficient than a text and verbose format like JSON. It stands out in that it’s schema support logical and complex types, and it’s decoupling between the writer’s schema and the reader’s schema, which provides flexibility.

    The schema for an application’s data is expected to change over time. In database-backed applications, this is typically done by changing the data shape with a migration. I’ve written about how deploying schema migrations needs to be though of carefully. In that article I covered a few different strategies, but they all shared a common trait: All the data has one shape before the migrations, and a different one afterwards. That is incompatible with applications that rely on an immutable log. Let’s explore Avro’s schema evolution.

    Read on →

  • Book Review: Building Git

    Building Git by James Coglan

    git is a widely successful version control system, it’s used in software companies large and small. It’s distributed nature changed software development in many ways. In Building Git, James Coglan re-implements a subset of git’s functionality from the ground up, using Ruby, which has a large standard-library and is higher-level than the original C.

    git itself is a large project with a lot of functionality. The book covers a lot of ground, in a step-by-step fashion. Each line of code is explained both conceptually and syntactically.

    Read on →

  • Postgres Ranges

    In my previous posts about bi-temporal data, I dealt with a lot of queries that had where clauses that dealt with operations in dates. For example:

    SELECT
        employee_id,
        committee_id
    FROM
        committee_membership
    WHERE
        valid_from <= '2020-05-02'
        AND '2020-05-02' < valid_up_to
        AND tx_applicable_from <= NOW()
        AND NOW() < tx_applicable_up_to
    

    The underlying schema looks like this:

    CREATE table committee_membership (
      employee_id int NOT NULL,
      committee_id int NOT NULL,
      valid_from date NOT NULL,
      valid_up_to date NOT NULL,
      tx_applicable_from date NOT NULL,
      tx_applicable_up_to date NOT NULL
    )
    

    The four dates in the table share the same structure. There are two prefixes valid and tx_applicable, and two suffixes from and up_to. This structure that hints that the dates represent two different concepts: An interval in time that delineates validity and an interval that delineates applicability.

    Read on →