The Yelp Engineering team has been posting regularly about they structure data consumption between different teams. The backbone of their system is Apache Kafka, but they have created a lot of tooling around it. In this announcement, they have open sourced (Apache License 2.0) many of these tools. MySQL Streamer pipes data from MySQL to Kafka. Schematizer stores and tracks the various data schemas used throughout their pipeline. There is a lot to learn in this projects and the blog itself, which provides an overview of how they approach dealing with data.
Troy Hunt writes a lengthy post about his experience offshoring development work to teams in India, China and the Philippines. He goes through the motivation for offshoring in the first place, the challanges and rewards, and the differences he encountered in different countries. His conclusion:
if you’re looking at hourly rate as a metric for outsourcing success, you’re doing it very, very wrong!
The United States National Institute for Standards and Technology has come up with new guidelines for password policies. If you are wondering which password rules to follow in your product, these are a great baseline. Note that the NIST policies contradict the FBI’s. While you are at it, consider if you actually need to store a password at all. Medium, for example, emails you a link to log in. Because of “forgot your password” functionality in most sites, access to your email is essentially equal to access to the site. Medium just made it explicit and removed the need for them to store users passwords. Those passwords are probably re-used elsewhere. If you don’t store them, you can’t loose them. Right?
This week at RubyConf I learned about two new tools.
Karafka is a framework used to simplify Apache Kafka based Ruby applications development. It looks like a Rails-like abstraction to remove some of the boilerplate and decisions around how to structure a Kafka application. I don’t know if it’s ready for production, but worth keeping an eye on it.
Victor Shepelev writes his opinion about RSpec and MiniTest and how the differ. I don’t subscribe to all the author’s opinions or conclusions, but I do prefer RSpec and I have never found the “It’s just Ruby” argument for MiniTest very convincing. If anything, I find that having a distinct shape, structure and feel for test is a net positive. It promotes shifting from “This is the part that specifies behavior” to “This is the part that implements behavior” in a cleaner way.
Being a good and kind person pays dividends. I love this story. You should read it.
xargsis one of my go-to tools in Unix. It reads lines from stdin and executes another command with each line as an argument. It’s very useful to glue commands together.
Sean Kelly writes a cautionary post about microservices, organized into debunking 5 fallacies that he has encountered about microservices: They keep the code cleaner, they are easy, they are faster, they are simple for engineers and, they scale better.
While looking into Apache Milagro, I found a link to this short paper on the math behind public-key cryptography. It’s a great introduction, or refresher, to the mathematics that makes the secure web work. The paper itself has no author information, but the URL suggests that it written by Kathryn Mann at the University of California at Berkley.
Olivier Lacan has a great explanation of Koichi Sasada recent proposal for bringing better parallelism to Ruby 3. The proposal is to introduce a new abstraction, called Guilds that is implemented in terms of existing Threads and Fibers, but can actually execute in parallel, because they have stronger guarantees around accessing shared state. In particular, guilds won’t be able to access objects in other guilds, without explicitly transferring them via channels. It’s exciting to think about Ruby’s performance not being bound by the Global Interpreter Lock (GIL).