Ylan Segal

The REPL: Issue 26 - September 2016

Microservices – Please, don’t

Sean Kelly writes a cautionary post about microservices, organized into debunking 5 fallacies that he has encountered about microservices: They keep the code cleaner, they are easy, they are faster, they are simple for engineers and, they scale better.

The science of encryption: prime numbers and mod n arithmetic

While looking into Apache Milagro, I found a link to this short paper on the math behind public-key cryptography. It’s a great introduction, or refresher, to the mathematics that makes the secure web work. The paper itself has no author information, but the URL suggests that it written by Kathryn Mann at the University of California at Berkley.

Concurrency in Ruby 3 with Guilds

Olivier Lacan has a great explanation of Koichi Sasada recent proposal for bringing better parallelism to Ruby 3. The proposal is to introduce a new abstraction, called Guilds that is implemented in terms of existing Threads and Fibers, but can actually execute in parallel, because they have stronger guarantees around accessing shared state. In particular, guilds won’t be able to access objects in other guilds, without explicitly transferring them via channels. It’s exciting to think about Ruby’s performance not being bound by the Global Interpreter Lock (GIL).

Goodbye StartSSL, Hello Let's Encrypt

Mozilla is considering taking action against two Certificate Authorities, WoSign and StartCom after an investigation into improper behavior, including not reporting that the WoSign bought StartCom outright.

As I wrote about earlier, this blog used a StartCom TLS certificate, under their StartSSL brand, which was free. At the time, the only reason why I didn’t pick Let’s Encrypt was because the certificate expiration is every 3 months. However, given the contents of the report, I would much rather use an organization that wants to make the web better – not exploit it.

Obtaining and installing the new certificate, turned out to to be an easy process.

Obtaining A Certificate From Let’s Encrypt

I used the certbot client to obtain a certificate. On my mac, I installed via Homebrew:

1
$ brew install certbot

certbot can request and install the certificate, if it’s executed in the same machine that runs the web-server. In my case, I just wanted the certificates to be generated and downloaded locally.

1
$ sudo certbot certonly --manual

During the in-terminal process, certbot will ask for the intended domain and instruct you to make available some specified content at a particular url in that domain. This is to prove that you the person requesting the TLS certificate is an administrator for that domain. For me, this involved copying one new file to my hosting service.

After that, the certificate is issued immediately and available locally at /etc/letsencrypt/live/ylan.segal-family.com (your domain will vary).

Installing The New Certificate

The last time I installed a certificate, I had to open a support ticket at Nearly Free Speech. Since then, they have made the process automated and available from the control panel. The instructions are to paste into the provided form the certificate (including the full cert chain) and private key, all into the same field:

1
$ cat fullchain.pem privkey.pem | pbcopy

A few seconds later, the new certificate was installed and being served on this domain.

Conclusion

The Let’s Encrypt process ended up being simpler than StartSSL, since there was no need to manually create the private key and certificate signing request: It’s all done with one command.

Redirecting to an External Server May Leak Tokens in Headers

While working on an HTTP API that serves binary files to client applications, I came upon some unexpected behavior.

Imagine that we have a /file/:id endpoint, but that instead of responding with the binary, it redirects to an external storage service, like AWS S3. Our endpoint is also protected, so that users need an access token. A typical request/response cycle:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ curl --include --header "Authorization: Bearer SECRET_TOKEN" http://localhost:3000/file/12345
HTTP/1.1 302 Found
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Location: https://external-file-server.com/some-path
Content-Type: text/html; charset=utf-8
Cache-Control: no-cache
X-Request-Id: 8025fbf8-8513-401b-8ebc-32752cfd7c59
X-Runtime: 0.002428
Transfer-Encoding: chunked

<html><body>You are being <a href="https://example.com/some-path">redirected</a>.</body></html>```

Now, let’s instruct curl to follow redirects and be more verbose so that we can see the headers sent in the requests, as well as the responses. I’ll omit some output (with ...) for clarity.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ curl --verbose --location --header "Authorization: Bearer SECRET_TOKEN" http://localhost:3000/file/12345
> GET /file/12345 HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.43.0
> Accept: */*
> Authorization: Bearer SECRET_TOKEN
>
< HTTP/1.1 302 Found
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< Location: https://example.com/some-path
< Content-Type: text/html; charset=utf-8
< Cache-Control: no-cache
< X-Request-Id: 4c2254fb-d5a5-46d2-9a8b-9cbfecc8b2ec
< X-Runtime: 0.002537
< Transfer-Encoding: chunked
<

> GET /some-path HTTP/1.1
> Host: example.com
> User-Agent: curl/7.43.0
> Accept: */*
> Authorization: Bearer SECRET_TOKEN
>
< HTTP/1.1 404 Not Found
...

curl, as requested, followed the redirect response, but in doing so, it included the original Authorization header in the request to another domain1. We have just leaked our secret and gave a valid token to access our system to a third party. To be fair, after some thought, I think it’s reasonable for curl to interpret that the header is to be sent in all requests, since we are also telling it to follow redirects. From the manual:

WARNING: headers set with this option will be set in all requests - even after redirects are followed, like when told with -L, --location. This can lead to the header being sent to other hosts than the original host, so sensitive headers should be used with caution combined with following redirects.

Who does that?

curl’s behavior (sending specifically set headers on redirects) was also observed on some other User Agents, notably the library used by one of our client applications. However, it doesn’t seem to be universal. For example httpie, does not leak the header:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ http --verbose --follow http://localhost:3000/file/12345 "Authorization: Bearer SECRET_TOKEN"
GET /file/12345 HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Authorization: Bearer SECRET_TOKEN
Connection: keep-alive
Host: localhost:3000
User-Agent: HTTPie/0.9.6



HTTP/1.1 302 Found
Cache-Control: no-cache
Content-Type: text/html; charset=utf-8
Location: https://example.com/some-path
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Request-Id: a53eb96b-f58c-4eb0-bbbd-bdca3eee8cc6
X-Runtime: 0.002384
X-XSS-Protection: 1; mode=block

GET /some-path HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: example.com
User-Agent: HTTPie/0.9.6



HTTP/1.1 404 Not Found
...

As you can see, the Authorization header is conspicuous for its absence in the second request.

Mitigation

Since we can’t predict the behavior of all User Agents that are going to use our API, we can design our APIs differently on the server.

Use Token As a Parameter

If we are using OAuth2 (which my example implies, because the use of a Bearer token), the specification allows for the token to be passed as a URI Query Parameter named access_token. Since that makes it part of the original URL it will certainly not be included by any client that follows redirection. However, I have seen the used flagged as risky by several security audits. One of the objections is that parameters in URLs are commonly written to logs and expose tokens unnecessarily.

The OAuth2 specification also allows a Form-Encoded Body Parameter also named access_token. This gets aournd the fact that the token is part of the URL and won’t be sent on any redirect. However, the request must have an application/x-www-form-urlencoded content type, which may conflict with the rest of the application wanting it to be application/json or similar.

Use Basic Authentication

Basic Authentication is a method for a User Agent to provide credentials to the server (usually username and password). Most User Agents have good support for it and understand that its use is limited to the original URL.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ curl --verbose --location --user SECRET_TOKEN: http://localhost:3000/file/12345
> GET /file/12345 HTTP/1.1
> Host: localhost:3000
> Authorization: Basic U0VDUkVUX1RPS0VOOg==
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 302 Found
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< Location: https://example.com/some-path
< Content-Type: text/html; charset=utf-8
< Cache-Control: no-cache
< X-Request-Id: f3211d6c-4e77-448b-a44d-7ad080fe5d3f
< X-Runtime: 0.002391
< Transfer-Encoding: chunked
<

> GET /some-path HTTP/1.1
> Host: example.com
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
......

The U0VDUkVUX1RPS0VOOg== in the Authorization header above is the secret, Base64 encoded:

1
2
$ echo U0VDUkVUX1RPS0VOOg== | base64 --decode
SECRET_TOKEN:

Don’t Redirect At All

Of course, redirecting is not the only option: Your endpoint can act as a proxy and read the contents from the external server and pass along to the client. The penalty is that the client connection to your server will stay open longer, consume more computation resources and transfer more data than a redirect.

Conclusion

Be careful when redirecting to external servers and you are using header-based authentication. Some clients may forward those headers along to a third party.


  1. We can ignore the 404 response. This is a made up example, and it’s irrelevant how the external server actually responded.

The REPL: Issue 25 - August 2016

Types

Gary Bernhardt writes a great article on types, type systems and the differences in typing in different programming languages. He clarifies some of the adjective commonly associates with types: static, dynamic, weak, strong. It’s a very interesting read, as are some of the comments in the gist. Gary has also re-started his Destroy All Software screencast series: I haven’t watched any of the new ones, but I learned a lot from the old ones.

CloudFlare, SSL and unhealthy security absolutism

Troy Hunt explores the services that CloudFlare provides as a content delivery network (CDN), in particular with respect to SSL (or, more properly, TLS). As with most interesting things in life, it’s not black and white: CloudFlare is not evil – like some recent blog post claim – and provides valuable services, but users need to be aware what the security guarantees are, or more importantly what they are not. Security is hard and nuanced. The more you know…

The Log: What every software engineer should know about real-time data’s unifying abstraction

In the last few weeks I have been reading a lot on data pipelines. Many companies have been moving from centralized databases for all their data to distributed systems that present a set of challenges. In particular: How to make the data produced in one system available to other systems in a robust and consistent manner. In this articles Jay Kreps explains the Log in detail – the underlying abstraction necessary to understand database systems, replication, transactions, etc. The Log, in this context, refers to a storage abstraction that is append-only, totally-ordered sequence of records, ordered by time. The article is long, but thorough and absolutely worth your time. Many of the concepts are similar to what is described on a post about Apache Samza, also a enlightening read.

The REPL: Issue 24 - July 2016

A Critique of the CAP Theorem

Papers We Love San Diego is having their first meeting later this month, which unfortunately I won’t be able to attend. I was somewhat intimidated about reading Computer Science papers because of my lack of formal training, but Martin Kleppmann’s paper is very approachable. I found the paper very interesting and insightful and found that I was familiar with most of the concepts on which the paper is based. I’m looking forward to the next meeting.

How To Explain Zero-Knowledge Protocols To Your Children

A few hops away from a story in Hacker News, I found this whimsical introduction to Zero-Knowledge protocols, which I was ignorant of. If the topic piques your interested, read another introductory article.

Ten Rules For Negotiating a Job Offer

Salary negotiation can be uncomfortable. It’s a skill that we usually get to practice only once every few years. In the first part of a series, Haseeb gives out practical advice on how and why to negotiate your salary. I found the part about exploding offers particularly interesting. As it happens, I last year I received an exploding job offer that was good until the end of the day!. I played my cards pretty much as the author suggests: The company never relented. They would not give me any more time. I walked away without any regrets. That particular company’s high pressure tactics, more than anything else, tells me that I would not have been a good fit.