Ylan Segal

Redirecting to an External Server May Leak Tokens in Headers

While working on an HTTP API that serves binary files to client applications, I came upon some unexpected behavior.

Imagine that we have a /file/:id endpoint, but that instead of responding with the binary, it redirects to an external storage service, like AWS S3. Our endpoint is also protected, so that users need an access token. A typical request/response cycle:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ curl --include --header "Authorization: Bearer SECRET_TOKEN" http://localhost:3000/file/12345
HTTP/1.1 302 Found
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Location: https://external-file-server.com/some-path
Content-Type: text/html; charset=utf-8
Cache-Control: no-cache
X-Request-Id: 8025fbf8-8513-401b-8ebc-32752cfd7c59
X-Runtime: 0.002428
Transfer-Encoding: chunked

<html><body>You are being <a href="https://example.com/some-path">redirected</a>.</body></html>```

Now, let’s instruct curl to follow redirects and be more verbose so that we can see the headers sent in the requests, as well as the responses. I’ll omit some output (with ...) for clarity.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ curl --verbose --location --header "Authorization: Bearer SECRET_TOKEN" http://localhost:3000/file/12345
> GET /file/12345 HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.43.0
> Accept: */*
> Authorization: Bearer SECRET_TOKEN
>
< HTTP/1.1 302 Found
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< Location: https://example.com/some-path
< Content-Type: text/html; charset=utf-8
< Cache-Control: no-cache
< X-Request-Id: 4c2254fb-d5a5-46d2-9a8b-9cbfecc8b2ec
< X-Runtime: 0.002537
< Transfer-Encoding: chunked
<

> GET /some-path HTTP/1.1
> Host: example.com
> User-Agent: curl/7.43.0
> Accept: */*
> Authorization: Bearer SECRET_TOKEN
>
< HTTP/1.1 404 Not Found
...

curl, as requested, followed the redirect response, but in doing so, it included the original Authorization header in the request to another domain1. We have just leaked our secret and gave a valid token to access our system to a third party. To be fair, after some thought, I think it’s reasonable for curl to interpret that the header is to be sent in all requests, since we are also telling it to follow redirects. From the manual:

WARNING: headers set with this option will be set in all requests - even after redirects are followed, like when told with -L, --location. This can lead to the header being sent to other hosts than the original host, so sensitive headers should be used with caution combined with following redirects.

Who does that?

curl’s behavior (sending specifically set headers on redirects) was also observed on some other User Agents, notably the library used by one of our client applications. However, it doesn’t seem to be universal. For example httpie, does not leak the header:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ http --verbose --follow http://localhost:3000/file/12345 "Authorization: Bearer SECRET_TOKEN"
GET /file/12345 HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Authorization: Bearer SECRET_TOKEN
Connection: keep-alive
Host: localhost:3000
User-Agent: HTTPie/0.9.6



HTTP/1.1 302 Found
Cache-Control: no-cache
Content-Type: text/html; charset=utf-8
Location: https://example.com/some-path
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Request-Id: a53eb96b-f58c-4eb0-bbbd-bdca3eee8cc6
X-Runtime: 0.002384
X-XSS-Protection: 1; mode=block

GET /some-path HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: example.com
User-Agent: HTTPie/0.9.6



HTTP/1.1 404 Not Found
...

As you can see, the Authorization header is conspicuous for its absence in the second request.

Mitigation

Since we can’t predict the behavior of all User Agents that are going to use our API, we can design our APIs differently on the server.

Use Token As a Parameter

If we are using OAuth2 (which my example implies, because the use of a Bearer token), the specification allows for the token to be passed as a URI Query Parameter named access_token. Since that makes it part of the original URL it will certainly not be included by any client that follows redirection. However, I have seen the used flagged as risky by several security audits. One of the objections is that parameters in URLs are commonly written to logs and expose tokens unnecessarily.

The OAuth2 specification also allows a Form-Encoded Body Parameter also named access_token. This gets aournd the fact that the token is part of the URL and won’t be sent on any redirect. However, the request must have an application/x-www-form-urlencoded content type, which may conflict with the rest of the application wanting it to be application/json or similar.

Use Basic Authentication

Basic Authentication is a method for a User Agent to provide credentials to the server (usually username and password). Most User Agents have good support for it and understand that its use is limited to the original URL.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ curl --verbose --location --user SECRET_TOKEN: http://localhost:3000/file/12345
> GET /file/12345 HTTP/1.1
> Host: localhost:3000
> Authorization: Basic U0VDUkVUX1RPS0VOOg==
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 302 Found
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< Location: https://example.com/some-path
< Content-Type: text/html; charset=utf-8
< Cache-Control: no-cache
< X-Request-Id: f3211d6c-4e77-448b-a44d-7ad080fe5d3f
< X-Runtime: 0.002391
< Transfer-Encoding: chunked
<

> GET /some-path HTTP/1.1
> Host: example.com
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
......

The U0VDUkVUX1RPS0VOOg== in the Authorization header above is the secret, Base64 encoded:

1
2
$ echo U0VDUkVUX1RPS0VOOg== | base64 --decode
SECRET_TOKEN:

Don’t Redirect At All

Of course, redirecting is not the only option: Your endpoint can act as a proxy and read the contents from the external server and pass along to the client. The penalty is that the client connection to your server will stay open longer, consume more computation resources and transfer more data than a redirect.

Conclusion

Be careful when redirecting to external servers and you are using header-based authentication. Some clients may forward those headers along to a third party.


  1. We can ignore the 404 response. This is a made up example, and it’s irrelevant how the external server actually responded.

The REPL: Issue 25 - August 2016

Types

Gary Bernhardt writes a great article on types, type systems and the differences in typing in different programming languages. He clarifies some of the adjective commonly associates with types: static, dynamic, weak, strong. It’s a very interesting read, as are some of the comments in the gist. Gary has also re-started his Destroy All Software screencast series: I haven’t watched any of the new ones, but I learned a lot from the old ones.

CloudFlare, SSL and unhealthy security absolutism

Troy Hunt explores the services that CloudFlare provides as a content delivery network (CDN), in particular with respect to SSL (or, more properly, TLS). As with most interesting things in life, it’s not black and white: CloudFlare is not evil – like some recent blog post claim – and provides valuable services, but users need to be aware what the security guarantees are, or more importantly what they are not. Security is hard and nuanced. The more you know…

The Log: What every software engineer should know about real-time data’s unifying abstraction

In the last few weeks I have been reading a lot on data pipelines. Many companies have been moving from centralized databases for all their data to distributed systems that present a set of challenges. In particular: How to make the data produced in one system available to other systems in a robust and consistent manner. In this articles Jay Kreps explains the Log in detail – the underlying abstraction necessary to understand database systems, replication, transactions, etc. The Log, in this context, refers to a storage abstraction that is append-only, totally-ordered sequence of records, ordered by time. The article is long, but thorough and absolutely worth your time. Many of the concepts are similar to what is described on a post about Apache Samza, also a enlightening read.

The REPL: Issue 24 - July 2016

A Critique of the CAP Theorem

Papers We Love San Diego is having their first meeting later this month, which unfortunately I won’t be able to attend. I was somewhat intimidated about reading Computer Science papers because of my lack of formal training, but Martin Kleppmann’s paper is very approachable. I found the paper very interesting and insightful and found that I was familiar with most of the concepts on which the paper is based. I’m looking forward to the next meeting.

How To Explain Zero-Knowledge Protocols To Your Children

A few hops away from a story in Hacker News, I found this whimsical introduction to Zero-Knowledge protocols, which I was ignorant of. If the topic piques your interested, read another introductory article.

Ten Rules For Negotiating a Job Offer

Salary negotiation can be uncomfortable. It’s a skill that we usually get to practice only once every few years. In the first part of a series, Haseeb gives out practical advice on how and why to negotiate your salary. I found the part about exploding offers particularly interesting. As it happens, I last year I received an exploding job offer that was good until the end of the day!. I played my cards pretty much as the author suggests: The company never relented. They would not give me any more time. I walked away without any regrets. That particular company’s high pressure tactics, more than anything else, tells me that I would not have been a good fit.

The REPL: Issue 23 - June 2016

Flirting with Crystal, a Rubyist Perspective

AkitaOnRails write on his perspective on Crystal – a new programming language that aims to be type-checked, compile to native code and have a syntax similar to Ruby. I have played with Crystal myself recently and found the discussion thoughtful and interesting. Lately, it seems that Crystal is gathering some steam, especially since Mike Perham ported Sidekiq and has been tweeting about it.

My Candidate Description

Erik Dietrich lays down his requirements that companies must meet for him to consider working for them. My list would certainly be different, but that is the point. There is high demand for Software Engineers. It might now hold for other industries, where people don’t have much choice but to take what is offered. Instead of taking the first option that is presented, let’s be more mindful of what we want from an employer.

StartEncrypt considered harmful today

Notwithstanding the cliché title, this articles shows how easy it is to get security wrong. The tragic part is that the security flaws come from a Certificate Authority, StartCom. As it happens, it’s the CA used for the certificate of this very blog (at the time of writing). I’ll have to re-consider that decision soon. Also clear from the story, is that Let’s Encrypt is putting some pressure on CAs – which functionality StartCom was trying to replicate. Some CAs are even trying to steal their brand.

moreutils

Unix tools have been around for a long time and haven’t changed much. Joey Hess took it upon himself to evaluate new simple tools that he thought are missing (and rejected some ideas in the process). I downloaded moreutils as soon as I read the descriptions. I am sure they will come in handy very soon. Kudos.

Secure Dotfiles With Git-crypt

I spend a lot of time on the command line, working with Unix commands. (If you don’t, you can check out my talk Practical Unix for Ruby and Rails). Over the years, I have configured and tweaked my shell to my liking. Most unix configuration is done in “dotfiles”, which are configuration files read by Unix utilities, usually residing in you home directory and named with a . in front, from which their name is derived.

I keep my dotfiles in a github repository. This let’s me track my changes in a familiar way and sync more than one computer. It is especially helpful when setting up a new machine: I check out the repository and link the files. See rcm for a handy utility to manage the linking.

Most dotfiles contain preferences that can be shared publicly to the whole world, like .git, .zshrc, .bash_profile, etc. However, some are sensitive - like .netrc or .ssh - and should NOT be kept in a public repository, even though many people do. Until today, I used to keep those in a safe place separately and copy them manually to new machines.

Enter git-crypt

git-crypt enabled transparent encryption and decryption of files inside a git repository. It leverages git filter mechanisms and gpg, a free implementation of OpenPGP standard, which provides public-private key cryptography. After following the rest of the instructions, you can keep an encrypted version of sensitive files into a public repository, without divulging any secrets AND work seamlessly with those files in your local machine.

Nitty Gritty

Setup GPG

The ins and outs of GPG are beyond the scope of this post. On my Mac I use GPGTools, which make setting up a private key very easy. Be sure to understand how to manage your keys: If you loose you private key or don’t know the password, you won’t be able to get any encrypted information back.

Install git-crypt

git-crypt is available on most package managers. In my case:

1
$ brew install git-crypt

Configure git repository

Inside the git repository where you want to add the protected files, start by initializing git-crypt:

1
$ git-crypt init

Edit the .gitattributes file. In it, you will direct git to use git-crypt as a filter for specific files (or pattern of files). Make sure that you do this before committing the sensitive files into git.

1
2
3
# .gitattributes
secretfile filter=git-crypt diff=git-crypt
*.key filter=git-crypt diff=git-crypt

Next, give access to specific users. This is where the interaction with gpg comes in. Proceed to add yourself and any other user that you want to have access to the encrypted files (you will need to have their gpg public key in your keyring).

1
$ git-crypt add-gpg-user USERID

USERID is a key fingerprint or email.

After this, you can continue with business as usual. Use git as usual to stage, commit or diff your files. Locally, the files appear to be in plaintext, but when pushed to a remote repository they are encrypted binaries.

At any time, you can lock the repository, like so:

1
$ git-crypt lock

Which will leave it in the same state that it would be when checking out by another user (or yourself on another machine). All files will be encrypted and unreadable. To unlock:

1
$ git-crypt unlock

Conclusion

I am very happy with the new setup: I can work with my dotfiles in a single repository and keep sensitive information secure with strong encryption. As always, security requires some diligence on the user’s part. git-crypt reduced the amount of inconvenience to a minimum and takes only a few minutes to setup, assuming you already know how to use gpg.