Update 5/12/2016: Building token authentication in your single page application? JJWT is a Java library providing end-to-end JWT creation and verification, developed by our very own Les Hazlewood. Forever free and open-source (Apache License, Version 2.0), JJWT is simple to use and understand. We’d love to have you try it out, and let us know what you think! (And, if you’re a Node developer, check out NJWT!)
We talk a lot about Token Authentication, but before diving into the details of how to use tokens, it’s critical for developers to understand the underlying security issues. Why do tokens matter and what types of vulnerabilities they protect an application from?
In this post, I will cover some of the best techniques to secure webapps and how to handle the pitfalls with those approaches. This post applies to all modern programming languages.
Buckle up – we’ve got a lot of ground to cover. Lets get started!
The primary goal of web application security is to prevent malicious code from running in our applications. We want to ensure user credentials are sent to our servers in a secure way. We want to secure our API endpoints. As a bonus, we want to expose access control rules to the client.
The following sections address each of these concerns.
The Open Web Application Security Project (OWASP) pages are an excellent resource for information on XSS attacks, as well as other types of web client vulnerabilities and remedies.
In the clip below, you can see this behavior in action. This is taken from the app security live example page.
I put script tags into the search field of the page form. Since this site is not protected against XSS attacks, it goes ahead and executes that script code, resulting in the alert popup.
The problem here is that on this page, anything that’s put into the input field is sent back to the page and rendered verbatim.
There’s a great cheat sheet on owasp for how to prevent XSS.
In a nutshell, the remedy for XSS is to escape all user input. On the cheatsheet referenced above, there are links to a number of XSS protection libraries. It’s best to use an existing, trusted and open source library for this. You definitely do not want to “roll your own” as a lot of due diligence has been done on a mature library. You could miss vectors of attack or even introduce new ones writing your own escaping library.
A number of popular frameworks, such as AngularJs, have XSS protections out-of-the-box, but you should still understand what’s included in that protection.
Traditionally, users enter their authentication information in the form of a username and password and transmit that information up to the application server (hopefully in a secure fashion) as an HTTP
POST. Assuming the credentials are correct, the application server creates a unique session id to identify the user and sends it back in the form of a
Set-Cookie header on the response. On each subsequent request from the user, that session id is presented in the request in the form of a
Cookie header. Here’s what this looks like:
The use of the session id accomplishes a few important goals:
- The userid and password do not need to be sent up to the application server on subsequent requests. The session id can become a proxy to represent the user.
- The application server typically uses the session id to store information about the user, such as name, permissions, and other meta-data
A similar process is used to secure API endpoints with session IDs. Java security frameworks like Apache Shiro and Spring Security make use of annotations to express how web applications and APIs should be secured. They do much of the heavy lifting involved in managing session data as well. This includes storing, retrieving, and expiring sessions and their associated data.
In order to get at the user information identified by the session id, an additional round trip on the network is required. Endpoints, such as
/profile are commonly used to accomplish this. The session id itself does not contain any information that can be used by a client, such as a browser.
These are among the drawbacks and challenges to this approach that we’ll address in the next post. We think that authentication tokens address these issues better, but more on that later.
Cookies are ok if done correctly. They can be compromised in a number of ways. We are going to look at two of these vulnerabilities in detail.
Man-in-the-middle refers to a situation where you believe you are connecting to a particular server, but in reality, there is another “listener” in between you and your intended server. That listener, which you are actually connected to, intercepts your communication and usually will replay it to the server you intended to reach. This is what makes it seem like you are connected to where you intended to go – the listener is ferrying data back and forth between you and the server, all the while saving data or even altering responses from your intended server.
Watch out for the scenario where you establish a secure connection with HTTPS and then downgrade that connection back to HTTP. This is never safe. Once the connection is downgraded, the session id will be passed in the clear on the network – such as that cozy coffee shop you are sitting in – and anyone listening in would be able to use that id. This is a variation on the typical man-in-the-middle attack. The goal is to get a hold of your session id and then use that id to impersonate you on the website to which you are authenticated. This is called hijacking the session.
The remedy here is to use HTTPS everywhere and to use TLS even on internal networks. This last point is important to guard against other attack vectors. For instance, log files and database dumps pose a vulnerability for an out-of-band attack.
If your webserver is very secure, but you log session IDs to a log file, and you save those log files in a less secure place, attackers can hijack sessions by getting a hold of that backed up log file. Likewise, if your database is very secure, but your dumpfiles are backed up to a less secure location, attackers can brute-force crack passwords at their leisure if they are able to get a hold of a database dump file.
“… occurs when a malicious web site, email, blog, instant message or program causes a user’s web browser to perform an unwanted action on a trusted site for which the user is currently authenticated”
from: Cross-Site Request Forgery (CSRF) Prevention Cheat Sheet
CSRF occurs when a malicious site has a link or form that connects to another site that you are already logged in to. Here’s an example scenario:
- You log in to your bank account at
https://myficticiousbank.com, as you normally would to review your balance and transactions
- You don’t log out
- You get an email from your buddy that has a link in that says: “See cute cats”
- Unbeknownst to you, that link connects back to your bank’s website and performs a transaction to send money to Mr. Bad Guy
How is this possible? Firstly, this exact scenario is very unlikely because banks are very familiar with CSRF and protect against it. Assuming that wasn’t the case for the purposes of this example, it’s possible because you have an active session with your bank. The attacker has no knowledge of your session id or any other cookies. The attacker is just counting on the chance that you didn’t log out of your session. When you click the link, the browser happily sends along the cookies representing your session since there is already session associated with the domain you are now connecting to. Let’s take a look at what the link in that email might look like:
All you see is the link to click on. Once clicked, you’re not going to see cute cats at all! You will be back at your bank’s website, probably confused as to why you just transferred all your money to Mr. Bad Guy.
There are three primary remedies for CSRF that we will examine now:
- Synchronizer Token
- Double Submit Cookie
With the Synchronizer Token approach, the server embeds a dynamic hidden variable in an input form. When the form is submitted, the server can check to make sure that the hidden variable is present and that it is the correct value. Let’s say you are on a trusted travel site and you are about to book a vacation around the world. Here’s how the “buy” form looks:
Now, let’s say you get an email with a link to book the vacation of your dreams with your trusted travel site. The link actually connects to a hacker site that’s trying to get you to use your trusted vacation site to book travel for them! The form looks the same as your trusted travel site, and you don’t notice that the url is different. However, since your trusted travel site has implemented a Synchronizer Token approach to defeating CSRF attacks when you click the Buy button, the transaction fails. There’s no way for the hacker site to know what the correct token should be. When the hidden token field is not present on the form submit, the trusted travel site fails the transaction.
With the Synchronzier Token approach, you can use the same token over again, but it’s better to have it be a nonce – that is a one-time use token. Using nonces prevent replay attacks.
POST requests. This isn’t a real problem as long as you adhere to the idempotent nature of
GET requests that are baked into the HTTP spec:
GET requests should never modify server state.
With the Double Submit Cookie approach, two cookies are sent back to the browser. One is the session id and the other is a random value (similar to the synchronizer token). There are two keys to this mechanism working. The first is a mechanism built into the browser called the Same Origin Policy. This permits the script code to interact with other server endpoints only if those endpoints have the same origin (base URI) as the endpoint that delivered said script code. You might be asking yourself, “If one cookie isn’t secure on its own, how are two cookies going to be more secure?” They key is in the second enabling mechanism: Having the second cookie included in subsequent requests in a custom header. It is up to your client script code to ensure that this is setup properly. Here’s how the interaction works:
When you request the login page, two cookies are sent back by the server. The second cookie is used in a custom header (
X-XSRF-Token in this case, but it could be anything) for subsequent requests from the browser. The server checks for the existence of the custom header and checks the value against what was sent for that page.
Similar to the Synchronizer Token approach, an external site trying to spoof a page and trick you into submitting data to an active session, would fail as it would not be able to set the custom header for a site at a different URL.
All browsers, including Internet Explorer 9 and later, send an
Origin header does not match the expected value, then a third-party site trying to spoof the look of your page would be foiled as the
Origin set in the browser would be different than what was expected.
Here’s what it looks like:
When I submit the form to register for a Stormpath account, the browser automatically includes the
Origin: https://api.stormpath.com header. Stormpath’s servers can check for that header and reject the request if the value of the
Origin header is something else.
This section on the remedies for Cross Site Request Forgery has focused primarily on securing the browser. We are next going to look at session IDs themselves with an eye to the server side of the interactions and how we can secure them.
The session IDs we’ve been looking at so far, usually managed in the form of cookies, have a number of challenges associated with them. Of primary importance is that as your infrastructure grows, you may find it difficult for your session mechanism to grow with you.
Imagine you start out with one application server. It manages sessions by saving them to a local datastore, such as redis. Your service takes off, and you need three application servers to handle the load. Now, you are in a situation where the application server a user connects with to start their session may not be the same application that user connects with to continue their session. You may find yourself needing a whole centralized session id de-referencing service. That is, a service to ensure that all sessions are kept in sync across all of your application servers. This is a challenging issue at scale.
Even on a single application server instance, there’s a cost with session IDs. The user and session data associated with that id has be stored. It also must be referenced on each and every interaction with the application server. This can be costly in terms of slower resources, such as persisting sessions to disk or in the memory required to keep this data cached.
Session IDs have no inherent value other than as a unique identifier. A client, such as your web browser, cannot inspect the session id to find out what you are allowed to do in the application. Separate requests are needed to get that authorization information.
This is where Token Authentication comes in.
In the sequel to this post, we’ll dive into how Token Authentication can be used to address these issues and more. We’ll focus on how JSON Web Tokens (JWTs) can not only be used as a session identifier but also contains encoded meta-data and is cryptographically signed. We’ll see this in action in a Java code example.
Java developers can see these techniques in action in my tutorial on Token Authentication for Java Web Applications – it covers how your Java app can benefit from token auth and walks through a Java example available in the Stormpath Java SDK repo, and show you how to use tokens in your own Java application.
Like what you see? Follow @goStormpath to keep up with the latest releases.