Enabling CORS on a node.js server, Same Origin Policy issue

Recently we faced the famous “XMLHttprequest doesn’t allow Cross-Origin Resource Sharing” error.

To overcome the problem a very simple solution was needed.

Below I’ll try to give a quick overview of what is CORS and how we managed to work around the issue.

Cross-Origin Resource Sharing – CORS

In a nutshell CORS is the mechanism that allows a domain to request resources from another domain, for example, if domain http://websiteAAA tries to request resources from http://websiteBBB the browser won’t allow it due to Same Origin Policy restrictions.

The reason for having Same Origin Policy rules applied on the browser is to prevent unauthorized websites accessing content they don’t have permissions for.

I found a great example that emphasizes the need to have Same Origin Policies enforced by the browser: Say you log in to a service, like Google for example, then while logged in you go and visit a shady website that’s running some malware on it. Without Same Origin Policy rules, the shady website would be able to query Google with the authentication cookies saved in the browser from your session, which of course is a huge vulnerability.

Since HTTP is a stateless protocol, the Same Origin Policy rules allow the browser to establish a connection using session cookies and still keep each cookie private to the domain that made the request, encapsulating the privileges of each “service” running in the browser.

With that being said, imagine a situation where you, as a developer, need to communicate with an API sitting on a different domain. In this scenario you don’t want to hit the Same Origin Policy restrictions.

Workaround 1 – Request resources from a server

The most common way to get around this problem is to make the API request from your own server, where Same Origin Policy rules are not applied, and then provide the data back to the browser. However, this can be exploited:

Last semester I created an example of how an attacker would be able to spoof whole websites and apply a phishing attack circumventing Same Origin Policy restrictions.
The attack structure was very similar of how ARP poisoning is done.

A very brief overview of the attack:

  1. The user would land on a infected page
  2. The page would load a legitimate website by making a request from the attacker’s server where Same Origin Policies are not applied.
  3. The attacker would inject some code in the response to monitor the vicitms activity
  4. After the victm’s credentials were stolen he would stop the attack and redirect the user to the original requested page.

By spoofing the victim’s DNS it would make even harder to detect the attack, but even without DNS spoofing this approach would still catch some careless users.

All the code for the example is available on github
The attack was built on top of a nodeJS server and socketIO
The presentation slides for the attack can also be found here

Workaround 2 – JSONP

Another way to circumvent the problem is by using JSONP (JSON Padding). The wikipedia articles summarizes in a clear and simple way how JSONP works.

The magic of JSONP is to use a script tag to load a json file and provide a callback to run when the file finishes loading.

An example of using JSONP with jquery:

[sourcecode language=”javascript”]
url: "http://website.com/file.json
dataType: ‘jsonp’,
success: function (data) {
// Manipulate the response here

Even though making requests from your server or using JSONP can get around the Same Origin Policy restrictions it is not the best solution, which is why CORS started being implemented by the browser vendors.

With CORS a server can set the HTTP headers of the response with the information indicating if the resources can or can’t be loaded from a different origin.

If you are curious and want to snoop around looking into the HTTP response headers of a page, one way to do that is to use the developers tools that come with WebKit.
Below is a screenshot of all the resources loaded by the stack overflow home page.
Screen Shot 2013-02-14 at 6.34.24 PM

As you can see in the screenshot, the script loaded from careers.stackoverflow.com/gethired/js had the following HTTP header options appended to its response:

  • Access-Control-Allow-Headers
  • Access-Control-Allow-Methods
  • Access-Control-Allow-Origin

That means that if you want to make an ajax call to carrers.stackoverflow.com/gethired/js from your own page, the browser will not apply Same Origin Policy restrictions since the careers.stackoverflow server has indicated that the script is allowed to be loaded from different domains.
*An important note to make is that only the http://careers.stackoverflow.com/gethired/js has the Same Origin Rules turned off, however, the careersstackoverflow.com domain still has them enabled on other pages.

This means you can enable the header options on a response level, making sure a few API calls are open to the public without putting your whole server in danger of being exploited.

This lead us to our problem.

The Problem

In the set up we currently have, one computer plays the role of the API server and we were trying to query that API asynchronously from the browser with a page being served from a different domain.

The results, as expected, were that the call was blocked by the browser.


Instead of hacking around and trying to make the requests from a different server or using JSONP techniques we simply added the proper header options to the responses of the API server.

We are building our API on a nodeJS server, and to add extra headers options to the response could not have been easier:

First we added the response headers to one of the API calls and it worked perfectly, however we wanted to add the option to all our API calls, the solution: use a middleware.

We created a middleware which sets the response header options and pass the execution to the next registered function, the code looks like this:

[sourcecode language=”javascript”]
//CORS middleware
var allowCrossDomain = function(req, res, next) {
res.header("Access-Control-Allow-Origin", "*");
res.header("Access-Control-Allow-Headers", "X-Requested-With");

app.configure(function () {
app.set(‘port’, config.interfaceServerPort);
app.set(‘views’, dirname + ‘/views’);
app.set(‘view engine’, ‘jade’);
dirname, ‘public’)));

app.configure(‘development’, function(){

// API routes
app.get(‘/list/vms’, routes.listGroup);
app.get(‘/list/vms/:ip’, routes.listSingle);
app.get(‘/list/daemons’, routes.listDaemons);

That’s it for CORS, later we’ll cover another cool header option, the X-Frame-Options

If you are interested in finding more about Same Origin Policy or CORS check out this links:

Cross-Domain AJAX

This is something I’ve been trying to do for a while, but never took the time to do it.
I was always curios to see if it was possible to get the rendered source code of a web site.

What happens is a lot of websites generate their source code using javascript, so by viewing their source code in the browser won’t reflect what is actually being displayed on the screen.

Recently I was faced with a situation that I needed to get some data from a wesbite. However the website didn’t provide an API. So I thought, let me get the source code parse it and extract the data from it.
A good example is google. If you check their source code is just a bunch of encrypted javascript and css code. But in the screen are just a few buttons and the search results.
What I wanted was to get the rendered source code, the one with the constructed DOM object, not the one served from their server.

What made me realize the task was possible was the inspect element of google chrome.
Using the inspect element it is possible to see the rendered html of any website.

If chrome can do it so do I :)

With that mind set I first started hacking the inspect element tool of chrome.
Because the tool is written in javascript and css, it is possible to see the source code. However is not something pleasant. After trying to read the source I realized the approach wasn’t going to work.

So without any ideas where else should I do? Of course, google it :P

The first results from my search mentioned the use of iframes to load a page, and then access the DOM object of the iframe from the parent page. I gave it a try, and it worked!

The way I did it was I created two files, one file was the parent page, the other was the one would be loaded in the iframe. I was able to access DOM object of the iframe and perform any operation wanted.
However, here it comes the trick part. I can only access the DOM of an iframe, if the page loaded in the iframe belongs to the same domain as the parent page. So when I tested with 411.ca, for example, it threw a permission denied error.

With that option out of the way, I started looking for different ways to do what I wanted.

One idea I had was to make an ajax call, and then load the response(the html source code) in the iframe of my page. So I wouldn’t have the problem of permission denied when trying to access the DOM object of the iframe, since I would loaded the source code from the parent page.
It is easier said then done. The same problem I had trying to access the DOM of the page loaded from another domain in the iframe, I had trying to request the page from another domain. I couldn’t make a XHR to a page on a different domain.

Even tough I hit another wall, I felt that I was getting closer to a solution.
I filtered my searches to how to make an ajax request between different domains. I ended up here:

This library is amazing. It is written in perl, so it uses the LWP::UserAgent and HTTP::Request classes to perform the operations. I haven’t fully understood the library yet, but by looking at the source code, it look like it creates a different header for the request so it matches the one from the url being requested. So by doing that the cross-domain restriction doesn’t apply. Again, I’m not sure, for more information just contact Bart Van der Donck , he is the guy who wrote the library.

Another tool I also found it very cool was the HTML to DOM parser provided by Mozilla.
This tool is very interesting, it uses components classes to perform the parsing. I’m still trying to understand how the safely parsing is actualy done. However I just want to document a few problems I had trying to make the parser work:
The first time I ran the code from the examples I got the message “Permission denied for ‘localhost’ to get property XPCComponents classes”

Again with the help of google I found that by adding this code before trying to load a component it would give access permission.

The only problem is that it only worked on firefox, it didn’t work on chrome or IE

In the end, putting all the pieces together I was able to request a page from a different domain, load it in my website, and manipulate the DOM object of the requested page.
Here a few screenshots:

I’m using google’s page as an example. You can notice that not all the images are loaded correctly. The reason why, is that is not the browser who is making the request. The request is made using the ajax-cross-domain library and the response(html source code) is loaded in the iframe. So when the the browser tries to load the images of the page it resolves relative paths using the localhost domain, and not google’s domain. This could be fixed if before loading the response in the iframe, the html code be parsed and all the relative paths be replaced by absolute paths using the original domain. So if there is an image with the path like this: “/images/img1.jpg” it would be replaced by “htt://www.google.com/images/img1.jpg”. So then the browser would go to google’s server and get the image. In the screen shot below, what happens is the path for the images is “/images/navlogo91.png”. So the browser resolves the relative paht to “http://mch.local/images/navlogo91.png”

Here is a dump of the DOM object.
The result of the first search is highlighted. It shows that it is possible to access all RENDERED html source code

Here is a screenshot of the source code of the same Ajax search seen through the browser “View Source Code”
Highlighted are the only two occurances of Ajax on the page, and they are no from the search result.

It is even possible to load the whole page in the parent page and not even use an iframe.