Setting up WordPress behind Amazon Cloudfront

You all know that feeling (well, I hope you do!) that when a spike in traffic occurs on your WordPress site, that the miniature server you have it running on very quickly runs out of resources. Apache’s good like that. Taking up all the resources with its large number of processes consuming oodles of memory each. How on earth can you possibly fix it, especially when you’re running on a tight budget and upgrading the server for that once-in-a-blue-moon you get a spike in traffic? Well, Amazon Cloudfront to the rescue!

Many websites and places that talk about integrating Amazon Cloudfront with WordPress are simply talking about static assets. Using some caching plugin that rewrites the URLs in the page’s source to point at your Cloudfront distribution, and that’s about it. They don’t talk about how your server can be (as WP SuperCache puts it) Stephen Fry-proof. No one actually says anything about how to put YOUR ENTIRE WEBSITE behind Amazon Cloudfront (and by this, I mean pointing your www.* domain at Cloudfront and letting it take care of everything).

It’s not entirely simple, I’ll give them that. And they probably want to avoid the technicalities of taking a website down if they screw something up. Well, I’ll share with you what I’ve learned, and the configuration that this site is running. I’m assuming you have *some* knowledge about Amazon Cloudfront and what it is. I won’t go into that here. If you don’t know what it is or how to use it, I suggest you read up on it before continuing.

This is by no means perfect, and I’ll no doubt be tweaking the configuration for time to come after this post gets published. But anyway, here goes.

Disclosure: I also have WP SuperCache installed, which is sending Expires/Cache-Control headers along with pages. Cloudfront will respect whatever expiry headers your webserver sends, so if you don’t have a caching module installed in WordPress, you’ll need to install one, otherwise Cloudfront will simply pass everything though as everything will be set to expire in the past (i.e; never cache). I’ve set my TTL to 5 minutes, but you can set yours longer (or shorter) if you so desire. Whatever you set it to, Cloudfront will respect that and only hold onto the cached page as long as you tell it to.

Right. To business.

First off, create a Cloudfront distribution. Point the Origin Domain at your domain name (not your www.* domain name, as we’ll be changing the DNS for that later, and this will all suddenly stop working due to CloudFront resolving the origin to itself, and thus no longer able to contact the backend. You can also use your server’s IP address (if you’ve only one), or a load balancer of some description).

Screenshot from 2015-04-13 21:40:42

Now we need to configure our “Default Cache Behaviour”. This is what will match when every other rule we specify (later) doesn’t.

Screenshot from 2015-04-13 21:43:53

The key things to note here:

  • Whitelist Headers: You need to whitelist the “Host” header to be passed through to the origin. This is important, because WordPress is a bit of a pain with its redirects. When you point your www.* domain name at it, your server needs to know where exactly to send your request. This also serves to satisfy WordPress’s desire to always be serving content on the configured domain name. Otherwise, Cloudfront will request content as “http://danneh.org/path/to/post/”, WordPress will attempt to redirect to “http://www.danneh.org/path/to/post/” (which your browser is already requesting), and enter a redirect loop.
  • Whitelist Cookies: The ones listed are the minimum required, for my website at least, to be seen in order for caching to actually work. While cookies usually invalidate caching completely, it’s worth noting that, unless you’ve specifically logged in, the anonymous user will not have any matching cookies to send, so Cloudfront will return a cached, cookie-less page.
  • Forward Query Strings: WordPress uses “?q=” for searching, so this must also be allowed to pass.

Finally, we need to setup our distribution.

Screenshot from 2015-04-13 21:50:19

Insert into the CNAME box your final, www.* domain name. All other settings relate to either logging or SSL, neither of which I’m using. If you are, or need to, go ahead and set these up now.

Now go ahead and create your distribution. This will take a while. But while it’s doing so, we can go ahead and add further behaviour rules to allow the admin to function.

Firstly, we need a rule that covers /wp-admin/.

Screenshot from 2015-04-13 21:52:37

This setup basically gets Cloudfront to forward *everything* to your origin. We have no choice with this. The admin simply won’t work without it. Remember though that /wp-content and /wp-includes will all be cached, so eventually the only thing hitting your origin when using the admin will be the individual page requests, GET and POST, whatever other voodoo WordPress uses internally. We MUST forward POST, all headers, all cookies and query strings. As a result, caching gets disabled. Oh well.

We can’t do anything in the admin though without being able to login. This is handled by /wp-login.php, so we need to add a separate rule for that.

Screenshot from 2015-04-13 21:55:43

Add a new rule, set the path pattern as above, and all settings as you have done for wp-admin/*. This functions the same way. It will also send the cookies required to establish a PHP session and to allow WordPress to log you in.

Finally, if you have any pages such as contact pages, add another rule for it too.

Screenshot from 2015-04-13 21:57:19

Again, same as wp-login.php, set the settings exactly the same. We have to allow everything through. For the record, I’m using FormBuilder on my contact page, which works with these settings (though it was a bit of a pig to figure out the right settings to begin with, and wait ~15-20 minutes for the Cloudfront settings to apply before I could test it).

You should now have a behaviours table that looks something like this.

Screenshot from 2015-04-13 21:59:07

Wait for settings to apply, then you can do a test. Grab an IP from your Cloudfront distribution’s hostname:

$ dig +short d17ksnl0e6086z.cloudfront.net
54.230.2.131
...

Put that IP address in your hosts file against your main www.* domain name that you configured in the CNAME section.

# Change the IP address for whatever the dig command returns for you.
# This will be different for everybody
54.230.2.131 www.mydomain.com

Once the distribution settings have finished applying, visit your website and check everything works. Specifically, you’re looking for some headers that Cloudfront sends to indicate the status of the cache for that particular URL. The first time you load a page, natually you’ll get a miss. But, if everything works correctly, you’ll get something like this.

Via: 1.1 339e24d8a32f15f77c28e47b57b2daf8.cloudfront.net (CloudFront)
X-Amz-Cf-Id: eexcmdQVf2t0l1z2BePOM4v42RtXKmWzgaCx8q1b_n6EYWiXFlkoVQ==
X-Cache: Hit from cloudfront

Change precedence of rules if necessary, however the ones I’ve provided *should* cover most scenarios. Naturally, if your install is a bit special, you’ll need to spend a bit more time with it. Once you’re happy, update your DNS records.

www.danneh.org. IN CNAME d17ksnl0e6086z.cloudfront.net.

Note: you cannot use an A record here, as CloudFront changes the IP addresses behind your distribution frequently as it load balances, and it also changes depending on the region so it can redirect requests to datacenters nearest the user. If you do this, your site WILL stop working eventually with no warning.

Note 2: Don’t forget to remove the hosts entry you made above to test the site before putting it live, else you’ll find yourself in a situation where your site doesn’t work for you, but works for everybody/every other computer, and you’ll forget all about that little line of text overriding your operating system’s DNS resolve pointing you at a now-non-existent endpoint!

Once it’s propogated, congratulations – your website is now protected behind Cloudfront, and is *truly* Stephen Fry-proof!

If you’re having difficulties, give me a shout. I can perhaps offer some advice, or I can help you out for an hourly fee (shameless freelancing plug, but hey, gotta make a living ;) )

Updated 2018-03-28:¬†Based on emails I’ve received with questions regarding this post, I’ve updated it to clarify a few common issues that keep coming up. Noteably, they are;

  1. You cannot use an A record to point your live domain name at the distribution. Your site will eventually stop working, and you won’t get geographic traffic distribution.
  2. Do not point the origin of the distribution at your www.* domain (e.g; www.danneh.org). When you update the DNS record for this to put it live, CloudFront’s origin will be pointing at itself, unable to find the backend server, and start raising error messages.
  3. You MUST forward the Host¬†HTTP header to the backend using the behaviour rules (above). This overrides CloudFront’s want to send a verbatim request to your backend using the domain name provided in the origin. It does not matter what URL is being requested, the HTTP request from CloudFront will be rewritten to be that of the origin domain name. UNLESS you pass through the Host header. If you omit this step, you’ll end up in a redirect loop.