Versioned URLs Plugin for Rails 23
Executive Summary
- Install versioned_urls plugin via
script/plugin install http://svn.lingr.com/plugins/versioned_urls
- Add appropriate rewriting and cache-header-pushing configuration to your web servers, e.g., for lightty:
url.rewrite-once = ( "^/(.*\.(css|js|gif|png|jpg))\.v[0-9.]+$" => "/$1" ) expire.url = ( "/stylesheets/" => "access 10 years" , "/javascripts/" => "access 10 years", "/images/" => "access 10 years" ) - Enjoy complements from your users about how responsive your site is
Gritty Details
One of the great "for free" features in Rails is asset timestamping. This feature, built into most of the methods in ActionView::Helpers::AssetTagHelper, automatically appends the timestamp of the referenced asset to generated URLs. So, when you write something like
<%= javascript_include_tag 'application' %>it actually generates something like this
<script src="/javascripts/application.js?1161807361" type="text/javascript"></script>where the
1161807362 parameter is the file modification time for /javascripts/application.js.
The theory is that, as long as applicaiton.js remains unchanged, the timestamp and URL remain the same, and the browser caches application.js. When application.js changes, the timestamp and URL change, and the browser refetches application.js. But, in fact, this is just the theory- not the reality.
In reality, the browser only halfway caches application.js. When the browser encounters the next
<script src="/javascripts/application.js?1161807361" type="text/javascript"></script>
it will actually send an HTTP GET request for /javascripts/application.js?1161807361, but it will include an If-Modified-Since header to notify the server that it should only return the requested data if it has changed since it was last requested. If application.js hasn't changed, the server sends a 304 Not Modified response, with no response body.
This is all well and good in that it saves the download time for application.js, but we still pay the TCP setup and teardown time, even when the asset hasn't changed. For many web applications, the number of javascripts, stylesheets, images, etc., included via asset tags is quite large. Thus, the cumulative penalty we pay in TCP setup/teardown to request unchanged assets can grow to a significant (read- noticeable) amount.
What we would really like to do is somehow tell the browser not to even bother asking if the asset is modified- that is, effectively, tell it that the asset will never change. But how can we be sure that an asset will never change? Simple- we just ensure that, when and if it does change, we modify its URL.
Now, we already know that the asset timestamping feature does just this (changing the URL whenever the referenced asset changes), but, it happens to do it in such a way that isn't completely compatible with some browsers caching systems. The main issue is that some browsers will not cache resources referenced by URLs that contain parameters (e.g.- ?1161807361). What we need is a way to move the asset version token into the path part of the URL. In other words, we need to produce code like this:
<script src="/javascripts/application.js.v1161807361" type="text/javascript"></script>
Now things are getting really tricky. With Rails' default asset timestamping feature, there's no need to make configuration changes to your web server, because it understands that a request for /javascripts/application.js?1161807361 is actually asking for the asset /javascripts/application.js. It can find /javascripts/application.js just fine, so, no problem there.
But, with versioned URLs, the server will receive requests for things like /javascripts/application.js.v1161807361, which it has no idea how to satisfy.
How can we solve this? We can use URL rewriting.
URL rewriting is a feature available at least in Apache and Lighttpd, and probably in just about any widely-used web server. If you use lightty, you'll need to add something like this to your lighttpd.conf:
url.rewrite-once = ( "^/(.*\.(css|js|gif|png|jpg))\.v[0-9.]+$" => "/$1" )
This tells lightty to interpret a request for /javascripts/application.js.v1161807361 as a request for /javascripts/application.js, so, everyone is happy again.
Now, the final step- telling the browser that the asset will never change. What we need to do is push back Expires and Cache-Control headers whenever we serve an asset that has a versioned URL. With lightty, you can do this by adding something like the following to your lighttpd.conf:
expire.url = ( "/stylesheets/" => "access 10 years" ,
"/javascripts/" => "access 10 years",
"/images/" => "access 10 years" )
Ten years is effectively "forever" in web terms, but you can use any ridiculously long time period you feel like.
Now, to the real point of this post :-). Today, we are releasing a versioned_urls plugin for Rails. Install the plugin via:
script/plugin install http://svn.lingr.com/plugins/versioned_urlsand, voila, you've got versioned asset URLs for free, using the file modification time as the file version.
Note that, by default, URL versioning is only active when RAILS_ENV != 'development', because, if you use WEBrick, you don't have URL rewriting. If you use a rewrite-capable web server in development, just set VersionedUrlsPlugin::enable appropriately.
The other configuration parameter is VersionedUrlsPlugin::version_for_asset. As I mentioned, by default, the version for an asset is its file modification time. If you want to do something more sophisticated, you can set VersionedUrlsPlugin::version_for_asset to a Method, Proc, or anything else that respond_to?(:call) (see versioned_urls.rb for details). For example, at Lingr, we use the subversion revision number of an asset as its URL version, via something like this:
YAML.load(`svn info #{file}`)['Last Changed Rev']
Of course, we use local caching to ensure that we only do svn info the very first time an asset is requested.
Finally, I really should refer you to Cal Henderson's excellent article at Vitamin. Cal is one of the top technical guys at Flickr, so, when he talks about optimizing HTTP semantics, he's speaking from the top of a big pile of data :-). Flickr has been using versioned urls for quite some time now. We're just the newcomers at this party.
Update - 01 Nov 2006
I should mention one caveat to using versioned URLs with images. If you are serving any images from CSS files, it requires careful planning.
URLs that you list in CSS files are hand-coded- they aren't generated by the ActionView::Helpers::AssetTagHelper methods. Thus, they don't get URL versioning. You typically end up with things like this in your CSS file:
.myStyle
{
margin: 0;
border: 0;
background: transparent url(/images/backgrounds/gradient-blue.gif) no-repeat -2px -20px;
}
It's important that you not push back the Expires and Cache-Control headers for images served from CSS, since the URLs for these images don't carry the versioning information.
A great way to approach this is to segregate your CSS-served images from your HTML-served images, then modify your expire.url statements to only push Expires and Cache-Control headers for the HTML-served images, like so:
expire.url = ( "/stylesheets/" => "access 10 years" ,
"/javascripts/" => "access 10 years",
"/images/html/" => "access 10 years" )
Thus avoiding pushing the Expires and Cache-Control headers for images located in /images/css/.
Comments
-
Cute. We've been using various tricks in our own webapp to accomplish the same thing. I considered a solution similar to this at one point, but for us the issue is not severe enough to warrant the extra complexity. What I'd really like to see is for this to be addressed in HTML itself. There should be an extra attribute on tags which refer to external resources specifying the last change time of the resource. The browser could then compare that against what's in its cache, avoiding the extra round trip which generally results in a 304. For instance:
It's essentially the same idea as what you're doing here, only made explicit, and without the rewrite hackery.
Getting the standards bodies and browser vendors to implement this is, of course, impossible.
-
Second try. Your crappy blog software keeps eating my comments. It also digested the parargraph breaks I put in my original comment, and swallowed the sample img tag example I had in there. Maybe you can find some software with a "preview" button? Anyway, the sample tag I had was: <img src="/foo/bar.jpg" modified="1161807361">
-
Why can't you generate your CSS dynamically too? Why should it be any different than your HTML?
-
Of course we could generate the CSS dynamically, but we don't for performance reasons. One solution we are considering is generating the CSS dynamically at deploy-time, so it can contain versioned URLs, but the web server can serve it as a static resource.
-
Serving dynamic CSS is very simple, just add a StylesheetsController, make it page-cached, set layout to nil, session :off, set headers['Content-Type'] = 'text/css' and move your stylesheets to app/views/stylesheets with a .rhtml extension. route.stylesheets 'stylesheets/:id.css', :controller => 'stylesheets', :action => 'generate'. def generate; render :action => params[:id]; end Sorry for the messy post but I'm not sure how to format it on this blog...
-
Great idea, Tuxie. You've caused one of those 'a-ha' moments :-)
-
Do versioned URL's work with httpd compression ? I suspect that the mimetypes will be fudged
-
John- by "httpd compression", I assume you mean some like mod-gzip? If so, I can't see how this would affect versioned urls. Versioned URLs are all about addressing the content- once it's addressed and found by the server (via the rewrite rules), it could be served compressed or not. I don't see how either versioned URLs or content compression affects mimetypes at all.
-
Why couldn't you format the filename as : name.version.ext? Such as: application.v1161807361.js
-
Casey- you could- it was pretty much an arbitrary decision to go with name.ext.version rather than name.version.ext. Whatever the naming convention, just make sure that your rewrite rules are constructed accordingly.
-
Is the subversion repo for this plugin up? I've tried quite a few times over the past week but all i'm getting is timeouts doing script/plugin install http://svn.lingr.com/plugins/versioned_urls.
-
Same here, I think the svn is down?
-
svn repo is up, and I just installed via "script/plugin install http://svn.lingr.com/plugins/versioned_urls", so it seems to work fine. Maybe some connectivity issues?
-
I've tried from a couple of boxes (all in the UK), they can all ping svn.lingr.com just fine but it seems like port 80 is closed as "script/plugin install http://svn.lingr.com/plugins/versioned_urls" fails as does "svn co" or even plain old telnet to port 80. All the boxes I tried can connect to www.lingr.com on port 80. traceroute on www.lingr.com and svn.lingr.com both respond up to the same i.p. address, which I assume is a firewall or other traceroute blocking bit of networking hardware. So it seems to be getting through. Only thing i can think of is maybe the firewall config on svn.lingr.com is blocking port 80 for everyone apart from your own i.p addresses?
-
Could you send the traceroute output to dburkes at infoteria.com? svn.lingr.com is not the same ip as www.lingr.com, so, if you are seeing that, I'm at a loss. Anyway, I'd like to help you solve this, so let's continue here or offline via email if you would prefer. Thanks for your help!
-
For telling the browser to cache content, you can do most of what is needed in Apache if you are using that. With an expires header sent by the httpd server, the rails generated query strings *should* be fine, but as mentioned in the update you do need to be selective about what you send the expires header for. At least one solution can be found here http://www.stephensykes.com/blog_perm.html?157
-
interesting
-
Cool!
-
Nice...
-
http://www.google.com http://www.yahoo.com http://www.msn.com
-
I'd like to see an update of this for Rails 2.0 and :cache => true, as well as the Apache code in addition to the lightty. Thanks!
-
Thanks for information. many interesting things Celpjefscylc
-
Nice Site! http://google.com
