How to make sure your web series site can handle Digg’s traffic (and Reddit’s, and the New York Times’s…)
Heya, and welcome to the final part in our series on creating websites for your web show or series.
This week, we’re going to be talking about running one final test - but it’s a test you really want to be sure your site can pass.
Hoping your series will go viral? Or pitching to sites like Digg or major press outlets? Then you need to make sure that if you get that sort of exposure your site doesn’t let you down.
Checking your site is Digg-proof
Load testing - testing how well your site will cope if a whole load of people suddenly want to read it - used to be a pain in the ass if you weren’t a techie. Does “ab [ -A auth-username:password ] [ -c concurrency ] [ -C cookie-name=value ] [ -d ] [ -e csv-file ] [ -g gnuplot-file ] [ -h ] [ -H custom-header ] “ sound like your idea of a fun afternoon?
That was a problem, because normally the first sign you’d have that your site might not be ready for a big traffic surge would be when a large site mentioned your series, and your mood went from “woohoo!” to “Oh, god” in seconds flat as your site became totally unresponsive. It happened to me in the summer of 2006 when my show BloodSpell was mentioned on the huge blog BoingBoing - five minutes later, all that could be seen of the BloodSpell website was an error message.
(There was a happy ending - we managed to dive into the website code and fix it before we lost too many visitors, and the site subsequently stood up under a whole load more press attention as BloodSpell became remarkably successful. But I digress.)
Why does a perfectly functional website suddenly go pear-shaped? Well, there are a number of reasons, and they all boil down to inefficiencies that you might not notice when the site’s not getting many visitors, but that become extremely obvious as soon as the traffic mounts up. Most notably, Wordpress websites do a LOT of database calls, and there’s only a fixed number of those that your server can handle a second. Much more traffic = many more calls = overloaded server = badness.
(That’s why we added a cache plugin in part 2, for those of you who have been following along.)
Fortunately, these days it’s easily possible to simulate a major traffic surge, thanks to the free service provided by a site called Load Impact. These guys started in the Hacker News community, and the service they provide is very cool. For free, they’ll simulate up to 50 simltaneous users hitting your site, and show you whether it holds up. (They also offer a paid service that will simulate a lot more, but unless you’ve got J J Abrams directing your series, you won’t need those sorts of numbers.)
Running a test with LoadImpact is simplicity itself - I recommend you start one now if you already have a website. Just open a new tab, load up the Load Impact website, type in your URL, and hit “start free test” on the next page. It may take a little while to get going, but it’ll get there.
What do the results of the test mean?
50 concurrent users might not sound like a lot, but it’s a lot. A few weeks ago Guerilla Showrunner was featured on the front page of Hacker News, one of the bigger traffic sources out there on the Internet right now. We had something like 10k visitors inside a day - a number that is pretty comparable in my experience to the numbers you’ll get from the front page of BoingBoing or Digg, or other major showcase sites. I’ve not been featured on Reddit, unlike the others, but I believe you’ll get about 7k from them. In short, you’re not going to get a lot more than 10k visitors from any single traffic source - unless your series becomes genuinely newsworthy for sites like the New York Times, a 10k bump in a day is about the most you’ll ever see.
That 10k bump works out at… wait for it… about 8 concurrent visitors at any one time if they’re spending about 3 minutes on your site. In order to get up to more than the 50 concurrent visitors you’re testing through LoadImpact, you’d have to either be looking at more like a 50k bump (and there are NOT a lot of places that can drive those numbers), or visitors staying for 15 minutes or more.
So, in other words, for most series, if your site performs fine at 50 concurrent visitors it’ll be fine for anything.
So, what’s a reasonable load time? According to Jakob Neilsen (who has some critics, but is as good an authority for this particular metric as anyone), less than one second is optimal, and less than 10 seconds is vital. From my personal experience, I’d say that a site should load in under 5 seconds to avoid losing impatient viewers. If your LoadImpact test is showing your site as well below that benchmark, you’re golden!
What if my site’s not performing?
There are a hell of a lot of reasons why a site can run slow, but here are some of the most obvious fixes you could try:
Install a cache If your site has any dynamic content (content that’s being served up with code, rather than just a straight HTML site), you really need a cache - it’s a program that stores common user requests and runs them much faster than if the program was just running each of them one at a time. If you’re running Wordpress, this is pretty simple: download the plugin HyperCache and install it. If you’re running a custom site, talk to your developers - it’s very easy to add caching using both Ruby on Rails and PHP, two of the most popular coding languages for the Web.
Check your cache is running This one bit me when I was putting together the test site for this series of articles. If your site seems to be running slowly, check your cache is working. For Wordpress, go to the Admin panel, then Settings, then the Hypercache tab, and check there are no error messages there. Most likely you’ll have set your file permissions wrong - when I fixed this, it reduced my load times from upward of 10 seconds to 2 seconds.
Check all the external files you’re serving. Wordpress plugins are particularly good at slowing a site down if they have to load a Javascript or other file from an external site. Try disabling all your plugins except your cache, and rerunning the test - if it’s suddenly a lot faster, one of your plugins is causing the slowdown. Re-enable them one at a time. If you’re not using Wordpress, check any Javascript or image files you’re running from a site other than yours, and try temporarily removing them from your page. Suddenly speeds up? You’ve got your smoking gun.
Check you don’t have any HUUUUUUUGE files. Use the online tool at http://www.websiteoptimization.com/services/analyze/ to check the size of your front page as well as a bunch of other useful info. Frankly, page size doesn’t matter as much as it used to, but if you’re north of 200kb on your page, you might want to try slimming down some big images or big Javascript files. Over 2Mb is definitely alarm bells territory - something’s far larger than it should be.
Talk to your hosts. Most webhosts are more than happy to look into speed issues - frankly, if they ain’t interested when you say “erm, my site’s slower than a snail superglued to a tin of molasses”, you should look for another host. There are a lot of tweaks that they can probably implement to speed your site up - ask about your web server config in particular.
If all else fails, prepare a simple static HTML page containing your latest episodes, contact information, and a link to your RSS feed. If you do get a traffic surge, rename that to “Index.html”, rename whatever your current index file is, and stick your new “index.html” in the main folder of your website- it’ll act as a new, simple front page and handle the load better.
And that’s it! Now, you either know that you can handle a Happy Event, or at least you’re on the way to figuring out what’s slowing you up!
Any other tips for making sure your site’s ready for a serious Digging? How have your sites handled major traffic in the past?