Possibly you’ve never heard of it. But if you know anyone who knits or crochets, ask him or (more likely) her about Ravelry and chances are you’re talking to an active member. It’s big. This is an interview with Casey Forbes, who constitutes the whole engineering staff.
I got to know about Ravelry because my wife is an active member; you may know her as Lauren Wood, Ph.D, girl geek and project manager extraordinaire, but she’s also on the air at Rune Designs, check it out. I learned early on that Ravelry was a Rails thingie; it took quite a while to back Casey into a corner long enough for a conversation, but I think it was worth the wait.
Tim: Since most of the readers here probably haven’t heard of you or your work, let’s motivate this story. You built an online community called Ravelry, and you use Rails. We’ll get to what it is and so on later, but start by impressing us with some numbers.
Casey: We’ve got 430,000 registered users, in a month we’ll see 200,000 of those, about 135,000 in a week and about 70,000 in a day.
We peak at 3.6 million pageviews per day. That’s registered users only (doesn’t include the very few pages that are Google accessible) and does not include the usual API calls, RSS feeds, AJAX.
Actual requests that hit Rails per day is 10 million.
900 new users sign up per day.
The forums are very active with about 50,000 new posts being written each day.
Some various numbers — 2.3 million knitting/crochet projects, 19 million forum posts, 13 million private messages, 8 million photos (the majority are hosted by Flickr).
Tim: Who are you?
Casey: My background? Just a regular coder. I’m from NH and went to the University of New Hampshire for Computer Science and graduated in 2000 when everything was a little wacky and unsuprisingly, went to work doing web stuff. I started out w/Microsoft stuff, moved to Java, and after 7 years (and 4 jobs) in Java with a little sysadmin and network stuff mixed in, I decided that I didn’t really want to write Java anymore. I wasn’t having fun. Ruby seemed like it might bring the fun back, and it hasn’t disappointed.
Tim: What is Ravelry?
Casey: Beyond the “Ravelry is a knit and crochet community”, we usually say that Ravelry is three things:
An organizational tool for knitters and crocheters. A project album, yarn stash album/inventory, needle inventory — everything a knitter/crocheter might want for personal organization.
A yarn and pattern database and research tool. Our community-edited yarn and pattern database is something that has never existed before. If someone else has used a pattern or yarn, no matter how obscure, you can probably find information and project photos on Ravelry. The personal organizational tool is actually entirely public and we were able to create this database by encouraging people who share their projects and information (by using the organizational tools) to contribute to the yarn and pattern directory.
A social site. Forums, groups, friend-related features (like viewing an activity stream of friend’s handspun yarn, projects, etc being added) all give people ways to interact with other knitters and crocheters.
... there is a also a 4th item, which is “a tool for independent designers and yarnies” (we use “yarnies” as a nickname for yarn dyers/spinners) From the very beginning, giving small indie designers/yarnies a way to show off their work and get the word out has been a very important part of Ravelry. We feel that we’ve helped many people find an audience and we’re proud of that.
Tim: [After pausing to appreciate the Douglas-Adams-esque four-part trilogy there] What’s the history? How did you get here?
Casey: I think that this 30 minute podcast might have been the most fun and coherent interview we’ve ever done and it explains the history, etc. pretty well.
In short: In short: My wife Jessica and I started designing the site in January 2007 after caving to months of needling from a friend who thought there was promise in our idea. Here's a 2005 entry from Jessica's blog describing an imaginary site that is much like Ravelry. I started coding on nights and weekends... As soon as we could, we got alpha testers in to try it out...
...4 months later, we had a site that we were ready to announce.
Once the secret was out, all hell broke loose. There was way more interest than expected and it was 5 months before we even had the hardware to support a fraction of the people who were signing up (we were just on a little VPS).
Interest hasn’t slowed since.
I could talk for ages about how awesome and valuable the beta process was. We learned so much during the first year when invitations were going out slowly and we were talking to the users of the site about what they wanted every single day. I would do it all over again in a heartbeat — start with something that works, get people in it, and build it together.
Tim: Does it make you a living?
Casey: Yep. Ravelry employs Jess and I, one full time employee (Mary-Heather) who has been with us for over a year, and one part time employee (Sarah) who we hope to hire full time soon.
We started out with just our meager savings and I don’t know how we would have ever made it to the self-sustaining company that we are today without all of the generous support from our users.
Now we have about 2,000 advertisers and a novel (for the industry, at least) approach to advertising that we are proud of — very low rates, self-service, affordable for small indie businesses, and advertising pays for much of Ravelry.
We also have a good-sized merchandise store and we have our own wholesale accounts, printers, and fulfillment company so that we aren’t giving all of the profits to CafePress or whoever.
Ryan Ryan Norbauer from RubyRags was a huge help when it came to integrating with a fulfillment center. We use a small Massachusetts company.
Recently we started providing a pattern sales (digital download service) that is especially suited for knitting/crochet designers, and that has been going well.
Tim: Could you give us a walk-through of the setup? Software, hardware, versions, hosting, and so on?
Casey: Sure. We own our servers and host in a datacenter where we rent space. This was the most cost effective for us.
The servers are from Silicon Mechanics, the bandwidth is from our datacenter (Hosted Solutions) and Cogent (dirt cheap!).
We are using Amazon S3 for storage and taking advantage of Cloudfront. We’re currently storing about 5 terabytes with Amazon. Until we got a great deal with Cogent, Amazon made huge sense because the bandwidth was cheap compared to what we were paying.
We have 7 servers running Gentoo Linux and virtualized into a total of 13 virtual servers with Xen.
2 are small, low-powered things. One does backups and the other is a “utility server” which is assorted non-critical processes and a place for me to stage stuff.
2 (more or less) are the master database and the slave database + Sphinx search engine. These have 32 GB of memory each. At the time, 64 GB was too expensive. I’m running a Percona build of MySQL 5.0. I love the Percona builds.
The other 3 (more or less) are application servers. They run Passenger and Ruby Enterprise edition, Ruby 1.8.6, memcached. That’s 6 quad core processors and 40 GB of memory total, with room to spare. Passenger gave me a lot more memory to work with and the Ruby GC patches are great too.
...as always, nearly all of the scaling/tuning/performance related work is database related.
Our front end server is Nginx and Haproxy. So traffic goes a little like this:
nginx ⇒ haproxy ⇒ (load balanced) ⇒ apache + mod_passenger
I love nginx. It is much faster and less memory hungry than Apache.
I love haproxy. It’s a great load balancer. It also gives me a lot of flexiblity and really helps me+Capistrano do rolling deploys of new versions of the site without affecting performance/traffic.
We have 6 Apache+passengers running, each capped at a pool size of 20. Munin charts tell me that number of concurrent sessions actually handled by those 120 available servers ranges between 15 and 90. (We have a global audience but 4 AM eastern is pretty quiet.)
I’ve been using Tokyo Cabinet/Tyrant instead of memcached in some places for caching larger objects. We do a lot of rendering Markdown into HTML — pretty much everywhere on the site — it’s a waste of energy to render but it’s also a huge amount of data to store in memcached.
Sysadmin tools that are important to us: syslog-ng for log aggregation, nagios for alerts, munin for resource/etc monitoring. NewRelic has been nice for tuning and having a little more visibility. HopToad has been great for exception notifications.
I’m very happy with most of our infrastructure with one exception. We have lots of 10-20 million row tables and MySQL schema changes are so painful if you have large tables and you don’t want any downtime.
Tim: You’re running of the world’s more successful deployments of Ruby and Rails technologies. What do you see from where you sit?
Casey: Ruby is fun! If you listen to Paul Graham and whoever else, then you’ll be working on your startup while you have a day job. Fun is important.
Ruby is fast to develop and prototype in. We released a new version of the code every day (Probably twice a day or more) during the first several months of heavy beta testing when a few thousands user were actively helping us create the site.
What we’ve done only takes 1 not-even-fulltime (I have lots of other duties) programmer/sysadmin and it’s very cool that the software available today makes this possible. It’s important — we can’t be spending money on “professional services” and we only have 3.5 employees.
[Tim’s parting thoughts.] I think there are some major lessons in this story, maybe not new but real important, screaming to be heard. But I’ll leave the exegesis to you, dear reader.