I just read an interview with developers from Twitter, commenting on why they rewrote parts of their Ruby-based system in Scala, and it was an interesting read for someone interested in both Scala and Ruby. The obvious reason to switch from Ruby to Scala for a high traffic site like Twitter was to squeeze out more performance of the hardware, but some of their comments puzzled me and in the end I felt things they said didn’t add up.
Steve Jenson: … Another thing we really like about Scala is static typing that’s not painful. Sometimes it would be really nice in Ruby to say things like, here’s an optional type annotation. This is the type we really expect to see here. And we find that really useful in Scala, to be able to specify the type information.
Alex Payne: I’d definitely want to hammer home what Steve said about typing. As our system has grown, a lot of the logic in our Ruby system sort of replicates a type system, either in our unit tests or as validations on models. I think it may just be a property of large systems in dynamic languages, that eventually you end up rewriting your own type system, and you sort of do it badly. You’re checking for null values all over the place. There’s lots of calls to Ruby’s kind_of? method, which asks, “Is this a kind of User object? Because that’s what we’re expecting. If we don’t get that, this is going to explode.” It is a shame to have to write all that when there is a solution that has existed in the world of programming languages for decades now.
First of all, I never expected Twitter to be a large system when it came to code. I assume they have quite a few boxes, but the problem domain itself only seems to require a fairly small system, even though words like small and large are bad qualifiers when we have no references. The API is small, what you can do is limited (which in itself is the power of Twitter) and the web front-end is not particularly large either, so I have a hard time imagining what made the system large. But that’s not what puzzled me.
When I read their comments on type systems, it indicated me that Twitter developers have used a dynamically typed language (Ruby) for several years and still missed the point. If you use a language that has duck typing, you never (or at least, extremely rarely) want to check which type something is. The whole point is that you don’t care. If your object walks like a duck and quacks like a duck and that’s what we need, we’re happy. There is no need to check it’s genes to see what species it is, make sure it’s got feathers and a flat nose as long as it fulfills the contract you have of walking and quacking like a duck. If you really do have to check occasionally, you should use
One of my favorite styles in programming when it comes to create stable systems is something the Pragmatic Programmer book describes well: “Dead programs tell no lies”. Unless you build software where people are going to die if you system goes down, don’t try to recover from programming errors (like sending in the wrong types). Let it explode. Loudly. What are you supposed to do anyway? The worse thing that can happen is that you actually manage to run with this very unexpected state because you will likely trash someone’s data instead and that’s a great recipe to get new enemies. If you system goes down because someone passed you the wrong type or passed you a null value when not allowed, that’s a good thing. You’ll find that bug fast and easily and then it’s gone. It’s not like there will be code that every 100.000:th call will give you another type. It will happen early which means you’ll die early. Trying to cope with such things almost always turns into real problems, like silently corrupting data. Let it explode loudly so you can find it and fix it. Don’t compensate for callers that are abusing you. If the caller don’t follow the preconditions in the contract, the deal is off. I don’t know what the Twitter devs tried to to when they checked for type, but if they tried to do anything else than perhaps explode in a different way, there is some kind of design problem going on.
Later on Alex says:
Alex Payne: … Thinking about static typing for the first time in several years.
Now I’m truly puzzled, since Alex said earlier that they more or less had to write their own static typing in Ruby.
On a tangent, when it comes no null checking, I agree it’s a pain, but most static type systems don’t handle it differently. Scala’s very strict type system does have an Option-type which address the problem, but any semi-static type systems like Java’s result in the developer doing null checks. I haven’t written much Scala code yet, but I’m very curious to fins out how the Option-type works in practice. It’s a very nice language feature, but I worry that you end up in the same situation as you did with C++ consts. In C++, it’s a good practice to mark your query methods as const to indicate that the method have no side effects, which means you can use the method with const objects. The problem is that since it’s not a common enough practice, when you need to use another method from your const method, you often find that even if it doesn’t change anything it’s not marked as const. This means you can’t use that from your const method since the compiler can’t be sure, and when you run into this enough, you’re likely to give up and stop marking your own methods as const to avoid the pain, unless you have all the code you call and the code the code you call calls (and so on) under your control. I do hope I’m wrong about these problems being similar because if would be a real pity if the same thing happened to the Scala Option type.
The next statement that puzzled me:
Robey Pointer: I had no functional background prior to learning Scala other than Python. I was pretty familiar with Python. As I’ve learned more Scala I’ve started thinking more functionally than I did before. When I first started I would use the for expression, which is very much like Python’s. Now more often I find myself invoking map or foreach directly on iterators.
I can only come to one conclusion and that is that the Twitter Ruby code must have looked an awful lot like Java code. If they haven’t used map and each in Ruby, there must be a lot of hard, unnecessary work done in the code base.
In the end of the interview, they touched on concurrency and Scala actors, a feature of Scala that appeals to me a lot. At first it seems like they had great use of the Scala actors, but then:
Robey Pointer: … And over time as we ran more and more system tests on it, we found that actors weren’t necessarily the ideal concurrency model for all parts of that system. Some parts of the concurrency model of that system are still actor based. For example, it uses a memcache library that Robey wrote, which is actor based. But for other parts we’ve just gone back to a traditional Java threading model. The engineer working on that, John Kalucki, just found it was a little bit easier to test, a bit more predictable.
Since I don’t know exactly what they were doing, it’s hard to comment on the specific case, but I’m really surprised by what they say. These are normally reasons why people want to leave Java style threading for actors, not the other way around. Either this particular problem was very different from other concurrency problems or the Twitter developers simply had the urge to get back to where they came from and what felt comfortable.
The problem is that comfort zones can be dangerous places. They feel safe because you’re familiar to them, but maybe you just never saw the monsters under your bed.