Niclas Nilsson

Comments on Twitter on Scala

Posted by Niclas Nilsson on 2009-04-06 at 13:20

I just read an interview with developers from Twitter, commenting on why they rewrote parts of their Ruby-based system in Scala, and it was an interesting read for someone interested in both Scala and Ruby. The obvious reason to switch from Ruby to Scala for a high traffic site like Twitter was to squeeze out more performance of the hardware, but some of their comments puzzled me and in the end I felt things they said didn’t add up.

Steve Jenson: […] Another thing we really like about Scala is static typing that’s not painful. Sometimes it would be really nice in Ruby to say things like, here’s an optional type annotation. This is the type we really expect to see here. And we find that really useful in Scala, to be able to specify the type information.

Alex Payne: I’d definitely want to hammer home what Steve said about typing. As our system has grown, a lot of the logic in our Ruby system sort of replicates a type system, either in our unit tests or as validations on models. I think it may just be a property of large systems in dynamic languages, that eventually you end up rewriting your own type system, and you sort of do it badly. You’re checking for null values all over the place. There’s lots of calls to Ruby’s kind_of? method, which asks, “Is this a kind of User object? Because that’s what we’re expecting. If we don’t get that, this is going to explode.” It is a shame to have to write all that when there is a solution that has existed in the world of programming languages for decades now.

First of all, I never expected Twitter to be a large system when it came to code. I assume they have quite a few boxes, but the problem domain itself only seems to require a fairly small system, even though words like small and large are bad qualifiers when we have no references. The API is small, what you can do is limited (which in itself is the power of Twitter) and the web front-end is not particularly large either, so I have a hard time imagining what made the system large. But that’s not what puzzled me.

When I read their comments on type systems, it indicated me that Twitter developers have used a dynamically typed language (Ruby) for several years and still missed the point. If you use a language that has duck typing, you never (or at least, extremely rarely) want to check which type something is. The whole point is that you don’t care. If your object walks like a duck and quacks like a duck and that’s what we need, we’re happy. There is no need to check it’s genes to see what species it is, make sure it’s got feathers and a flat nose as long as it fulfills the contract you have of walking and quacking like a duck. If you really do have to check occasionally, you should use respond_to?, not kind_of?

One of my favorite styles in programming when it comes to create stable systems is something the Pragmatic Programmer book describes well: “Dead programs tell no lies”. Unless you build software where people are going to die if you system goes down, don’t try to recover from programming errors (like sending in the wrong types). Let it explode. Loudly. What are you supposed to do anyway? The worse thing that can happen is that you actually manage to run with this very unexpected state because you will likely trash someone’s data instead and that’s a great recipe to get new enemies. If you system goes down because someone passed you the wrong type or passed you a null value when not allowed, that’s a good thing. You’ll find that bug fast and easily and then it’s gone. It’s not like there will be code that every 100.000:th call will give you another type. It will happen early which means you’ll die early. Trying to cope with such things almost always turns into real problems, like silently corrupting data. Let it explode loudly so you can find it and fix it. Don’t compensate for callers that are abusing you. If the caller don’t follow the preconditions in the contract, the deal is off. I don’t know what the Twitter devs tried to to when they checked for type, but if they tried to do anything else than perhaps explode in a different way, there is some kind of design problem going on.

Later on Alex says:

Alex Payne: […] Thinking about static typing for the first time in several years.

Now I’m truly puzzled, since Alex said earlier that they more or less had to write their own static typing in Ruby.

On a tangent, when it comes no null checking, I agree it’s a pain, but most static type systems don’t handle it differently. Scala’s very strict type system does have an Option-type which address the problem, but any semi-static type systems like Java’s result in the developer doing null checks. I haven’t written much Scala code yet, but I’m very curious to fins out how the Option-type works in practice. It’s a very nice language feature, but I worry that you end up in the same situation as you did with C++ consts. In C++, it’s a good practice to mark your query methods as const to indicate that the method have no side effects, which means you can use the method with const objects. The problem is that since it’s not a common enough practice, when you need to use another method from your const method, you often find that even if it doesn’t change anything it’s not marked as const. This means you can’t use that from your const method since the compiler can’t be sure, and when you run into this enough, you’re likely to give up and stop marking your own methods as const to avoid the pain, unless you have all the code you call and the code the code you call calls (and so on) under your control. I do hope I’m wrong about these problems being similar because if would be a real pity if the same thing happened to the Scala Option type.

The next statement that puzzled me:

Robey Pointer: I had no functional background prior to learning Scala other than Python. I was pretty familiar with Python. As I’ve learned more Scala I’ve started thinking more functionally than I did before. When I first started I would use the for expression, which is very much like Python’s. Now more often I find myself invoking map or foreach directly on iterators.

I can only come to one conclusion and that is that the Twitter Ruby code must have looked an awful lot like Java code. If they haven’t used map and each in Ruby, there must be a lot of hard, unnecessary work done in the code base.

In the end of the interview, they touched on concurrency and Scala actors, a feature of Scala that appeals to me a lot. At first it seems like they had great use of the Scala actors, but then:

Robey Pointer: […] And over time as we ran more and more system tests on it, we found that actors weren’t necessarily the ideal concurrency model for all parts of that system. Some parts of the concurrency model of that system are still actor based. For example, it uses a memcache library that Robey wrote, which is actor based. But for other parts we’ve just gone back to a traditional Java threading model. The engineer working on that, John Kalucki, just found it was a little bit easier to test, a bit more predictable.

Since I don’t know exactly what they were doing, it’s hard to comment on the specific case, but I’m really surprised by what they say. These are normally reasons why people want to leave Java style threading for actors, not the other way around. Either this particular problem was very different from other concurrency problems or the Twitter developers simply had the urge to get back to where they came from and what felt comfortable.

The problem is that comfort zones can be dangerous places. They feel safe because you’re familiar to them, but maybe you just never saw the monsters under your bed.

Comments: 7 (view/add your own) Tags: (none)

Pair programming desk

Posted by Niclas Nilsson on 2009-03-31 at 23:45

Pair programming is perhaps the most underestimated agile practices. Many organizations that adopt agile are still very sceptic towards pair programming. I talked about this as a part of my talk on “Unintuitive parts of agile” at SDC 2009, and I intend to write more about it.

At SDC 2009, I also showed the pair programming desk that we designed at factor10. Over the years we’ve been pairing at various client sites on straight desks, the rounded (convex) ends of conference tables, and on really bad setups on concave desks. To make a long story short, when we furnished the factor10 Gothenburg office, we decided to create a great desk for pair programming and now that we’ve had it for a year, we’ve decided to publish the blueprints under a Creative Commons Attribution-Share Alike license. Any decent wood workshop should be able to use the blueprints if you want one yourself.

More information and the blueprints can be found at the factor10 website. If you decide to build one for yourself, I’ll be interested to hear your feedback, and if you decide to use the power of Creative Commons and modify it to better suit your particular need, I’d be very curious to hear about modifications and how they worked out.

Enjoy!

Comments: 4 (view/add your own) Tags: (none)

Speaking at SDC 2009

Posted by Niclas Nilsson on 2009-03-18 at 20:20

Just a short announcement. Next week I’ll be giving two talks at Scandinavian Developer Conference. The first talk is about “The unintuitive parts of agile” and the second one is on “Outside in - black belt TDD/BDD”. I can’t seem to find a direct link to the abstract so you have to search on the track page to read the full abstracts.

Comments: 6 (view/add your own) Tags: (none)

97 Things Every Software Architect Should Know

Posted by Niclas Nilsson on 2009-02-25 at 02:56

Just as my previous book went out of print after being in print for a decade, I’m on paper again. I’ve had the honor to be involved in an O’Reilly project called 97 Things Every Software Architect Should Know: Collective Wisdom from the Experts which I wrote two “things” for. Now, the book is published and it feels good to be part of a project with so many skilled architects with loads and loads of experience, and I’m looking forward to getting the paper book in the mail soon.

As you may expect, the book contains 97 different things that are important when you work as an architect, but all of them are not necessarily things you’d expect. There is naturally loads of advice of technical nature, but there is also a lot of hard-gained insights about your role as an architect. Many of the things addressed in the book are mistakes I myself have made in the past and many others are things I’ve seen others do or seen the result of; traps you most likely want to avoid if you can.

All “things” are also available online, so if you’d like a sneak peek of the book, here are the links to my two contributions about Commit’n’Run and Fighting Repetition.

Comments: 10 (view/add your own) Tags: (none)

Scraping holidays

Posted by Niclas Nilsson on 2008-12-17 at 18:09

Calendar problems again. Maybe I’m a bad at searching but Twitter didn’t help either. In any case, I couldn’t seem to find a decent iCal calendar with swedish holidays and observances for 2009. Many of them missed things like Midsummer Eve, which is pretty important to an average swede, and non of them contained what I wanted. timeanddate.com does and it can be configures to include different levels of details for your specific country, but they don’t provide iCal versions. I of course quickly became bored looking through broken calendars and programming is fun, it’s was time for some scraping.

gem install hpricot (a great html parser), and off we go.

Replace the months with whatever they’re called in your language, and replace the url with your configured url at timeanddate.com (and mind the year placeholder in the url if you want to get several years):

require 'rubygems'
require 'open-uri'
require 'hpricot'
require 'icalendar'
require 'date'
require 'active_support'
include Icalendar

cal = Calendar.new

months = %w[jan feb mar apr maj jun jul aug sep okt nov dec]

# Create a map { "jan" => 1, ... }
months = months.zip((1..12).to_a).flatten
months = Hash[*months]

is_date = lambda { |line| line =~ /\d+\s.*/ }
to_text = lambda { |e| e.to_plain_text }

(2009..2015).each do |year|
  url = "http://timeanddate.com/calendar/custom.html?year=#{year}&country=21&lang=sv&hol=825&moon=on"

  doc = Hpricot(open url)

  # scrape the dates and descriptions
  dates, descs = (doc/"td.smtop").
    map(&to_text).
    partition(&is_date)

  # for each date and description pair...
  dates.zip(descs).each do |date, desc|
    day, month = date.split
    month = months[month]
    date = Date.new(year.to_i, month.to_i, day.to_i)

    cal.event do
      dtstart       date
      dtend         date + 1
      summary       desc
    end
  end
end

puts cal.to_ical

and run it with

ruby scaping-holidays.rb > holidays.ics

Here is my file with swedish holidays 2009-2015 and you import it in the same way as the previous post on weeks in ical.

I wish you nice future holidays!

Comments: 6 (view/add your own) Tags: (none)