enjoying salad since 1978.

Monday, April 21, 2008

What are you doing?

reading @biz out me as a Twitter employee.

Tuesday, April 15, 2008

curious what delicious is saying about something?

Here's a bookmarklet that checks the current URL you're visiting with del.icio.us.

Sunday, April 13, 2008

My first Thrift app

When you find yourself working on big systems, a useful technique is to decompose it into services. Moving from a big monolithic server to a bunch of separate services can be a big challenge but if you had foresight, many of your services were already decoupled in your system from day 1 even though you were deploying it monolithicly.

A common technique for decomposing services is using RPC. At Google, we used protocol buffers, which were briefly descibed in the Sawzall paper.

Basically, you describe your data and the interface that process the data in a language-independent format (a DDL, essentially) and use code generators to turn that DDL into set of objects in your target langauge that can create and send those structures over the wire. This makes it easy to write servers in one language and clients in another and the generated code deals with serialization.

I found that using a DDL to describe your code and services was really nice. When building a new service, you could simply reference your DDL in the design doc and have a meanginful discussion about the service without getting into the details of how it would be written until you had the semantics nailed down.

Facebook, as they were growing, decided to move to a homegrown binary RPC mechanism similar to protocol buffers called Thrift.

Let's say I wanted to write a simple service that would tell the client what time it was on the server. Here would be the DDL file describing both the data and the service plus a little extra to help out the generated code files.

# time.thrift
namespace java tserver.gen
namespace ruby TServer.Gen

typedef i64 Timestamp

service TimeServer {
  // Simply returns the current time.
  Timestamp time()

After running thrift --gen java --gen rb time.thrift on the file, I'd have an interface and server that I could implement in Java and a client that I could use in Ruby.

Based on the generated java code, I could write a short server in Scala:

package tserver

import tserver.gen._
import com.facebook.thrift.TException
import com.facebook.thrift.TProcessor
import com.facebook.thrift.TProcessorFactory
import com.facebook.thrift.protocol.TProtocol
import com.facebook.thrift.protocol.TProtocolFactory
import com.facebook.thrift.transport.TServerTransport
import com.facebook.thrift.transport.TServerSocket
import com.facebook.thrift.transport.TTransport
import com.facebook.thrift.transport.TTransportFactory
import com.facebook.thrift.transport.TTransportException
import com.facebook.thrift.server.TServer
import com.facebook.thrift.server.TThreadPoolServer
import com.facebook.thrift.protocol.TBinaryProtocol

 * TimeServer.time returns the current time according to the server.
class TimeServer extends TimeServer.Iface {
  override def time: Long = {
    val now = System.currentTimeMillis
    println("somebody just asked me what time it is: " + now)

object SimpleServer extends Application {
  try {
    val serverTransport = new TServerSocket(7911)
    val processor = new TimeServer.Processor(new TimeServer())
    val protFactory = new TBinaryProtocol.Factory(true, true)
    val server = new TThreadPoolServer(processor, serverTransport,
    println("starting server")
  } catch { 
    case x: Exception => x.printStackTrace();

(Geez, most of that space was taken up in my obsessive need to separate out all my imports. You can thank Google for that bit of OCD.)

The client is even shorter:


require 'thrift/transport/tsocket'
require 'thrift/protocol/tbinaryprotocol'
require 'TimeServer'

transport = TBufferedTransport.new(TSocket.new("localhost", 7911))
protocol = TBinaryProtocol.new(transport)
client = TimeServer::Client.new(protocol)


puts "I wonder what time it is. Let's ask!"
puts client.time()

The ruby client took about 20ms to get an answer from the Scala server.

Thrift advantages:

  • Pipelined connections means you spend less time in connection setup/teardown and TCP likes longer-lived connections.
  • Asynchronous requests. Asynchronous replies would be nice too but would be trickier to use.
  • Binary representation is much more efficient to transmit and process than, say, XML.

Thrift drawbacks:

  • Integrating generated source into your build system can be tricky. Typically, you rarely have to regenerate your stubs but debugging generated code can be a huge pain.
  • It's Java server should move away from ServerSocket to NIO for increased throughput. That's probably not more than a week's work as long as the existing code isn't too tightly coupled.
  • Currently it doesn't build cleanly on the Mac. I did some work and got it working but I don't think it's used extensively on the Mac so if that's your primary platform, you should be prepared to send them patches from time to time.

If you're looking to move towards decoupled services, Thrift is worth a hard look.

Here's a tarball with my time server. It contains all the generated code as well as libthrift.jar and a Makefile to run the example server.

Sunday, April 06, 2008

GVN and gold

Two things popped up on my radar recently:

gvn, Google's wrappers around Subversion to help them work in their code-review heavy workflow. Even if you're not into code reviews, tkdiff integration is a nice improvement over colordiff or FileMerge.

gold, a new ELF linker built with giant binaries in mind. When you're building 900MB+ static binaries routinely, linking speed matters. gold claims to be at least 5x faster currently. Even if you have a massive distcc cluster, linking is still serial. One of gold's future design goals is to be concurrent and that would be pretty awesome. Imagine how fast I could link with a concurrent linker on my 8-core Mac Pro! Not that using an ELF linker under Leopard helps much since OS X uses Mach-O binaries but hey, there's always cross-compiling.

BTW, Ian Lance Taylor, the author of gold, has an excellent series of blog articles on linkers.