What are you doing?
reading @biz out me as a Twitter employee.
enjoying salad since 1978.
When you find yourself working on big systems, a useful technique is to decompose it into services. Moving from a big monolithic server to a bunch of separate services can be a big challenge but if you had foresight, many of your services were already decoupled in your system from day 1 even though you were deploying it monolithicly.
A common technique for decomposing services is using RPC. At Google, we used protocol buffers, which were briefly descibed in the Sawzall paper.
Basically, you describe your data and the interface that process the data in a language-independent format (a DDL, essentially) and use code generators to turn that DDL into set of objects in your target langauge that can create and send those structures over the wire. This makes it easy to write servers in one language and clients in another and the generated code deals with serialization.
I found that using a DDL to describe your code and services was really nice. When building a new service, you could simply reference your DDL in the design doc and have a meanginful discussion about the service without getting into the details of how it would be written until you had the semantics nailed down.
Facebook, as they were growing, decided to move to a homegrown binary RPC mechanism similar to protocol buffers called Thrift.
Let's say I wanted to write a simple service that would tell the client what time it was on the server. Here would be the DDL file describing both the data and the service plus a little extra to help out the generated code files.
# time.thrift namespace java tserver.gen namespace ruby TServer.Gen typedef i64 Timestamp service TimeServer { // Simply returns the current time. Timestamp time() }
After running thrift --gen java --gen rb time.thrift on the file, I'd have an interface and server that I could implement in Java and a client that I could use in Ruby.
Based on the generated java code, I could write a short server in Scala:
package tserver
import tserver.gen._
import com.facebook.thrift.TException
import com.facebook.thrift.TProcessor
import com.facebook.thrift.TProcessorFactory
import com.facebook.thrift.protocol.TProtocol
import com.facebook.thrift.protocol.TProtocolFactory
import com.facebook.thrift.transport.TServerTransport
import com.facebook.thrift.transport.TServerSocket
import com.facebook.thrift.transport.TTransport
import com.facebook.thrift.transport.TTransportFactory
import com.facebook.thrift.transport.TTransportException
import com.facebook.thrift.server.TServer
import com.facebook.thrift.server.TThreadPoolServer
import com.facebook.thrift.protocol.TBinaryProtocol
/**
* TimeServer.time returns the current time according to the server.
*/
class TimeServer extends TimeServer.Iface {
override def time: Long = {
val now = System.currentTimeMillis
println("somebody just asked me what time it is: " + now)
now
}
}
object SimpleServer extends Application {
try {
val serverTransport = new TServerSocket(7911)
val processor = new TimeServer.Processor(new TimeServer())
val protFactory = new TBinaryProtocol.Factory(true, true)
val server = new TThreadPoolServer(processor, serverTransport,
protFactory)
println("starting server")
server.serve();
} catch {
case x: Exception => x.printStackTrace();
}
}
(Geez, most of that space was taken up in my obsessive need to separate out all my imports. You can thank Google for that bit of OCD.)
The client is even shorter:
#!/usr/bin/ruby
$:.push('~/thrift/lib/rb/lib')
$:.push('../gen-rb')
require 'thrift/transport/tsocket'
require 'thrift/protocol/tbinaryprotocol'
require 'TimeServer'
transport = TBufferedTransport.new(TSocket.new("localhost", 7911))
protocol = TBinaryProtocol.new(transport)
client = TimeServer::Client.new(protocol)
transport.open()
puts "I wonder what time it is. Let's ask!"
puts client.time()
The ruby client took about 20ms to get an answer from the Scala server.
Thrift advantages:
Thrift drawbacks:
If you're looking to move towards decoupled services, Thrift is worth a hard look.
Here's a tarball with my time server. It contains all the generated code as well as libthrift.jar and a Makefile to run the example server.
Two things popped up on my radar recently:
gvn, Google's wrappers around Subversion to help them work in their code-review heavy workflow. Even if you're not into code reviews, tkdiff
integration is a nice improvement over colordiff
or FileMerge
.
gold, a new ELF linker built with giant binaries in mind. When you're building 900MB+ static binaries routinely, linking speed matters. gold claims to be at least 5x faster currently. Even if you have a massive distcc cluster, linking is still serial. One of gold's future design goals is to be concurrent and that would be pretty awesome. Imagine how fast I could link with a concurrent linker on my 8-core Mac Pro! Not that using an ELF linker under Leopard helps much since OS X uses Mach-O binaries but hey, there's always cross-compiling.
BTW, Ian Lance Taylor, the author of gold, has an excellent series of blog articles on linkers.