2010
04.27
Over on the Cloudera blog I’ve written an article that should be of interest to readers of this blog.
I’m no great fan of the ubiquity of the CAP theorem – it’s a solid impossibility result which appeals to the theorist in me, but it doesn’t capture every fundamental tension in a distributed system. For example: we make our systems distributed across more than one machine usually for reasons of performance and to eliminate a single point of failure. Neither of these motivations are captured verbatim by the CAP theorem. There’s more to designing distributed systems!
In this, I agree with Stonebraker; it’s the erroneous representation of ‘partition tolerance’ that I found very strange. I’ve been a good deal more forceful in private about this than I have in public
2010
03.24
Students! Over at Apache ZooKeeper we’re looking for great students with a strong interest in distributed systems to work with us over the summer as part of Google’s Summer of Code, 2010.
Summer of Code is a great program – providing stipends to students and more importantly connecting them with mentors in open source projects. ZooKeeper has a number of interesting projects to get started on.
ZooKeeper is a distributed coordination platform on which you can build the distributed equivalents of many traditional concurrent primitives like locks, queues and barriers. It’s heavily used in the real world – Yahoo! use it extensively, and many other major companies rely on it. The other committers and I are actively looking to increase participation in the project – there is loads of really interesting work left to do. If consensus protocols, distributed systems, scalability, fault-tolerance and performance are your thing, this is certainly the project for you.
If you have any questions at all, drop me a line at henry at apache dot org.
2010
01.29
Category:
Note /
Tags: no tag /
Thanks to those of you who bugged me about putting images back up. They should all (except one that’s causing me a bit of a headache) be back now, after I discovered copies of all in an dusty corner of my hard drive.
Real post with actual content coming in the next week or so!
2009
12.06
Astute readers may have noticed that this blog was unavailable for the past three-and-a-half months. The short reason for this was that the computer on which Paper Trail was hosted was on a boat in the Atlantic and it is surprisingly hard to get a good internet connection out there, let alone a power supply to the shipping crate it was in.
The long reason was that I have now moved across the ocean, from Cambridge to San Francisco, where I am living in order that the commute to Cloudera be a little shorter than 24 hours. My possessions finally arrived on Thursday – over three months since we shipped them – and we are finally getting everything in order, including getting this blog back online. I’m now hosted at Bluehost, and have transferred all the old posts over to the new WordPress installation. I don’t yet have access to the images, so unfortunately those wil have to wait, but most other things are in order. Please excuse the dust as I find out which links are broken.
The old address – http://hnr.dnsalias.net/wordpress – will still work, but the correct link is now our very own domain: http://the-paper-trail.org/. Hopefully we will start showing up in Google again when the crawlers do their thing. Please update your rss readers – those of you who are still following and haven’t deleted my feed in disgust. Does anyone even still use rss readers anymore?
I am in the process of deciding what to write about for the next few posts. I still have the remainder of the theory of computation posts to write, which would be fun to do but is a bit of a departure from the systems focus. I never really fully explained Byzantine Fault Tolerance – at least, I never got as far as describing Zyzzyva and other modern systems. At the same time, some interesting stuff in systems research has happened since the blog went quiet – Google released the Go programming language which is intriguing for writing user-space systems software. SOSP 2009 happened, with some very cool papers which I really want to write about. And I’ve been busy myself – I was recently made a committer on the Apache ZooKeeper project, which is a distributed coordination system written by some engineers and researchers at Yahoo!, and is very cool. My largest contribution was a patch for ‘observers’ – which are listeners, in Paxos terminology – which help maintain the read performance of the cluster as the number of clients scales.
So, lots going on, plenty to write about, and some exciting possibilities coming down the queue. Good to be back.
2009
08.12
This is a really great article. Sean Quinlan talks very openly and critically about the design of the Google File System given ten years of use (ten years!).
What’s interesting is that the general sentiment seems to be that the concessions that GFS made for performance and simplicity (single master, loose consistency model) have turned out to probably be net bad decisions, although they probably weren’t at the time.
There are scaling issues with GFS – the well known many-small-files problem that also plagues HDFS, and a similar huge-files problem. Both increase the amount of metadata that a master has to maintain, and both therefore degrade performance due to some linear costs on metadata operations.
A replacement for GFS is in the works – a complete re-do, not just incremental updates to the design. This will involve, at least, a rethink of metadata storage and clearly a more fault-tolerant and scalable design. Multiple appenders, also a bugbear of HDFS, may be serialised through a single writer process, which would help the consistency issues.
“Our user base has definitely migrated from being a MapReduce-based world to more of an interactive world that relies on things such as BigTable. Gmail is an obvious example of that. Videos aren’t quite as bad where GFS is concerned because you get to stream data, meaning you can buffer. Still, trying to build an interactive database on top of a file system that was designed from the start to support more batch-oriented operations has certainly proved to be a pain point.”
One of those great articles where every paragraph is rich with insight. Worth reading, probably twice.
2009
06.29
The accepted papers for SOSP 2009 are here. As ever, some excellent looking papers. If you search for the titles you can often turn up drafts or even the submitted versions.
The best looking sessions to me are ‘scalability’ and ‘clusters’, but there’s at least one great looking title in every session. I’ll start posting some reviews once I find some bandwidth (and have finished the computation theory series – next one on its way).
Congratulations to all accepted authors!
2009
06.07
(This article represents a brief divergence from our usual focus on distributed systems and algorithms. Normal service will be resumed shortly.)
One of the courses I do some teaching for is the dryly named ‘Theory of Computation’. Although such a course name is unlikely to pique a lot of interest, in actual fact it is probably the most fundamental and important course that you could take about computer science. Computation theory allows us to ask, and answer, questions about the limits of computers and the programs they execute. Is every problem solvable by a computer program? What do we even mean by a ‘problem’?
It turns out that there are profound and beautiful truths to be discovered when considering these questions. In a short series of articles, I’m going to give a brief introduction to the theory of computability, aimed at those who have never studied Turing machines, Cantor diagonalisation or Godel incompleteness. An appreciation for some of these ideas and results should give a perspective on what computers are, and are not, capable of.
This first article presents a bit of background material on the nature of infinity, which is right at the heart of computation theory. In the next article I’ll tie this in with the idea of computability and show how these ideas elegantly allow us to establish limits on what we can do with computers – even in theory, let alone practice.
Read More >>
2009
05.29
I’ve been away from this blog for far longer than I hoped. I’ve got plenty of content queued up though, so in the next few days I should start to get my postings back in order. Rumours of my death are greatly exaggerated; of my gainful employment less so. That, clearly, has been taking up some time, but I’m getting into my groove now.
A small conciliatory nugget that regular readers of this blog might find interesting: a trademark Robinson monster post on Cloudera’s blog describing how to build a simple distributed queue with Apache ZooKeeper.
More on ZooKeeper and our usual distributed systems programming to follow.
2009
04.09
Although I try and keep personal information to a relative minimum on this blog, here’s some news that’s relevant. Recently I accepted an offer to start work at Cloudera, a young company in the San Francisco area. Initially I’ll be working from the UK, with a view to a permanent move out to California when timing and visas allow.
Hadoop is Cloudera’s business. Hadoop is an open-source implementation of Google’s MapReduce Cloudera provides support for Hadoop, and their own fully supported distribution of the Hadoop toolset. Hadoop allows very large scale distributed and parallel processing of huge data sets in a fault-tolerant and efficient manner. Data sets are getting bigger all the time, and there’s a mismatch now between the desire and the ability of companies to handle and process all those data. Cloud computing, at least in the form of dynamic provision of computing resources, and distributed processing technologies such as Hadoop are helping to make this problem tractable. Cloudera provides the expertise and the experience to help businesses make best use of this seriously powerful tech.
I’m joining, as you might expect, as a distributed systems engineer. There are plenty of interesting problems to attack, both in Hadoop and the ecosystem of technologies that support and extend it, such as HDFS, Hbase, Pig, Hive and ZooKeeper. Cloudera already has a killer team (see some of the press from the New York Times), and I’m really looking forward to being a part of it.
2009
03.30
Barbara Liskov has just been announced as the recipient of the 2008 Turing Award, which is one of the most important prizes in computer science, and can be thought of as our field’s equivalent to the various Nobel Prizes. Professor Liskov is a worthy recipient of the award, even if judged alone by her citation which lists a number of the important contributions she has made to operating systems, programming languages and distributed systems.
Professor Liskov seems to be particularly well known for the Liskov substitution principle which says that some property of a supertype ought to hold of its subtypes. I’m not in any position to speak as to the importance of this contribution. However, her more recent work has been regarding the tolerance of Byzantine failures in distributed systems, which is much more close to my heart.
The only work of Liskov’s that I am really familiar with is the late 90s work on Practical Byzantine Fault Tolerance with Miguel Castro and is first published in this OSDI ’99 paper. I’m not going to do a full review, but the topic sits so nicely with my recent focus on consensus protocols that it makes sense to briefly discuss its importance.
Read More >>