- Reasoning about Knowledge
- Toward a Cloud Computing Research Agenda (2009) –
“One of the LADIS attendees commented at some point that Byzantine Consensus could be used to improve Chubby, making it tolerant of faults that could disrupt it as currently implemented. But for our keynote speakers, enhancing Chubby to tolerate such faults turns out to be of purely academic interest.”
Low-level data structures –
The llds general working thesis is: for large memory applications, virtual memory layers can hurt application performance due to increased memory latency when dealing with large data structures. Specifically, data page tables/directories within the kernel and increased DRAM requests can be avoided to boost application memory access.
- High-Performance Concurrency Control for Main-Memory Databases (via High Scalability) – MVCC is interesting and elegant, and also underpins some datastores with persistence, like HBase. I like this paper as the best survey.
During his retirement, my father has been able to spend much time indulging his love of mathematics. This included, amongst other impressive endeavours, attending Cambridge at a more advanced age than average to take (and pass!) the Part III of the Mathematical Tripos, often considered one of the hardest taught courses in maths in the world.
Having completed this monumental piece of work, it seemed only proper to share it a little more widely so that other students might benefit from his efforts – and that’s where I come in, since I’m the one with the website. So if you have any passing interest in 19th / 20th century modern algebra, I encourage you to check out Noel Robinson’s translation of “Theory of Algebraic Functions of One Variable”, hosted on this site.
EuroSys 2012 was last week – one of the premier European systems conferences. Over at the Cambridge System Research Group’s blog, various people from the group have written notes on the papers presented. They’re very well-written summaries, and worth checking out for an overview of the research presented.
A smart student asked me a couple of days ago whether I thought taking a 2xx-level reading course in operating systems was a good idea. The student, understandably, was unsure whether talking about these systems was as valuable as actually building them, and also whether, since his primary interest is in ‘distributed’ systems, he stood to benefit from a deep understanding of things like virtual memory.
I’ll be giving a talk at this year’s Strata Conference in Santa Clara, on February 29th. My talk is called Monitoring Apache Hadoop – A Big Data Problem?. I’d be lying if I said that every slide was fully realised at this point, but you can read the abstract to see what I’ve committed myself to. The general idea is that building large scale shared-nothing distributed systems is at most half the problem in making them a reality. Managing these systems day-to-day requires the understanding and analysis of a serious amount of data; so there’s a nice cycle here that you might be able to use the data processing systems you’re trying to understand to understand them. I’ll try and tie the whole thing together with a discussion of failure; the thesis being that partial failure in distributed systems is both to blame for the incidents we’re trying to understand, and making understanding them very difficult – I believe this is true in a very fundamental sense, so I’ll make that case and also talk about what is to be done.
(And if I’m not a big enough draw – perish the thought – there are many, many other interesting sessions. In particular, Josh will be talking about Crunch, and Sarah will be giving both introductory and advanced Hadoop classes – both people I work with, and both fantastic speakers!)