When thinking about web application design, it doesn’t take long before the realisation hits you: to achieve any kind of efficiency you need to ignore or abuse key principles of relational database design, employing strategies such as sharding your information across multiple databases, abandoning the use of transactions, and the wholesale denormalisation of data.
Taken to the extreme, the logical conclusion is that all you really need is a robust key-value (or tuple) store and a suitable way to query it.
This is the essence of the NoSQL movement i.e. that many applications don’t suit a traditional relational database. Don’t get me wrong, I’m not a SQL-hater. If you have truly relational data with a fixed schema and complex transaction requirements then you can’t beat a good old RDBMS (e.g. Oracle, SqlServer, MySQL, Postgres etc). However, it’s rare for the data involved to actually have a fixed structure, especially with ever more data being user-generated. Yes, there may be similarities between groups of records, and even some common fields sometimes, but it doesn’t make sense to try to shoe-horn this freeform data into a relational database.
A multitude of solutions to this problem have emerged, ranging from extremely fast and scalable key-value stores such as Tokyo Cabinet, Amazon’s SimpleDB and Facebook’s Cassandra, through to more feature-rich document oriented databases like MongoDB and CouchDB which help to bridge the gap to full-blown relational databases.
As my colleagues will testify, I’ve recently been getting particularly excited about CouchDB, a document-store which has just been released in beta. With Couch and its document-oriented cousins, data is held in schema-less “documents”, which can have any number of arbitrary attributes. This makes them perfect for storing collections of similarly (but not identically) structured items – think business cards, or invoices in the ‘real world’.
Starting to think in terms of documents (which are essentially just groups of related key-value pairs) takes a bit of getting used to if you’re used to the world of relational databases, but it before long it becomes apparent that for many applications it makes life much easier. A set of constraints are enforced upon the programmer, but this inflexibility shapes the design of your software into a scalable, robust, understandable and easy to maintain form. This rigidity is balanced by a freedom of choice for ways in which the data can be stored and accessed, leading to rich, innovative applications for the end-user.
Ric Roberts – 21st Oct 2009
Comments