Wednesday, August 4, 2010

Introduction of Cassandra

Cassandra is an alternative to MySQL, Oracle or other database manager for very large amounts of queries.

It is suitable for fully distributed and highly scalable databases, so for very large amounts of information and requests.

The distributed model is used to store information on many different servers managed by a central system.

Cassandra is based on the non-relational data model BigTable created by Google and used by the index of its search engine, next to Dynamo, the storage system from Amazon.

It has been open sourced by Facebook in 2008 and then supported by the Apache Foundation.

SQL or not SQL?

Cassandra is part of NoSQL the movement that wants to simplify the databases by removing the relational aspect.

Tables are no longer a predetermined fixed schema (that we can actually change later), and can change horizontally (for the columns) as well as vertically (for the lines, so the records).

NoSQL actually means Not Only SQL, so it is not about the query language, which is always SQL.

Cassandra vs. MySQL

Apache shown the following performances:

Writing: MySQL: 300 ms. Cassandra: 0.12 ms.

Reading: MySQL: 350ms. Cassandra: 15 ms.



Is the origin of Cassandra, even though the project was then integrated to Apache.

Software powering Facebook.


Twitter does not so far use Cassandra to manage tweets, because it would have to rewrite the system but it is used for statistical data and geolocation.


Complaining of slowness in MySQL, Digg has decided to completely reimplement the management of data to Cassandra.

Why Digg replaces MySQL.


Cassandra was originally designed for Facebook. It tends to be used by more and more different players, but this is a new product which has not been confronted with all the variety of uses that can be put. We can expect setbacks in adapting it to a new application.


Post a Comment

Design by Free WordPress Themes | Bloggerized by Lasantha - Premium Blogger Themes | Macys Printable Coupons