“Ding, dong! The witch is dead.Which old witch? The wicked witch!Ding, dong! The wicked witch is dead.Wake up sleepy head,Rub your eyes, get out of bed.Wake up; the wicked witch is dead.She’s gone where the goblins goBelow, below, below; yo ho!Let’s open up and sing and ring the bells out.Ding, dong, the merry oh;Sing it high, sing it low.Let them know the wicked witch is dead.”—From Wizard of Oz (YouTube video)
Amazon created an in-house product called Dynamo in 2007 to meet the performance, scalability, and availability needs of its own e-commerce platform after it concluded that mainstream database management systems were not capable of satisfying those needs. The most notable aspect of Dynamo was the apparent break with the relational model; there was no mention of relations, relational algebra, or SQL.
Amazon started out by using Oracle Database for its e-commerce platform but later switched to a proprietary database management system called Dynamo that it built in-house. Dynamo is the archetypal NoSQL product; it embodies all the innovations of the NoSQL camp. The Dynamo requirements and assumptions are documented in the paper “Dynamo: Amazon’s Highly Available Key-value Store” (http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf), published in 2007. Here are some excerpts from that paper:
“Customers should be able to view and add items to their shopping cart even if disks are failing, network routes are flapping, or data centers are being destroyed by tornados. Therefore, the service responsible for managing shopping carts requires that it can always write to and read from its data store, and that its data needs to be available across multiple data centers.”
“There are many services on Amazon’s platform that only need primary-key access to a data store. For many services, such as those that provide best seller lists, shopping carts, customer preferences, session management, sales rank, and product catalog, the common pattern of using a relational database would lead to inefficiencies and limit scale and availability. Dynamo provides a simple primary-key only interface to meet the requirements of these applications.”
“Experience at Amazon has shown that data stores that provide ACID guarantees tend to have poor availability.”
“Dynamo targets applications that operate with weaker consistency (the “C” in ACID) if this results in high availability.”
“… since each service uses its distinct instance of Dynamo, its initial design targets a scale of up to hundreds of storage hosts.”
To paraphrase, Amazon’s requirements were extreme performance, extreme scalability, and extreme availability.
Amazon’s pivotal design decision was to break its monolithic enterprise-wide database service into simpler component services, such as a best-seller list service, a shopping cart service, a customer preferences service, a sales rank service, and a product catalog service. This avoided a single point of failure. In an interview for the NoCOUG Journal, Amazon’s first database administrator, Jeremiah Wilton explains the rationale behind Amazon’s approach:
“The best availability in the industry comes from application software that is predicated upon a surprising assumption: The databases upon which the software relies will inevitably fail. The better the software’s ability to continue operating in such a situation, the higher the overall service’s availability will be. But isn’t Oracle unbreakable? At the database level, regardless of the measures taken to improve availability, outages will occur from time to time. An outage may be from a required upgrade or a bug. Knowing this, if you engineer application software to handle this eventuality, then a database outage will have less or no impact on end users. In summary, there are many ways to improve a single database’s availability. But the highest availability comes from thoughtful engineering of the entire application architecture.” (http://www.nocoug.org/Journal/NoCOUG_Journal_200711.pdf#page=4)
As an example, the shopping cart service should not be affected if the checkout service is unavailable or not performing well.
I said that this was the pivotal design decision made by Amazon. I cannot emphasize this enough. If you resist functional segmentation, you are not ready for NoSQL. If you miss the point, you will not understand NoSQL.
Note that functional segmentation results in simple hierarchical schemas. Here is an example of a simple hierarchical schema from Ted Codd’s 1970 paper on the relational model (http://www.nocoug.org/Journal/NoCOUG_Journal_201111.pdf#page=10). This simple schema stores information about employees, their children, their job histories, and their salary histories.
Functional segmentation is the underpinning of NoSQL technology, but it does not present a conflict with the relational model; it is simply a physical database design decision. Each functional segment is usually assigned its own standalone database. The collection of functional segments could be regarded as a single distributed database. However, distributed transactions are forbidden in the NoSQL world. Functional segmentation can therefore result in temporary inconsistencies if, for example, the shopping cart data is not in the same database as the product catalog, and occasional inconsistencies result. As an Amazon customer, I occasionally leave items in my shopping cart but don’t complete a purchase. When I resume shopping, I sometimes get a notification that an item in my shopping chart is no longer in stock or has been repriced. This technique is called “eventual consistency.” Randy Shoup, one of the architects of eBay’s ecommerce platform, explains:
“At eBay, we allow absolutely no client-side or distributed transactions of any kind – no two-phase commit. In certain well-defined situations, we will combine multiple statements on a single database into a single transactional operation. For the most part, however, individual statements are auto-committed. While this intentional relaxation of orthodox ACID properties does not guarantee immediate consistency everywhere, the reality is that most systems are available the vast majority of the time. Of course, we do employ various techniques to help the system reach eventual consistency: careful ordering of database operations, asynchronous recovery events, and reconciliation or settlement batches. We choose the technique according to the consistency demands of the particular use case.” (http://www.infoq.com/articles/ebay-scalability-best-practices)
Amazon’s next design decision was “sharding” or horizontal partitioning of all the tables in a hierarchical schema. Hash-partitioning is typically used. Each table is partitioned in the same way as the other tables in the schema, and each set of partitions is placed in a separate database referred to as a “shard.” The shards are independent of each other; that is, there is no clustering as in Oracle RAC.
Note that the hierarchical schemas that result from functional segmentation are always shardable; that is, hierarchical schemas are shardable by definition.
Returning to the example from Ted Codd’s 1970 paper on the relational model:
Note that the jobhistory, salaryhistory, and children tables have composite keys. In each case, the leading column of the composite key is the employee#. Therefore, all four tables can be partitioned using the employee#.
Sharding is an essential component of NoSQL designs but it does not present a conflict with the relational model; it too is simply a physical database design decision. In the relational model, the collection of standalone databases or shards can be logically viewed as a single distributed database.
The Dynamo developers saw that one of the keys to extreme availability was data replication. Multiple copies of the shopping cart are allowed to exist and, if one of the replicas becomes unresponsive, the data can be served by one of the other replicas. However, because of network latencies, the copies may occasionally get out of sync, and the customer may occasionally encounter a stale version of the shopping cart. Once again, this can be handled appropriately by the application tier; the node that falls behind can catch up eventually, or inconsistencies can be detected and resolved at an opportune time, such as at checkout. This technique is called “eventual consistency.”
The inventor of relational theory, Dr. Codd, was acutely aware of the potential overhead of consistency checking. In his 1970 paper, he said:
“There are, of course, several possible ways in which a system can detect inconsistencies and respond to them. In one approach the system checks for possible inconsistency whenever an insertion, deletion, or key update occurs. Naturally, such checking will slow these operations down [emphasis added]. If an inconsistency has been generated, details are logged internally, and if it is not remedied within some reasonable time interval, either the user or someone responsible for the security and integrity of the data is notified. Another approach is to conduct consistency checking as a batch operation once a day or less frequently.” In other words, the inventor of relational theory would not have found a conflict between his relational model and the “eventual consistency” that is one of the hallmarks of the NoSQL products of today. However, the Dynamo developers imagined a conflict because it quite understandably conflated the relational model with the ACID guarantees of database management systems. However, ACID has nothing to do with the relational model per se (although relational theory does come in very handy in defining consistency constraints); pre-relational database management systems such as IMS provided ACID guarantees and so did post-relational object-oriented database management systems.
The tradeoff between consistency and performance is as important in the wired world of today as it was in Dr. Codd’s world. Synchronous replication is rarely used in the relational camp, so we cannot frown at Dynamo for not using it. Application developers in the relational camp are warned about the negative impact of consistency checking, so we cannot frown on Dynamo’s decision to permit temporary inconsistencies between functional segments.
The final hurdle was extreme performance, and that’s where the Dynamo developers went astray. The Dynamo developers believed that the relational model imposes a “join penalty” and therefore chose to store data as “blobs.” This objection to the relational model is colorfully summarized by the following statement attributed to Esther Dyson, the editor of the Release 1.0 newsletter, “Using tables to store objects is like driving your car home and then disassembling it to put it in the garage. It can be assembled again in the morning, but one eventually asks whether this is the most efficient way to park a car.” The statement dates back to 1988 and was much quoted when object-oriented databases were in vogue.
Since the shopping cart is an object, doesn’t disassembling it for storage make subsequent data retrieval and updates inefficient? The belief stems from an unfounded assumption that has found its way into every relational DBMS—that every table should map to physical storage. In reality, the relational model is a logical model and, therefore, it does not concern itself with storage details at all. It would be perfectly legitimate to store the shopping cart in a physical form that resembled a shopping cart while still offering a relational model of the data complete with SQL. In other words, the physical representation could be optimized for the most important use case—retrieving the entire shopping-cart object using its key—without affecting the relational model of the data. It would also be perfectly legitimate to provide a non-relational API for the important use cases. Dr. Codd himself gave conditional blessing to such non-relational APIs in his 1985 Computerworld article, “Is Your DBMS Really Relational?,” in which he says, “If a relational system has a low-level (single-record-at-a-time) language, that low level [should not] be used to subvert or bypass the integrity rules and constraints expressed in the higher level relational language (multiple-records-at-a-time).”
The key-blob or “key-value” approach used by Dynamo and successor products would be called “zeroth” normal form in relational terminology. In his 1970 paper, “A Relational Model of Data for Large Shared Data Banks,” Dr. Codd says: “Nonatomic values can be discussed within the relational framework. Thus, some domains may have relations as elements. These relations may, in turn, be defined on nonsimple domains, and so on. For example, one of the domains on which the relation employee is defined might be salary history. An element of the salary history domain is a binary relation defined on the domain date and the domain salary. The salary history domain is the set of all such binary relations. At any instant of time there are as many instances of the salary history relation in the data bank as there are employees. In contrast, there is only one instance of the employee relation.”
In common parlance, a relation with non-simple domains is said to be in “zeroth” normal form or unnormalized. Dr. Codd suggested that unnormalized relations should be normalized for ease of use. Here again is the unnormalized employee relation from Dr. Codd’s paper:
jobhistory (jobdate, title, salaryhistory (salarydate, salary)),
children (childname, birthyear)
The above unnormalized relation can be decomposed into four normalized relations as follows.
However, this is not to suggest that these normalized relations must necessarily be mapped to individual buckets of physical storage. Dr. Codd differentiated between the stored set, the named set, and the expressible set. In the above example, we have one unnormalized relation and four normalized relations. If we preferred, the unnormalized employee relation could be the only member of the stored set. Alternatively, if we preferred, all five relations could be part of the stored set; that is, we could legitimately store redundant representations of the data. However, the common belief blessed by current practice is that the normalized relations should be the only members of the stored set.
Even if the stored set contains only normalized relations, they need not map to different buckets of physical storage. Oracle Database is unique among mainstream database management systems in providing a convenient construct called the “table cluster” that is suitable for hierarchical schemas. In Dr. Codd’s example, employee# would be the cluster key, and rows corresponding to the same cluster key from all four tables could be stored in the same physical block on disk, thus avoiding the join penalty. If the cluster was a “hash cluster,” no indexes would be required for the use case of retrieving records belonging to a single cluster key. A demonstration is available at http://iggyfernandez.wordpress.com/2013/12/30/the-twelve-days-of-nosql-day-six-the-false-premise-of-nosql/.
The final innovation of the NoSQL camp is “schemaless design.” In database management systems of the NoSQL kind, data is stored in “blobs” or documents; the database management system does not police their structure. In mainstream database management systems on the other hand, doctrinal purity requires that the schema be designed before data is inserted. Let’s do a thought experiment.
Let’s suppose that we don’t have a schema and that the following facts are known:
Could we ask the following questions and expect to receive correct answers?
Question: What is the salary of Iggy Fernandez?
Expected answer: $1000.
Question: What is the commission percentage of Iggy Fernandez?
Expected answer: Invalid question.
Question: What is the commission percentage of Mogens Norgaard?
Expected answer: 25%
Question: What is the commission percentage of Morten Egan?
Expected answer: Unknown.
If we humans can process the above data and correctly answer the above questions, then surely we can program computers to do so.
The above data could be modeled with the following three relations. It is certainly disruptive to suggest that this be done on the fly by the database management system, but it is not outside the realm of possibility.
EMPLOYEE_ID NOT NULL NUMBER(6)
EMPLOYEE_ID NOT NULL NUMBER(6)
EMPLOYEE_ID NOT NULL NUMBER(6)
A NoSQL company called Hadapt has already stepped forward with such a feature:
“While it is true that SQL requires a schema, it is entirely untrue that the user has to define this schema in advance before query processing. There are many data sets out there, including JSON, XML, and generic key-value data sets that are self-describing — each value is associated with some key that describes what entity attribute this value is associated with [emphasis added]. If these data sets are stored in Hadoop, there is no reason why Hadoop cannot automatically generate a virtual schema against which SQL queries can be issued. And if this is true, users should not be forced to define a schema before using a SQL-on-Hadoop solution — they should be able to effortlessly issue SQL against a schema that was automatically generated for them when data was loaded into Hadoop.” (http://hadapt.com/blog/2013/10/28/all-sql-on-hadoop-solutions-are-missing-the-point-of-hadoop/)
This is not really new ground. Oracle Database provides the ability to convert XML documents into relational tables (http://docs.oracle.com/cd/E11882_01/appdev.112/e23094/xdb01int.htm#ADXDB0120), though it ought to be possible to view XML data as tables while physically storing it in XML format in order to benefit certain use cases. It should also be possible to redundantly store data in both XML and relational formats in order to benefit other use cases.
In “Extending the Database Relational Model to Capture More Meaning,” Dr. Codd explains how a “formatted database” is created from a collection of facts:
“Suppose we think of a database initially as a set of formulas in first-order predicate logic. Further, each formula has no free variables and is in as atomic a form as possible (e.g, A & B would be replaced by the component formulas A, B). Now suppose that most of the formulas are simple assertions of the form Pab…z (where P is a predicate and a, b, … , z are constants), and that the number of distinct predicates in the database is few compared with the number of simple assertions. Such a database is usually called formatted, because the major part of it lends itself to rather regular structuring. One obvious way is to factor out the predicate common to a set of simple assertions and then treat the set as an instance of an n-ary relation and the predicate as the name of the relation.”
In other words, a collection of facts can always be organized into a collection of relations.
NoSQL databases can be classified into the following categories:
NoSQL products are numerous and rapidly evolving. There is a crying need for a continuously updated encyclopedia of NoSQL products, but none exists. There is a crying need for an independent benchmarking organization, but none exists. My best advice is to do a proof of concept (POC) as well as a PSR (Performance, Scalability, and Reliability) test before committing to using a NoSQL product. Back in 1985, Dr. Codd had words of advice for those who were debating between the new relational products and the established pre-relational products of the time.
“Any buyer confronted with the decision of which DBMS to acquire should weigh three factors heavily. The first factor is the buyer’s performance requirements, often expressed in terms of the number of transactions that must be executed per second. The average complexity of each transaction is also an important consideration. Only if the performance requirements are extremely severe should buyers rule out present relational DBMS products on this basis. Even then buyers should design performance tests of their own, rather than rely on vendor-designed tests or vendor-declared strategies [emphasis added]. The second factor is reduced costs for developing new databases and new application programs … The third factor is protecting future investments in application programs by acquiring a DBMS with a solid theoretical foundation … In every case, a relational DBMS wins on factors two and three. In many cases, it can win on factor one also—in spite of all the myths about performance.”—“An Evaluation Scheme for Database Management Systems that are claimed to be Relational”
The above advice is as solid today as it was in Dr. Codd’s day.
In May 2011, Oracle Corporation published a white paper titled “Debunking the NoSQL Hype,” the final advice being “Go for the tried and true path. Don’t be risking your data on NoSQL databases.” (https://www.google.com/#q=%2B%22Debunking+the+NoSQL+Hype%22) However, in September of the same year, Oracle Corporation released Oracle NoSQL Database. Oracle suggested that the NoSQL approach was well-suited for certain use-cases:
“The Oracle NoSQL Database, with its ‘No Single Point of Failure’ architecture, is the right solution when data access is “simple” in nature and application demands exceed the volume or latency capability of traditional data management solutions. For example, click-stream data from high volume web sites, high-throughput event processing and social networking communications all represent application domains that produce extraordinary volumes of simple keyed data. Monitoring online retail behavior, accessing customer profiles, pulling up appropriate customer ads and storing and forwarding real-time communication are examples of domains requiring the ultimate in low-latency access. Highly distributed applications such as real-time sensor aggregation and scalable authentication also represent domains well-suited to Oracle NoSQL Database.” (http://www.oracle.com/technetwork/products/nosqldb/learnmore/nosql-wp-1436762.pdf)
Oracle NoSQL Database has two features that distinguish it from other key-value stores: A key is the concatenation of a “major key path” and a “minor key path.” All records with the same “major key path” will be colocated on the same storage node. In addition, Oracle NoSQL provides transactional support for modifying multiple records with the same major key path.
There are already proofs that performance, scalability, and reliability can be achieved without abandoning the relational model. For example, ScaleBase provides sharding and replication on top of MySQL storage nodes. Another good example to study is VoltDB, which claims to be the world’s fastest OLTP database (though it has never published an audited TPC benchmark). A counter-example to Amazon is eBay, which arguably has equal scale and equally high-performance, scalability, and reliability requirements. eBay uses performance segmentation, sharding, replication, and eventual consistency but continues to use Oracle (and SQL) to manage local databases. I asked Randy Shoup, one of the architects of the eBay e-commerce platform, why eBay did not abandon Oracle Database, and he answered in one word: “comfort.” Here are links to some of his presentations and articles on the eBay architecture:
The latest challenge to NoSQL comes from Google, which recently created a new DBMS called F1 for its business-critical AdWords application. Google implemented a version of Oracle table clusters in order to avoid the join penalty. Here is a quote from Google’s paper “F1: A Distributed SQL Database That Scales”:
“In recent years, conventional wisdom in the engineering community has been that if you need a highly scalable, high-throughput data store, the only viable option is to use a NoSQL key/value store, and to work around the lack of ACID transactional guarantees and the lack of conveniences like secondary indexes, SQL, and so on. When we sought a replacement for Google's MySQL data store for the AdWords product, that option was simply not feasible: the complexity of dealing with a non-ACID data store in every part of our business logic would be too great, and there was simply no way our business could function without SQL queries. Instead of going NoSQL, we built F1, a distributed relational database system that combines high availability, the throughput and scalability of NoSQL systems, and the functionality, usability and consistency of traditional relational databases, including ACID transactions and SQL queries. Google’s core AdWords business is now running completely on F1. F1 provides the SQL database functionality that our developers are used to and our business requires. Unlike our MySQL solution, F1 is trivial to scale up by simply adding machines.” (http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/41344.pdf)
The NoSQL camp put performance, scalability, and reliability front and center but lost the opportunity to take the relational model to the next level because—just like the relational camp—it mistakenly believed that normalization dictates physical storage choices, that non-relational APIs are forbidden by the relational model, and that “relational” is synonymous with ACID (Atomicity, Consistency, Isolation, and Durability).
The NoSQL camp created a number of innovations: functional segmentation, sharding, replication, eventual consistency, and schemaless design. Since these innovations are compatible with the relational model, they should eventually be absorbed by mainstream database management systems.
Finally, I should point out that there are very good reasons to criticize current NoSQL products, including a lack of standards; primitive feature sets, security, and management tools; unproven claims; and traps for the unwary. MongoDB uses a database-wide lock for reads and writes (https://web.archive.org/web/20150316063207/http://docs.mongodb.org/manual/faq/concurrency/). #nuffsaid
Excerpted from the February 2015 issue of the NoCOUG Journal. Please leave your comments below or email me at email@example.com.
“As coroner I must aver,I thoroughly examined her.And she’s not only merely dead;She's really most sincerely dead.Then this is a day of independenceFor all the munchkins and their descendants (if any).Yes, let the joyous news be spreadThe wicked old witch at last is dead!”—From Wizard of Oz (YouTube video)
Copyright © 2015 Iggy Fernandez
3/14/2015 update: This article is based on a twelve-part series of blog posts called “The Twelve Days of NoSQL” written in December 2013. Thanks to A. G. for informing me that, in November 2014, MongoDB began supporting document-level locking for writes when using the optional Wired Tiger storage engine and collection-level locking when using the default MMAPv1 storage engine (one MongoDB database can contain multiple document collections). Prior to that, MongoDB only supported database-level locking for writes. Prior to August 2012, MongoDB only supported instance-level locking for writes (one MongoDB instance can control multiple databases).
Also read www.toadworld.com/.../doom-and-gloom-for-nosql.aspx: everything that NoSQL offers has been offered by Oracle since at least 2001. See en.wikipedia.org/.../Oracle_Database.
Thanks to A. G. for informing me that, as of November 12, 2014, MongoDB supports document-level locking for writes. Prior to November 12, 2014, MongoDB only supported database-level locking for writes. Prior to August 29 2012, MongoDB only supported instance-level locking for writes (one MongoDB instance can control multiple MongoDB databases).