The purpose of this blog is to provide you some clarification about how to choose perfect database for your projects.
Well In terms of data engineering, data pressure is the ability of the system to process the amount of data at a reasonable cost or at a reasonable time. Thus, data pressure plays a very crucial role while working on highly scalable projects.
So, let's see what are the major differences between them:
SQL: Optimized for Storage
NoSQL: Optimized for Compute/QueryingSQL: Normalized/relational
NoSQL: De-normalized (Unnormalized)/Hierarchical
SQL: Table based data structure
NoSQL: Depending on DBs, the data structures are …
★ Key-Values(DynamoDB, Radis, Voldemort)
★ Wide-column i.e. containers for rows(Cassandra, HBase)
★ Collection of Documents(MongoDB, CouchDB, DynamoDB)
★ Graph Structures(Neo4J, InfiniteGraph)SQL: Scale Vertically & Expensive. Can Scale Horizontally but challenging & time-consuming
NoSQL: Scale Horizontally & CheapSQL: Fixed schema, altering requires modifying the whole database
NoSQL: Schemas are dynamicSQL: Good for OLAP
NoSQL: Good for OLTP at scaleSQL: ACID(Atomicity, Consistency, Isolation, Durability) properties
NoSQL: BASE(Basically Available, Soft state, Eventual consistency) properties
When to choose NoSQL?
For our application service, when it comes down to
✔ Well-known and well-understood types of access patterns
✔ Want simple queries
✔ Not much data calculation involved
✔ Have a common business process
✔ OLTP apps
If all the above mentioned conditions are required, then NoSQL is a perfect Database and would be most efficient. We have to structure the data model specifically to support the given access pattern.
When NOT to choose NoSQL?
If our application service has the requirements to support
✔ Ad-hoc queries. e.g. bi analytics use case or OLAP application
✔ May require “reshaping” the data
✔ Complex queries, inner joins, outer joins, etc.
✔ Complex value calculations
then we have to prefer SQL instead of NoSQL as it will be much more efficient as NoSQL.
So basically, if we know about our access patterns and scalability is such a big factor for your application then NoSQL is perfect choice from all sides.
Top comments (4)
One issue I've always had with nosql databases is I was never sure how to address "change" effectively. With RDBMS, there's minimal data redundancy as any shared object will be referenced using a key. Want to change that shared object? Add new properties? No problem.
But with nosql it seems the popular thing to do is to denormalize data, so any sort of change would require extra scripts to make sure the changes are propagated across everything?
That's correct. Instead of relying on the data store to manage this, your application needs to instead.
Schema changes can be especially tricky if you have a high record volume that needs updating.
Yeah am agree with Brandin Chiu. RDBS are not designed to handle changes. Today, change occurs frequently, and data modeling is a huge challenge because of the time and resources that relational databases require. Unfortunately, when using a relational database, even a simple change like adding or replacing a column in a table might be a million dollar task. RDBMS can not handle 'Data Variety'. The amount of , Whereas in Cassandra (a NoSQL database), you can add a column to specific row partitions. For every change you make, you should ensure strict ACID properties.
To be clear: I'm referring to schema changes being tricky in nosql solutions, not relational systems.