It's pronounced Diane. I do data architecture, operations, and backend development. In my spare time I maintain Massive.js, a data mapper for Node.js and PostgreSQL.
How do you see the classical DBA role (robes, incense, occult rites and forbidden languages and all) evolving as development makes inroads on architectural and operational disciplines?
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
I heard you my friend, this is the biggest concern of every DBA nowadays. As the cloud is getting more mature everyday, many of the features are available online, cloud providers like google, oracle, aws azure etc are excelling at their cloud offerings the migration to cloud is inevitable.
Secondly the cost of cloud is way cheaper than maintaining a data center for small and medium sized businesses.
This reduced the administration workload from conventional DBAs so now these people need to find some more work to stay relevant in the company and industry.
Thus the role of "DevOps" is born. DevOps person is responsible for administration, maintenance, optimization, automation, deployment, backups and monitoring. So many DBAs have already adapted this change and became top notch DevOps engineers with vast experience in database administration.
Another route will be to become an architect, design and develop the structure of the database that will be housing terabytes of data yet providing exceptional performance under heavy loads, this commitment is not easy as this requires a deep understanding of RDBMS and the core concepts of databases both relational and NoSQL.
When you transition from SQL to NoSQL, do you need to be quirky enough to execute the desire query and unfollow the conventional RDBMS way? I know a lot of NoSQL(looking at MongoDB) still lacks some querying powers to get the data you want.
In your profile description, you seem to know MongoDB. When dealing with your knowledge in RDBMS, how did you manage to normalize your database using NoSQL?
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
I will try to answer both of your questions in a single comment as the solution will come from the same origin.
When we need to migrate a database from SQL to NoSQL, we start with the transactional systems first. We need to decide where the transactions will be stored, either on the relational database or we are totally dumping SQL database and moving everything to NoSQL.
Once we agree on any one of the aforementioned approaches, we design the NoSQL database solution around it to ingest the data from different sources in the most denormalized form, meaning we need to resolve all the relationships and references to get all the data in one go. Reporting system can be later built upon this NoSQL database deployment.
There is not much support for constraints in NoSQL, however Mongo and other few databases provide the relationship features but I would say they are still primitive as compared to full RDBMS like Oracle, PostgreSQL, MySQL or SQL Server, etc. NoSQL databases provide many APIs and there are language wrappers to ease up development and can sometimes result in very powerful tooling to develop applications quickly.
NoSQL (MongoDB in particular) is extremely fast for reading when we design it correctly, else they can be messy if we not set them up and monitor them correctly.
Just a personal opinion: NoSQL databases are not suitable for smaller projects, they require strong administrative skills to manage multiple instances. Cloud is a good option when choosing to deploy a fully managed database.
What's your opinion on GraphQL, do we need this as part of our toolset or should we look to migrate away from Relational DBs and toward the new world of GraphQL?
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
This is my personal experience, which may be correct or not relative to different circumstances. GraphQL is a technology that compliments the traditional relational databases. They bring advantages for our next generation web platforms and provide a very robust set of tooling for development in large scale web applications.
There is no direct comparison between relational databases and GQL as the relational database are designed to store large sets of data with the capability to identify datasets with relations, whereas GQL is an API that facilitates the development with rapid response methodology with a gigantic backend data warehouse.
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
Your question is a valid concern as we all see the rise of NoSQL databases.
But let's get the facts straight first.
Most of the enterprise applications around the world are built upon RDBMS and have some bit and pieces that rely on NoSQL. Most of the web is using RDBMS along with NoSQL.
SQL Databases are not dying as of the next 20 years because most of the applications still rely on the relationship mechanism and will continue to work on it. NoSQL databases are designed to serve a specific purpose where they excel (BigData, rapid reads, low latency inserts, etc...) or we can say they bring the features that traditional RDBMS lacks. So consider SQL Databases are not going anywhere in the near future.
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
Unfortunately i never had a chance to work with healthcare sector in my career, but i have pretty good idea when it comes to data that grows exponentially.
My first tip would be to divide and sanitize data correctly; meaning, dont store data directly into varchar data type or any other text data type per se, instead use numeric data type where possible. Because indexing on text types as compared to numeric type is way slower in terms of index creation, insertions and reading too.
Secondly normalize at atleast 3NF, because as the data grows, lets say, about ~600GB in a single table, your database wont be less than a nightmare with full of garbage and repeated data, so avoid that and normalize as much as possible.
Try to use partitions as they will certainly ease up your database engine and hardware and will give it a room to breath when there is high traffic on your applications.
Lastly implement data archiving and warehousing to ensure you only keep hot usable data into the primary server instead of all historical data which will only be needed for reporting and analytics.
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
Cloud and Database Lead at iQ Innovation Hub,Pune.
Working actively with AWS,PostgreSQL,Analytics,SQL.
Core volunteer AWS User Group Pune
AWS Certified.
What are the high level design patterns that should be taken into consideration while designing a PostgreSQL for analytical queries? The system would have high reads and seldom writes.
Have more than 10 years of experience in designing, developing and maintaining database systems on diverse platforms such as Oracle, Postgres, MySQL, Mongo, Cassandra, Redis, and SQLServer.
Good question. Designing a database is the most crucial part of any application. One must fully understand the concepts of database management system as well as most of the features of the database engine they have opted for.
As far as the interviews are related, start answering the questions from the most high-level database design concepts then slowly dive into the deeper knowledge you have about the database technology you are comfortable with. Keep it short and explain the design as fluent as possible, don't say anything you don't know about, else the interviewer will ditch you into that rabbit hole and you will be chasing your tail clarifying your statement.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
How do you see the classical DBA role (robes, incense, occult rites and forbidden languages and all) evolving as development makes inroads on architectural and operational disciplines?
I heard you my friend, this is the biggest concern of every DBA nowadays. As the cloud is getting more mature everyday, many of the features are available online, cloud providers like google, oracle, aws azure etc are excelling at their cloud offerings the migration to cloud is inevitable.
Secondly the cost of cloud is way cheaper than maintaining a data center for small and medium sized businesses.
This reduced the administration workload from conventional DBAs so now these people need to find some more work to stay relevant in the company and industry.
Thus the role of "DevOps" is born. DevOps person is responsible for administration, maintenance, optimization, automation, deployment, backups and monitoring. So many DBAs have already adapted this change and became top notch DevOps engineers with vast experience in database administration.
Another route will be to become an architect, design and develop the structure of the database that will be housing terabytes of data yet providing exceptional performance under heavy loads, this commitment is not easy as this requires a deep understanding of RDBMS and the core concepts of databases both relational and NoSQL.
That reminds me.. I need to get my robes back from the cleaners before I finish up this migration or the whole thing will go to hell! :)
When you transition from SQL to NoSQL, do you need to be quirky enough to execute the desire query and unfollow the conventional RDBMS way? I know a lot of NoSQL(looking at MongoDB) still lacks some querying powers to get the data you want.
In your profile description, you seem to know MongoDB. When dealing with your knowledge in RDBMS, how did you manage to normalize your database using NoSQL?
I will try to answer both of your questions in a single comment as the solution will come from the same origin.
When we need to migrate a database from SQL to NoSQL, we start with the transactional systems first. We need to decide where the transactions will be stored, either on the relational database or we are totally dumping SQL database and moving everything to NoSQL.
Once we agree on any one of the aforementioned approaches, we design the NoSQL database solution around it to ingest the data from different sources in the most denormalized form, meaning we need to resolve all the relationships and references to get all the data in one go. Reporting system can be later built upon this NoSQL database deployment.
There is not much support for constraints in NoSQL, however Mongo and other few databases provide the relationship features but I would say they are still primitive as compared to full RDBMS like Oracle, PostgreSQL, MySQL or SQL Server, etc. NoSQL databases provide many APIs and there are language wrappers to ease up development and can sometimes result in very powerful tooling to develop applications quickly.
NoSQL (MongoDB in particular) is extremely fast for reading when we design it correctly, else they can be messy if we not set them up and monitor them correctly.
Just a personal opinion: NoSQL databases are not suitable for smaller projects, they require strong administrative skills to manage multiple instances. Cloud is a good option when choosing to deploy a fully managed database.
What's your opinion on GraphQL, do we need this as part of our toolset or should we look to migrate away from Relational DBs and toward the new world of GraphQL?
This is my personal experience, which may be correct or not relative to different circumstances. GraphQL is a technology that compliments the traditional relational databases. They bring advantages for our next generation web platforms and provide a very robust set of tooling for development in large scale web applications.
There is no direct comparison between relational databases and GQL as the relational database are designed to store large sets of data with the capability to identify datasets with relations, whereas GQL is an API that facilitates the development with rapid response methodology with a gigantic backend data warehouse.
What pitfalls to expect of using a distributed database in production?
Are SQL databases dying?
Should we all move to NoSQL databases?
Your question is a valid concern as we all see the rise of NoSQL databases.
But let's get the facts straight first.
Most of the enterprise applications around the world are built upon RDBMS and have some bit and pieces that rely on NoSQL. Most of the web is using RDBMS along with NoSQL.
SQL Databases are not dying as of the next 20 years because most of the applications still rely on the relationship mechanism and will continue to work on it. NoSQL databases are designed to serve a specific purpose where they excel (BigData, rapid reads, low latency inserts, etc...) or we can say they bring the features that traditional RDBMS lacks. So consider SQL Databases are not going anywhere in the near future.
Have you ever dabbled with healthcare/EMR data? What database tips can you give in storing the kind with a really high growth rate?
Unfortunately i never had a chance to work with healthcare sector in my career, but i have pretty good idea when it comes to data that grows exponentially.
My first tip would be to divide and sanitize data correctly; meaning, dont store data directly into varchar data type or any other text data type per se, instead use numeric data type where possible. Because indexing on text types as compared to numeric type is way slower in terms of index creation, insertions and reading too.
Secondly normalize at atleast 3NF, because as the data grows, lets say, about ~600GB in a single table, your database wont be less than a nightmare with full of garbage and repeated data, so avoid that and normalize as much as possible.
Try to use partitions as they will certainly ease up your database engine and hardware and will give it a room to breath when there is high traffic on your applications.
Lastly implement data archiving and warehousing to ensure you only keep hot usable data into the primary server instead of all historical data which will only be needed for reporting and analytics.
This was so insightful and what I needed! Thank you for the very thorough answer :)
No prob! ;)
what does your typical day look like ?
What do you like most about your job?
what is the most difficult part of your job ?
What does your typical day look like?
What do you like most about your job?
What is the most difficult part of your job?
What are the high level design patterns that should be taken into consideration while designing a PostgreSQL for analytical queries? The system would have high reads and seldom writes.
How to answer database design related questions during interviews?
Good question. Designing a database is the most crucial part of any application. One must fully understand the concepts of database management system as well as most of the features of the database engine they have opted for.
As far as the interviews are related, start answering the questions from the most high-level database design concepts then slowly dive into the deeper knowledge you have about the database technology you are comfortable with. Keep it short and explain the design as fluent as possible, don't say anything you don't know about, else the interviewer will ditch you into that rabbit hole and you will be chasing your tail clarifying your statement.