For further actions, you may consider blocking this person and/or reporting abuse
Read next
Dangerous Linux Configurations You Must Avoid for Secure Systems (Deep Dive)
karthick-dkk -
Week 2 in DevOps: From Linux Basics to Shell Scripting
Aman Deol -
GenAI in the Field of Data Cleansing: First Steps
Dr. Malte Polley -
Globally Replicated Services for the Rest of Us
Jonas Scholz -
Top comments (4)
Sena,
Nice writeup! Have you run into any issues with RDS's eventually consistent nature? I've been thinking about using it, but I'm not sure how to deal with the lack of a consistency guarantee.
– Evan
Evan,
I might be misunderstanding your question, but let me provide 2 answers:
Granted, these are operational safeguards, and there could be scenarios when these might fail us. But this is our first pass. Let me know if I didn't answer your question fully, and I'm happy to jump back on this thread!
Sena,
I was referring to the eventual consistency on the slave DB replication. This clears things up. Thanks for the detailed response!
– Evan
"Our particular data schema didn't lend itself to easily use the lowest tier Redshift instances. This meant a 40x price increase for a cluster using the next instance size up."
That's... no. I've been using Redshift since it was released, and I'm here to help.
If you're able to run your reporting workload on MySQL at all, you could almost certainly run it on one (1) dense-storage Redshift node for $.85/hr. If for some reason your data set is smaller but your compute needs more intense, a small cluster of 4-8 dense-compute nodes (at $.25/hr each) would work. There is no way you would ever need either of the XL node types.
What you say about the schema not fitting suggests you got the wrong idea about how hard it is to tune data distribution in Redshift. It's not that difficult. For a typical workload involving large facts and small dimensions, just set everything to EVEN and you'll be fine.
It may ultimately be fine to run this workload on MySQL. If that's what your team has capability with, great. And since MySQL finally added hash joins, it's not literally impossible to run serious analytic queries on it, like it used to be. Just recognize that it's not the right tool for the job, and you're giving up a ton of functionality (and potentially performance) compared to Postgres, Redshift, Greenplum, Vertica ($$), or (blech) Oracle.