I am trying to understand how good is Apache Parquet for
- Data storage format (when you DO NOT have a Hadoop; only on your local computer)
- How big is the size?
- How reliable is it?
- Query-able format
- Do I have to index first? (Probably unique indices are not possible?)
- Speed?
- Resource usage?
As far as I understand, Parquet may not be good for frequent writes or updates; but is it good enough for a static database?
You can compare to the always popular SQLite, as a benchmark; disregarding SQLite features, such as foreign keys, unique indices, full text search and multiple tables.
BTW, I have seen SQLite file size goes to 700 MB for a few megabytes for final CSV data, and not sure if it is reliable as a storage anymore...
Top comments (0)