Quantcast
Channel: Business Intelligence y Big Data: ¡Aprende Gratis sobre Analytics!
Viewing all articles
Browse latest Browse all 866

Citusdata, PostgreSQL Columnar Store for Analytic Workloads

$
0
0
Columnar stores bring notable benefits for analytic workloads where data is loaded in batches

Open source columnar store extension for PostgreSQL and share it with the community! Columnar stores bring notable benefits for analytic workloads, where data is loaded in batches.
This columnar store extension uses the Optimized Row Columnar (ORC) format for its data layout. ORC improves upon the RCFile format developed at Facebook, and brings the following benefits:
  • Compression: Reduces in-memory and on-disk data size by 2-4x. Can be extended to support different codecs.
  • Column projections: Only reads column data relevant to the query. Improves performance for I/O bound queries.
  • Skip indexes: Stores min/max statistics for row groups, and uses them to skip over unrelated rows.
Further, we used the Postgres foreign data wrapper APIs and type representations with this extension. This brings:
  • Support for 40+ Postgres data types. The user can also create new types and use them.
  • Statistics collection. PostgreSQL's query optimizer uses these stats to evaluate different query plans and pick the best one.
  • Simple setup. Create foreign table and copy data. Run SQL.

It's worth noting that the columnar store extension is self-contained. If you're a PostgreSQL user, you can get the entire source code and build using the instructions on our GitHub page (link is external). You can even join columnar store and regular Postgres tables in the same SQL query.

Viewing all articles
Browse latest Browse all 866

Trending Articles