DataWarehousing
From pgwiki
WARNING: This page has been migrated to the PostgreSQL Wiki. Please do not edit this page or your changes may be lost!
Contents |
[edit]
Data Warehousing Features
- On-disk Bitmap Index (anyone game to finish GP patch?)
- Error Handling in COPY
- Parallel Query
- Windowing Functions
- MERGE
- Parallel Index Build
Note that there's much overlap between here and the Simon Riggs' Development Projects planning.
[edit]
On-Disk Bitmap Index
[edit]
Error Handling in COPY
Handled OK by pg_loader, so lower priority
[edit]
Parallel Query
2 main kinds are single-node and multi-node parallelism Fairly easy to get something working to improve SeqScan performance, but will be more difficult to get something working that applies further up into the executor. Challenges are
- Planner changes
- How to manage the pool of query slaves
- Deciding which parts of the data are accessed by which slave
Notably the last two aren't an issue at all in most multi-node parallel architectures, since there is one query slave per node, plus each node operates only on the data that has been statically partitioned to it, often using a hash partitioning scheme.
[edit]
Windowing Functions
SQL:2003 feature
[edit]
MERGE
SQL:2003 feature
[edit]
Parallel Index Build
Josh Berkus: not sure how this works exactly, but it speeds Oracle up considerably

