Redis was being referenced in the CLI as a cache store.
I foresaw the issue of caching data that needs to be fanned out to many sub algorithms.
This might be overkill.
Instead, we can use UNLOGGED TABLES in postgres as a psuedo KV store. This would enable the kind of ETL flows that Orca should support.
For example:
- Root window triggers an algorithm that streams in lots of data. It streams this data to a
UNLOGGED TABLE in PG. Perhaps it does this via a Data Function (#107).
- Then, it fires off lots of child windows that process on this data, offering a pointer to the specific region of data for triggering algorithms
- These algorithms run in the usual process, but get their data from the PG store.
- Each window DAG, can clean up it's segment once consumed by child algorithms (complex requires some thought - might not be possible)
This process replaces the need for a costly KV store when the PG store is probably underutilised anyway.
The load parameters for this problem now become:
- Network traffic between algorithms and store. Typically 0 if the environment is configured properly.
- Read / Write ops per second
Redis was being referenced in the CLI as a cache store.
I foresaw the issue of caching data that needs to be fanned out to many sub algorithms.
This might be overkill.
Instead, we can use
UNLOGGED TABLESin postgres as a psuedo KV store. This would enable the kind of ETL flows that Orca should support.For example:
UNLOGGED TABLEin PG. Perhaps it does this via a Data Function (#107).This process replaces the need for a costly KV store when the PG store is probably underutilised anyway.
The load parameters for this problem now become: