How the row-DAG execution engine works inside Hypertab
A deep look at the per-row dependency graph that runs every smart column in topological order. Cross-row parallelism, cascade, and why tables win at scale.
TL;DR. A Hypertab table with smart columns is a directed acyclic graph. Each smart column is a node. Every row flows through the graph in topological order. Columns in the same dependency layer run in parallel. Cross-row parallelism is plan-gated. A single edit cascades only to dependent cells. This post walks through how the engine is built, what trade-offs we made, and why the table shape beats a workflow graph once you cross a few thousand rows.
The problem with workflow graphs at scale
If you have ever built something in n8n, Zapier, or Make, the model is familiar. You draw a graph. Each node is a step. The node fires, passes output to the next node, and the graph walks forward. For one execution that is perfect. For ten thousand it is not.
Every workflow execution pays the full setup cost. Node by node. Retry by retry. Queue by queue. Running fifty thousand rows through a five step pipeline is fifty thousand separate walks, and each walk has its own error surface. You end up babysitting runs, resuming from the middle, and paying per step regardless of whether the work is identical across rows.
The table shape is different. The rows are the iteration. The columns are the steps. The engine plans once, runs many, and the retry surface collapses to a single cell rather than a whole run. That is the core bet behind Hypertab.
Columns that do work
Every column in Hypertab has a kind field. Eight kinds exist today.
staticholds data. Text, number, date, anything.airuns a prompt per row against a model of your choice.httpmakes an HTTP call per row and extracts a field.formulacomputes from other columns.integrationpushes a row to an external service.waterfalltries sources in order and uses the first match.lookupdoes a cross table VLOOKUP.extractpulls a field from an upstream JSON result at zero external cost.
The last one matters more than it looks. Once you have an HTTP column that returns a JSON blob, an extract column pulls any field out of that blob without a second API call. Most tables end up with one or two HTTP columns and many extracts. That is how the op cost stays flat even when the schema is wide.
The dependency graph
Smart columns reference other columns inside their config. An AI prompt can say Summarize {{website_html}} in one sentence. An HTTP URL can template https://api.crunchbase.com/v4/companies/{{domain}}. A formula can add two numbers. Every time you save a smart column, the engine parses every config field, pulls out the {{column_name}} tokens, and records them as edges.
The edges live in a system table called _ht_column_dependencies. One row per edge. Source column, target column, table id, created at. A small table that changes only when columns change, which is rare. We index by target column id because the common read is “what depends on this column”, used during cascade and validation.
Two dep sources feed the graph.
- Implicit. Anything between
{{ }}is an edge. You do not have to declare anything. This covers roughly 95% of real columns. - Explicit.
config.depends_on: ["col_name"]lets a formula reference a bare identifier without double braces. Rare, but needed for some formula shapes.
Extract columns are a special case. They always depend on their source_column_id. The DAG builder wires that automatically so you cannot forget.
Cycle detection and the DAG_CYCLE_DETECTED error
The moment a graph lands, we topo sort. Kahn’s algorithm. In degree map, queue the zero degree nodes, pop and decrement, repeat. If the sort visits fewer nodes than the graph has, there is a cycle. We find the cycle path by walking back from the unvisited set and return it in the error message.
DAG_CYCLE_DETECTED: Column "score" depends on "rank" which depends on "score".
Cycle: score -> rank -> score.
Suggestion: break the cycle by removing one of the {{template}} references.
We reject the column save. You never ship a broken graph. This matters because a cycle in a smart column table is not just incorrect, it is infinite work.
Unknown references and fuzzy matching
If a column references {{emial}} and no column named emial exists, we do not silently drop it. We compute Levenshtein distance against every other column in the table. If the closest match is within 2 edits, we return
DAG_UNKNOWN_REF: Column "enriched_company" references {{emial}}, which does not exist.
Did you mean "email"? (distance 1)
This is the same pattern we use on the REST API. Errors tell you what was attempted, what went wrong, and what to do. The AI agent on the other end reads the suggestion and fixes the prompt without a human round trip.
Topological layers, not a single queue
Once the graph is valid we compute layers. Layer 0 is every node with no deps. Layer 1 is every node whose deps are in layer 0. And so on. Layers are the parallelism boundary. Nodes in the same layer can run at the same time. Nodes in later layers must wait.
For one row this is cheap. You walk the layers and run the columns. For ten thousand rows it is also cheap, because the graph is shared. Every row uses the same layers. You do not recompute anything per row.
The engine caches the compiled graph in a Durable Object keyed by table id. When a column mutates we invalidate the cache, rebuild, and push the new version to any active row run. The row run picks up the new graph on the next layer boundary. Mid run changes are rare but safe.
Executing a row
The function that runs a single row lives in smart-columns/row-engine.ts and is called executeRowDAG. Signature is roughly
async function executeRowDAG(
table: TableRef,
row: Row,
ctx: ProcessorContext,
options: { maxParallelPerRow: number } = { maxParallelPerRow: 6 },
): Promise<RowRunResult>
Inside it walks the layers. For each layer it builds an array of smart column promises. Each promise fetches the column processor (one of ai, http, formula, integration, waterfall, lookup, extract), calls it with the row and the accumulated computed map, writes the result back into computed, and updates the cell state in the database.
for (const layer of layers) {
const chunks = chunk(layer, options.maxParallelPerRow)
for (const group of chunks) {
await Promise.all(group.map((col) => runOneColumn(col, row, ctx)))
}
}
maxParallelPerRow defaults to 6 because Cloudflare Workers cap a single request at 6 concurrent outbound connections. Raising it inside one Worker invocation does nothing useful. For heavier jobs the row runs inside our Fly service, which has no such cap.
The computed map threads downstream. If column A is an HTTP call that returns a JSON body, column B is an extract that pulls company.name from that body, and column C is an AI prompt that templates both, the AI column sees the HTTP body on {{website_json}}, the extracted name on {{company_name}}, and all the original row fields at once. One context object, every value resolved.
Upstream failure propagation
If column A errors, what happens to B and C that depend on A? They are marked skipped with a reason Upstream A did not complete. They do not run. They do not bill ops. They do not add noise to your error log.
This is the part most workflow graphs get wrong. A failed step in n8n often halts the whole run. A failed step in Hypertab halts the subgraph rooted at that step, for that row only, and every other independent column on that row continues. One row with a bad field does not poison the whole batch.
Cross row parallelism
Rows are independent. The engine can run N rows at once, each row walking the graph in topological order. N is plan gated.
- Free: 5 rows in flight per table
- Starter: 10
- Pro: 50
- Scale: 100
- Enterprise: 200
The bucket is global per (provider, account). If two tables share an OpenAI key, the combined row parallelism is shared, so you never exceed the external API limit just because two pipelines both want 50. That detail is what lets us promise “no babysitting” at 50k scale. The rate limiter is a Durable Object, keyed by the provider tuple, with a token bucket and adaptive backoff. On a 429 it shrinks the bucket and slows every consumer. On a quiet interval it grows back.
Cascade on edit
When a human (or agent) edits one cell, we do not rerun the whole row. We find the transitive dependents of the edited column, and rerun only those. The function is cascadeFromEdit in smart-columns/cascade.ts.
Concretely:
- Diff old vs new row, compute the set of changed column ids.
- For each changed id, look up transitive dependents via
_ht_column_dependencies. - Union the dependents. Topo sort. Run the subgraph for that one row.
A column can opt out with config.recompute_on_upstream_change: false. You would do that for an expensive one shot enrichment that should only fire once even if the upstream name is edited by hand. That is a policy choice. Most columns keep the default and rerun.
When the DAG fires
Four triggers exist.
- Row insert. If any smart column on the table has
auto_run: true,afterRowsInsertedfires the full graph for every new row. This is the default for most pipelines. You drop rows in via webhook or MCP, and they enrich themselves. - Cell edit. The cascade above.
- Manual column run.
hypertab_run_columnenqueues a run for one smart column against specified rows. Useful for retries. - Manual table run.
hypertab_run_dagruns the whole graph over specified rows, or every row if none are specified. Useful for full refreshes.
All four go through the same row engine. No separate code paths. No “retry logic” that works differently from “first run logic”. That bought us a lot of correctness.
Why we wrote it this way
We considered three alternatives before landing here.
- Per column queues. Every smart column gets its own queue and workers. Simple. Fails on cross column data flow because downstream workers cannot see upstream output without a database round trip per step.
- Per row workflow engine. Think Temporal. Rich, but overkill for a stateless batch, and every row pays workflow overhead. We measured 100ms to 300ms per row in engine overhead alone. Unacceptable at 50k rows.
- Row DAG with shared graph compile. What we shipped. Compile once per table, walk once per row, cache across runs. Engine overhead is roughly 3ms per row. Most of the per row latency is the external API call, which is the part you actually want to spend time on.
Where we are going
The next step is partial row execution. Today a DAG run touches every smart column layer. Once we ship partial runs you will be able to say “run just this column and its descendants for these rows”, which closes the last gap where a workflow graph beats us: interactive exploration of a single pipeline stage.
After that, branch merges. If two HTTP columns can produce the same field, a merge node picks the first non null. Today you would model that as a waterfall column, which works but is heavy. A lightweight merge will make waterfalls optional for the simple case.
We will write up both when they ship. Until then, the code is source available on GitHub for anyone who wants to read the real thing.
If you have a pipeline that is too big for Zapier and too rigid for a queue, try Hypertab on the free plan. 5,000 ops a month, no credit card. The graph compiles the same way whether you have ten rows or a hundred thousand.