Architecture
Teide JS is a three-layer system: a TypeScript API that developers interact with, a C++17 NAPI addon that bridges JavaScript and native code, and a vendored C17 columnar engine that performs all computation.
Three-Layer Overview
┌─────────────────────────────────────────────┐
│ TypeScript API (lib/) │
│ Context, Table, Query, Expr, Series │
├─────────────────────────────────────────────┤
│ C++17 NAPI Addon (src/) │
│ NativeContext, NativeTable, NativeSeries │
│ TeideThread, SPSC queue, ExprNode compiler │
├─────────────────────────────────────────────┤
│ C17 Core Engine (vendor/teide/) │
│ Columnar storage, DAG executor, optimizer │
│ Buddy allocator, symbol table, graph ops │
└─────────────────────────────────────────────┘
Project Structure
teide-js/
├── lib/ # TypeScript API layer
│ ├── context.ts # Entry point; loads .node addon
│ ├── query.ts # Lazy query builder with operation stack
│ ├── expr.ts # Expression tree (col refs, literals, ops, aggs)
│ ├── table.ts # Table + GroupBy wrappers
│ └── series.ts # Column accessor with dtype-aware TypedArrays
├── src/ # C++17 NAPI addon layer
│ ├── addon.cpp # Module init, exports collectSync/collect
│ ├── context.cpp # NativeContext: CSV I/O, SQL dispatch
│ ├── query.cpp # Expression serialization, plan compilation
│ ├── table.cpp # NativeTable: column access, retain/release
│ ├── series.cpp # NativeSeries: zero-copy TypedArray creation
│ ├── teide_thread.h # Background thread + SPSC work queue
│ └── compat.h # C-atomic shim for C++/C17 interop
├── vendor/teide/ # Vendored C17 core engine
│ └── include/teide/td.h # Public API + type/opcode definitions
├── test/ # Vitest test suite
│ ├── smoke.test.ts
│ ├── table.test.ts
│ ├── expr.test.ts
│ └── fixtures/ # CSV test data
├── CMakeLists.txt # Native build configuration
├── binding.gyp # node-gyp binding
└── tsconfig.json # TypeScript configuration
SQL Pipeline
SQL statements go through a multi-stage pipeline before results reach JavaScript:
SQL string
│
▼
┌──────────────────┐
│ PGQ Pre-parse │ Extracts GRAPH_TABLE / MATCH clauses
└────────┬─────────┘ and rewrites them into internal form
│
▼
┌──────────────────┐
│ Parse │ node-sql-parser produces an AST
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Plan │ AST → PlanStep vector (filter, project,
└────────┬─────────┘ group, sort, limit, join, etc.)
│
▼
┌──────────────────┐
│ Execute │ PlanSteps run against the Teide engine
└────────┬─────────┘ on the dedicated Teide thread
│
▼
Table / null
For the fluent query API (table.filter().sort().collectSync()), the pipeline skips the SQL parsing stage. Instead, the TypeScript Expr objects are serialized directly to C++ ExprNode trees, and the operation stack becomes the PlanStep vector.
Threading Model
A key design constraint: the V8 thread must never call Teide C APIs directly. All native operations run on a dedicated Teide thread that owns the C heap. Communication between the two threads uses a lock-free SPSC (single-producer, single-consumer) queue.
┌──────────────┐ SPSC Queue ┌──────────────┐
│ V8 Thread │ ──── work item ──▶ │ Teide Thread │
│ (main) │ │ (background) │
│ │ ◀── result ─────── │ │
└──────────────┘ cond_var / tsfn └──────────────┘
Synchronous Path: dispatch_sync()
The V8 thread posts a work item to the SPSC queue, then blocks on a condition variable. The Teide thread picks up the item, executes the operation, writes the result, and signals the condition variable. The V8 thread wakes up and returns the result to JavaScript.
// This call blocks until the Teide thread completes
const table = ctx.readCsvSync('data.csv');
Asynchronous Path: dispatch_async()
The V8 thread posts a work item and immediately returns a Promise. The Teide thread picks up the item, executes the operation, and uses a napi_threadsafe_function to schedule the Promise resolution back on the V8 thread's event loop.
// This returns immediately; work happens on the Teide thread
const table = await ctx.readCsv('data.csv');
Shutdown
When ctx.destroy() is called (or Symbol.dispose triggers), the V8 thread posts a sentinel work item. The Teide thread recognizes the sentinel, cleans up all native resources (td_pool_destroy, td_sym_destroy, td_heap_destroy), and exits. This ensures deterministic cleanup even if JavaScript garbage collection hasn't run.
Zero-Copy Data Access
When you access a column via table.col('name').data, no data is copied. The NativeSeries C++ class uses napi_create_external_typed_array to expose the C heap memory directly as a JavaScript TypedArray.
C Heap Memory JavaScript
┌──────────────┐ ┌──────────────────┐
│ float64[1000]│ ───▶ │ Float64Array │
│ (td_col_t) │ │ .buffer points │
│ │ │ to C heap │
└──────────────┘ └──────────────────┘
▲
│ No copy — same memory
Use-After-Free Safety
A subtle hazard: JavaScript garbage collection may run Series destructor code after the Teide heap has been torn down during shutdown. The heap_alive_ atomic flag prevents this:
- When the context is alive,
heap_alive_istrue. Series destructors calltd_release()normally. - During shutdown,
heap_alive_is set tofalsebeforetd_heap_destroy()runs. - If GC triggers a Series destructor after shutdown, it checks
heap_alive_and skips thetd_release()call, avoiding a use-after-free crash.
Operation Graph
The Teide engine represents computations as a directed acyclic graph (DAG) of operations. Each node in the DAG produces a column of data. The graph structure enables:
- Common subexpression elimination: If two expressions reference the same column with the same transform, the computation runs once.
- Operator fusion: Element-wise operations (add, multiply, compare) on adjacent nodes are fused into a single pass over the data, reducing memory bandwidth.
- Lazy evaluation: The graph is only executed when results are materialized (on
collectSync()orcollect()).
Example: col('price').mul(lit(1.1)).gt(lit(100))
gt
/ \
mul lit(100)
/ \
col('price') lit(1.1)
Fused into a single vectorized pass:
for each row: result[i] = (price[i] * 1.1) > 100
Memory Model
The C17 core engine uses a custom memory allocator designed for columnar workloads:
| Component | Purpose |
|---|---|
| Buddy Allocator | Manages large power-of-2 block allocations for column data. Minimizes external fragmentation. |
| Slab Cache | Fast fixed-size allocations for internal metadata structures (graph nodes, hash entries). |
| Thread-Local Heaps | Each thread gets its own heap for allocation-heavy paths, avoiding contention on a global lock. |
| Reference Counting | td_retain() / td_release() manage object lifetimes. When the refcount hits zero, memory is returned to the allocator. |
The TypeScript layer never allocates or frees native memory directly. The NAPI addon calls td_retain() when wrapping C objects and td_release() when JavaScript garbage-collects the wrapper.
Build Process
Building teide-js involves two compilation stages:
┌─────────────────────────────────────────────────┐
│ 1. CMake compiles vendor/teide/ (C17) │
│ → static library libteide.a │
├─────────────────────────────────────────────────┤
│ 2. node-gyp links src/*.cpp (C++17) + libteide │
│ → build/Release/teidedb_addon.node │
├─────────────────────────────────────────────────┤
│ 3. tsc compiles lib/*.ts │
│ → dist/*.js + dist/*.d.ts │
└─────────────────────────────────────────────────┘
C++ Header Inclusion Order
The src/compat.h header provides a C-atomic shim so that C17 Teide headers (which use _Atomic(T)) compile in C++ mode. This shim redefines _Atomic(T) as volatile T with GCC builtins. To avoid conflicts with the C++ <atomic> header:
- NAPI headers (
<napi.h>) must be included beforecompat.h. - C++ standard headers (
<string>,<vector>, etc.) must be included beforecompat.h. compat.his included last, immediately before any#include <teide/td.h>.
Build Commands
# Full build: native addon (debug) + TypeScript
npm run build
# Native addon only (debug)
npm run build:native
# Native addon with -O3 optimizations
npm run build:native:release
# TypeScript compilation only
npm run build:ts
# Clean build artifacts
npm run clean
Requirements: CMake 3.15 or later, a C17/C++17 capable compiler, and Node.js 18 or later.