GUIDES

Graph Queries (SQL/PGQ)

Model relationships as property graphs, query them with Cypher-like MATCH patterns, and run graph algorithms — all composable with standard SQL.

Relational joins can express many relationships, but some problems — social networks, dependency trees, fraud rings, knowledge graphs — are naturally expressed as graphs. TeideDB supports SQL/PGQ (Property Graph Queries), an ISO SQL extension that lets you define property graphs over existing tables and query them with pattern-matching syntax, without leaving the SQL world.

Why Property Graphs?

Consider a social network where users follow each other. To find "friends of friends" with relational joins, you need a self-join for each hop: two joins for 2-hop, three for 3-hop, and so on. The SQL becomes verbose and rigid — you cannot easily express "any path up to 5 hops long." Property graphs solve this with pattern matching: you describe the shape of the path you want, and the engine finds all matches. This is the same idea behind Cypher (Neo4j) and SPARQL, but integrated directly into SQL.

Building the Graph

A property graph is defined over existing relational tables. Node tables provide vertices (with properties from columns), and edge tables provide directed relationships. The CREATE PROPERTY GRAPH statement declares which tables serve as nodes and edges, and how they connect via key columns.

import { Context } from 'teide-js';

const ctx = new Context();

// Create node tables
ctx.executeSync(`
  CREATE TABLE person (
    id   INT,
    name TEXT,
    city TEXT
  )
`);

ctx.executeSync(`
  INSERT INTO person VALUES
    (1, 'Alice',   'Seattle'),
    (2, 'Bob',     'Portland'),
    (3, 'Carol',   'Seattle'),
    (4, 'Dave',    'Denver'),
    (5, 'Eve',     'Portland'),
    (6, 'Frank',   'Denver'),
    (7, 'Grace',   'Seattle')
`);

// Create edge table
ctx.executeSync(`
  CREATE TABLE follows (
    src    INT,
    dst    INT,
    since  TEXT
  )
`);

ctx.executeSync(`
  INSERT INTO follows VALUES
    (1, 2, '2023-01'),
    (1, 3, '2023-02'),
    (2, 4, '2023-03'),
    (3, 4, '2023-01'),
    (3, 5, '2023-04'),
    (4, 6, '2023-02'),
    (5, 7, '2023-05'),
    (6, 1, '2023-06'),
    (7, 1, '2023-03')
`);

// Define the property graph
ctx.executeSync(`
  CREATE PROPERTY GRAPH social_net
  VERTEX TABLES (
    person KEY (id)
  )
  EDGE TABLES (
    follows KEY (src, dst)
      SOURCE person (src) DESTINATION person (dst)
  )
`);

Note: The KEY clause specifies which columns uniquely identify nodes and edges. The SOURCE and DESTINATION clauses tell TeideDB how edge endpoints map to node keys.

1-Hop Queries: Direct Connections

The simplest graph query finds direct connections. The MATCH clause uses an ASCII-art syntax to describe the pattern: parentheses for nodes, square brackets for edges, and arrows for direction. This query finds everyone that Alice follows directly.

SELECT dst.name, e.since
FROM GRAPH_TABLE (social_net
  MATCH (src:person)-[e:follows]->(dst:person)
  WHERE src.name = 'Alice'
  COLUMNS (dst.name, e.since)
)
ORDER BY e.since;
const direct = ctx.executeSync(`
  SELECT dst.name, e.since
  FROM GRAPH_TABLE (social_net
    MATCH (src:person)-[e:follows]->(dst:person)
    WHERE src.name = 'Alice'
    COLUMNS (dst.name, e.since)
  )
  ORDER BY e.since
`);

console.log('Alice follows:');
const targets = direct.getColumn('name').toArray();
const dates = direct.getColumn('since').toArray();
targets.forEach((t, i) => console.log(`  ${t} (since ${dates[i]})`) );
Alice follows: Bob (since 2023-01) Carol (since 2023-02)

Multi-Hop Traversal

Variable-length path patterns let you find connections across multiple hops without writing repeated joins. The {1,3} quantifier means "follow between 1 and 3 edges." The + quantifier means "one or more hops" (transitive closure). This is where graph queries become dramatically more concise than equivalent relational SQL.

Variable-length paths: 1 to 3 hops

SELECT DISTINCT dst.name, dst.city
FROM GRAPH_TABLE (social_net
  MATCH (src:person)-[:follows]->{1,3}(dst:person)
  WHERE src.name = 'Alice'
  COLUMNS (dst.name, dst.city)
)
ORDER BY dst.name;
Bob Portland Carol Seattle Dave Denver Eve Portland Frank Denver Grace Seattle

Transitive closure with +

The + quantifier follows edges until no more new nodes are reachable. This computes the full transitive closure — everyone reachable from a starting node, regardless of distance.

SELECT DISTINCT dst.name
FROM GRAPH_TABLE (social_net
  MATCH (src:person)-[:follows]->+(dst:person)
  WHERE src.name = 'Alice'
  COLUMNS (dst.name)
)
ORDER BY dst.name;
Alice Bob Carol Dave Eve Frank Grace

Note: Alice appears in her own transitive closure because there is a cycle: Alice → Bob → Dave → Frank → Alice. TeideDB handles cycles correctly, terminating traversal when it revisits a node.

Shortest Path

Finding the shortest path between two nodes is one of the most common graph operations. SQL/PGQ provides the ANY SHORTEST path qualifier, which returns one shortest path (if multiple exist with the same length). This is useful for routing, influence distance, and degree-of-separation queries.

SELECT path_length, intermediate_nodes
FROM GRAPH_TABLE (social_net
  MATCH ANY SHORTEST
    (src:person)-[:follows]->+(dst:person)
  WHERE src.name = 'Alice' AND dst.name = 'Grace'
  COLUMNS (
    EDGE_COUNT() AS path_length,
    NODE_NAMES() AS intermediate_nodes
  )
);
const shortest = ctx.executeSync(`
  SELECT path_length, intermediate_nodes
  FROM GRAPH_TABLE (social_net
    MATCH ANY SHORTEST
      (src:person)-[:follows]->+(dst:person)
    WHERE src.name = 'Alice' AND dst.name = 'Grace'
    COLUMNS (
      EDGE_COUNT() AS path_length,
      NODE_NAMES() AS intermediate_nodes
    )
  )
`);

const len = shortest.getColumn('path_length').toArray()[0];
const path = shortest.getColumn('intermediate_nodes').toArray()[0];
console.log(`Shortest path (${len} hops): ${path}`);
Shortest path (3 hops): Alice -> Carol -> Eve -> Grace

Graph Algorithms

Beyond pattern matching, TeideDB provides built-in graph algorithms that operate on the property graph structure. These run as SQL functions that return table results, so you can compose them with standard SQL operations like GROUP BY, ORDER BY, and joins.

PageRank: measuring influence

PageRank scores each node based on the number and quality of incoming connections. Nodes pointed to by many other well-connected nodes receive higher scores. This algorithm, originally designed for web page ranking, applies equally well to influence in social networks, importance in citation networks, or criticality in dependency graphs.

SELECT name, score
FROM GRAPH_ALGORITHM (social_net,
  PAGERANK(damping := 0.85, iterations := 20)
  COLUMNS (name, score)
)
ORDER BY score DESC;
const pr = ctx.executeSync(`
  SELECT name, ROUND(score, 4) AS score
  FROM GRAPH_ALGORITHM (social_net,
    PAGERANK(damping := 0.85, iterations := 20)
    COLUMNS (name, score)
  )
  ORDER BY score DESC
`);

console.log('PageRank scores:');
const prNames = pr.getColumn('name').toArray();
const prScores = pr.getColumn('score').toArray();
prNames.forEach((n, i) => console.log(`  ${n}: ${prScores[i]}`));
PageRank scores: Alice: 0.2134 Dave: 0.1876 Carol: 0.1245 Bob: 0.1198 Frank: 0.1102 Eve: 0.0923 Grace: 0.0812

Connected components: finding clusters

The COMPONENT algorithm assigns a component ID to each node. Nodes in the same connected component can reach each other via some path (ignoring edge direction). This is useful for finding isolated subgroups, detecting disconnected clusters, or verifying that a network is fully connected.

SELECT component_id, COUNT(*) AS size, GROUP_CONCAT(name) AS members
FROM GRAPH_ALGORITHM (social_net,
  COMPONENT()
  COLUMNS (name, component_id)
)
GROUP BY component_id
ORDER BY size DESC;
0 7 Alice,Bob,Carol,Dave,Eve,Frank,Grace

Community detection: finding groups

The COMMUNITY algorithm uses modularity-based detection (similar to Louvain) to find densely connected subgroups within a larger graph. Unlike connected components, communities can exist within a single connected component — they represent clusters where internal connections are denser than external ones.

SELECT community_id, COUNT(*) AS size, GROUP_CONCAT(name) AS members
FROM GRAPH_ALGORITHM (social_net,
  COMMUNITY()
  COLUMNS (name, community_id)
)
GROUP BY community_id
ORDER BY size DESC;
0 3 Alice,Bob,Carol 1 2 Dave,Frank 2 2 Eve,Grace

Combining Graph Results with SQL

One of the most powerful aspects of SQL/PGQ is that graph query results are regular SQL tables. You can join them with other tables, aggregate them, filter them, or use them in CTEs. This lets you mix graph analysis with traditional relational queries seamlessly.

-- Find the most influential person in each city
WITH influence AS (
  SELECT name, score
  FROM GRAPH_ALGORITHM (social_net,
    PAGERANK(damping := 0.85, iterations := 20)
    COLUMNS (name, score)
  )
)
SELECT p.city, p.name, ROUND(i.score, 4) AS pagerank
FROM person p
INNER JOIN influence i ON p.name = i.name
WHERE i.score = (
  SELECT MAX(i2.score)
  FROM person p2
  INNER JOIN influence i2 ON p2.name = i2.name
  WHERE p2.city = p.city
)
ORDER BY pagerank DESC;
const cityInfluence = ctx.executeSync(`
  WITH influence AS (
    SELECT name, score
    FROM GRAPH_ALGORITHM (social_net,
      PAGERANK(damping := 0.85, iterations := 20)
      COLUMNS (name, score)
    )
  )
  SELECT p.city, p.name, ROUND(i.score, 4) AS pagerank
  FROM person p
  INNER JOIN influence i ON p.name = i.name
  WHERE i.score = (
    SELECT MAX(i2.score)
    FROM person p2
    INNER JOIN influence i2 ON p2.name = i2.name
    WHERE p2.city = p.city
  )
  ORDER BY pagerank DESC
`);

console.log('Most influential person per city:');
const cities = cityInfluence.getColumn('city').toArray();
const ciNames = cityInfluence.getColumn('name').toArray();
const ciScores = cityInfluence.getColumn('pagerank').toArray();

cities.forEach((c, i) => {
  console.log(`  ${c}: ${ciNames[i]} (PageRank ${ciScores[i]})`);
});

ctx.destroy();
Most influential person per city: Seattle: Alice (PageRank 0.2134) Denver: Dave (PageRank 0.1876) Portland: Bob (PageRank 0.1198)