← Back to blog

Find the Shortest Path and Go Home -- Neo4j + Dijkstra

dsp2017 neo4j cypher graphs algorithms database

Originally published on speedydev.pl in 2017 as part of the Daj Się Poznać blogging contest. Republished here for archival purposes.

The “magic command” Neo4j gives you out of the box:

MATCH (s:Point), (e:Point),
  p = shortestPath((s)-[*]-(e))
WHERE s.name = 'JASŁO' AND e.name = 'RZESZÓW'
RETURN p

shortestPath result -- Jasło to Rzeszów by hop count

Returns the shortest path by number of relationships, not actual distance. That’s not what we need.

Building Distance-Weighted Edges Between Points

The railway model has Points (stops) connected through Segments (track sections). To route by distance, I needed direct Point-to-Point connections with a distance property.

MATCH (s:Point)-[r:Stay]-(t:Segment)
WHERE r.lineNo = t.lineNo
WITH DISTINCT s, r.km AS km, t.lineNo AS lineNo
ORDER BY km
WITH
  collect({s: s, km: km, lineNo: lineNo}) AS one,
  collect({s: s, km: km, lineNo: lineNo}) AS two
UNWIND one AS ss
WITH ss, filter(
  next IN two
  WHERE ss.lineNo = next.lineNo
    AND next.km >= ss.km
    AND NOT next.s.name = ss.s.name
)[0] AS nss
WITH ss.s AS fi, nss.km - ss.km AS distance, ss.lineNo AS lineNo, nss.s AS se
WHERE se IS NOT NULL
MERGE (fi)-[:Connect {distance: distance, lineNo: lineNo}]-(se)

Getting Cypher-fluent took a while here — open to better approaches.

Actual Shortest Path with Dijkstra via APOC

With weighted edges in place, sum distances using reduce and pull the minimum:

START startNode=node:node_auto_index(name="Start"),
      endNode=node:node_auto_index(name="Finish")
MATCH p=(startNode)-[:NAVIGATE_TO*]->(endNode)
RETURN p AS shortestPath,
       reduce(distance=0, r IN relationships(p) : distance + r.distance) AS totalDistance
ORDER BY totalDistance ASC LIMIT 1;

Don’t run this on more than ~100 nodes. It’s a Cartesian product under the hood — on 25k Points and 60k Segments it just dies. Lesson learned after 5 minutes of staring at a frozen query wondering what went wrong.

The right tool: neo4j-apoc-procedures GitHub which ships Dijkstra as a plugin:

MATCH (s:Point {name: 'JASŁO'}), (e:Point {name: 'RZESZÓW'})
CALL apoc.algo.dijkstra(s, e, 'Connect', 'distance') YIELD path, weight
RETURN path, weight

That’s the real shortest path.

Dijkstra result in Neo4j

Excluding Stations

What if you can’t route through a specific station (e.g., Czudec is closed)? My quick solution: copy the relevant relationships excluding that node, run Dijkstra on the temp set, then clean up.

MATCH (s:Point)-[c:Connect]-(t:Point)
WHERE s.name <> 'Czudec' AND t.name <> 'Czudec'
MERGE (s)-[:Connect_tempWOCzud {distance: c.distance, lineNo: c.lineNo}]-(t)

Copying ~2k relationships (minus the ~2 Czudec had) took ~450ms — acceptable. The fun part: the alternative route around Czudec adds 100km to the journey. That’s just the map we have.

MATCH (s:Point {name: 'JASŁO'}), (e:Point {name: 'RZESZÓW'})
CALL apoc.algo.dijkstra(s, e, 'Connect_tempWOCzud', 'distance') YIELD path, weight
RETURN path, weight

Graph without Czudec

Data Quality Fun

Railway graph with data issues visible

While exploring alternative paths, I discovered a gem: a node called “Granica Państwa” (State Border) with distance 0 that effectively teleports you from Zgorzelec to Cieszyn. The dataset has its quirks.

Next step: build a proper .NET service to handle transactions, exclusions, and routing in one go. Then — actually go home.