$ cat blog/neo4j-shortest-path-dijkstra.md
Find the Shortest Path and Go Home -- Neo4j + Dijkstra
Originally published on speedydev.pl in 2017 as part of the Daj Się Poznać blogging contest. Republished here for archival purposes.
The “magic command” Neo4j gives you out of the box:
MATCH (s:Point), (e:Point),
p = shortestPath((s)-[*]-(e))
WHERE s.name = 'JASŁO' AND e.name = 'RZESZÓW'
RETURN p
Returns the shortest path by number of relationships, not actual distance. That’s not what we need.
Building Distance-Weighted Edges Between Points
The railway model has Points (stops) connected through Segments (track sections). To route by distance, I needed direct Point-to-Point connections with a distance property.
MATCH (s:Point)-[r:Stay]-(t:Segment)
WHERE r.lineNo = t.lineNo
WITH DISTINCT s, r.km AS km, t.lineNo AS lineNo
ORDER BY km
WITH
collect({s: s, km: km, lineNo: lineNo}) AS one,
collect({s: s, km: km, lineNo: lineNo}) AS two
UNWIND one AS ss
WITH ss, filter(
next IN two
WHERE ss.lineNo = next.lineNo
AND next.km >= ss.km
AND NOT next.s.name = ss.s.name
)[0] AS nss
WITH ss.s AS fi, nss.km - ss.km AS distance, ss.lineNo AS lineNo, nss.s AS se
WHERE se IS NOT NULL
MERGE (fi)-[:Connect {distance: distance, lineNo: lineNo}]-(se)
Getting Cypher-fluent took a while here — open to better approaches.
Actual Shortest Path with Dijkstra via APOC
With weighted edges in place, sum distances using reduce and pull the minimum:
START startNode=node:node_auto_index(name="Start"),
endNode=node:node_auto_index(name="Finish")
MATCH p=(startNode)-[:NAVIGATE_TO*]->(endNode)
RETURN p AS shortestPath,
reduce(distance=0, r IN relationships(p) : distance + r.distance) AS totalDistance
ORDER BY totalDistance ASC LIMIT 1;
Don’t run this on more than ~100 nodes. It’s a Cartesian product under the hood — on 25k Points and 60k Segments it just dies. Lesson learned after 5 minutes of staring at a frozen query wondering what went wrong.
The right tool: neo4j-apoc-procedures
which ships Dijkstra as a plugin:
MATCH (s:Point {name: 'JASŁO'}), (e:Point {name: 'RZESZÓW'})
CALL apoc.algo.dijkstra(s, e, 'Connect', 'distance') YIELD path, weight
RETURN path, weight
That’s the real shortest path.
Excluding Stations
What if you can’t route through a specific station (e.g., Czudec is closed)? My quick solution: copy the relevant relationships excluding that node, run Dijkstra on the temp set, then clean up.
MATCH (s:Point)-[c:Connect]-(t:Point)
WHERE s.name <> 'Czudec' AND t.name <> 'Czudec'
MERGE (s)-[:Connect_tempWOCzud {distance: c.distance, lineNo: c.lineNo}]-(t)
Copying ~2k relationships (minus the ~2 Czudec had) took ~450ms — acceptable. The fun part: the alternative route around Czudec adds 100km to the journey. That’s just the map we have.
MATCH (s:Point {name: 'JASŁO'}), (e:Point {name: 'RZESZÓW'})
CALL apoc.algo.dijkstra(s, e, 'Connect_tempWOCzud', 'distance') YIELD path, weight
RETURN path, weight

Data Quality Fun

While exploring alternative paths, I discovered a gem: a node called “Granica Państwa” (State Border) with distance 0 that effectively teleports you from Zgorzelec to Cieszyn. The dataset has its quirks.
Next step: build a proper .NET service to handle transactions, exclusions, and routing in one go. Then — actually go home.