Use APOC Generate function to build a Neo4j Graph of small Galaxy

May 25: Towel Day, One Month Graph Challenge

4 min readMay 25, 2019

Welcome word

In this series of small posts I do one simple graph daily. Domain model of graph somehow related to day’s history, some historical event, celebration or person. I do this challenge to learn Neo4j Data Modeling and Cypher. Every day. One month. Follow me. Maybe you will be inspired and next month would be yours One Month Graph Challenge. #OMGChallenge

Domain model

Towel Day — a tribute to Douglas Adams. Today all Douglas Adams fans are encouraged to carry a towel with them on this day. Let the towel be visible (make sure that the towel catches the eye) — use it as a topic for conversation, so that even those who never read “The Hitchhiker’s Guide to the Galaxy” go and find a copy. The towel can be wrapped around the head, used as a weapon, soaked with nutrients — anything!

What I want to try today must be fun. My plan is: generate a small graph of Galaxy with some small number of stars and random number of highways to each other. And then, start from the center of this small Galaxy, ride to the farest star at the edge of it. I hope that someone, who drive into same way can helps us with transportation.

Graph

Let’s generate a small Galaxy and name it Towel Galaxy, for example. Is it enough setup 1 million of stars into it? Just to compare, Milky Way Galaxy contains between 100–400 billion stars. Ok, our Towel Galaxy is super-duper small.

MATCH (n) DETACH DELETE n;
CALL apoc.generate.ba(1000000, 1, 'Star', 'HIGHWAY');

This small picture includes about 1000 stars. But even this small amount looks beautiful. Now I need to define what star is a center of Towel Galaxy and label it as a Core.

Before you run next query, please, be sure, that your machine ready to handle Galaxy’s capacity of 1 million stars.

CALL algo.betweenness.stream('Star', 'HIGHWAY') YIELD nodeId, centrality
MATCH (s:Star) WHERE id(s) = nodeId
RETURN s.uuid AS starId, centrality
ORDER BY centrality DESC
LIMIT 1

This query is bad idea. It try to manage tonns of data. I don’t even have results of this Betweenness Centrality algorithm execution. Computations taken an infinity and a little bit more, so I simply cancel it.

But I have idea to another approach.

Let’s define all Edge stars first. All stars with only 1 neighbour.

MATCH (:Star)-[:HIGHWAY]->(s:Star)
WHERE NOT (s)-[:HIGHWAY]->(:Star)
SET s:Edge
RETURN count(s) as numberOfEdges

In our Galaxy 667076 “edge” stars. But not all of them our destination points. Let’s calculate the all longest paths, that exists in the Galaxy. Because in query I use direction of arrow all roads starts from center.

MATCH p = ((s:Star)-[*]->(e:Edge))
RETURN length(p) as depth
ORDER BY depth DESC
LIMIT 10

Now I know, that the longest trip from center of the Galaxy to the Edge we can will visit 20 stars.

Time to find Core of Towel Galaxy. I know the max depth already, I know that Edge is an endpoint, so all I need is just one star within 20 steps from Edge. Let’s mark centeral star as a Core and the edge of the Galaxy edge as an End.

MATCH (s:Star)-[*20]->(e:Edge)
SET s:Core, e:End
RETURN s

Now I know Core and End. Let’s count all longiest trips from Core to Edges. As you remember, one of them have depth of 20 steps and some others with depth of 19. Let’s print out all possible trips from the Core to the Edges of Towel Galaxy.

MATCH (e:Edge)
MATCH trip = ((c:Core)-[*19..20]->(e))
RETURN trip
LIMIT 100

Now we are ready for hitchhiking! Have a nice trip!

Resume

I want to have you suggestions about proper usage of Centrality algorithms in cases like this. How to apply it properly? Any examples are welcomed.

By the way, can you answer now to the Ultimate Question of Life, the Universe, and Everything?

Resources

APOC User Guide 3.5

apoc.path.expandConfig(startNode |Node|list…

neo4j-contrib.github.io

5.3. The Betweenness Centrality algorithm - Chapter 5. Centrality algorithms

This section describes the Betweenness Centrality algorithm in the Neo4j Graph Algorithms library. Betweenness…

neo4j.com

2.9. Patterns - Chapter 2. Syntax

Patterns and pattern-matching are at the very heart of Cypher, so being effective with Cypher requires a good…

neo4j.com

Use APOC Generate function to build a Neo4j Graph of small Galaxy

May 25: Towel Day, One Month Graph Challenge

Welcome word

Domain model

Graph

Resume

Resources

APOC User Guide 3.5

apoc.path.expandConfig(startNode |Node|list…

5.3. The Betweenness Centrality algorithm - Chapter 5. Centrality algorithms

This section describes the Betweenness Centrality algorithm in the Neo4j Graph Algorithms library. Betweenness…

2.9. Patterns - Chapter 2. Syntax

Patterns and pattern-matching are at the very heart of Cypher, so being effective with Cypher requires a good…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Vlad Batushkov

No responses yet

More from Vlad Batushkov

Learn Neo4j Cypher basics in 30 minutes

Practical tutorial for Neo4j graph database beginners, based on Neo4j Workshop in Bangkok.

Codebase Knowledge Graph

.NET graph-based code analysis using Neo4j Database

Breaking the Monolith

Modular redesign of Agoda.com

Find Circular Money Flow with Neo4j

How to use the subgraph technique to detect the circular money flow (money laundering) using a graph database

Recommended from Medium

Neo4j Graph DB

Graph Databases (e.g., Neo4j)

Agentic Mesh: Building Highly Reliable Agents

LLMs are getting overloaded. Specialized LLMs, with deterministic orchestration & an agent architecture offer a more reliable path forward.

Lists

Natural Language Processing

Word document (doc/docx) loader for 🦜🔗 LangChain

Your translation:

How to Create Agentic RAG with Self Evaluation Mechanism

An end-to-end tutorial for developing Agentic RAG

Structured Knowledge Extraction: from DbPedia Queries to Llama Index Knowledge Graphs

Evolution of paradigms

A Unified Machine Learning Framework for Time Series Forecasting

Harness Diverse Algorithms to Improve Predictive Accuracy from Transactional Data