Neo4j Random Walk Algorithm to Traverse Graph of Pubs

May 21: FIFA, One Month Graph Challenge

Vlad Batushkov

--

Welcome word

In this series of small posts I do one simple graph daily. Domain model of graph somehow related to day’s history, some historical event, celebration or person. I do this challenge to learn Neo4j Data Modeling and Cypher. Every day. One month. Follow me. Maybe you will be inspired and next month would be yours One Month Graph Challenge. #OMGChallenge

Domain model

On May 21, 1904, (115 years ago) the International Football Federation was founded in Paris (fr. Federation Internationale de Football Association, abbr. FIFA), the main purpose of which was to unite the national football federations and facilitate the holding of world-class competitions.

Same as the biggest part of men on this planet, I am a football fan. This is why today my small graph would be somehow dedicated to my favorite Premier League.

I am fan of Liverpool. You might know, that one of the most famous song in football “You’ll Never Walk Alone” belongs to this famous football club fans. Plus, I like to watch games in sport pubs. According to all of this, I decide to combine all this facts together and don’t see better opportunaty to try Random Walk algorithm. As described in documentation, it is similar to how a drunk person (football fan) traverses a city.

I think, it would be much more fun, than recalculate how many games Manchester City won in this season. Hate them.

Graph

Welcome to my small town of Graphville, full of pubs. Again. Previous mentioning about Graphville, you can find in post about Pioneriya.

Some pubs in this town belongs to the footballs clubs or just neutral to football. I have never been in England, but I guess, it will match to map of any nearby town of Merseyside.

MATCH (p) DETACH DELETE p;
CALL apoc.generate.er(20, 30, 'Pub', 'ROAD');

I don’t want to see generated map. I will put it here only in the end of article. Visualize the map or at least some map features, based on Random Walk algorithm results — this is the main goal for today.

So, for now, we have 20 pubs connected by 30 roads.

Now, let’s set a property “club” in 10 of them, as supporters of 10 different football clubs, rest 10 gonna be neutral.

WITH ["Liverpool", "Everton", "Tranmere", "Southport", "Marine", "PrescotCables", "Bootle", "CammellLaird1907", "Litherland", "StHelensTown"] as clubs
MATCH (p:Pub)
SET p.club = "neutral"
WITH apoc.coll.randomItems(collect(p), 10, false) as pubs, clubs, range(0, 9) as arr
UNWIND arr as idx
SET (pubs[idx]).club = clubs[idx]
RETURN clubs[idx] as processedClubs

Totally no idea how the town road map look like right now. So, let’s try to ask drunk football fan to walk alone a little bit. Not far. Just 5 times walk to next nearest 2 pubs.

MATCH (p:Pub { club: "Liverpool" })
CALL algo.randomWalk.stream(id(p), 2, 5) YIELD nodeIds
UNWIND nodeIds AS nodeId
RETURN algo.asNode(nodeId).club AS pubOfClub

This is the output of the algorithm. Already explains something about pubs location, but I don’t want to draw manually, so I need to automate my query.

Let’s generate a new DrunkPub node for each next visited pub and connect two DrunkPubs with a DRUNK_ROAD relationship. I also set more pubs to visit and more attempts to walk — it must gives me better picture of pubs location around.

MATCH (p:Pub { club: "Liverpool" })
CALL algo.randomWalk.stream(id(p), 5, 10) YIELD nodeIds
UNWIND nodeIds AS nodeId
WITH { id: nodeId, club: algo.asNode(nodeId).club } AS pubOfClub
WITH collect(pubOfClub) as clubs
CALL apoc.coll.partition(clubs, 6) YIELD value
WITH [value[0], value[1], value[2], value[3], value[4], value[5]] as route
WITH apoc.coll.pairsMin(route) as pubsPairs
UNWIND pubsPairs as pubsPair
MERGE (p1:DrunkPub { id: pubsPair[0].id, club: pubsPair[0].club })
MERGE (p2:DrunkPub { id: pubsPair[1].id, club: pubsPair[1].club })
MERGE (p1)-[:DRUNK_ROAD]-(p2)
Drunk Map (red) VS (green) Original Map

Great job, buddy! Pint of your best ale to this guy with red scarf!

Resume

Interesting topic, is not it? I can image, that based on this approach we can discover graph structure and detect something important. For example, some finding of vulnerabilities or detection of changes between node relationships. Hope you have cool ideas on utilize of this algorithm. Share your thoughts in comments.

Maybe someone want to build the whole city map? What about define proper directions on relationships here? Not sure, is it possible or not… Need to learn more.

And before you leave this page, remember, you will never walk alone.

Similar topics

Unusual use of Neo4j Common Neighbors Algorithm on people community graph

Neo4j Betweenness Centrality Algorithm detailed and fun example

Neo4j kShortestPaths Algorithm apply to City Map

Neo4j Cosine Similarity Algorithm example

Analyse Neo4j Graph of Books domain model with Jaccard Similarity Algorithm

Resources

Sign up to discover human stories that deepen your understanding of the world.

--

--

Vlad Batushkov
Vlad Batushkov

Written by Vlad Batushkov

Engineering Manager @ Agoda. Neo4j Ninja. Articles brewed on modern tech, hops and indie rock’n’roll.

No responses yet

Write a response