Photo by Alex Kremer from Pexels

Neo4j graph Data Modeling of Star Wars Universe with APOC load json

May 4: Star Wars, One Month Graph Challenge

Vlad Batushkov
4 min readMay 4, 2019

--

Welcome word

In this series of small posts I do one simple graph daily. Domain model of graph somehow related to day’s history, some historical event, celebration or person. I do this challenge to learn Neo4j Data Modeling and Cypher. Every day. One month. Follow me. Maybe you will be inspired and next month would be yours One Month Graph Challenge. #OMGChallenge

Domain model

May the 4th be with you. May the force be with you. This epic frase connect millions of people around the world. Everybody likes Star Wars. Serously. So, today I will try to build small graph of Star Wars. I hope public Star Wars API will help me in my challendge duty.

I faced parsing issue with direct use of https://swapi.co/api/, but this unexpected problem not stops me. Thanks to Florent Georges, who imported all the data to public github repo https://github.com/fgeorges/star-wars-dataset. Man, you saved my day! Finally, I published this json file as standalone resource here: https://vbatushkov.bitbucket.io/swapi.json.

Structure of json file looks like this:

{
"root": {
"people": "http://swapi.co/api/people/",
"planets": "http://swapi.co/api/planets/",
"films": "http://swapi.co/api/films/",
"species": "http://swapi.co/api/species/",
"vehicles": "http://swapi.co/api/vehicles/",
"starships": "http://swapi.co/api/starships/"
},
"people": [{
"url": "http://swapi.co/api/people/1/",
"name": "Luke Skywalker",
"homeworld": "",
"films": [],
"species": [],
"vehicles": [],
"starships": []
}],
"planets": [{
"url": "http://swapi.co/api/planets/3/",
"name": "Alderaan",
"residents": [],
"films": [],
}],
"films": [{
"url": "http://swapi.co/api/films/1/",
"title": "A New Hope",
"characters": [],
"planets": [],
"starships": [],
"vehicles": [],
"species": []
}],
"species": [{
"url": "http://swapi.co/api/species/5/",
"name": "Hutt",
"homeworld": "",
"people": [],
"films": []
}],
"vehicles": [{
"url": "http://swapi.co/api/vehicles/4/",
"name": "Sand Crawler",
"pilots": [],
"films": []
}],
"starships": [{
"url": "http://swapi.co/api/starships/5/",
"name": "Sentinel-class landing craft",
"pilots": [],
"films": []
}]
}

I left only important properties inside entities: url is used as a unique id, name (title) used as representation field and arrays must help to connect to other entities. Let’s overview upcoming labeled nodes from this data and all the relationships between them.

Labels: Film, Character, Planet, Species, Vehicle, Starship.

Relationships: Films have links to all other types, so every node potentially have relationship to film (*)-[:APPEARED_IN]->(:Film). Planet is a homewolrd for Species and Character, so relationship would be (:Species)- AND (:Character)-[:HOMEWORLD]->(:Planet). Character is one of some Species, lead to (:Character)-[:OF]->(:Species). Also Character can pilot different transport, so (:Character)-[:PILOT]->(:Starship) AND ->(:Vehicle). Here is the schema, that I expect to build:

Graph

As a small training, list all characters involved in Star Wars saga:

To simplify flow, first I only create nodes without relationships. Main nodes of the saga, includes filmes, characters and planets:

Then rest of the world, includes species, vehicles and starships:

Cool! But nodes without relationships still not a graph. Time to connect all the nodes, based on links they have.

Connect characters, planets and scecies with films they are appeared in:

Connect vehicels nd starships with films they are appeared in:

Some nodes still not connected. Why? Actually, json file have more things, than appeared in movies. It is easy to see by example of Ojom planet. Planet not present in any film, but it is a homeworld for Dexter Jettster. This relation will be added a bit later.

Planet
{
"name": "Ojom",
"residents": ["http://swapi.co/api/people/71/"],
"films": [],
"url": "http://swapi.co/api/planets/55/"
}
Character
{
"name": "Dexter Jettster",
"homeworld": "http://swapi.co/api/planets/55/",
"films": ["http://swapi.co/api/films/5/"],
"species": ["http://swapi.co/api/species/31/"],
"vehicles": [],
"starships": [],
"url": "http://swapi.co/api/people/71/"
}

First, build all character-related relationships:

What is left? Right, species to planet as a homewolrd relationship:

All Humans of Star Wars with some other relations.

I think a bit about data, and have an ideas to add labels like Transport to all nodes of vehicles and starships, and Hero — for only main characters of saga. Suggest your ideas in comments. Maybe we can improve and make something really fun. But for now, I will left it without changes.

--

--

Vlad Batushkov

Engineering Manager @ Agoda. Neo4j Ninja. Articles brewed on modern tech, hops and indie rock’n’roll.