
Use Neo4j Cypher query to analyse Oscar trends based on film reviews
May 16: Oscar, One Month Graph Challenge
Welcome word
In this series of small posts I do one simple graph daily. Domain model of graph somehow related to day’s history, some historical event, celebration or person. I do this challenge to learn Neo4j Data Modeling and Cypher. Every day. One month. Follow me. Maybe you will be inspired and next month would be yours One Month Graph Challenge. #OMGChallenge
Domain model
Today it is birthday of one of the most famous cinema awards in the world — Oscar. First Oscars ceremony happend exctly this day 90 years ago. In general, May is very important month in USA film industry, actually, because 92 yeras ago in 11 of May the American Academy of Film Arts was founded.
I believe, that Oscar nominees is a mirror world’s trends, so today I want to analyse last ceremony to find out what kind of topics was raised up. Neo4j weaponed by Cypher, APOC and algorithms library must help me again.
Graph
Build a graph of movies, based on Oscar categories such as best picture, best director, best actor in a leading role, best actress in a leading role, best actor and best actress in a supporting roles and many others.
Yes, one of the links actually lead to edit mode. Fixed.
Next step is fill each movie node with some review, that we can understand topic of the movie. Also I try play with keywords, so I can use this property to experiment on similarity and accumulating some trends.
I want to experiment on intersection of keywords, and if I am luck, I will found some same topic, rised in several movies.

As you can see some movies share same keywords in review. Let’s check top 20 moves shared same keywords. To prevent many useless word matching, I try to filter out some words out with my own black list.

Not clear, but some things we can see from movies and topics we have in Oscar this year. But still not that cool, as I want. Also all this filtering of “valid” words confuse me too.
Resume
Text analysis is a bit out of scope of just querying. Do I need to have more keywords? I don’t think so. Problem is keywords quality, not size of words list. I need a movie plot, not a review and maybe then I will see better results.