Oscar Statuette Free Image from https://libreshot.com/oscar-statuette/

Use Neo4j Cypher query to analyse Oscar trends based on film reviews

May 16: Oscar, One Month Graph Challenge

Vlad Batushkov

Welcome word

In this series of small posts I do one simple graph daily. Domain model of graph somehow related to day’s history, some historical event, celebration or person. I do this challenge to learn Neo4j Data Modeling and Cypher. Every day. One month. Follow me. Maybe you will be inspired and next month would be yours One Month Graph Challenge. #OMGChallenge

Domain model

Today it is birthday of one of the most famous cinema awards in the world — Oscar. First Oscars ceremony happend exctly this day 90 years ago. In general, May is very important month in USA film industry, actually, because 92 yeras ago in 11 of May the American Academy of Film Arts was founded.

I believe, that Oscar nominees is a mirror world’s trends, so today I want to analyse last ceremony to find out what kind of topics was raised up. Neo4j weaponed by Cypher, APOC and algorithms library must help me again.

Graph

Build a graph of movies, based on Oscar categories such as best picture, best director, best actor in a leading role, best actress in a leading role, best actor and best actress in a supporting roles and many others.

Yes, one of the links actually lead to edit mode. Fixed.

Next step is fill each movie node with some review, that we can understand topic of the movie. Also I try play with keywords, so I can use this property to experiment on similarity and accumulating some trends.

I want to experiment on intersection of keywords, and if I am luck, I will found some same topic, rised in several movies.

As you can see some movies share same keywords in review. Let’s check top 20 moves shared same keywords. To prevent many useless word matching, I try to filter out some words out with my own black list.

Not clear, but some things we can see from movies and topics we have in Oscar this year. But still not that cool, as I want. Also all this filtering of “valid” words confuse me too.

Resume

Text analysis is a bit out of scope of just querying. Do I need to have more keywords? I don’t think so. Problem is keywords quality, not size of words list. I need a movie plot, not a review and maybe then I will see better results.

Resources

Sign up to discover human stories that deepen your understanding of the world.

Vlad Batushkov
Vlad Batushkov

Written by Vlad Batushkov

Engineering Manager @ Agoda. Neo4j Ninja. Articles brewed on modern tech, hops and indie rock’n’roll.

No responses yet

Write a response