Photo by Manasvita S on Unsplash

One Month Graph Challenge Results

May 31: Results

Vlad Batushkov
8 min readMay 31, 2019

Welcome word

In this series of small posts I do one simple graph daily. Domain model of graph somehow related to day’s history, some historical event, celebration or person. I do this challenge to learn Neo4j Data Modeling and Cypher. Every day. One month. Follow me. Maybe you will be inspired and next month would be yours One Month Graph Challenge. #OMGChallenge

How and why

Idea of One Month Graph Challenge coming into my head just before I went to sleep at April 30. You correct, right at next day I opened my laptop and started the challenge. It was small step of my long full month journey.

I just realized, that it might be cool to build one small graph daily related to some historical event. I can learn Cypher querying and try many other things. My wife also like this idea. But I got “green” light and support from her, only after I promised, that it would be just one or two hours per day. Not more. (Later you will see real numbers)

I hadn’t clear vision or predefined conception, what exactly I should do every day. At the same time, it was not so “suddenly” thing, that I fall in love with Neo4j and start blogging about it next day. I already read few Neo4j books (partially). I finished one small online training course at official website. Created one “pet” project with this database. But I never tried write a blog on it or do a lot Cypher daily for such a long period. It feels like a real challenge for me. And it really does.

Today is my last day, I want to conclude about all the graphs, queries, posts and topics covered and learned during this month of Neo4j.

Graph of Graphs

I collected some interesting information from all my graphs and left it as csv file here: https://vbatushkov.bitbucket.io/results.csv

LOAD CSV WITH HEADERS FROM 'https://vbatushkov.bitbucket.io/results.csv' AS line FIELDTERMINATOR ';' 
MERGE (g:Graph { day: date({ year: 2019, month: 5, day: toInteger(line.id) }), topic: line.topic, read: toInteger(line.read), lines: toInteger(line.lines), nodes: toInteger(line.nodes), rels: toInteger(line.rels), build: line.build, algs: coalesce(line.algs, "") })

Don’t rely on Wikipedia or any other website much. HTML getting obsolete very fast. I already found, that load query for May 9: Immortal Regiment not works at all. And I don’t really want to fix it.

My treasure.

Everything in node properties right now. Let’s extract build approach and algorithms.

MATCH (g:Graph)
WITH g, split(g.build, ",") as builds, split(g.algs, ",") as algorithms
WITH g, builds, filter(x IN algorithms WHERE x <> "") as algs
FOREACH (x IN builds |
MERGE (b:Build { name: x })
MERGE (g)-[:CREATED_WITH]->(b)
)
FOREACH (x IN algs |
MERGE (a:Algorithm { name: x })
MERGE (g)-[:USE]->(a)
)

Now it looks much better.

What I want to know? Totals and averages first of all.

MATCH (g:Graph)
WITH sum(g.lines) as totalCypherLines, apoc.math.round(avg(g.lines), 1) as avgCypherLinesPerTopic, sum(g.nodes) as totalNodes, sum(g.rels) as totalRelationships, apoc.math.round(avg(g.read), 1) as avgMinutesToRead
RETURN avgMinutesToRead, avgCypherLinesPerTopic, totalCypherLines, totalNodes, totalRelationships

I clearly remember once I generate 1 mln of nodes and 999999 relationships for stars of Towel Galaxy. But, as you can see, even without this 2 huge numbers, sum of nodes and relationships looks very impressive. More than 1200 lines of Cypher. Wow!

Let’s look at our champions closer.

MATCH (g:Graph)
OPTIONAL MATCH (g)-[:USE]->(a:Algorithm)
WITH g, collect(a.name) as algorithms
MATCH (g)-[:CREATED_WITH]->(b:Build)
WITH g, algorithms, collect(b.name) as builds
RETURN g.topic as topic, g.read as read, g.lines as lines, algorithms, builds
ORDER BY read DESC, lines DESC

How many algorithms and data build methods was used? Easy.

MATCH (:Graph)-[u:USE]->(a:Algorithm)
WITH a.name as name, count(u) as num
WITH collect({ name: name, num: num }) as algs
MATCH (:Graph)-[c:CREATED_WITH]->(b:Build)
WITH b.name as name, count(c) as num, algs
WITH collect({ name: name, num: num }) as builds, algs
WITH apoc.coll.union(algs, builds) as items
UNWIND items as item
RETURN item.name as name, item.num as num
ORDER BY item.num DESC

Great!

Experience

Neo4j

First of all, I want to say about main goal: learning Neo4j, Cypher, APOC and usage of algorithms.

Best pages, that helps me every day are: Cypher Manual, APOC Guide and Algorithms Docs.

To understand flow and scopes of Cypher you need to write down many many queries. Any kind: complex and easy, short or long. This is lesson I learned in this month. Books is good, but real practice is game changer. Try something and enjoy good results.

Timing

Month is a lot. But, as one my friend said. If less, like week or 10 days, then probably, you can’t call it challenge, but a month, definitely it is a challenge.

Also important to mention about time per day to do a topic. At weekends it is not a determined process, but daily after work and dinner, I start at 9 PM. And then, absolutely, not like I promised to my wife, it takes from 2 hours till 5 hours. Yes, it is true. If I wrote topic till the midnight — it is great. But sometimes it can takes time until 1 AM or even 2 AM.

Data

Especially bad performance, when you need to load HTML. It was a nightmare sometimes. You can check some of my crazy jquery selectors and then hell of regex expressions, that I been need to wrote sometimes. Just to build initial data. Not even start analysis yet it took me few hours.

Goals

I can say that even 4 hours is not enough time to cover complex things from scratch to some output. To build a rich relationships in graphs and apply deep analisys, apply more algorithms and so on. I tried to balance between achivements and time, rejecting many ideas or complex scenarios.

Tools

Next block is about tools, that helped make my idea live. I want to share it, because, maybe someone would be interest or even decide to try to use something.

Neo4j Community Edition 3.4

No need to mention, but anyway. When I started, I used Neo4j Desktop application. But somewhere in the middle switched to standalone instance of Neo4j Community Edition and run it from the box as a service. Plus I installed “must have” plugins APOC and algorithms library and configure it in a few lines in config file.

Notes

I used my phone to write down a list of upcoming days and topics. On my way to work and from work I am thinking about ideas for future graph. Sometimes I couldn’t decide, what kind of graph to build or what topic to choose, until the dead line. As I mentioned before, every day I usually been start building a graph at 9 PM.

Calend.ru

I used this web-site to track events related to the day in history. I had options and some brief details about every day of month. So I can choose and plan a bit ahead.

Canva + Unsplash + Pexels + …

I like, when good thing have a good wrapper. This mean, that creator take care of many aspects. And he/she better to do this, because, let’s say, that your “product” is part of you.

Such an attraction point for my post being numerous screenshots of query results and of course, cover images. Mostly I used just two services (Unsplash and Pexels) to find cool and attractive images. Good quality and free license guaranteed. After I find image I add some common styled text. Canva is great online service I used to “decorate” all the images of my topics.

Medium

Great audience and modern service. Medium is a platform for write articles. Serious, long and readable content. Service not really fit your needs if you decide to have a “blog” here. You can feel it, when you decide to try to connect your posts together. Right now every post isolated from another. No cross links, no tools to combine, group by time, category and so on. But anyway I don’t have my own web page or something like this. This is why I am here and pretty happy with it.

Gist

I used gist for my scripts at the beginning, but then I realize it quite useless and just stopped. Ctrl + Alt + 6 is more than enough.

Conclusion

I asking myself what next? And I have few answers.

Certification. Sorry. T-shirtification. Would be good to have one. Even if it is takes 30 tries. I am skilled in 30 days long challenges now.

Workshops. Right now in Bangkok not much stuff related to Neo4j. After one month, I have some initial base of many topics and ideas, so, I can make something bigger from it and share with others.

Use and share at work. I working at Agoda. Agoda is a great tech company and we really loves to share knowledges in our Bangkok based TechAtAgoda community.

Feedback

Thank you all for support. Not much people read and clap. But for all, who like this posts — thank you, guys.

Andrei for daily advices and editing of my mistakes. I know there are still a lot, but much less with your help.

Zoltan Bíró for huge claps of epic topics. Yeah, I also have some similar favorites: Pioneria, FIFA, Sherlock.

David Allen for great help at Neo4j Community. William_Lyon for cool advice what actually I should do at my last day of challenge. Graph of graphs is actually his idea.

That’s it! See you soon with new indie experience. After small but noticeable rest. I hope, I deserve at least few day of silence or weeks or months? Let’s see how long I can keep silence after daily posts, now.

I will be miss you One Month Graph Challenge. Bye!

P.S.

You alway can try something similar. Ping me, in case you have enough courage to start. I definitely will support you till the final day!

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Vlad Batushkov
Vlad Batushkov

Written by Vlad Batushkov

Engineering Manager @ Agoda. Neo4j Ninja. Articles brewed on modern tech, hops and indie rock’n’roll.

Responses (1)

Write a response