Call for Papers Tool
Making life easier for finding relevant conference call for papers
Useful links:
Getting up and running
The Jupyter notebook will get the initial Neo4j graph up and running. You will need to insert credentials for the instance of Neo4j you are using. The code is work in progress, to be optimised, and I apologise in advance for the terrible code :)
If you’ve not already done so, I would suggest you get yourself a (free!) instance of Neo4j Sandbox up and running first.
Querying the CfP graph
Here are the queries that we will run against the graph during the session. You will need to have Neo4j Browser up and running for this.
What are the most common tags? Hints at trends
MATCH (e:Event)-->(t:Tag)
RETURN t.value, count(e) AS size ORDER BY size DESC
What CfPs are closing within the next month? This query is casting a string to DateTime - we can import/update this property to a DateTime type. Will do that later
MATCH (e:Event)-->(t:Tag)
WHERE date(e.cfpClosing) < date("2021-11-01")
RETURN e.title, e.cfpClosing, e.description, collect(t.value) AS Tags
ORDER BY e.cfpClosing
Find some data science-esque conferences. This query starts by providing an array of terms we’d commonly associate with data science, and then we’re checking whether any of the tags linked to events contain those terms. As more than one of these array terms could map to the same event, we use DISTINCT
to just bring back a single copy.
WITH ['data science', 'machine learning', 'artificial intelligence', 'ai', 'ml', 'deep learning'] AS p
MATCH (l:Location)<--(e:Event)-->(t:Tag)
WHERE t.value IN p
RETURN DISTINCT e.title, e.description, l.value
Find conferences that have tags of data science and python. Here we’re again using an array, but this time around we want the tags associated with an event to include all of the terms.
WITH ['data science', 'python'] AS p
MATCH (e:Event)
WHERE ALL (i in p WHERE exists((e)--(:Tag {value:i})))
RETURN e.title
Time for similarity - create a graph projection for similarity. Think of a graph projection as an in memory 'view' of our original graph. The algorithm we are using is expecting a monopartite graph, so we are creating a view where the Tag
node is being used as a bridge to show which events are connected to each other.
CALL gds.graph.create.cypher("similar",
"MATCH (e:Event) RETURN id(e) AS id",
"MATCH (e1:Event)-->(t:Tag)<--(e2:Event) RETURN id(e1) AS source, id(e2) AS target")
Let’s have a look at those similar events. We are going to run the node similarity algorithm across our graph projection, and then return a stream of event pairs, ordered by which ones are most similar.
CALL gds.nodeSimilarity.stream("similar")
YIELD node1, node2, similarity
RETURN gds.util.asNode(node1).title, gds.util.asNode(node2).title, similarity
ORDER BY similarity desc
Let’s now connect similar events together, so that we can have a look at them. Similar query as above, but now we are going to make a write to our graph. We are going to create a SIMILAR_TO
relationship between our Event
nodes where they are at least 80% similar as specified by the algorithm. To prevent having relationships in both directions between Event nodes, we’re using the greater than 'trick' to just write in one relationship.
CALL gds.nodeSimilarity.stream("similar")
YIELD node1, node2, similarity
WITH gds.util.asNode(node1) AS n1, gds.util.asNode(node2) AS n2,
similarity WHERE similarity >= 0.8 AND id(n1)>id(n2)
CREATE (n1)-[:SIMILAR_TO]->(n2)
And let’s look at the result! Here we are using the fact that there is only one relationship between the Event nodes, so we find the 'head node' of the similar group, and then do an unbounded directional traversal to pick up all the other similar events, collect them and then return them. We also show the tags for the 'head node' so we get an idea of the conference theme.
MATCH (e:Event)-[:SIMILAR_TO]->(e2)
WHERE NOT ()-->(e)
WITH e
MATCH (e)-[:SIMILAR_TO*]->(e1), (e)-->(t:Tag)
RETURN DISTINCT e.title, collect(distinct e1.title), collect(distinct t.value)