Apoc was the technician and driver on board of the Nebuchadnezzar in the Matrix movie. He was killed by Cypher.
APOC was also the first bundled A Package Of Components for Neo4j in 2009.
APOC also stands for "Awesome Procedures On Cypher"
Go to http://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/latest to find the latest release and download the binary jar to place into your $NEO4J_HOME/plugins folder.
git clone http://github.com/neo4j-contrib/neo4j-apoc-procedures
cd neo4j-apoc-procedures
mvn clean install
cp target/apoc-1.0.0-SNAPSHOT.jar $NEO4J_HOME/plugins/
$NEO4J_HOME/bin/neo4j restart
If you want to run embedded or use neo4j-shell on a disk store, configure your plugins
directory in conf/neo4j.conf
with dbms.plugin.directory=path/to/plugins
.
Procedures can be called stand-alone with CALL procedure.name();
But you can also integrate them into your Cypher statements which makes them so much more powerful.
WITH 'https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/master/src/test/resources/person.json' AS url
CALL apoc.load.json(url) YIELD value as person
MERGE (p:Person {name:person.name})
ON CREATE SET p.age = person.age, p.children = size(person.children)
|
lists name, description-text and if the procedure performs writes (descriptions are WIP), search string is checked against beginning (package) or end (name) of procedure |
CALL apoc.help("apoc") YIELD name, text
WITH * WHERE text IS null
RETURN name AS undocumented
To find the procedure count with the package in Neo4j:
CALL dbms.procedures() YIELD name
RETURN head(split(name,".")) as package, count(*), collect(name) as procedures;
Procedures to add to and query manual indexes
Note
|
Please note that there are (case-sensitive) automatic schema indexes, for equality, non-equality, existence, range queries, starts with, ends-with and contains! |
|
add all nodes to this full text index with the given fields, additionally populates a 'search' index field with all of them in one place |
|
add node to an index for each label it has |
|
add node to an index for the given label |
|
add relationship to an index for its type |
|
search for the first 100 nodes in the given full text index matching the given lucene query returned by relevance |
|
lucene query on node index with the given label name |
|
lucene query on relationship index with the given type name |
|
lucene query on relationship index with the given type name bound by either or both sides (each node parameter can be null) |
|
lucene query on relationship index with the given type name for outgoing relationship of the given node, returns end-nodes |
|
lucene query on relationship index with the given type name for incoming relationship of the given node, returns start-nodes |
|
lists all manual indexes |
|
removes manual indexes |
|
gets or creates manual node index |
|
gets or creates manual relationship index |
match (p:Person) call apoc.index.addNode(p,["name","age"]) RETURN count(*);
// 129s for 1M People
call apoc.index.nodes('Person','name:name100*') YIELD node, weight return * limit 2
Schema Index lookups that keep order and can apply limits
|
schema range scan which keeps index order and adds limit, values can be null, boundaries are inclusive |
|
schema string search which keeps index order and adds limit, operator is 'STARTS WITH' or 'CONTAINS' |
Returns a virtual graph that represents the labels and relationship-types available in your database and how they are connected.
|
examines the full graph to create the meta-graph |
|
examines a sample graph to create the meta-graph, default sampleSize is 100 |
|
examines a sample sub graph to create the meta-graph, default sampleSize is 100 |
|
examines a subset of the graph to provide a tabular meta information |
|
returns the information stored in the transactional database statistics |
|
type name of a value ( |
|
returns a row if type name matches none if not |
MATCH (n:Person)
CALL apoc.meta.isType(n.age,"INTEGER")
RETURN n LIMIT 5
|
asserts that at the end of the operation the given indexes and unique constraints are there, each label:key pair is considered one constraint/label |
|
acquires a write lock on the given nodes |
|
acquires a write lock on the given relationship |
|
acquires a write lock on the given nodes and relationships |
|
converts value to json string |
|
converts value to json map |
|
converts json list to Cypher list |
|
converts json map to Cypher map |
|
creates a stream of nested documents representing the at least one root of these paths |
Data is exported as cypher statements (for neo4j-shell, and partly apoc.cypher.runFile to the given file.
|
exports whole database incl. indexes as cypher statements to the provided file |
|
exports given nodes and relationships incl. indexes as cypher statements to the provided file |
|
exports nodes and relationships from the cypher statement incl. indexes as cypher statements to the provided file |
|
load from relational database, either a full table or a sql statement |
|
load from relational database, either a full table or a sql statement |
|
register JDBC driver of source database |
|
load from JSON URL (e.g. web-api) to import JSON as stream of values if the JSON was an array or a single value if it was a map |
|
load from XML URL (e.g. web-api) to import XML as single nested map with attributes and |
|
load from XML URL (e.g. web-api) to import XML as single nested map with attributes and |
|
load CSV fom URL as stream of values |
|
elastic search statistics |
|
perform a GET operation |
|
perform a SEARCH operation |
|
perform a raw GET operation |
|
perform a raw POST operation |
|
perform a POST operation |
|
perform a PUT operation |
|
perform a find operation on mongodb collection |
|
perform a find operation on mongodb collection |
|
perform a first operation on mongodb collection |
|
perform a find,project,sort operation on mongodb collection |
|
inserts the given documents into the mongodb collection |
|
inserts the given documents into the mongodb collection |
|
inserts the given documents into the mongodb collection |
Copy these jars into the plugins directory:
mvn dependency:copy-dependencies cp target/dependency/mongodb*.jar target/dependency/bson*.jar $NEO4J_HOME/plugins/
CALL apoc.mongodb.first('mongodb://localhost:27017','test','test',{name:'testDocument'})
|
streams provided data to Gephi |
Gephi has a streaming plugin, that can provide and accept JSON-graph-data in a streaming fashion.
Make sure to install the plugin firsrt and activate it for your workspace (there is a new "Streaming"-tab besides "Layout"), right-click "Master"→"start" to start the server.
You can provide your workspace name (you might want to rename it before you start thes streaming), otherwise it defaults to workspace0
The default Gephi-URL is http://localhost:8080, resulting in http://localhost:8080/workspace0?operation=updateGraph
You can also configure it in conf/neo4j.conf
via apoc.gephi.url=url
or apoc.gephi.<key>.url=url
|
create node with dynamic labels |
|
create multiple nodes with dynamic labels |
|
adds the given labels to the node or nodes |
|
removes the given labels from the node or nodes |
|
create relationship with dynamic rel-type |
|
creates an UUID |
|
creates count UUIDs |
Virtual Nodes and Relationships don’t exist in the graph, they are only returned to the UI/user for representing a graph projection. They can be visualized or processed otherwise. Please note that they have negative id’s.
|
returns a virtual node |
|
returns virtual nodes |
|
returns a virtual relationship |
|
returns a virtual pattern |
|
returns a virtual pattern |
Example
MATCH (a)-[r]->(b)
WITH head(labels(a)) AS l, head(labels(b)) AS l2, type(r) AS rel_type, count(*) as count
CALL apoc.create.vNode(['Meta_Node'],{name:l}) yield node as a
CALL apoc.create.vNode(['Meta_Node'],{name:l2}) yield node as b
CALL apoc.create.vRelationship(a,'META_RELATIONSHIP',{name:rel_type, count:count},b) yield rel
RETURN *;
Create a graph object (map) from information that’s passed in.
It’s basic structure is: {name:"Name",properties:{properties},nodes:[nodes],relationships:[relationships]}
|
creates a virtual graph object for later processing it tries its best to extract the graph information from the data you pass in |
|
creates a virtual graph object for later processing |
|
creates a virtual graph object for later processing |
|
creates a virtual graph object for later processing |
|
creates a virtual graph object for later processing |
|
creates a virtual graph object for later processing |
(thanks @SaschaPeukert)
|
Warmup the node and relationship page-caches by loading one page at a time |
(thanks @ikwattro)
|
node and relationships-ids in total and in use |
|
store information such as kernel version, start time, read-only, database-name, store-log-version etc. |
|
store size information for the different types of stores |
|
number of transactions total,opened,committed,concurrent,rolled-back,last-tx-id |
|
db locking information such as avertedDeadLocks, lockCount, contendedLockCount and contendedLocks etc. (enterprise) |
|
executes reading fragment with the given parameters |
|
runs each statement in the file, all semicolon separated - currently no schema operations |
|
runs each semicolon separated statement and returns summary - currently no schema operations |
|
executes fragment in parallel batches with the list segments being assigned to _ |
|
executes writing fragment with the given parameters |
TODO runFile: begin/commit/schema await/constraints/indexes
|
repeats an batch update statement until it returns 0, this procedure is blocking |
|
list all jobs |
|
submit a one-off background statement |
|
submit a repeatedly-called background statement |
|
submit a repeatedly-called background statement until it returns 0 |
|
iterate over first statement and apply action statement with given transaction batch size. Returns to numeric values holding the number of batches and the number of total processed rows. E.g. |
|
run the second statement for each item returned by the first statement. Returns number of batches and total processed rows |
-
there are also static methods
Jobs.submit
, andJobs.schedule
to be used from other procedures -
jobs list is checked / cleared every 10s for finished jobs
CALL apoc.periodic.rock_n_roll('match (p:Person) return id(p) as id_p', 'MATCH (p) where id(p)={id_p} SET p.lastname =p.name', 20000)
copies over the name
property of each person to lastname
.
|
clone nodes with their labels and properties |
|
clone nodes with their labels, properties and relationships |
|
merge nodes onto first in list |
|
redirect relationship to use new end-node |
|
redirect relationship to use new start-node |
|
change relationship-type |
|
extract node from relationships |
|
collapse node to relationship, node with one rel becomes self-relationship |
|
normalize/convert a property to be boolean |
|
turn each unique propertyKey into a category node and connect to it |
TODO:
-
merge nodes by label + property
-
merge relationships
|
look up geographic location of location from openstreetmap geocoding service |
|
sort a given collection of paths by geographic distance based on lat/long properties on the path nodes |
|
returns statically stored value from config (apoc.static.<key>) or server lifetime storage |
|
returns statically stored values from config (apoc.static.<prefix>) or server lifetime storage |
|
stores value under key for server livetime storage, returns previously stored or configured value |
Sometimes type information gets lost, these functions help you to coerce an "Any" value to the concrete type
|
tries it’s best to convert the value to a string |
|
tries it’s best to convert the value to a map |
|
tries it’s best to convert the value to a list |
|
tries it’s best to convert the value to a boolean |
|
tries it’s best to convert the value to a node |
|
tries it’s best to convert the value to a relationship |
|
tries it’s best to convert the value to a set |
|
creates map from list with key-value pairs |
|
creates map from a keys and a values list |
|
creates map from alternating keys and values in a list |
|
returns the map with the value for this key added or replaced |
|
returns the map with the key removed |
|
returns the map with the keys removed |
|
removes the keys and values (e.g. null-placeholders) contained in those lists, good for data cleaning from CSV/JSON |
|
sum of all values in a list |
|
avg of all values in a list |
|
minimum of all values in a list |
|
maximum of all values in a list |
|
sums all numeric values in a list |
|
partitions a list into sublists of |
|
all values in a list |
|
returns `[first,second],[second,third], … |
|
returns a unique list backed by a set |
|
sort on Collections |
|
sort nodes by property |
|
optimized contains operation (using a HashSet) (returns single row or not) |
|
optimized contains-all operation (using a HashSet) (returns single row or not) |
|
optimized contains on a sorted list operation (Collections.binarySearch) (returns single row or not) |
|
optimized contains-all on a sorted list operation (Collections.binarySearch) (returns single row or not) |
|
creates the distinct union of the 2 lists |
|
returns unique set of first list with all elements of second list removed |
|
returns first list with all elements of second list removed |
|
returns the unique intersection of the two lists |
|
returns the disjunct set of the two lists |
|
creates the full union with duplicates of the two lists |
|
splits collection on given values rows of lists, value itself will not be part of resulting lists |
|
position of value in the list |
|
id |
|
quickly returns all nodes with these id’s |
|
id |
|
quickly returns all relationships with these id’s |
|
Compute the US_ENGLISH phonetic soundex encoding of all words of the text value which can be a single string or a list of strings |
|
Compute the US_ENGLISH soundex character difference between two given strings |
|
join the given strings with the given delimiter. |
|
strip the given string of everything except alpha numeric characters and convert it to lower case. |
|
compare the given strings stripped of everything except alpha numeric characters converted to lower case. |
|
filter out non-matches of the given strings stripped of everything except alpha numeric characters converted to lower case. |
|
returns domain part of the value |
|
computes the sha1 of the concatenation of all string values of the list |
|
computes the md5 of the concatenation of all string values of the list |
|
sleeps for <duration> millis, transaction termination is honored |
(thanks @tkroman)
|
get Unix time equivalent of given date (in seconds) |
|
same as previous, but accepts custom datetime format |
|
get string representation of date corresponding to given Unix time (in seconds) |
|
the same as previous, but accepts custom datetime format |
|
get Unix time equivalent of given date (in milliseconds) |
|
same as previous, but accepts custom datetime format |
|
get string representation of date corresponding to given time in milliseconds in UTC time zone |
|
the same as previous, but accepts custom datetime format |
|
the same as previous, but accepts custom time zone |
-
possible unit values:
ms,s,m,h,d
and their long formsmillis,milliseconds,seconds,minutes,hours,days
. -
possible time zone values: Either an abbreviation such as
PST
, a full name such asAmerica/Los_Angeles
, or a custom ID such asGMT-8:00
. Full names are recommended. You can view a list of full names in this Wikipedia page.
Provides a wrapper around the java bitwise operations.
call apoc.bitwise.op(a long, "operation", b long ) yield value as <identifier> |
examples
operator |
name |
example |
result |
a & b |
AND |
call apoc.bitwise.op(60,"&",13) |
12 |
a | b |
OR |
call apoc.bitwise.op(60,"|",13) |
61 |
a ^ b |
XOR |
call apoc.bitwise.op(60,"&",13) |
49 |
~a |
NOT |
call apoc.bitwise.op(60,"&",0) |
-61 |
a << b |
LEFT SHIFT |
call apoc.bitwise.op(60,"<<",2) |
240 |
a >> b |
RIGHT SHIFT |
call apoc.bitwise.op(60,">>",2) |
15 |
a >>> b |
UNSIGNED RIGHT SHIFT |
call apoc.bitwise.op(60,">>>",2) |
15 |
(thanks @keesvegter)
The apoc.path.expand procedure makes it possible to do variable length path traversals where you can specify the direction of the relationship per relationship type and a list of Label names which act as a "whitelist" or a "blacklist". The procedure will return a list of Paths in a variable name called "path".
|
expand from given nodes(s) taking the provided restrictions into account |
Syntax: [<]RELATIONSHIP_TYPE1[>]|[<]RELATIONSHIP_TYPE2[>]|…
input | type | direction |
---|---|---|
|
|
OUTGOING |
|
|
INCOMING |
|
|
BOTH |
Utility to find nodes in parallel (if possible). These procedures return a single list of nodes or a list of 'reduced' records with node id, labels, and the properties where the search was executed upon.
|
A distinct set of Nodes will be returned. |
|
All the found Nodes will be returned. |
|
A merged set of 'minimal' Node information will be returned. One record per node (-id). |
|
All the found 'minimal' Node information will be returned. One record per label and property. |
|
|
(JSON or Map) For every Label-Property combination a search will be executed in parallel (if possible): Label1.propertyOne, label2.propOne and label2.propTwo. |
|
'exact' or 'contains' or 'starts with' or 'ends with' |
Case insensitive string search operators |
|
"<", ">", "=", "<>", "⇐", ">=", "=~" |
Operators |
|
'Keanu' |
The actual search term (string, number, etc). |
CALL apoc.search.nodeAll('{Person: "name",Movie: ["title","tagline"]}','contains','her') YIELD node AS n RETURN n
call apoc.search.nodeReduced({Person: 'born', Movie: ['released']},'>',2000) yield id, labels, properties RETURN *
Provides some graph algorithms (not very optimized yet)
|
run dijkstra with relationship property name as cost function |
|
run dijkstra with relationship property name as cost function and a default weight if the property does not exist |
|
run A* with relationship property name as cost function |
|
run A* with relationship property name as cost function |
|
run allSimplePaths with relationships given and maxNodes |
|
calculate betweenness centrality for given nodes |
|
calculate closeness centrality for given nodes |
|
return relationships between this set of nodes |
|
calculates page rank for given nodes |
|
calculates page rank for given nodes |
|
simple label propagation kernel |
|
search the graph and return all maximal cliques at least at large as the minimum size argument. |
|
search the graph and return all maximal cliques that are at least as large than the minimum size argument and contain this node |
Example: find the weighted shortest path based on relationship property d
from A
to B
following just :ROAD
relationships
MATCH (from:Loc{name:'A'}), (to:Loc{name:'D'})
CALL apoc.algo.dijkstra(from, to, 'ROAD', 'd') yield path as path, weight as weight
RETURN path, weight
MATCH (n:Person)
-
move apoc.get to apoc.nodes and apoc.rels
-
add apoc.nodes.delete(id|ids|node|nodes)
-
(√) add weight/score to manual index operations, expose it, TODO add Sort.RELEVANCE sorter conditionally or unconditionally
-
pass in last count to rundown so you can also do batch-creates
-
in browser guide as apoc-help-page
-
(√) optimized collection functions (WIP)
-
Time Conversion Functions (ISO<→ts, padded long representation)
-
ordered, limited retrieval from index (both manual and schema index)
-
json to graph (mapping)
-
virtual graph from collection of nodes and rels, handle node-uniqueness with pk
-
RDF / Ontology loader
-
Encryption / decryption of single properties or a subset or all properties (provide decryption key as param or config)
-
(in progress) Graph Algorithms (Stefan, Max?)
-
custom expanders, e.g. with dynamic rel-type suffixes and prefixes
-
(√) Graph Refactorings (WIP)
-
(√) Job Queue (WIP) See BatchedWriter from Jake/Max
-
run/load shell scripts apoc.load.shell(path)
-
apox.save.dump() whole database, dump("statement"), dump("", "data/import/file") dump("", "URL TO PUT"), formats - binary(packstream), human readable(graphml, graphjson), compression
-
store arbitrary objects in properties with kryo/packstream or similar serialization
-
variable path length on patterns instead of single relationships. Don’t have a syntax for this to suggest, but assume you want to search for ()-[:TYPE_A]→()-[:TYPE_B]→() e.g. 2..5 times.
-
match (a)-[r*]→(b) where all rels in the path are this pattern ()-[:Foo]→()-[:Bar]→()
-
all unique pairs of a list
-
TopK select
-
apoc.schema.create(indexConfig) - {unique:[{Label:keys}], index:[{Label:keys}],existence:[{Label:keys}], }
-
Procedures in other languages (e.g. JS, JSR-223 scripting → apoc-unsafe project)
-
eval javascript
-
apoc.meta.validate(metagraph) validate a metagraph against the current graph and report violations
-
apoc.run.register(name, query[,params]), apoc.run.named(name,[params])
-
apoc.create.graph(nodes,rels,data-map) → {nodes:[], rels:[], data:{}} a graph data structure, e.g. for rendering, export, validation, …
-
auto-increment id’s (per label? → graph properties)
-
query neo4j databases
-
find relationships within a set of nodes
-
summary for graphs (a bit like apoc.meta.stats but for a named subgraph)
-
graph operations (union, intersection etc. see gradoop), but also graph summarization and FSM
-
path expander config for node and rel-properties, both equals with value as well as comparisons with operator → value { name: "John" weight : {
>
: 10,<
: 100 } -
represent storage-records virtually as a graph
-
run cypher query and return query plan as a graph
-
add list of values support to parallel node search, add support for HAS_LABEL (OR, AND, ALL)
-
run export in parallel
-
demonstrate how to run export in parallel just with the built in procs in periodic/cypher
-
give a flag to rock_n_roll that makes it run concurrently
-
allow to attach cypher queries to tx handler (like triggers)