thelonelyvulpes / neo4j-http

PoC for an external HTTP API using Bolt.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Neo4j-HTTP

Neo4j-HTTP provides an HTTP API for Neo4j that runs externally to the core product and uses the Neo4j Java Driver exclusively for communicating with a database system.

Main areas of work

Those are the topics evaluated

Architecture

This PoC uses the reactive Neo4j-Java-Driver to connect to any Neo4j instance, doesn’t matter if on-prem or cloud (AuraDB), single instance or cluster. All versions that the currently used driver supports are supported too.

Authentication

All endpoints are protected via HTTP Basic Auth, analog to the current existing Neo4j-HTTP API. The username / password is the same as your Neo4j instance, exactly as it is today with an on-prem Neo4j instance.

Warning
If you ever should think putting this PoC in protoduction, make sure the traffic to it is over HTTPS and properly encrypted! You have been warned.
Warning
This application keeps credentials in memory for the duration of a request. This is necessary so that the credentials can be passed to the driver where authentication occurs.
Note
Right now Neo4j is the authority for authentication, but this is completely swappable, for example for an Octa instance or similar.

Readers or writers?

When using Neo4j 5 or instances with dbms.routing.enabled

During startup the application checks if it is connected against an instance with Server Side Routing (aka SSR). If this is the case Server Side Routing will be used. This requires the connection to be made via a neo4j:// URI.

Client side query evaluation

When connected via bolt:// or against instances that don’t support SSR, a best effort estimation on queries will be done whether they can be routed to readers or require writers to be executed. When in doubt, we will route things to a writer. The application uses an EXPLAIN call to do this. This will happen only once per query.

Enforcing SSR routing

Warning
You can’t enforce SSR over a bolt:// connection.

All checks can be skipped by running this application with the ssr profile on. This can be specified as a JVM paramter with -Dspring.profiles.active=ssr or as an environment variable export spring_profiles_active=ssr.

Specifying the default value for SSR in case of no connectivity on startup

Use org.neo4j.http.default-to-ssr=true either as property or environment variable to default to SSR in case the connection to an instance can’t be made during startup of this application.

Managed or implicit transactions?

By default, the PoC will make use of managed transaction (aka transactional functions aka retry-functions), so that all queries should be eventually succeed according to Neo4js definition of retryable. In cases we figure that a query uses USING PERIODIC COMMIT LOAD CSV … or CALL {…} IN TRANSACTION we fallback to implicit (aka server managed transactions) and no retries will be attempt

Compiling and running

You need Java 17 to create an executable Jar file and additional, Docker to run all tests and create a docker image.

Just create packages

./mvnw -Dfast clean package

Execute all tests and create a package

./mvnw clean verify

Execute all tests and create a docker image

./mvnw clean spring-boot:build-image -Dspring-boot.build-image.imageName=neo4j/neo4j-http

Creating a native image with GraalVM

This requires at least GraalVM 22.3 with native image installed

./mvnw -Pnative clean package native:compile

Creating a native image inside docker

This requires at least GraalVM 22.3 with native image installed

./mvnw -Pnative clean spring-boot:build-image -Dspring-boot.build-image.imageName=neo4j/neo4j-http-native
Note
Native image docker containers are not currently supported on ARM chipsets (see: https://github.com/spring-projects/spring-boot/wiki/Spring-Boot-with-GraalVM#building-container-images)

Running the Jar-File

All Neo4j related data properties - those are all that start with spring.neo4j.* - can be used to configure the Bolt connection. These can come from application.properties or application.yml files. Basically, all features of externalized configuration can be used.

The following connects the application to a locally running instance, using the password secret. We are using locally exported environment variables here:

SPRING_NEO4J_URI=neo4j://localhost:7687 \
SPRING_NEO4J_AUTHENTICATION_USERNAME=neo4j \
SPRING_NEO4J_AUTHENTICATION_PASSWORD=verysecret \
java -jar neo4j-http/target/neo4j-http-0.0.1-SNAPSHOT-runnable.jar

You can also use a credentials file downloaded from Aura likes this:

set -o allexport # (1)
(source ~/Downloads/credentials-XXX.env;  java -jar neo4j-http/target/neo4j-http-0.0.1-SNAPSHOT-runnable.jar)
set +o allexport
  1. Might not be needed in your shell

Ensuring connectivity

By default, no checks are done during startup whether Neo4j is reachable. You can set org.neo4j.http.verify-connectivity=true via any of the available means and the application will try to reach Neo4j during startup and fail hard if it won’t reach any.

Usually this is not necessary, as the driver is able to heal when Neo4j becomes available. To monitory the status, you can use either of the following endpoints:

Listing 1. Checking the application health
curl -X GET --location "http://localhost:8080/actuator/health/" \
    --basic --user neo4j:secret

It will return the full status, similar to this when authenticated, status only without authentication:

{
  "status": "UP",
  "components": {
    "diskSpace": {
      "status": "UP",
      "details": {
        "total": 994662584320,
        "free": 744871899136,
        "threshold": 10485760,
        "exists": true
      }
    },
    "livenessState": {
      "status": "UP"
    },
    "neo4j": {
      "status": "UP",
      "details": {
        "server": "4.4.11@localhost:7687",
        "edition": "community",
        "database": "neo4j"
      }
    },
    "ping": {
      "status": "UP"
    },
    "readinessState": {
      "status": "UP"
    }
  },
  "groups": [
    "liveness",
    "readiness"
  ]
}

There are reduced endpoints for liveness and readiness:

Listing 2. Checking for readiness only
curl -X GET --location "http://localhost:8080/actuator/health/readiness"

Running the Docker image

docker run \
-e 'SPRING_NEO4J_URI=neo4j://yourhost:7687' \
-e 'SPRING_NEO4J_AUTHENTICATION_USERNAME=neo4j' \
-e 'SPRING_NEO4J_AUTHENTICATION_PASSWORD=secret' \
-p 8080:8080 \
neo4j/neo4j-http:latest

Running the Docker image in Kubernetes or similar

You might want to configure appropriate probes for your setup somewhat similar to this

Listing 3. Configuring K8s to use the built-in probes
livenessProbe:
  httpGet:
    path: "/actuator/health/liveness"
    port: <actuator-port>
  failureThreshold: ...
  periodSeconds: ...

readinessProbe:
  httpGet:
    path: "/actuator/health/readiness"
    port: <actuator-port>
  failureThreshold: ...
  periodSeconds: ...

Available endpoints

Running queries

Parameter types

Aligning with Neo4j Java Driver types, we support types that cannot and should not automatically get derived from a String by Jackson. To use define those types in a request define the parameter in the list of parameters as follows:

{
  "statement": "RETURN $aDateValue as dateInput, $aStringValue as stringInput",
  "parameters": {
    "aDateValue": {
      "$type": "Date",
      "_value": "2022-10-31"
    },
    "aStringValue": "somthing"
  }
}

Type name

example value

Date

"2022-10-23"

Time

"13:37:11+02:00"

LocalTime

"13:37:11"

DateTime

"2022-10-18T13:37:11+02:00[Europe/Paris]"

LocalDateTime

"2022-10-18T13:37:11"

Duration

"PT23H21M"

Period

"P20D"

Point

"SRID=4979;POINT(12.994823 55.612191 2)"

Byte[]

"00 01 02 03 04 05 06 07" (whitespaces are optional)

All other parameters can be specified by default JSON types, such as literal null, Strings, boolean and numbers.

Running one or more queries and get one or more result

This endpoint behaves just like the current Neo4j-HTTP and also supports its current parameters and options. As a matter of fact, the Neo4j-OGM-HTTP driver tests successfully against it. For the basic format used, see Cypher transaction API.

Note
This PoC only allows "Beginning and committing a transaction in one request" as defined here to keep the API stateless.

An example call taken straight from the above documentation looks like this:

curl -X POST --location "http://localhost:8080/db/neo4j/tx/commit" \
    -H "Content-Type: application/json" \
    -H "Accept: application/json" \
    -d "{
          \"statements\": [
            {
              \"statement\": \"CREATE (n:Hello {name: 'World', createdAt: datetime()}) RETURN n\",
              \"includeStats\": true,
              \"resultDataContents\": [\"row\", \"graph\"]
            }
          ]
        }" \
    --basic --user neo4j:secret

The result will look like this

{
  "results": [
    {
      "columns": [
        "n"
      ],
      "data": [
        {
          "row": [
            {
              "name": "World",
              "createdAt": "2022-10-26T07:16:54.078Z"
            }
          ],
          "meta": [
            {
              "id": 7,
              "type": "node"
            }
          ],
          "graph": {
            "relationships": [],
            "nodes": [
              {
                "id": 7,
                "properties": {
                  "name": "World",
                  "createdAt": "2022-10-26T07:16:54.078Z"
                },
                "labels": [
                  "Hello"
                ]
              }
            ]
          }
        }
      ],
      "stats": {
        "contains_updates": true,
        "nodes_created": 1,
        "nodes_deleted": 0,
        "properties_set": 2,
        "relationships_created": 0,
        "relationship_deleted": 0,
        "labels_added": 1,
        "labels_removed": 0,
        "indexes_added": 0,
        "indexes_removed": 0,
        "constraints_added": 0,
        "constraints_removed": 0,
        "contains_system_updates": false,
        "system_updates": 0
      }
    }
  ],
  "notifications": [],
  "errors": []
}

Streaming the results of one query

This endpoint is different to the existing API. It allows only one query to be executed and does not allow to specify the format. In addition, it will render complex data types as shown in Parameter types while streaming each record returned:

curl -X POST --location "http://localhost:8080/db/neo4j/tx/commit" \
    -H "Content-Type: application/json" \
    -H "Accept: application/x-ndjson" \
    -d "{
          \"statement\": \"WITH range(1,10) AS r UNWIND r as i CREATE (n:Hello {name: 'World ' + i, createdAt: datetime()}) RETURN n\"
        }" \
    --basic --user neo4j:secret
Important
Note the accepted content type, it is application/x-ndjson and that the query is not wrapped in a list of statements.

The result are 10 chunks of json looking like this:

{
  "n": {
    "name": "World 1",
    "createdAt": {
      "$type": "DateTime",
      "_value": "2022-10-26T07:20:21.239Z"
    }
  }
}

Getting metrics

Metrics are available via Spring Boot actuator at this endpoint:

curl -X GET --location "http://localhost:8080/actuator/metrics/" \
    --basic --user neo4j:secret

Relevant driver metrics start with neo4j.driver.*, the connection usage for example can be retrieved like this:

curl -X GET --location "http://localhost:8080/actuator/metrics/neo4j.driver.connections.usage" \
    --basic --user neo4j:secret

And has this format:

{
  "name": "neo4j.driver.connections.usage",
  "baseUnit": "seconds",
  "measurements": [
    {
      "statistic": "COUNT",
      "value": 10.0
    },
    {
      "statistic": "TOTAL_TIME",
      "value": 0.296342917
    },
    {
      "statistic": "MAX",
      "value": 0.0
    }
  ],
  "availableTags": [
    {
      "tag": "address",
      "values": [
        "localhost:7687"
      ]
    }
  ]
}

All metrics can be exported as described in the official Spring Boot Manual towards a plethora of different tools.

About

PoC for an external HTTP API using Bolt.


Languages

Language:Java 100.0%