neo4j-contrib / neo4j-apoc-procedures

Awesome Procedures On Cypher for Neo4j - codenamed "apoc"                     If you like it, please ★ above ⇧            

Home Page:https://neo4j.com/labs/apoc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

apoc.load.jdbcUpdate inside apoc.periodic.iterate leaves idle connections in 5.19.0

nielsjansendk opened this issue · comments

Expected Behavior (Mandatory)

When running a apoc.load.jdbcUpdate operation as the second statement in an apoc.periodic.iterate, the connections to the database (postgres) should be closed after they have been executed.

Actual Behavior (Mandatory)

It does not close the connections, and it leaves an idle connection for each statement. When the postgres database reaches max connections the statement fails.

How to Reproduce the Problem

Steps (Mandatory)

  1. Create a table in a postgres database:
    CREATE TABLE nodes ( id serial PRIMARY KEY, my_id integer );

  2. In you neo4j database, create some nodes:
    WITH range(0, 100) as list UNWIND list as l CREATE (n:MyNode{id: l})

  3. Load the postgres driver:
    CALL apoc.load.driver("org.postgresql.Driver")

  4. Try to use apoc.period.iterate to insert into the postgres table:

CALL apoc.periodic.iterate("MATCH (n:MyNode) return n", "WITH n, apoc.text.format('insert into nodes (my_id) values (\\\'%d\\\')',[n.id]) AS sql CALL apoc.load.jdbcUpdate('jdbc:postgresql://my_user:my_password@localhost:5432/my_database',sql) YIELD row AS row2 return row2,n", {batchsize: 10,parallel: true}) yield batches,total,timeTaken,committedOperations,failedOperations,failedBatches,retries,errorMessages, operations, failedParams, updateStatistics return batches,total,timeTaken, committedOperations,failedOperations,failedBatches,retries,errorMessages,operations,failedParams, updateStatistics

  1. Check postgres for connections:

select client_addr, state from pg_stat_activity to see the hanging connections from the neo4j server. They can be killed with select pg_terminate_backend(pid) from pg_stat_activity where state = 'idle' but a pg_cancel_backend will not work.

NOTE: running the same experiment on neo4j 5.18 works fine, it 5.19 that introduces this problem.

Specifications (Mandatory)

Currently used versions

Versions

  • OS: Ubuntu 22.04.4
  • Neo4j: 5.19.0
  • Neo4j-Apoc: 5.19.0
  • pg driver: postgresql-42.7.3.jar
  • Postgres version: 16.2

Also happens for apoc.load.csv (holds the lock on windows) not sure if this is related.

https://community.neo4j.com/t/database-is-holding-onto-file-after-apoc-load-csv-creating-a-permission-error/67943

The problem does not seem to be caused by apoc.periodic.iterate, in fact even doing apoc.load.jdbcUpdate directly many times, idle connections are left.

However it would seem not to be an apoc problem, as PreparedStatement.close() executes correctly without errors, rather it could depend on Postgres itself leaving connections idle for a certain amount of time, as mentioned here and here.

Even with the latest JDBC driver version, the problem still seems to be present.

Please feel free to reopen the issue, if you have more info to share.

The problem isn't related to the apoc.load.csv one.

@vga91 The problem does not exist with neo4j version 5.18. So, exact same process, create the nodes, write to postgres: no idle connections. So something must have changed from 5.18 to 5.19. I can confirm that the problem persist in 5.21.

Changing the version of the postgres driver does not make any difference.