Cannot destroy test db

Question

Cannot destroy test db

askholme opened this issue 10 years ago · comments

Hi

Using this pool, gevent and psycogreen patching i get errors constantly when running "manage.py test" .
The problem seems to be that when django closes a db connection it actually just goes back to the pool (instead of being closed). This means that the connections are still active and then postgresql blocks the "DROP DATABASE" statement that is issued.

I do understand that this is not a production problem (you would never use drop database there), but it still hampers testing quite a bit (my issue is that i'm using gevent explicitly in a celery tasks which i need to test, so it would not help to simply turn of gevent when environment==DEV).

You have any clue on a good fix for that?

Javier Cordero · Answer 1 · Fri Aug 22 2014 00:20:23 GMT+0800 (China Standard Time)

Hello, I got some projects that spawn greenlets explicitly and never found this problem.

Which version of django are you using? Maybe there is some problem with django 1.7.

Have you added these params at DATABASES['default'] to your settings? (from the README)

'ATOMIC_REQUESTS': False,
'AUTOCOMMIT': True,
'CONN_MAX_AGE': 0,

The persistent database connection support that was added to django 1.6 needs to be disabled or the connections will never go back to the pool and so they are never closed.

How are you handling the gevent spawned inside the celery task? are you calling join() or joinAll() and wait to all greenlets finish their processing before ending the task?

regards.

Javier Cordero · Answer 2 · Fri Aug 22 2014 02:19:39 GMT+0800 (China Standard Time)

Ok, I was able to reproduce this problem, you are correct, this bug only happens while running the tests, the problem is the connections are never closed, the tests are correctly checked, this bug doesn't change the results and they are correctly displayed, but raises an OperationalError at the end of the logs. I will try to find a clean way to fix it.

Thanks!

Ask Holme · Answer 3 · Fri Aug 22 2014 04:42:00 GMT+0800 (China Standard Time)

Good and you are correct on what happens, it's not a problem which impact tests results - it just makes it rather slow to run tests.

I also noticed that the problem only occurs when the gevent specific test is run last.
I.e. if i only run the that test i get the problem, however if do a full test run where the gevent specific test is placed first or in the middle i don't get the problem.

So somehow a command that is trigged by later test runs clears up the problem.

As for your question on the use of gevent, i'm spawning a pool and calling a join to block the thread until the work has completed. In practice however the test only places 1 item in the queue, meaning that only 1 greenlet is spawned and thus nothing really happens concurrently, but the bug is trigged none the less.

Let me know if you need anything else

Br
ask

Javier Cordero · Answer 4 · Sat Aug 23 2014 23:23:15 GMT+0800 (China Standard Time)

The bug should be fixed now, if you want you can try with last version (I will update pypi later)

pip install https://github.com/jneight/django-db-geventpool/archive/master.zip

Thanks.

Ask Holme · Answer 5 · Sun Aug 24 2014 03:43:35 GMT+0800 (China Standard Time)

In my installation the problem is still in place when using the latest version.
(Actually it's gotten worse, since now it also blocks DROP database in beginning of a test session, meaning that i need to manually drop the DB to rerun the test).

But it might be some other stuff causing headackes, will try to move the code to a standalone example in own virtualenv to see if that also has trouble

Ask Holme · Answer 6 · Sun Aug 24 2014 17:50:22 GMT+0800 (China Standard Time)

On newest version this works in a plain environment - so i guess i just have to figure out what causes trouble in my main app!

Thanks for the help

Ask Holme · Answer 7 · Sun Aug 24 2014 20:10:13 GMT+0800 (China Standard Time)

Okay. I narrowed the error down a bit and now it can be reproduced with the "small" example at https://github.com/kvanque/geventpooltest
My successfull test earlier was because i didn't monkey patch in manage.py.

So here goes on what causes the error:

The pool code relies on CONN_MAX_AGE to return ununsed connections to the pool.
Handling of that is done by django.db.close_old_connections() which is triggered by the request_started and request_finished signals.
One trick however is that the django_db.close_old_connections() only checks connections registered with it's connectionhandler and the connectionhandler uses a local() to keep track of connections in a thread-safe way.

But if one calls monkey.patch_all() gevent will change local() to be greenlet local instead of thread local, and this close_old_connections() needs to be called from each greenlet that uses the DB (since the connection handler will create a databasewrapper and thus a connection per greenlet).

My example confirms this in that i can fix the problem by:

not monkey patching threads
manually calling close_old_connections inside each greenlet

If i understand the code correctly the problem actually has two bad effects:

Cases where django does not call the request start/end signals must ensure to manually return/close connections before closing a greenlet (this covers everything running outside a wsgi handler, e.g. celery workers, tests, management commands, and greenlets spawned manually inside request handling + possibly other stuff)
Even when runnign inside a wsgi handler DB pooling will not be happening (each greenlet=one connection) which means more connection overhead than we would actually like

There is 2 solutions to this:
Simply document that monkeypatching the tread is not compatible with db-geventpool (i.e. one should call monkey.patch_all(thread=False) ) And that db_close_connections needs to be triggered when running something else than wsgi server
Apply some extra monkey patch method to turn the connectionhandler back into a thread-specific thing instead of a greenlet local thing and apply some signal stuff to periodically trigger the close_old_connections() functions

I would probably go for the first one in terms of the monkey patching. However a general solution for calling close_old_connections would be nice. I might put in a pull request if i figure a nice way to make one

P.s.: The general bug in not closing all connections actually also exists with create_test_db (cases where a previous test db still exists), i will submit a pull request for closing that one

Ask Holme · Answer 8 · Mon Aug 25 2014 02:31:09 GMT+0800 (China Standard Time)

okay stupid me, it's obviously not a good idea to turn of the thread patching.
That will cause the ORM to share a database wrapper among greenlets, which in effect means that they will try to use the same connection, something that naturally does not play well.

The correct solution is simply to patch threading and ensure that the connections are returned to the pool, i.e. by signaling request_finished or calling close_old_connections before a greenlet ends

Javier Cordero · Answer 9 · Mon Aug 25 2014 15:10:41 GMT+0800 (China Standard Time)

You are correct, when a greenlet ends, the connection is not returned to the pool, maybe there is a way to detect a greenlet exit, I will try to see gevent code, and also will add a decorator with this behaviour, I will forget to call close_old_connections always :)

Thanks.