giampaolo / pyftpdlib

Extremely fast and scalable Python FTP server library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add support for kqueue() and epoll() to event loop

giampaolo opened this issue · comments

From g.rodola on January 25, 2012 18:35:26

Right now the internal poller depends on asyncore module; as such it can only 
use select() and poll() system calls which don't scale/perform well with 
thousands of concurrent clients.
This is a benchmark using poll():

pyftpdlib 0.7.0:

2000 concurrent clients (connect, login)      36.63 secs
2000 concurrent clients (RETR 10M file)      128.07 secs
2000 concurrent clients (STOR 10M file)      189.73 secs
2000 concurrent clients (quit)                 0.39 secs


proftpd 1.3.4rc2:

2000 concurrent clients (connect, login)      44.59 secs
2000 concurrent clients (RETR 10M file)       33.90 secs
2000 concurrent clients (STOR 10M file)      138.94 secs
2000 concurrent clients (quit)                 2.28 secs


2000 clients here actually means 4000 concurrent connections (control + data).
As noticeable, poll() clearly suffers a serious performance degradation.
select() on the other hand, wouldn't have been able to work at all as it has a 
limit of 1024 fds.

epoll() (Linux) and kqueue() (BSD / OSX) are supposed to fix this problems 
altogheter.

What I have in mind (for 1.0.0 version) is to add a "lib" package within a 
modified version of asyncore.dispatcher and an asyncore.loop supporting 
kqueue()/epoll().
A partial patch I wrote some time ago is here: http://bugs.python.org/issue6692 
Also, tornado ( http://www.tornadoweb.org/ ) can be used as an example for the 
epoll() implementation.

Original issue: http://code.google.com/p/pyftpdlib/issues/detail?id=203

From g.rodola on January 28, 2012 08:23:42

A preliminary patch is in attachment.

=== before patch (poll()) ===

giampaolo@ubuntu:~/svn/pyftpdlib$ python test/bench.py -u giampaolo -p XXX -b 
concurrence -s 1K -n 2000
2000 concurrent clients (connect, login)      34.98 secs
2000 concurrent clients (RETR 1K file)        61.02 secs
2000 concurrent clients (STOR 1K file)       169.42 secs
2000 concurrent clients (quit)                 0.11 secs


=== after patch (epoll()) ===

giampaolo@ubuntu:~/svn/pyftpdlib$ python test/bench.py -u giampaolo -p XXX -b 
concurrence -s 1K -n 2000
2000 concurrent clients (connect, login)      19.46 secs
2000 concurrent clients (RETR 1K file)        24.29 secs
2000 concurrent clients (STOR 1K file)       122.09 secs
2000 concurrent clients (quit)                 0.10 secs

Attachment: ioloop.patch

From g.rodola on February 17, 2012 11:45:58

Patch in attachment adds kqueue() support (BSD and OSX systems).

Attachment: kqueue.patch

From g.rodola on February 18, 2012 08:06:36

Updated patch.

Attachment: ioloop.patch

From g.rodola on February 28, 2012 08:50:49

Updated patch in attachment.

CHANGES:
- got rid of serve_forever()'s "use_poll" and "count" arguments; replaced with 
a new "blocking" argument defaulting to True

TODO:
- kqueue() uses an hack for accepting sockets
- epoll()/poll() currently ckecks for error fds in order to detect closed 
connections but this might not be necessary (twisted doesn't do that)
- on the other hand, select() on windows might need to do that

Attachment: ioloop.patch

From g.rodola on March 02, 2012 14:23:39

Ok, I think this is done.
Here's a summary to clarify what I've done.

Before the patch
================

- The IO loop was based on asyncore stdlib module which only supports select() 
and poll().

- These are known to scale/perform reasonably fine under a thousand concurrent 
connections, then they start to show performance degration (poll()) or don't 
work at all (select()).

- asyncore's IO poller is also particularly naive in that every registered file 
descriptor is checked for both read and write operations, even for idle 
connections.

- That means that with 200 connected clients we iterate over a list of 400 (200 
* 2) elements on every loop.


After the patch
===============

- The IO loop has been rewritten from scratch and now supports epoll() and 
kqueue() on Linux and OSX/BSD.

- epoll() and kqueue() scales/perform better with thousands of connections.

- asyncore's original select() and poll() implementation were rewritten.

- The poller is smarter in that it only iterates on fds which are actually 
interested in either reading or writing.

- That means that with 200 idle clients except one we will iterate over a list 
of 1 element instead of 400.

- This is valid for all pollers, including select().

- By default we use the better poller for the designated platform:
    - Linux: epoll()
    - OSX/BSD: kqueue()
    - all other POSIX: poll()
    - Windows: select()

- FTPServer.serve_forever() signature has changed.


Final benchamrk
===============

=== old select() implementation ===

200 concurrent clients (connect, login)                0.96 secs
STOR (1 file with 200 idle clients)                   81.94 MB/sec
RETR (1 file with 200 idle clients)                   89.01 MB/sec
200 concurrent clients (RETR 10M file)                 2.80 secs
200 concurrent clients (STOR 10M file)                 6.65 secs
200 concurrent clients (QUIT)                          0.02 secs


=== new select() implementation ===

200 concurrent clients (connect, login)                0.78 secs
STOR (1 file with 200 idle clients)                  399.46 MB/sec
RETR (1 file with 200 idle clients)                  761.53 MB/sec
200 concurrent clients (RETR 10M file)                 2.22 secs
200 concurrent clients (STOR 10M file)                 5.79 secs
200 concurrent clients (QUIT)                          0.01 secs


=== epoll() implementation ===

200 concurrent clients (connect, login)                0.77 secs
STOR (1 file with 200 idle clients)                  535.83 MB/sec
RETR (1 file with 200 idle clients)                 1632.50 MB/sec
200 concurrent clients (RETR 10M file)                 2.24 secs
200 concurrent clients (STOR 10M file)                 5.82 secs
200 concurrent clients (QUIT)                          0.02 secs


Furter note
===========

A patch which can be applied to current 0.7.0 version version is in attachment.

Attachment: ioloop.patch

From g.rodola on May 11, 2012 08:31:44

Patch including updated docstrings.

Attachment: ioloop.patch

From g.rodola on May 23, 2012 08:18:13

This in now committed in r1049 .

Status: FixedInSVN
Labels: Milestone-1.0.0

From g.rodola on May 23, 2012 08:30:58

Final patch attached.

Attachment: ioloop.patch

From nagy.att...@gmail.com on July 16, 2012 11:08:42

Thank you very much for this! I've just began to port my SMTP server from 
python's default asyncore to your lib and using exactly the same code shows a 
substantial amount of speedup.
Previously a dummy SMTP sink could do around 70 MiBps (std asyncore with poll), 
with your io loop (FreeBSD, kqueue) it does around 110.
The same logic in twisted can do about 20...

From g.rodola on February 19, 2013 04:49:26

Releasing 1.0.0 just now. Closing.

Status: Fixed
Labels: Version-0.7.0

From g.rodola on February 19, 2013 04:58:50

Final benchmarks: https://code.google.com/p/pyftpdlib/wiki/Benchmarks