denisenkom / pytds

Python DBAPI driver for MSSQL using pure Python TDS (Tabular Data Stream) protocol implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pytds cannot handle large query results

jaj42 opened this issue · comments

I am working with a large database with high frequency data and I regularly do queries that return large results. I found that above a certain threshold (somewhere above 120 kB), pytds fails with : pytds.tds_base.ClosedConnectionError: Server closed connection
By querying different tables I found out that it does not seem to be related to the number of lines returned but the size of the data.

Here is a gist to reproduce the error:
https://gist.github.com/jaj42/e334457459095a325ebaed4e88461b05

Using pyodbc I can get > 4 MB using the same query without any error.

Here is a traceback:

Traceback (most recent call last):
  File "extract_stuff_with_pytds.py", line 22, in <module>
    df = pd.DataFrame.from_records(res, columns=colnames, nrows=4000)
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 1240, in from_records
    values.extend(itertools.islice(data, nrows - 1))
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/__init__.py", line 879, in __next__
    row = self.fetchone()
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/__init__.py", line 852, in fetchone
    row = self._session.fetchone()
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds.py", line 1581, in fetchone
    if not self.next_row():
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds.py", line 1592, in next_row
    self.process_token(marker)
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds.py", line 1542, in process_token
    return handler(self)
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds.py", line 1659, in <lambda>
    tds_base.TDS_ROW_TOKEN: lambda self: self.process_row(),
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds.py", line 639, in process_row
    curcol.value = self.row[i] = curcol.serializer.read(r)
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds_types.py", line 804, in read
    return r.read_str(size, ucs2_codec)
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds.py", line 212, in read_str
    return codec.decode(readall(self, size))[0]
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds_base.py", line 564, in readall
    return join_bytearrays(read_chunks(stm, size))
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds_base.py", line 376, in join_bytearrays
    return b''.join(ba)
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds_base.py", line 543, in read_chunks
    buf = stm.recv(left)
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds.py", line 153, in recv
    self._read_packet()
  File "/home/ml/anaconda3/lib/python3.7/site-packages/pytds/tds.py", line 232, in _read_packet
    raise tds_base.ClosedConnectionError()
pytds.tds_base.ClosedConnectionError: Server closed connection

I ran into the same problem, see #115 for my fix (but no promises that my fix is correct!)