PostgreSQL - Python

The default is psycopg2 which is quite fast and intuitive. The other package is asyncpg for the brave developers that are not afraid of concurrency.

For psycopg2 it is usually easier to PIP install the binary package to avoid errors due to missing dependencies because of in-place compiling.

pip3 install --user psycopg2-binary

Issues

If your pip3 is too old, it pip wont select the binary packet and tries to compile it from source (psql01, psql02). In this case, you need to upgrade pip:

pip3 install --upgrade pip

Beware of executemany

Since it is not faster than execute, execute_batch and friends should be used: https://www.psycopg.org/docs/extras.html#fast-exec (Fast execution helpers)

Named cursors

To perform

select * from dingenskirchen

efficiently without limit and stuff, psycopg2 supports named clusters that lift the burden of keeping everything in the client memory

cur = conn.cursor(name='foo') # magic!
cur.execute("SELECT picalike_id, x1, x2, x3, y1, y2, y3 FROM sim_clusters")
while True:
 rows = cur.fetchmany(5000)
 if not rows: break

Auto reconnect

There is no functionality in psycopg2. This caused a lot of pain and trouble. In the git repo https://git.picalike.corpex-kunden.de/incubator/swiss-army-knife/-/tree/master/db there are scripts with solution hacks(!). The code is not generic but tested.

References

https://www.psycopg.org/docs/extras.html#fast-exec

Keywords: psql python postgres connect database