Some of my random thoughts and insights with sqlite and python
getting started:
from sqlite3 import Connection
db = Connection("/tmp/test.db")
cur = db.cursor()
cur.execute("CREATE TABLE test(col1 int, col2 text)")
db.commit()
Einstellungen per PRAGMA (https://www.sqlite.org/pragma.html):
cur.execute("PRAGMA journal_mode=WAL") # probably only useful at the beginning of the programm
cursor.executemany:
data = [(1, "a"), (2, "b")]
cur.executemany("INSERT INTO test (col1, col2) VALUES (?, ?)", data)
db.commit() # <-- this seems to influence performance a lot (in a positive way, during feed updates)!
# http://codereview.stackexchange.com/questions/26822/myth-busting-sqlite3-performance-w-pysqlite#comment41680_26822
Journal Modes:
* OFF: Änderungen direkt in die Datenbank
* DELETE: Default: nach einer Transaktion wird das Journal gelöscht
* WAL: Write Ahead Log: Erlaubt mehrere Leser und Schreiber
* MEMORY: Journal wird im RAM gehalten
Ein Test hat ergeben, dass journal_mode=WAL für die LillyX am besten geeignet ist.
python + sqlite3 mit der aktuellsten sqlite3-Version
* Download pysqlite sources: wget https://pypi.python.org/packages/42/02/981b6703e3c83c5b25a829c6e77aad059f9481b0bbacb47e6e8ca12bd731/pysqlite-2.8.3.tar.gz#md5=033f17b8644577715aee55e8832ac9fc
* extract sources: tar xfvz pysqlite-2.8.3.tar.gz
* cd pysqlite-2.8.3
* Download most recent sqlite version from https://www.sqlite.org/download.html: wget https://www.sqlite.org/2017/sqlite-amalgamation-3160200.zip
* unzip sqlite-amalgamation-3160200.zip
* Die sqlite sources in das gleiche Verzeichnis wie die setup.py bewegen: mv sqlite-amalgamation-3160200/* .
* sqlite bauen: python setup.py build_static
* pip install --user .
* cp build/lib.linux-x86_64-2.7/pysqlite2/_sqlite.so ~/.local/lib/python2.7/site_packages/pysqlite2/
Prüfen, ob es funktioniert hat:
from pysqlite2 import dbapi2 as sqlite3
print "SQLITE version:", sqlite3.sqlite_version # -> 'SQLITE version: 3.16.2'
====== Corrupted Data / Applying Journals ======
In case the database is inconsistent due to an exception, there is an easy way to 'repair' the database and apply existing journals.
''%%sqlite3 v5_store.db vacuum%%''
This might take a while, but is the easiest way.
===== Read Only =====
''%%sqlite3.connect('file:{}?mode=ro'.format(feat_db), uri=True) # read-only%%''
In case you just want to fire read-only queries. This hopefully avoid unnecessary locks
===== Journal Mode: WAL =====
To avoid that SELECT queries are blocked while there is a running insert query for the same database, Write-Ahead-Logs should be used. A high-level overview is given in [1].\\
A few things should be noted:
* The option is remembered, meaning if you open a DB with 'wal' any connect will use it, even if the option is not given
* There are additional files for each db: *-shm, *-wal which means more space is required
To enable the write-ahead log mode
pragma journal_mode=wal
To switch back to the default journal mode, use
pragma journal_mode=delete
[1] https://charlesleifer.com/blog/going-fast-with-sqlite-and-python/
===== Inserts + Ignore =====
If one do not care if the primary key constraint is violated, the following construct can be used:
INSERT OR IGNORE INTO
This is like 'ON CONFLICT DO NOTHING' for PostgreSQL
==== Datetime functions ====
SELECT date(timestamp, 'unixepoch'), count(*) from qa group by date(timestamp, 'unixepoch');
convert unix epoch into a date, with time it is datetime(timestamp, 'unixepoch');