Some of my random thoughts and insights with sqlite and python

getting started:

<code>
from sqlite3 import Connection
db = Connection("/tmp/test.db")
cur = db.cursor()
cur.execute("CREATE TABLE test(col1 int, col2 text)")
db.commit()
</code>
Einstellungen per PRAGMA (https://www.sqlite.org/pragma.html):

<code>
cur.execute("PRAGMA journal_mode=WAL")  # probably only useful at the beginning of the programm
</code>
cursor.executemany:

<code>
data = [(1, "a"), (2, "b")]
cur.executemany("INSERT INTO test (col1, col2) VALUES (?, ?)", data)
db.commit()  # <-- this seems to influence performance a lot (in a positive way, during feed updates)!
# http://codereview.stackexchange.com/questions/26822/myth-busting-sqlite3-performance-w-pysqlite#comment41680_26822
</code>
Journal Modes:


  * OFF: Änderungen direkt in die Datenbank
  * DELETE: Default: nach einer Transaktion wird das Journal gelöscht
  * WAL: Write Ahead Log: Erlaubt mehrere Leser und Schreiber
  * MEMORY: Journal wird im RAM gehalten

Ein Test hat ergeben, dass journal_mode=WAL für die LillyX am besten geeignet ist.

python + sqlite3 mit der aktuellsten sqlite3-Version


  * Download pysqlite sources: wget https://pypi.python.org/packages/42/02/981b6703e3c83c5b25a829c6e77aad059f9481b0bbacb47e6e8ca12bd731/pysqlite-2.8.3.tar.gz#md5=033f17b8644577715aee55e8832ac9fc
  * extract sources: tar xfvz pysqlite-2.8.3.tar.gz
  * cd pysqlite-2.8.3
  * Download most recent sqlite version from https://www.sqlite.org/download.html: wget https://www.sqlite.org/2017/sqlite-amalgamation-3160200.zip
  * unzip sqlite-amalgamation-3160200.zip
  * Die sqlite sources in das gleiche Verzeichnis wie die setup.py bewegen: mv sqlite-amalgamation-3160200/* .
  * sqlite bauen: python setup.py build_static
  * pip install --user .
  * cp build/lib.linux-x86_64-2.7/pysqlite2/_sqlite.so ~/.local/lib/python2.7/site_packages/pysqlite2/

Prüfen, ob es funktioniert hat:

<code>
from pysqlite2 import dbapi2 as sqlite3
print "SQLITE version:", sqlite3.sqlite_version  # -> 'SQLITE version: 3.16.2'
</code>
====== Corrupted Data / Applying Journals ======

In case the database is inconsistent due to an exception, there is an easy way to 'repair' the database and apply existing journals.

''%%sqlite3 v5_store.db vacuum%%''

This might take a while, but is the easiest way.


===== Read Only =====

''%%sqlite3.connect('file:{}?mode=ro'.format(feat_db), uri=True) # read-only%%''

In case you just want to fire read-only queries. This hopefully avoid unnecessary locks


===== Journal Mode: WAL =====

To avoid that SELECT queries are blocked while there is a running insert query for the same database, Write-Ahead-Logs should be used. A high-level overview is given in [1].\\


A few things should be noted:


  * The option is remembered, meaning if you open a DB with 'wal' any connect will use it, even if the option is not given
  * There are additional files for each db: *-shm, *-wal which means more space is required

To enable the write-ahead log mode

<code>
pragma journal_mode=wal
</code>
To switch back to the default journal mode, use

<code>
pragma journal_mode=delete
</code>
[1] https://charlesleifer.com/blog/going-fast-with-sqlite-and-python/


===== Inserts + Ignore =====

If one do not care if the primary key constraint is violated, the following construct can be used:

<code>
INSERT OR IGNORE INTO
</code>
This is like 'ON CONFLICT DO NOTHING' for PostgreSQL


==== Datetime functions ====

<code>
SELECT date(timestamp, 'unixepoch'), count(*) from qa group by date(timestamp, 'unixepoch');
</code>
convert unix epoch into a date, with time it is datetime(timestamp, 'unixepoch');