Registering custom VFS for SQLite3 Link to heading
SQLite3 uses virtual file systems in order to be portable across operating systems. There are multiple built-in VFSes, supporting OS operations on Unix-like systems, Windows systems etc.
It is possible to register a custom VFS, thus allowing us to run custom code at the time the SQLite driver does file operations (not unlike custom streams in GNU/Linux). In this blog post series, I’ll explore how this feature of SQLite can be used in different ways.
Registering a SQLite3 VFS in Python using apsw
Link to heading
In this first blog post I’ll show how we can register the most basic, read-only VFS using the Python
package apsw
. To give it some kind of purpose, we’ll add functionality
to count the number of reads to the underlying file system, along with the number of bytes read.
First we’ll need to subclass apsw.VFS
:
|
|
Next comes the definition of CountingVFSFile
(we can optionally inherit from apsw.VFSFile
), where we need to
implement at least xRead
and xFileSize
:
|
|
In the constructor we simply open a file descriptor at the requested location1. To support
reading from the file, we implement xRead
. This function takes in the amount of bytes + the offset (from the
beginning of the file) requested by the SQLite driver, and must return that amount of bytes. Since we’re simply reading
from a local file, we can just read
from the
already opened file.
To implement the byte counting functionality, we just sum of the amount of bytes requested in the function call of
xRead
. Note that xWrite
is not implemented, making this VFS purely read-only.
With the above implementations, we can go ahead and register the VFS with the SQLite driver, and use it when creating a new connection:
In a more complete example, we can use the counting VFS in a command line interface, outputting the number of reads/number of bytes read while performing a query:
|
|
I happen to have a database extract of the danish parliament data (extracted late ‘23), stored in the file ft.db
(~
200MB):
|
|
There it is. To extract these 10 rows, SQLite makes 914 reads to ft.db
, reading a total of 3.6M
bytes. This is quite
a lot of bytes to trawl through, but makes sense given a largely un-indexed database.
Conclusion Link to heading
While the example given here is not particularly useful in itself, having the option to interject code at the time of OS operations turns out to be quite useful.
In part 2 of this series, I show how we can use another custom VFS to allow querying data stored in compressed files.
A proper implementation should check the file exists, that we have sufficient permissions etc. ↩︎