35% Faster Than The Filesystem
SQLite reads and writes small blobs (for example, thumbnail images)
35% faster¹ than the same blobs
can be read from or written to individual files on disk using
fread() or fwrite().
Furthermore, a single SQLite database holding
10-kilobyte blobs uses about 20% less disk space than
storing the blobs in individual files.
The performance difference arises (we believe) because when
working from an SQLite database, the open() and close() system calls
are invoked only once, whereas
open() and close() are invoked once for each blob
when using blobs stored in individual files. It appears that the
overhead of calling open() and close() is greater than the overhead
of using the database. The size reduction arises from the fact that
individual files are padded out to the next multiple of the filesystem
block size, whereas the blobs are packed more tightly into an SQLite
database.
The measurements in this article were made during the week of 2017-06-05
using a version of SQLite in between 3.19.2 and 3.20.0. You may expect
future versions of SQLite to perform even better.
1.1. Caveats
¹The 35% figure above is approximate. Actual timings vary
depending on hardware, operating system, and the
details of the experiment, and due to random performance fluctuations
on real-world hardware. See the text below for more detail.
Try the experiments yourself. Report significant deviations on
the SQLite forum.
The 35% figure is based on running tests on every machine
that the author has easily at hand.
Some reviewers of this article report that SQLite has higher
latency than direct I/O on their systems. We do not yet understand
the difference. We also see indications that SQLite does not
perform as well as direct I/O when experiments are run using
a cold filesystem cache.
So let your take-away be this: read/write latency for
SQLite is competitive with read/write latency of individual files on
disk. Often SQLite is faster. Sometimes SQLite is almost
as fast. Either way, this article disproves the common
assumption that a relational database must be slower than direct
filesystem I/O.
Jim Gray
and others studied the read performance of BLOBs
versus file I/O for Microsoft SQL Server and found that reading BLOBs
out of the
database was faster for BLOB sizes less than between 250KiB and 1MiB.
(Paper).
In that study, the database still stores the filename of the content even
if the content is held in a separate file. So the database is consulted
for every BLOB, even if it is only to extract the filename. In this
article, the key for the BLOB is the filename, so no preliminary database
access is required. Because the database is never used at all when
reading content from individual files in this article, the threshold
at which direct file I/O becomes faster is smaller than it is in Gray’s
paper.
The Internal Versus External BLOBs article on this website is an
earlier investigation (circa 2011) that uses the same approach as the
Jim Gray paper — storing the blob filenames as entries in the
database — but for SQLite instead of SQL Server.
I/O performance is measured using the
kvtest.c program
from the SQLite source tree.
To compile this test program, first gather the kvtest.c source file
into a directory with the SQLite amalgamation source
files “sqlite3.c” and “sqlite3.h”. Then on unix, run a command like
the following:
gcc -Os -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c -o kvtest -ldl -lpthread
Or on Windows with MSVC:
cl -I. -DSQLITE_DIRECT_OVERFLOW_READ kvtest.c sqlite3.c
Instructions for compiling for Android
are shown below.
Use the resulting “kvtest” program to
generate a test database with 100,000 random uncompressible
blobs, each with a random
size between 8,000 and 12,000 bytes
using a command like this:
./kvtest init test1.db --count 100k --size 10k --variance 2k
If desired, you can verify the new database by running this command:
Next, make copies of all the blobs into individual files in a directory
using a command like this:
./kvtest export test1.db test1.dir
At this point, you can measure the amount of disk space used by
the test1.db database and the space used by the test1.dir directory
and all of its content. On a standard Ubuntu Linux desktop, the
database file will be 1,024,512,000 bytes in size and the test1.dir
directory will use 1,228,800,000 bytes of space (according to “du -k”),
about 20% more than the database.
The “test1.dir” directory created above puts all the blobs into a single
folder. It was conjectured that some operating systems would perform
poorly when a single directory contains 100,000 objects. To test this,
the kvtest program can also store the blobs in a hierarchy of folders with no
more than 100 files and/or subdirectories per folder. The alternative
on-disk representation of the blobs can be created using the –tree
command-line option to the “export” command, like this:
./kvtest export test1.db test1.tree --tree
The test1.dir directory will contain 100,000 files
with names like “000000”, “0000