Database Microbenchmarks
Symas Corp., September 2012
This page follows on from Google's LevelDB benchmarks published in July 2011 at
LevelDB. (A snapshot
of that document is available here for reference.
In addition to the benchmarks
tested there, we add the venerable BerkeleyDB as well as the OpenLDAP MDB database. For this test, we compare
LevelDB version 1.5 (git rev dd0d562b4d4fbd07db6a44f9e221f8d368fee8e4), SQLite3
(version 3.7.7.1) and Kyoto Cabinet's (version 1.2.76) TreeDB (a B+Tree based key-value store), Berkeley DB 5.3.21, and OpenLDAP MDB (git rev 2e677bcb995b63d36461eea254f2134ebfe29da2). We would like to acknowledge the LevelDB project for the original benchmark code.
This page shows updated results for MDB write performance when using its
writable mmap option. The previous benchmark report is available here. Using the memory map
in read/write mode allows much faster writes and even lower memory overhead, but also opens
the possibility of undetected database corruption if bugs in the application cause overwrites
of arbitrary ranges of the map.
The writable mmap option was implemented after discussions at the LinuxCon in August. The results
here obsolete the results in the August LinuxCon presentation.
Benchmarks were all performed on a Dell Precision M4400 laptop with a quad-core Intel(R) Core(TM)2 ExtremeCPU Q9300 @ 2.53GHz, with 6144 KB of total L3 cache and 8 GB of DDR2 RAM at 800 MHz. (Note that LevelDB uses at most two CPUs since the benchmarks are single threaded: one to run the benchmark, and one for background compactions.)
The benchmarks were run on three different filesystems: tmpfs, reiserfs on an SSD, and ext2 on an HDD.
(See our previous report for performance tests on btrfs, ext2, ext3, ext4, jfs,
ntfs, reiserfs, xfs, and zfs.)
The SSD is a Crucial M4 512GB, purchased new in August 2012. The HDD is a Seagate ST9500420AS Momentus 7200.4
500GB notebook hard drive.
The system had Ubuntu 12.04 installed, with
kernel 3.5.0-030500. Tests were all run in single-user mode to prevent variations due to other system activity.
CPU performance scaling was disabled (scaling_governor = performance), to ensure a consistent CPU clock speed
for all tests. The numbers reported below are the median of three measurements. The databases are completely deleted
between each of the three measurements.
Benchmark Source Code
We wrote benchmark tools for SQLite, BerkeleyDB, MDB, and Kyoto TreeDB based on LevelDB's db_bench.
The LevelDB, SQLite3, and TreeDB benchmark programs were originally provided in the LevelDB source distribution
but we've made additional fixes to the versions used here.
The code for each of the benchmarks resides here:
Google's original benchmark code had a number of flaws, as we noted in
this post
to the LevelDB Discussion Group. We've fixed these flaws for these tests:
- The Random write tests now use a shuffled list of all the keys, so no duplicate keys are used.
Thus the resulting databases should have the same content as from the Sequential tests.
- The synchronous tests on tmpfs are identical in length to the asynch tests, to ensure
that the results are comparable. (We still shortened them on SSD and HDD due to their lengthy runtimes.)
In addition, we found that the Kyoto TreeDB results were being unfairly penalized
relative to the LevelDB results, because all of its data items were undergoing an extra
alloc/copy before being sent to the database. The data items for LevelDB are now similarly
copied before use, to compensate for this issue. It's not clear how relevant this flaw
is in real world applications but it definitely skewed write speeds in LevelDB's favor
in these benchmarks.
Custom Build Specifications
Compression support was disabled in the libraries that support it. No special malloc library was used in the build. All
of the benchmark programs were linked to their respective static libraries, to show the actual size needed for a
minimal program using each library.
- LevelDB: Assertions were disabled.
- TreeDB: We enabled the TSMALL and TLINEAR options when opening the database in order to reduce the footprint of each record.
- SQLite: We tuned SQLite's performance, by setting its locking mode to exclusive. We also enabled SQLite's write-ahead logging.
- BerkeleyDB: We configure with --with-mutex=POSIX/pthreads to avoid using the default hybrid mutex implementation.
- MDB: Assertions were disabled.
1. Relative Footprint
Most database vendors claim their product is fast and lightweight. Looking at the total size
of each application gives some insight into the footprint of each database implementation.
size db_bench*
text data bss dec hex filename
272247 1456 328 274031 42e6f db_bench
1675911 2288 304 1678503 199ca7 db_bench_bdb
90423 1508 304 92235 1684b db_bench_mdb
653480 7768 1688 662936 a1d98 db_bench_sqlite3
296572 4808 1096 302476 49d8c db_bench_tree_db
The core of the MDB code is barely 32K of x86-64 object code. It fits entirely within most modern CPUs' on-chip caches.
All of the other libraries are several times larger.
2. Baseline Performance
This section gives the baseline performance of all the
databases. Following sections show how performance changes as various
parameters are varied. For the baseline:
- All operations are running on tmpfs. This shows the pure CPU time each
database requires, independent of I/O speed.
- Each database is allowed 4 MB of cache memory. (MDB has no cache, so this is irrelevant.)
- Databases are opened in asynchronous write mode.
(LevelDB's sync option, TreeDB's OAUTOSYNC option,
SQLite3's synchronous options are all turned off, MDB uses the
MDB_NOSYNC option, and BerkeleyDB uses the DB_TXN_WRITE_NOSYNC option). I.e.,
every write is pushed to the operating system, but the
benchmark does not wait for the write to reach the disk.
- Keys are 16 bytes each.
- Values are 100 bytes each.
- Sequential reads/writes traverse the key space in increasing order.
- Random reads/writes traverse the key space in random order. The entire key space is shuffled so that every key is visited once; there are no duplicates or collisions in the random sequence.
A. Sequential Reads
LevelDB |
4,587,156 ops/sec |
|
Kyoto TreeDB |
836,820 ops/sec |
|
SQLite3 |
313,283 ops/sec |
|
MDB |
14,705,882 ops/sec |
|
BerkeleyDB |
827,130 ops/sec |
|
B. Random Reads
LevelDB |
174,246 ops/sec |
|
Kyoto TreeDB |
103,252 ops/sec |
|
SQLite3 |
82,994 ops/sec |
|
MDB |
751,315 ops/sec |
|
BerkeleyDB |
101,958 ops/sec |
|
C. Sequential Writes
LevelDB |
498,753 ops/sec |
|
Kyoto TreeDB |
336,474 ops/sec |
|
SQLite3 |
56,693 ops/sec |
|
MDB |
461,255 ops/sec |
|
BerkeleyDB |
90,959 ops/sec |
|
D. Random Writes
LevelDB |
317,662 ops/sec |
|
Kyoto TreeDB |
96,984 ops/sec |
|
SQLite3 |
42,199 ops/sec |
|
MDB |
240,154 ops/sec |
|
BerkeleyDB |
45,928 ops/sec |
|
MDB has the fastest read operations by a huge margin, due to its single-level-store
architecture. MDB's write performance is good but not the fastest, due to the overhead of its
copy-on-write design. This copy overhead is most significant on small records like those
used in this test.
E. Batch Writes
A batch write is a set of writes that are applied atomically to the underlying database. A single batch of N writes may be significantly faster than N individual writes. The following benchmark writes one thousand batches where each batch contains one thousand 100-byte values. TreeDB does not support batch writes so its baseline numbers are
repeated here for reference.
Sequential Writes
LevelDB |
677,048 entries/sec |
|
(1.36x baseline) |
Kyoto TreeDB |
336,474 entries/sec |
|
(baseline) |
SQLite3 |
109,302 entries/sec |
|
(1.93x baseline) |
MDB |
2,481,390 entries/sec |
|
(5.38x baseline) |
BerkeleyDB |
187,829 entries/sec |
|
(2.06x baseline) |
Random Writes
LevelDB |
432,152 entries/sec |
|
(1.36x baseline) |
Kyoto TreeDB |
96,984 entries/sec |
|
(baseline) |
SQLite3 |
58,432 entries/sec |
|
(1.38x baseline) |
MDB |
294,898 entries/sec |
|
(1.23x baseline) |
BerkeleyDB |
57,680 entries/sec |
|
(1.26x baseline) |
Batching allows some of MDB's copy-on-write overhead to be amortized across
multiple records. MDB's Append Mode for sequential writes is
most effective in batched operation.
F. Synchronous Writes
In the following benchmark, we enable the synchronous writing modes
of all of the databases. Otherwise the test is identical to the Baseline.
- For LevelDB, we set WriteOptions.sync = true.
- In TreeDB, we enabled TreeDB's OAUTOSYNC option.
- For SQLite3, we set "PRAGMA synchronous = FULL".
- For MDB, we set no options since full sync is its default mode.
- For BerkeleyDB, we set no options since full sync is its default mode.
Sequential Writes
LevelDB |
444,247 ops/sec |
|
(0.89x baseline) |
Kyoto TreeDB |
5,730 ops/sec |
|
(0.017x baseline) |
SQLite3 |
51,308 ops/sec |
|
(0.91x baseline) |
MDB |
404,694 ops/sec |
|
(0.88x baseline) |
BerkeleyDB |
86,745 ops/sec |
|
(0.95x baseline) |
Random Writes
LevelDB |
304,692 ops/sec |
|
(0.96x baseline) |
Kyoto TreeDB |
5.698 ops/sec |
|
(0.058x baseline) |
SQLite3 |
38,713 ops/sec |
|
(0.92x baseline) |
MDB |
224,871 ops/sec |
|
(0.94x baseline) |
BerkeleyDB |
44,893 ops/sec |
|
(0.98x baseline) |
While most of the databases are only slowed down a little,
TreeDB performs extremely poorly in synchronous mode.
G. Disk Usage
The amount of disk space each database consumed is also a factor when
considering the footprint of each database. The results for each write test
are presented here, in KBytes used per test. The results are obtained using
du on the test directory at the completion of each test, after all data has
been flushed to disk. It includes space used by log files as well as actual
data files, if any.
| Plain Writes | Batched | Synchronous |
|
Sequential | Random |
Sequential | Random |
Sequential | Random |
LevelDB |
112692 | 114960 | 112348 | 115412 | 112692 | 114040 |
Kyoto TreeDB |
133104 | 229644 | | | 261980 | 300676 |
SQLite3 |
159128 | 158772 | 159232 | 160092 | 159136 | 158792 |
MDB |
126660 | 180716 | 126660 | 192560 | 126660 | 180840 |
BerkeleyDB |
570400 | 683100 | 527436 | 639728 | 570400 | 682636 |
LevelDB is the most efficient for disk usage, even without compression. We chose to
omit the built in compression from these tests because it is clearly not a core feature
of a database; any compression library could easily be used to handle compression for
any database using simple wrappers around their put and get APIs.
MDB's sequential write packing gives the most deterministic file sizes; the DB size
is the same regardless of batched or unbatched or synchronous writing. All of the other
databases and modes yield different sizes due to varying overheads in each mode.
BerkeleyDB's transaction log files form the bulk of its disk usage. We configured it
with DB_LOG_AUTO_REMOVE to run these tests, otherwise the consumption would be even higher.
In regular deployments it may be unsafe to use this setting, since removal of the log files
may prevent database recovery from working after a system crash. BerkeleyDB's transaction
log management has been a continual inconvenience in real world deployments.
3. Performance Using More Memory
We increased the overall cache size for each database to 128 MB.
For SQLite3, we kept the page size at 1024 bytes, but increased the number of pages to 131,072 (up from 4096). For TreeDB, we also kept the page size at 1024 bytes, but increased the cache size to 128 MB (up from 4 MB).
For MDB there is no cache, so the numbers are simply a copy of the baseline. Both MDB and BerkeleyDB use
the default system page size (4096 bytes).
A. Sequential Reads
LevelDB |
4,566,210 ops/sec |
|
(0.99x baseline) |
Kyoto TreeDB |
1,242,236 ops/sec |
|
(1.48x baseline) |
SQLite3 |
441,501 ops/sec |
|
(1.41x baseline) |
MDB |
14,705,882 ops/sec |
|
(baseline) |
BerkeleyDB |
860,585 ops/sec |
|
(1.04x baseline) |
B. Random Reads
LevelDB |
177,557 ops/sec |
|
(1.02x baseline) |
Kyoto TreeDB |
195,198 ops/sec |
|
(1.89x baseline) |
SQLite3 |
105,697 ops/sec |
|
(1.27x baseline) |
MDB |
751,315 ops/sec |
|
(baseline) |
BerkeleyDB |
169,348 ops/sec |
|
(1.66x baseline) |
C. Sequential Writes
LevelDB |
488,520 ops/sec |
|
(0.98x baseline) |
Kyoto TreeDB |
323,939 ops/sec |
|
(0.96x baseline) |
SQLite3 |
54,660 ops/sec |
|
(0.96x baseline) |
MDB |
461,255 ops/sec |
|
(baseline) |
BerkeleyDB |
91,525 ops/sec |
|
(1.006x baseline) |
D. Random Writes
LevelDB |
318,370 ops/sec |
|
(1.002x baseline) |
Kyoto TreeDB |
182,983 ops/sec |
|
(1.89x baseline) |
SQLite3 |
43,687 ops/sec |
|
(1.04x baseline) |
MDB |
240,154 ops/sec |
|
(baseline) |
BerkeleyDB |
67,838 ops/sec |
|
(1.48x baseline) |
E. Batch Writes
Sequential Writes
LevelDB |
683,995 entries/sec |
|
(1.40x non-batched) |
Kyoto TreeDB |
324,675 entries/sec |
|
(non-batched) |
SQLite3 |
105,119 entries/sec |
|
(1.92x non-batched) |
MDB |
2,481,390 entries/sec |
|
(5.38x non-batched) |
BerkeleyDB |
191,534 entries/sec |
|
(2.09x non-batched) |
Random Writes
LevelDB |
441,501 entries/sec |
|
(1.39x non-batched) |
Kyoto TreeDB |
182,983 entries/sec |
|
(non-batched) |
SQLite3 |
64,078 entries/sec |
|
(1.47x non-batched) |
MDB |
294,898 entries/sec |
|
(1.23x non-batched) |
BerkeleyDB |
104,613 entries/sec |
|
(1.54x non-batched) |
F. Synchronous Writes
Sequential Writes
LevelDB |
441,696 ops/sec |
|
(0.89x baseline) |
Kyoto TreeDB |
5,730 ops/sec |
|
(0.017x baseline) |
SQLite3 |
51,243 ops/sec |
|
(0.90x baseline) |
MDB |
404,694 ops/sec |
|
(0.88x baseline) |
BerkeleyDB |
86,957 ops/sec |
|
(0.96x baseline) |
Random Writes
LevelDB |
304,599 ops/sec |
|
(0.96x baseline) |
Kyoto TreeDB |
5,698 ops/sec |
|
(0.058x baseline) |
SQLite3 |
40,510 ops/sec |
|
(0.96x baseline) |
MDB |
224,871 ops/sec |
|
(0.94x baseline) |
BerkeleyDB |
65,789 ops/sec |
|
(1.43x baseline) |
Performance gains from the increased cache are minimal at best; some of the
tests are slower with the increased caches. Overall, application-level caching exacts
a great cost in terms of software and administrative complexity, and yields little
(if any) benefit in return.
4. Performance Using Large Values
For this benchmark, we use 100,000 byte values. To keep the benchmark running time reasonable, we stop after writing 10,000 values. Otherwise, all of the same tests as for the Baseline are run.
A. Sequential Reads
LevelDB |
299,133 ops/sec |
|
Kyoto TreeDB |
16,514 ops/sec |
|
SQLite3 |
7,402 ops/sec |
|
MDB |
30,303,030 ops/sec |
|
BerkeleyDB |
9,133 ops/sec |
|
B. Random Reads
LevelDB |
15,183 ops/sec |
|
Kyoto TreeDB |
14,518 ops/sec |
|
SQLite3 |
7,047 ops/sec |
|
MDB |
1,718,213 ops/sec |
|
BerkeleyDB |
8,646 ops/sec |
|
MDB's single-level-store architecture clearly outclasses all of the other designs; the others
barely even register on the results. MDB's zero-memcpy reads mean its read rate is
essentially independent of the size of the data items being fetched; it is only affected by the
total number of keys in the database.
C. Sequential Writes
LevelDB |
3,366 ops/sec |
|
Kyoto TreeDB |
5,860 ops/sec |
|
SQLite3 |
2,029 ops/sec |
|
MDB |
12,905 ops/sec |
|
BerkeleyDB |
1,920 ops/sec |
|
D. Random Writes
LevelDB |
742 ops/sec |
|
Kyoto TreeDB |
5,709 ops/sec |
|
SQLite3 |
2,004 ops/sec |
|
MDB |
12,735 ops/sec |
|
BerkeleyDB |
1,902 ops/sec |
|
E. Batch Writes
Sequential Writes
LevelDB |
3,138 entries/sec |
|
Kyoto TreeDB |
5,860 entries/sec |
|
SQLite3 |
2,068 entries/sec |
|
MDB |
13,215 entries/sec |
|
BerkeleyDB |
1,952 entries/sec |
|
Random Writes
LevelDB |
3,079 entries/sec |
|
Kyoto TreeDB |
5,709 entries/sec |
|
SQLite3 |
2,041 entries/sec |
|
MDB |
13,099 entries/sec |
|
BerkeleyDB |
1,939 entries/sec |
|
Unlike the other databases, MDB handles large value writes with ease.
In Google's original test only 1,000 entries are used, but that is too few
to characterize each database. With 10,000 entries, we see that LevelDB's
performance drops off a cliff in the random write tests. This is one of the
most common problems with microbenchmarks - very often the data sets are too
small to yield any meaningful results.
F. Synchronous Writes
Sequential Writes
LevelDB |
3,368 ops/sec |
|
Kyoto TreeDB |
3,121 ops/sec |
|
SQLite3 |
2,026 ops/sec |
|
MDB |
12,916 ops/sec |
|
BerkeleyDB |
1,913 ops/sec |
|
Random Writes
LevelDB |
745 ops/sec |
|
Kyoto TreeDB |
2,162 ops/sec |
|
SQLite3 |
1,996 ops/sec |
|
MDB |
12,665 ops/sec |
|
BerkeleyDB |
1,893 ops/sec |
|
None of the other databases even begin to approach MDB's performance when using
large data values.
G. Disk Usage
| Plain Writes | Batched | Synchronous |
|
Sequential | Random |
Sequential | Random |
Sequential | Random |
LevelDB |
979580 | 993188 | 977336 | 1109824 | 979580 | 990068 |
Kyoto TreeDB |
978076 | 978024 | | | 978232 | 978172 |
SQLite3 |
985812 | 985828 | 1082016 | 1083112 | 985832 | 985836 |
MDB |
1000380 | 1000516 | 1000380 | 1001352 | 1000380 | 1000540 |
BerkeleyDB |
2028472 | 2028936 | 2028072 | 2028572 | 2028472 | 2028996 |
5. Performance On SSD
The same tests as in Section 2 are performed again, this time using the Crucial M4 SSD with reiserfs.
This drive has been in light use for only a few weeks, so it should still be near its
original performance levels.
A. Sequential Reads
LevelDB |
4,629,630 ops/sec |
|
Kyoto TreeDB |
834,028 ops/sec |
|
SQLite3 |
315,259 ops/sec |
|
MDB |
14,084,507 ops/sec |
|
BerkeleyDB |
843,882 ops/sec |
|
B. Random Reads
LevelDB |
143,947 ops/sec |
|
Kyoto TreeDB |
99,285 ops/sec |
|
SQLite3 |
83,668 ops/sec |
|
MDB |
723,589 ops/sec |
|
BerkeleyDB |
102,669 ops/sec |
|
Read performance is essentially the same as for tmpfs since all of the data is
present in the filesystem cache.
C. Sequential Writes
LevelDB |
477,555 ops/sec |
|
Kyoto TreeDB |
335,683 ops/sec |
|
SQLite3 |
51,419 ops/sec |
|
MDB |
494,315 ops/sec |
|
BerkeleyDB |
59,637 ops/sec |
|
D. Random Writes
LevelDB |
164,962 ops/sec |
|
Kyoto TreeDB |
93,266 ops/sec |
|
SQLite3 |
38,619 ops/sec |
|
MDB |
236,798 ops/sec |
|
BerkeleyDB |
16,783 ops/sec |
|
Most of the databases perform at close to their tmpfs speeds, which is expected
since these are asynchronous writes. However, BerkeleyDB shows a large reduction in
throughput. Also, surprisingly, MDB beats LevelDB's write performance here.
Even though the writes are fully cached, the underlying storage device
still has an impact on write throughput.
E. Batch Writes
Sequential Writes
LevelDB |
619,963 entries/sec |
|
(1.30x non-batched) |
Kyoto TreeDB |
335,683 entries/sec |
|
(non-batched) |
SQLite3 |
99,850 entries/sec |
|
(1.94x non-batched) |
MDB |
2,386,635 entries/sec |
|
(4.83x non-batched) |
BerkeleyDB |
111,297 entries/sec |
|
(1.87x non-batched) |
Random Writes
LevelDB |
228,154 entries/sec |
|
(1.38x non-batched) |
Kyoto TreeDB |
93,266 entries/sec |
|
(non-batched) |
SQLite3 |
53,700 entries/sec |
|
(1.39x non-batched) |
MDB |
286,205 entries/sec |
|
(1.21x non-batched) |
BerkeleyDB |
16,254 entries/sec |
|
(0.97x non-batched) |
F. Synchronous Writes
Here the difference between SSD and tmpfs is made obvious. Note that due
to the overall slowness of these operations, they were only performed 1000 times each.
Sequential Writes
LevelDB |
342 ops/sec |
|
(0.0007x asynch) |
Kyoto TreeDB |
69 ops/sec |
|
(0.0002x asynch) |
SQLite3 |
114 ops/sec |
|
(0.0022x asynch) |
MDB |
149 ops/sec |
|
(0.0003x asynch) |
MDB, no MetaSync |
328 ops/sec |
|
(0.0007x asynch) |
BerkeleyDB |
300 ops/sec |
|
(0.0050x asynch) |
Random Writes
LevelDB |
342 ops/sec |
|
(0.0021x asynch) |
Kyoto TreeDB |
67 ops/sec |
|
(0.0007x asynch) |
SQLite3 |
114 ops/sec |
|
(0.0029x asynch) |
MDB |
148 ops/sec |
|
(0.0006x asynch) |
MDB, no MetaSync |
322 ops/sec |
|
(0.0013x asynch) |
BerkeleyDB |
291 ops/sec |
|
(0.0173x asynch) |
The slowness of the SSD overshadows any difference between sequential and random
write performance here.
MDB's synchronous writes are slower because by default it does two separate syncs per commit -
one for the data, and one for the meta page update. Generally there is a large seek penalty for
the sync of the meta page, although that should not be an issue with an SSD.
Using MDB's NOMETASYNC option will skip the explicit sync of the
meta page update. Omitting the explicit sync of the meta page means the meta page
update for one transaction may not be sync'd along with that transaction but definitely will be
sync'd by the commit of the next following transaction.
In this case, if the system crashes, the last committed transaction may be lost. Some apps
can tolerate this reduced level of reliability. Also, on filesystems which support
ordered writes, no transactions will be lost.
6. Performance Using More Memory
We increased the overall cache size for each database to 128 MB, as in Section 3.
The "baseline" in these tests refers to the values from Section 5.
A. Sequential Reads
LevelDB |
4,672,897 ops/sec |
|
(1.01x baseline) |
Kyoto TreeDB |
1,246,883 ops/sec |
|
(1.50x baseline) |
SQLite3 |
438,982 ops/sec |
|
(1.39x baseline) |
MDB |
14,084,507 ops/sec |
|
(baseline) |
BerkeleyDB |
819,672 ops/sec |
|
(0.97x baseline) |
B. Random Reads
LevelDB |
142,857 ops/sec |
|
(0.99x baseline) |
Kyoto TreeDB |
193,386 ops/sec |
|
(1.95x baseline) |
SQLite3 |
104,690 ops/sec |
|
(1.25x baseline) |
MDB |
723,589 ops/sec |
|
(baseline) |
BerkeleyDB |
160,077 ops/sec |
|
(1.56x baseline) |
C. Sequential Writes
LevelDB |
483,793 ops/sec |
|
(0.92x baseline) |
Kyoto TreeDB |
323,625 ops/sec |
|
(0.96x baseline) |
SQLite3 |
49.925 ops/sec |
|
(0.97x baseline) |
MDB |
494,315 ops/sec |
|
(baseline) |
BerkeleyDB |
71,726 ops/sec |
|
(1.20x baseline) |
D. Random Writes
LevelDB |
134,608 ops/sec |
|
(0.82x baseline) |
Kyoto TreeDB |
180,636 ops/sec |
|
(1.94x baseline) |
SQLite3 |
40,050 ops/sec |
|
(1.04x baseline) |
MDB |
236,798 ops/sec |
|
(baseline) |
BerkeleyDB |
53,333 ops/sec |
|
(3.18x baseline) |
E. Batch Writes
Sequential Writes
LevelDB |
610,128 entries/sec |
|
(1.26x non-batched) |
Kyoto TreeDB |
323,625 entries/sec |
|
(non-batched) |
SQLite3 |
94,171 entries/sec |
|
(1.89x non-batched) |
MDB |
2,386,635 entries/sec |
|
(4.83x non-batched) |
BerkeleyDB |
108,108 entries/sec |
|
(1.51x non-batched) |
Random Writes
LevelDB |
228,624 entries/sec |
|
(1.70x non-batched) |
Kyoto TreeDB |
180,636 entries/sec |
|
(non-batched) |
SQLite3 |
56,145 entries/sec |
|
(1.40x non-batched) |
MDB |
286,205 entries/sec |
|
(1.21x non-batched) |
BerkeleyDB |
76,622 entries/sec |
|
(1.44x non-batched) |
F. Synchronous Writes
Sequential Writes
LevelDB |
348 ops/sec |
|
(0.0007x asynch) |
Kyoto TreeDB |
69 ops/sec |
|
(0.0002x asynch) |
SQLite3 |
117 ops/sec |
|
(0.0023x asynch) |
MDB |
149 ops/sec |
|
(0.0003x asynch) |
MDB, no MetaSync |
328 ops/sec |
|
(0.0007x asynch) |
BerkeleyDB |
299 ops/sec |
|
(0.0042x asynch) |
Random Writes
LevelDB |
349 ops/sec |
|
(0.0026x asynch) |
Kyoto TreeDB |
67 ops/sec |
|
(0.0004x asynch) |
SQLite3 |
118 ops/sec |
|
(0.0029x asynch) |
MDB |
148 ops/sec |
|
(0.0006x asynch) |
MDB, no MetaSync |
322 ops/sec |
|
(0.0013x asynch) |
BerkeleyDB |
297 ops/sec |
|
(0.0056x asynch) |
The impact of the increased cache is inconsistent at best.
7. Performance Using Large Values
This is the same as the test in Section 4, using the SSD.
A. Sequential Reads
LevelDB |
355,240 ops/sec |
|
Kyoto TreeDB |
16,384 ops/sec |
|
SQLite3 |
7,163 ops/sec |
|
MDB |
30,303,030 ops/sec |
|
BerkeleyDB |
9,144 ops/sec |
|
B. Random Reads
LevelDB |
19,245 ops/sec |
|
Kyoto TreeDB |
14,639 ops/sec |
|
SQLite3 |
6,943 ops/sec |
|
MDB |
1,697,793 ops/sec |
|
BerkeleyDB |
8,652 ops/sec |
|
The read results are about the same as for tmpfs.
C. Sequential Writes
LevelDB |
761 ops/sec |
|
Kyoto TreeDB |
3,449 ops/sec |
|
SQLite3 |
1,080 ops/sec |
|
MDB |
9,762 ops/sec |
|
BerkeleyDB |
450 ops/sec |
|
D. Random Writes
LevelDB |
126 ops/sec |
|
Kyoto TreeDB |
2,578 ops/sec |
|
SQLite3 |
948 ops/sec |
|
MDB |
9,687 ops/sec |
|
BerkeleyDB |
435 ops/sec |
|
E. Batch Writes
Sequential Writes
LevelDB |
1,575 entries/sec |
|
Kyoto TreeDB |
3,449 entries/sec |
|
SQLite3 |
1107 entries/sec |
|
MDB |
9,996 entries/sec |
|
BerkeleyDB |
454 entries/sec |
|
Random Writes
LevelDB |
1,574 entries/sec |
|
Kyoto TreeDB |
2,578 entries/sec |
|
SQLite3 |
970 entries/sec |
|
MDB |
9,922 entries/sec |
|
BerkeleyDB |
441 entries/sec |
|
MDB dominates the asynchronous write tests.
F. Synchronous Writes
Sequential Writes
LevelDB |
110 ops/sec |
|
Kyoto TreeDB |
34 ops/sec |
|
SQLite3 |
83 ops/sec |
|
MDB |
91 ops/sec |
|
MDB, no MetaSync |
93 ops/sec |
|
BerkeleyDB |
100 ops/sec |
|
Random Writes
LevelDB |
92 ops/sec |
|
Kyoto TreeDB |
30 ops/sec |
|
SQLite3 |
80 ops/sec |
|
MDB |
66 ops/sec |
|
MDB, no MetaSync |
97 ops/sec |
|
BerkeleyDB |
107 ops/sec |
|
The benefit of MDB's NOMETASYNC option is marginal here; most of the time
required for these operations is simply in writing the 100,000 byte data values.
8. Performance On HDD
The same tests as in Section 2 are performed again,
this time using the Seagate ST9500420AS HDD with EXT2 fs. The EXT2 filesystem
was chosen based on our
previous survey of multiple filesystems.
A. Sequential Reads
LevelDB |
4,385,965 ops/sec |
|
Kyoto TreeDB |
823,723 ops/sec |
|
SQLite3 |
306,937 ops/sec |
|
MDB |
14,084,507 ops/sec |
|
BerkeleyDB |
825,083 ops/sec |
|
B. Random Reads
LevelDB |
135,925 ops/sec |
|
Kyoto TreeDB |
101,041 ops/sec |
|
SQLite3 |
81,579 ops/sec |
|
MDB |
746,826 ops/sec |
|
BerkeleyDB |
102,323 ops/sec |
|
Read performance is essentially the same as the previous tests since all of the data is
present in the filesystem cache.
C. Sequential Writes
LevelDB |
497,265 ops/sec |
|
Kyoto TreeDB |
337,952 ops/sec |
|
SQLite3 |
49,383 ops/sec |
|
MDB |
488,281 ops/sec |
|
BerkeleyDB |
68,507 ops/sec |
|
D. Random Writes
LevelDB |
172,058 ops/sec |
|
Kyoto TreeDB |
90,827 ops/sec |
|
SQLite3 |
19,623 ops/sec |
|
MDB |
233,155 ops/sec |
|
BerkeleyDB |
27,625 ops/sec |
|
E. Batch Writes
Sequential Writes
LevelDB |
517,063 entries/sec |
|
(1.04x non-batched) |
Kyoto TreeDB |
337,952 entries/sec |
|
(non-batched) |
SQLite3 |
97,267 entries/sec |
|
(1.97x non-batched) |
MDB |
2,487,562 entries/sec |
|
(5.09x non-batched) |
BerkeleyDB |
118,329 entries/sec |
|
(1.73x non-batched) |
Random Writes
LevelDB |
226,244 entries/sec |
|
(1.31x non-batched) |
Kyoto TreeDB |
90,827 entries/sec |
|
(non-batched) |
SQLite3 |
25,825 entries/sec |
|
(1.32x non-batched) |
MDB |
286,451 entries/sec |
|
(1.23x non-batched) |
BerkeleyDB |
26,190 entries/sec |
|
(0.95x non-batched) |
F. Synchronous Writes
As before, only 1000 operations are performed, due to the slowness of these tests.
The HDD's cache was not disabled for these tests.
Sequential Writes
LevelDB |
1260 ops/sec |
|
(0.0025x asynch) |
Kyoto TreeDB |
30 ops/sec |
|
(0.0001x asynch) |
SQLite3 |
114 ops/sec |
|
(0.0023x asynch) |
MDB |
364 ops/sec |
|
(0.0007x asynch) |
BerkeleyDB |
733 ops/sec |
|
(0.0107x asynch) |
Random Writes
LevelDB |
1291 ops/sec |
|
(0.0075x asynch) |
Kyoto TreeDB |
28 ops/sec |
|
(0.0003x asynch) |
SQLite3 |
112 ops/sec |
|
(0.0057x asynch) |
MDB |
297 ops/sec |
|
(0.0013x asynch) |
BerkeleyDB |
704 ops/sec |
|
(0.0255x asynch) |
The slowness of the HDD overshadows any difference between sequential and random
write performance here. LevelDB seems to benefit more from the HDD's caching than
any of the other databases.
MDB's NOMETASYNC option made no discernible difference here so those results
are omitted. Ultimately the seek
overhead cannot be avoided using a single storage device. Ideally the meta pages
should be isolated in a separate file that can be stored on a separate spindle to
avoid the seek overhead.
9. Performance Using More Memory
We increased the overall cache size for each database to 128 MB, as in Section 3.
The "baseline" in these tests refers to the values from Section 8.
A. Sequential Reads
LevelDB |
4,524,887 ops/sec |
|
(1.03x baseline) |
Kyoto TreeDB |
1,246,883 ops/sec |
|
(1.51x baseline) |
SQLite3 |
430,108 ops/sec |
|
(1.40x baseline) |
MDB |
14,084,507 ops/sec |
|
(baseline) |
BerkeleyDB |
572,082 ops/sec |
|
(0.69x baseline) |
B. Random Reads
LevelDB |
137,231 ops/sec |
|
(1.01x baseline) |
Kyoto TreeDB |
192,456 ops/sec |
|
(1.90x baseline) |
SQLite3 |
104,102 ops/sec |
|
(1.28x baseline) |
MDB |
746,826 ops/sec |
|
(baseline) |
BerkeleyDB |
134,626 ops/sec |
|
(1.32x baseline) |
C. Sequential Writes
LevelDB |
492,854 ops/sec |
|
(0.99x baseline) |
Kyoto TreeDB |
327,225 ops/sec |
|
(0.97x baseline) |
SQLite3 |
48,065 ops/sec |
|
(0.97x baseline) |
MDB |
488,281 ops/sec |
|
(baseline) |
BerkeleyDB |
58,258 ops/sec |
|
(0.85x baseline) |
D. Random Writes
LevelDB |
176,929 ops/sec |
|
(1.03 baseline) |
Kyoto TreeDB |
181,127 ops/sec |
|
(1.99x baseline) |
SQLite3 |
19,085 ops/sec |
|
(0.97x baseline) |
MDB |
233,155 ops/sec |
|
(baseline) |
BerkeleyDB |
46,990 ops/sec |
|
(1.70x baseline) |
E. Batch Writes
Sequential Writes
LevelDB |
553,097 entries/sec |
|
(1.12x non-batched) |
Kyoto TreeDB |
327,225 entries/sec |
|
(non-batched) |
SQLite3 |
94,697 entries/sec |
|
(1.97x non-batched) |
MDB |
2,487,562 entries/sec |
|
(5.09x non-batched) |
BerkeleyDB |
93,266 entries/sec |
|
(1.60x non-batched) |
Random Writes
LevelDB |
229,095 entries/sec |
|
(1.29x non-batched) |
Kyoto TreeDB |
181,127 entries/sec |
|
(non-batched) |
SQLite3 |
28,129 entries/sec |
|
(1.47x non-batched) |
MDB |
286,451 entries/sec |
|
(1.23x non-batched) |
BerkeleyDB |
64,708 entries/sec |
|
(1.38x non-batched) |
F. Synchronous Writes
Sequential Writes
LevelDB |
1298 ops/sec |
|
(0.0026x asynch) |
Kyoto TreeDB |
29 ops/sec |
|
(0.0001x asynch) |
SQLite3 |
115 ops/sec |
|
(0.0024x asynch) |
MDB |
364 ops/sec |
|
(0.0007x asynch) |
BerkeleyDB |
733 ops/sec |
|
(0.0126x asynch) |
Random Writes
LevelDB |
1323 ops/sec |
|
(0.0075x baseline) |
Kyoto TreeDB |
29 ops/sec |
|
(0.0002x baseline) |
SQLite3 |
113 ops/sec |
|
(0.0059x baseline) |
MDB |
297 ops/sec |
|
(0.0013x baseline) |
BerkeleyDB |
649 ops/sec |
|
(0.0138x baseline) |
10. Performance Using Large Values
This is the same as the test in Section 4, using the HDD.
A. Sequential Reads
LevelDB |
373,413 ops/sec |
|
Kyoto TreeDB |
15,910 ops/sec |
|
SQLite3 |
7,634 ops/sec |
|
MDB |
28,571,429 ops/sec |
|
BerkeleyDB |
9,443 ops/sec |
|
B. Random Reads
LevelDB |
20,831 ops/sec |
|
Kyoto TreeDB |
14,167 ops/sec |
|
SQLite3 |
7,210 ops/sec |
|
MDB |
1,700,680 ops/sec |
|
BerkeleyDB |
8,786 ops/sec |
|
Again, the read results are about the same as for tmpfs.
C. Sequential Writes
LevelDB |
651 ops/sec |
|
Kyoto TreeDB |
5,124 ops/sec |
|
SQLite3 |
1,525 ops/sec |
|
MDB |
11,865 ops/sec |
|
BerkeleyDB |
470 ops/sec |
|
D. Random Writes
LevelDB |
99 ops/sec |
|
Kyoto TreeDB |
5,000 ops/sec |
|
SQLite3 |
553 ops/sec |
|
MDB |
11,716 ops/sec |
|
BerkeleyDB |
519 ops/sec |
|
E. Batch Writes
Sequential Writes
LevelDB |
759 entries/sec |
|
Kyoto TreeDB |
5,124 entries/sec |
|
SQLite3 |
1,828 entries/sec |
|
MDB |
12,135 entries/sec |
|
BerkeleyDB |
482 entries/sec |
|
Random Writes
LevelDB |
754 entries/sec |
|
Kyoto TreeDB |
5,000 entries/sec |
|
SQLite3 |
1,820 entries/sec |
|
MDB |
12,037 entries/sec |
|
BerkeleyDB |
474 entries/sec |
|
F. Synchronous Writes
Sequential Writes
LevelDB |
119 ops/sec |
|
Kyoto TreeDB |
21 ops/sec |
|
SQLite3 |
107 ops/sec |
|
MDB |
119 ops/sec |
|
BerkeleyDB |
136 ops/sec |
|
Random Writes
LevelDB |
117 ops/sec |
|
Kyoto TreeDB |
18 ops/sec |
|
SQLite3 |
107 ops/sec |
|
MDB |
113 ops/sec |
|
BerkeleyDB |
136 ops/sec |
|
The slowness of the HDD makes most of the database implementations perform about
the same. As before, Kyoto Cabinet is much slower than the rest.
The raw data for all of these tests is also available.
tmpfs, SSD, and HDD.
The results are also tabulated in an OpenOffice spreadsheet for further analysis
here.
The raw data includes additional tests (e.g. reverse sequential read) which were
omitted from this already lengthy report for space reasons.