Database Microbenchmarks
Symas Corp., July 2012
This page follows on from Google's LevelDB benchmarks published in July 2011 at
LevelDB. (A snapshot
of that document is available here for reference.
In addition to the benchmarks
tested there, we add the venerable BerkeleyDB as well as the OpenLDAP MDB database. For this test, we compare
LevelDB version 1.5 (git rev dd0d562b4d4fbd07db6a44f9e221f8d368fee8e4), SQLite3
(version 3.7.7.1) and Kyoto Cabinet's (version 1.2.76) TreeDB (a B+Tree based key-value store), Berkeley DB 5.3.21, and OpenLDAP MDB (git rev a0993354a603a970889ad5c160c289ecca316f81). We would like to acknowledge the LevelDB project for the original benchmark code.
Benchmarks were all performed on a Dell Precision M4400 laptop with a quad-core Intel(R) Core(TM)2 ExtremeCPU Q9300 @ 2.53GHz, with 6144 KB of total L3 cache and 8 GB of DDR2 RAM at 800 MHz. (Note that LevelDB uses at most two CPUs since the benchmarks are single threaded: one to run the benchmark, and one for background compactions.) The benchmarks were run on two different filesystems, one with a tmpfs and one with reiserfs on an SSD.
The SSD is a relatively old model, Samsung PM800 Series 256GB. The system had Ubuntu 12.04 installed, with
kernel 3.2.0-26. Tests were all run in single-user mode to prevent variations due to other system activity.
CPU performance scaling was disabled (scaling_governor = performance), to ensure a consistent CPU clock speed
for all tests. The numbers reported below are the median of three measurements. The databases are completely deleted
between each of the three measurements.
Update: Additional tests were run on a Western Digital WD20EARX 2TB SATA
hard drive. The HDD results start in Section 8. The results across
multiple filesystems are in Section 11.
Benchmark Source Code
We wrote benchmark tools for SQLite, BerkeleyDB, MDB, and Kyoto TreeDB based on LevelDB's db_bench.
The LevelDB, SQLite3, and TreeDB benchmark programs were originally provided in the LevelDB source distribution
but we've made additional fixes to the versions used here.
The code for each of the benchmarks resides here:
Custom Build Specifications
Compression support was disabled in the libraries that support it. No special malloc library was used in the build. All
of the benchmark programs were linked to their respective static libraries, to show the actual size needed for a
minimal program using each library.
- LevelDB: Assertions were disabled.
- TreeDB: We enabled the TSMALL and TLINEAR options when opening the database in order to reduce the footprint of each record.
- SQLite: We tuned SQLite's performance, by setting its locking mode to exclusive. We also enabled SQLite's write-ahead logging.
- BerkeleyDB: We configure with --with-mutex=POSIX/pthreads to avoid using the default hybrid mutex implementation.
- MDB: Assertions were disabled.
1. Relative Footprint
Most database vendors claim their product is fast and lightweight. Looking at the total size
of each application gives some insight into the footprint of each database implementation.
size db_bench*
text data bss dec hex filename
271991 1456 320 273767 42d67 db_bench
1682579 2288 296 1685163 19b6ab db_bench_bdb
96879 1500 296 98675 18173 db_bench_mdb
655988 7768 1688 665444 a2764 db_bench_sqlite3
296244 4808 1080 302132 49c34 db_bench_tree_db
The core of the MDB code is barely 32K of x86-64 object code. It fits entirely within most modern CPUs' on-chip caches.
All of the other libraries are several times larger.
2. Baseline Performance
This section gives the baseline performance of all the
databases. Following sections show how performance changes as various
parameters are varied. For the baseline:
- All operations are running on tmpfs. This shows the pure CPU time each
database requires, independent of I/O speed.
- Each database is allowed 4 MB of cache memory. (MDB has no cache, so this is irrelevant.)
- Databases are opened in asynchronous write mode.
(LevelDB's sync option, TreeDB's OAUTOSYNC option,
SQLite3's synchronous options are all turned off, MDB uses the
MDB_NOSYNC option, and BerkeleyDB uses the DB_TXN_WRITE_NOSYNC option). I.e.,
every write is pushed to the operating system, but the
benchmark does not wait for the write to reach the disk.
- Keys are 16 bytes each.
- Value are 100 bytes each.
- Sequential reads/writes traverse the key space in increasing order.
- Random reads/writes traverse the key space in random order.
A. Sequential Reads
LevelDB |
4,566,210 ops/sec |
|
Kyoto TreeDB |
851,788 ops/sec |
|
SQLite3 |
265,816 ops/sec |
|
MDB |
14,492,754 ops/sec |
|
BerkeleyDB |
834,029 ops/sec |
|
B. Random Reads
LevelDB |
186,289 ops/sec |
|
Kyoto TreeDB |
107,631 ops/sec |
|
SQLite3 |
82,706 ops/sec |
|
MDB |
768,640 ops/sec |
|
BerkeleyDB |
101,647 ops/sec |
|
C. Sequential Writes
LevelDB |
562,114 ops/sec |
|
Kyoto TreeDB |
345,423 ops/sec |
|
SQLite3 |
55,860 ops/sec |
|
MDB |
113,161 ops/sec |
|
BerkeleyDB |
90,531 ops/sec |
|
D. Random Writes
LevelDB |
363,504 ops/sec |
|
Kyoto TreeDB |
106,134 ops/sec |
|
SQLite3 |
37,074 ops/sec |
|
MDB |
93,835 ops/sec |
|
BerkeleyDB |
47,950 ops/sec |
|
LevelDB has the fastest write operations. MDB has the fastest read operations by a huge
margin, due to its single-level-store architecture. MDB was written for OpenLDAP; LDAP directory workloads tend to
be many reads/few writes, so read optimization is more critical for that workload than writes. LevelDB is oriented
towards many writes/few reads, so write optimization is emphasized there.
E. Batch Writes
A batch write is a set of writes that are applied atomically to the underlying database. A single batch of N writes may be significantly faster than N individual writes. The following benchmark writes one thousand batches where each batch contains one thousand 100-byte values. TreeDB does not support batch writes so its baseline numbers are
repeated here for reference.
Sequential Writes
LevelDB |
745,712 entries/sec |
|
(1.33x baseline) |
Kyoto TreeDB |
345,423 entries/sec |
|
(baseline) |
SQLite3 |
111,161 entries/sec |
|
(1.99x baseline) |
MDB |
2,493,766 entries/sec |
|
(22.0x baseline) |
BerkeleyDB |
182,216 entries/sec |
|
(2.01x baseline) |
Random Writes
LevelDB |
469,263 entries/sec |
|
(1.29x baseline) |
Kyoto TreeDB |
106,135 entries/sec |
|
(baseline) |
SQLite3 |
49,803 entries/sec |
|
(1.34x baseline) |
MDB |
155,521 entries/sec |
|
(1.66x baseline) |
BerkeleyDB |
61,248 entries/sec |
|
(1.28x baseline) |
Because of the way LevelDB persistent storage is organized, batches of
random writes are not much slower (only a factor of 1.6x) than batches
of sequential writes. MDB has a special optimization for sequential writes,
which is most effective in batched operation.
F. Synchronous Writes
In the following benchmark, we enable the synchronous writing modes
of all of the databases. Since this change significantly slows down the
benchmark, we stop after 10,000 writes. Unfortunately the resulting
numbers are not directly comparable to the async numbers, since
overall database size is also a factor in write performance and
the resulting databases here are much smaller than the baseline.
- For LevelDB, we set WriteOptions.sync = true.
- In TreeDB, we enabled TreeDB's OAUTOSYNC option.
- For SQLite3, we set "PRAGMA synchronous = FULL".
- For MDB, we set no options since full sync is its default mode.
- For BerkeleyDB, we set no options since full sync is its default mode.
Sequential Writes
LevelDB |
372,024 ops/sec |
|
(0.661x baseline) |
Kyoto TreeDB |
6,889 ops/sec |
|
(0.0199x baseline) |
SQLite3 |
51,970 ops/sec |
|
(0.93x baseline) |
MDB |
157,332 ops/sec |
|
(1.39x baseline) |
BerkeleyDB |
86,468 ops/sec |
|
(0.95x baseline) |
Random Writes
LevelDB |
349,895 ops/sec |
|
(0.96x baseline) |
Kyoto TreeDB |
7,080 ops/sec |
|
(0.067x baseline) |
SQLite3 |
45,851 ops/sec |
|
(1.23x baseline) |
MDB |
147,776 ops/sec |
|
(1.57x baseline) |
BerkeleyDB |
78,296 ops/sec |
|
(1.63x baseline) |
In both LevelDB and TreeDB the fact that operations are synchronous outweighs the fact that
the database is much smaller than the baseline. TreeDB in particular performs extremely
poorly in synchronous mode. On random writes, for SQLite3, MDB, and BerkeleyDB
the smaller database size completely negates the cost of the synchronous writes.
3. Performance Using More Memory
We increased the overall cache size for each database to 128 MB.
For SQLite3, we kept the page size at 1024 bytes, but increased the number of pages to 131,072 (up from 4096). For TreeDB, we also kept the page size at 1024 bytes, but increased the cache size to 128 MB (up from 4 MB).
For MDB there is no cache, so the numbers are simply a copy of the baseline. Both MDB and BerkeleyDB use
the default system page size (4096 bytes).
A. Sequential Reads
LevelDB |
4,504,505 ops/sec |
|
(0.99x baseline) |
Kyoto TreeDB |
1,282,051 ops/sec |
|
(1.50x baseline) |
SQLite3 |
339,328 ops/sec |
|
(1.27x baseline) |
MDB |
14,084,507 ops/sec |
|
(baseline) |
BerkeleyDB |
879,507 ops/sec |
|
(1.05x baseline) |
B. Random Reads
LevelDB |
187,196 ops/sec |
|
(1.005x baseline) |
Kyoto TreeDB |
218,675 ops/sec |
|
(2.03x baseline) |
SQLite3 |
101,276 ops/sec |
|
(1.22x baseline) |
MDB |
765,697 ops/sec |
|
(baseline) |
BerkeleyDB |
173,641 ops/sec |
|
(1.70x baseline) |
C. Sequential Writes
LevelDB |
564,653 ops/sec |
|
(1.005x baseline) |
Kyoto TreeDB |
469,043 ops/sec |
|
(1.36x baseline) |
SQLite3 |
53,642 ops/sec |
|
(0.96x baseline) |
MDB |
99,771 ops/sec |
|
(baseline) |
BerkeleyDB |
91,819 ops/sec |
|
(1.014x baseline) |
D. Random Writes
LevelDB |
362,450 ops/sec |
|
(0.997x baseline) |
Kyoto TreeDB |
227,324 ops/sec |
|
(2.14x baseline) |
SQLite3 |
39,485 ops/sec |
|
(1.07x baseline) |
MDB |
87,040 ops/sec |
|
(baseline) |
BerkeleyDB |
72,643 ops/sec |
|
(1.51x baseline) |
E. Batch Writes
Sequential Writes
LevelDB |
744,602 entries/sec |
|
(1.32x non-batched) |
Kyoto TreeDB |
469,403 entries/sec |
|
(non-batched) |
SQLite3 |
105,619 entries/sec |
|
(1.97x non-batched) |
MDB |
1,157,407 entries/sec |
|
(11.6x non-batched) |
BerkeleyDB |
184,502 entries/sec |
|
(2.01x non-batched) |
Random Writes
LevelDB |
478,240 entries/sec |
|
(1.32x non-batched) |
Kyoto TreeDB |
227,324 entries/sec |
|
(non-batched) |
SQLite3 |
55,157 entries/sec |
|
(1.40x non-batched) |
MDB |
140,766 entries/sec |
|
(1.62x non-batched) |
BerkeleyDB |
115,969 entries/sec |
|
(1.59x non-batched) |
F. Synchronous Writes
Sequential Writes
LevelDB |
371,195 ops/sec |
|
(0.661x baseline) |
Kyoto TreeDB |
6,886 ops/sec |
|
(0.0199x baseline) |
SQLite3 |
51,945 ops/sec |
|
(0.93x baseline) |
MDB |
125,188 ops/sec |
|
(1.25x baseline) |
BerkeleyDB |
86,236 ops/sec |
|
(0.95x baseline) |
Random Writes
LevelDB |
346,741 ops/sec |
|
(0.96x baseline) |
Kyoto TreeDB |
7,070 ops/sec |
|
(0.067x baseline) |
SQLite3 |
45,880 ops/sec |
|
(1.23x baseline) |
MDB |
119,918 ops/sec |
|
(1.38x baseline) |
BerkeleyDB |
77,894 ops/sec |
|
(1.63x baseline) |
4. Performance Using Large Values
For this benchmark, we use 100,000 byte values. To keep the benchmark running time reasonable, we stop after writing 1000 values. Otherwise, all of the same tests as for the Baseline are run.
A. Sequential Reads
LevelDB |
194,628 ops/sec |
|
Kyoto TreeDB |
18,536 ops/sec |
|
SQLite3 |
7,476 ops/sec |
|
MDB |
33,333,333 ops/sec |
|
BerkeleyDB |
9,174 ops/sec |
|
B. Random Reads
LevelDB |
17,115 ops/sec |
|
Kyoto TreeDB |
17,207 ops/sec |
|
SQLite3 |
7,690 ops/sec |
|
MDB |
2,012,072 ops/sec |
|
BerkeleyDB |
9,347 ops/sec |
|
MDB's single-level-store architecture clearly outclasses all of the other designs; the others
barely even register on the results. MDB's zero-memcpy reads mean its read rate is
essentially independent of the size of the data items being fetched; it is only affected by the
total number of keys in the database.
C. Sequential Writes
LevelDB |
3,422 ops/sec |
|
Kyoto TreeDB |
12,415 ops/sec |
|
SQLite3 |
1.936 ops/sec |
|
MDB |
11,758 ops/sec |
|
BerkeleyDB |
1,869 ops/sec |
|
D. Random Writes
LevelDB |
2,178 ops/sec |
|
Kyoto TreeDB |
5,612 ops/sec |
|
SQLite3 |
1,820 ops/sec |
|
MDB |
10,278 ops/sec |
|
BerkeleyDB |
1,543 ops/sec |
|
E. Batch Writes
Sequential Writes
LevelDB |
2,327 entries/sec |
|
Kyoto TreeDB |
12,416 entries/sec |
|
SQLite3 |
1,908 entries/sec |
|
MDB |
6,828 entries/sec |
|
BerkeleyDB |
1,901 entries/sec |
|
Random Writes
LevelDB |
2,332 entries/sec |
|
Kyoto TreeDB |
5,612 entries/sec |
|
SQLite3 |
1,957 entries/sec |
|
MDB |
9,032 entries/sec |
|
BerkeleyDB |
1,563 entries/sec |
|
TreeDB has very good performance with large values using asynchronous writes.
It has much worse performance in synchronous mode. Batch mode appears to have
no benefit with large values; the work of writing the values cancels out
the efficiency gained from batching. MDB has additional features to handle
large values but the current benchmark code doesn't support it.
F. Synchronous Writes
Sequential Writes
LevelDB |
1,090 ops/sec |
|
Kyoto TreeDB |
3,115 ops/sec |
|
SQLite3 |
1,886 ops/sec |
|
MDB |
9,747 ops/sec |
|
BerkeleyDB |
2,167 ops/sec |
|
Random Writes
LevelDB |
1,064 ops/sec |
|
Kyoto TreeDB |
3,247 ops/sec |
|
SQLite3 |
2,137 ops/sec |
|
MDB |
10,001 ops/sec |
|
BerkeleyDB |
1,882 ops/sec |
|
5. Performance On SSD
The same tests as in Section 2 are performed again, this time using the Samsung SSD with reiserfs. This
drive has been in regular use over the past several years and was not reformatted for the tests. It has very
poor random write speed as a result.
A. Sequential Reads
LevelDB |
4,366,812 ops/sec |
|
Kyoto TreeDB |
851,789 ops/sec |
|
SQLite3 |
274,650 ops/sec |
|
MDB |
14,925,373 ops/sec |
|
BerkeleyDB |
804,505 ops/sec |
|
B. Random Reads
LevelDB |
154,321 ops/sec |
|
Kyoto TreeDB |
105,641 ops/sec |
|
SQLite3 |
82,905 ops/sec |
|
MDB |
772,797 ops/sec |
|
BerkeleyDB |
103,875 ops/sec |
|
Read performance is essentially the same as for tmpfs since all of the data is
present in the filesystem cache.
C. Sequential Writes
LevelDB |
414,079 ops/sec |
|
Kyoto TreeDB |
342,700 ops/sec |
|
SQLite3 |
51,464 ops/sec |
|
MDB |
93,231 ops/sec |
|
BerkeleyDB |
52,048 ops/sec |
|
D. Random Writes
LevelDB |
150,399 ops/sec |
|
Kyoto TreeDB |
103,928 ops/sec |
|
SQLite3 |
32,186 ops/sec |
|
MDB |
77,851 ops/sec |
|
BerkeleyDB |
15,959 ops/sec |
|
Most of the databases perform at close to their tmpfs speeds, which is expected
since these are asynchronous writes. However, BerkeleyDB shows a large reduction in
throughput.
E. Batch Writes
Sequential Writes
LevelDB |
509,165 entries/sec |
|
(1.23x non-batched) |
Kyoto TreeDB |
342,700 entries/sec |
|
(non-batched) |
SQLite3 |
101,010 entries/sec |
|
(1.96x non-batched) |
MDB |
953,289 entries/sec |
|
(10.2x non-batched) |
BerkeleyDB |
79,618 entries/sec |
|
(1.52x non-batched) |
Random Writes
LevelDB |
202,799 entries/sec |
|
(1.35x non-batched) |
Kyoto TreeDB |
103,928 entries/sec |
|
(non-batched) |
SQLite3 |
41,530 entries/sec |
|
(1.29x non-batched) |
MDB |
119,976 entries/sec |
|
(1.54x non-batched) |
BerkeleyDB |
15,261 entries/sec |
|
(0.96x non-batched) |
F. Synchronous Writes
Here the difference between SSD and tmpfs is made obvious.
Sequential Writes
LevelDB |
461 ops/sec |
|
(0.0011x asynch) |
Kyoto TreeDB |
60 ops/sec |
|
(0.0001x asynch) |
SQLite3 |
357 ops/sec |
|
(0.0069x asynch) |
MDB |
198 ops/sec |
|
(0.0021x asynch) |
BerkeleyDB |
417 ops/sec |
|
(0.0080x asynch) |
Random Writes
LevelDB |
460 ops/sec |
|
(0.0031x asynch) |
Kyoto TreeDB |
67 ops/sec |
|
(0.0006x asynch) |
SQLite3 |
361 ops/sec |
|
(0.0112x asynch) |
MDB |
194 ops/sec |
|
(0.0025x asynch) |
BerkeleyDB |
391 ops/sec |
|
(0.0245x asynch) |
The slowness of the SSD overshadows any difference between sequential and random
write performance here.
6. Performance Using More Memory
We increased the overall cache size for each database to 128 MB, as in Section 3.
The "baseline" in these tests refers to the values from Section 5.
A. Sequential Reads
LevelDB |
4,566,210 ops/sec |
|
(1.05x baseline) |
Kyoto TreeDB |
1,298,701 ops/sec |
|
(1.52x baseline) |
SQLite3 |
343,289 ops/sec |
|
(1.25x baseline) |
MDB |
14,925,373 ops/sec |
|
(baseline) |
BerkeleyDB |
887,311 ops/sec |
|
(1.10x baseline) |
B. Random Reads
LevelDB |
154,655 ops/sec |
|
(1.002x baseline) |
Kyoto TreeDB |
219,154 ops/sec |
|
(2.07x baseline) |
SQLite3 |
98,931 ops/sec |
|
(1.19x baseline) |
MDB |
772,798 ops/sec |
|
(baseline) |
BerkeleyDB |
171,644 ops/sec |
|
(1.65x baseline) |
C. Sequential Writes
LevelDB |
417,885 ops/sec |
|
(1.009x baseline) |
Kyoto TreeDB |
462,321 ops/sec |
|
(1.35x baseline) |
SQLite3 |
51,575 ops/sec |
|
(1.002x baseline) |
MDB |
93,231 ops/sec |
|
(baseline) |
BerkeleyDB |
59,481 ops/sec |
|
(1.14x baseline) |
D. Random Writes
LevelDB |
150,466 ops/sec |
|
(1.000x baseline) |
Kyoto TreeDB |
225,073 ops/sec |
|
(2.17x baseline) |
SQLite3 |
35,030 ops/sec |
|
(1.09x baseline) |
MDB |
77,851 ops/sec |
|
(baseline) |
BerkeleyDB |
53,502 ops/sec |
|
(3.35x baseline) |
E. Batch Writes
Sequential Writes
LevelDB |
505,051 entries/sec |
|
(1.21x non-batched) |
Kyoto TreeDB |
462,321 entries/sec |
|
(non-batched) |
SQLite3 |
100,634 entries/sec |
|
(1.95x non-batched) |
MDB |
953,289 entries/sec |
|
(10.2x non-batched) |
BerkeleyDB |
97,714 entries/sec |
|
(1.64x non-batched) |
Random Writes
LevelDB |
205,212 entries/sec |
|
(1.36x non-batched) |
Kyoto TreeDB |
225,073 entries/sec |
|
(non-batched) |
SQLite3 |
46,637 entries/sec |
|
(1.33x non-batched) |
MDB |
119,976 entries/sec |
|
(1.54x non-batched) |
BerkeleyDB |
77,119 entries/sec |
|
(1.44x non-batched) |
F. Synchronous Writes
Sequential Writes
LevelDB |
467 ops/sec |
|
(0.0011x asynch) |
Kyoto TreeDB |
61 ops/sec |
|
(0.0001x asynch) |
SQLite3 |
369 ops/sec |
|
(0.0072x asynch) |
MDB |
199 ops/sec |
|
(0.0021x asynch) |
BerkeleyDB |
406 ops/sec |
|
(0.0068x asynch) |
Random Writes
LevelDB |
466 ops/sec |
|
(0.0031x baseline) |
Kyoto TreeDB |
70 ops/sec |
|
(0.0003x baseline) |
SQLite3 |
366 ops/sec |
|
(0.0104x baseline) |
MDB |
194 ops/sec |
|
(0.0025x baseline) |
BerkeleyDB |
379 ops/sec |
|
(0.0071x baseline) |
7. Performance Using Large Values
This is the same as the test in Section 4, using the SSD.
A. Sequential Reads
LevelDB |
149,992 ops/sec |
|
Kyoto TreeDB |
18,776 ops/sec |
|
SQLite3 |
7,845 ops/sec |
|
MDB |
32,258,064 ops/sec |
|
BerkeleyDB |
9,414 ops/sec |
|
B. Random Reads
LevelDB |
21,607 ops/sec |
|
Kyoto TreeDB |
17,390 ops/sec |
|
SQLite3 |
8,033 ops/sec |
|
MDB |
1,976,285 ops/sec |
|
BerkeleyDB |
5,653 ops/sec |
|
The read results are about the same as for tmpfs.
C. Sequential Writes
LevelDB |
712 ops/sec |
|
Kyoto TreeDB |
12,425 ops/sec |
|
SQLite3 |
1,184 ops/sec |
|
MDB |
4,403 ops/sec |
|
BerkeleyDB |
190 ops/sec |
|
D. Random Writes
LevelDB |
405 ops/sec |
|
Kyoto TreeDB |
5,089 ops/sec |
|
SQLite3 |
1,311 ops/sec |
|
MDB |
4,165 ops/sec |
|
BerkeleyDB |
247 ops/sec |
|
E. Batch Writes
Sequential Writes
LevelDB |
2,194 entries/sec |
|
Kyoto TreeDB |
12,425 entries/sec |
|
SQLite3 |
694 entries/sec |
|
MDB |
3,391 entries/sec |
|
BerkeleyDB |
306 entries/sec |
|
Random Writes
LevelDB |
2,184 entries/sec |
|
Kyoto TreeDB |
5,089 entries/sec |
|
SQLite3 |
790 entries/sec |
|
MDB |
4,901 entries/sec |
|
BerkeleyDB |
291 entries/sec |
|
F. Synchronous Writes
Sequential Writes
LevelDB |
106 ops/sec |
|
Kyoto TreeDB |
32 ops/sec |
|
SQLite3 |
92 ops/sec |
|
MDB |
91 ops/sec |
|
BerkeleyDB |
126 ops/sec |
|
Random Writes
LevelDB |
106 ops/sec |
|
Kyoto TreeDB |
38 ops/sec |
|
SQLite3 |
104 ops/sec |
|
MDB |
88 ops/sec |
|
BerkeleyDB |
114 ops/sec |
|
As before, TreeDB's write performance is good on asynchronous writes. BerkeleyDB's performance
degrades the least in synchronous mode.
8. Performance On HDD
The same tests as in Section 2 are performed again,
this time using the Western Digital WD20EARX HDD with EXT3 fs. The drive
was attached to the laptop's eSATA port, so interface bottlenecks are not
an issue. The MDB library used here is a littler newer
than the previous tests, using revision
5da67968afb599697d7557c13b65fb961ec408dd
which results in faster sequential write rates than in the previous tests
so those numbers are not directly comparable.
Note that this data does not represent the maximum performance
that the drive is capable of.
For completeness, the tests were repeated on multiple other filesystems
including EXT2, EXT3, EXT4, JFS, XFS, NTFS, ReiserFS, BTRFS, and ZFS. Those
results will be uploaded later.
This drive uses 4KB physical sectors. The drive was partitioned into two
1TB partitions, 4KB aligned. The first partition was formatted with NTFS.
The 2nd partition was reused with each of the other filesystems.
A. Sequential Reads
LevelDB |
4,504,504 ops/sec |
|
Kyoto TreeDB |
851,789 ops/sec |
|
SQLite3 |
272,554 ops/sec |
|
MDB |
14,705,882 ops/sec |
|
BerkeleyDB |
805,152 ops/sec |
|
B. Random Reads
LevelDB |
99,010 ops/sec |
|
Kyoto TreeDB |
106,315 ops/sec |
|
SQLite3 |
82,034 ops/sec |
|
MDB |
772,200 ops/sec |
|
BerkeleyDB |
98,795 ops/sec |
|
Read performance is essentially the same as the previous tests since all of the data is
present in the filesystem cache. LevelDB and BerkeleyDB are slightly slower than before.
C. Sequential Writes
LevelDB |
205,550 ops/sec |
|
Kyoto TreeDB |
344,828 ops/sec |
|
SQLite3 |
46,164 ops/sec |
|
MDB |
78,021 ops/sec |
|
BerkeleyDB |
43,977 ops/sec |
|
D. Random Writes
LevelDB |
63,259 ops/sec |
|
Kyoto TreeDB |
101,194 ops/sec |
|
SQLite3 |
28,581 ops/sec |
|
MDB |
61,335 ops/sec |
|
BerkeleyDB |
4,978 ops/sec |
|
Kyoto Cabinet performs close to its tmpfs speed, while the other databases show
more of a reduction in throughput. BerkeleyDB slows down the most.
E. Batch Writes
Sequential Writes
LevelDB |
213,904 entries/sec |
|
(1.04x non-batched) |
Kyoto TreeDB |
344,828 entries/sec |
|
(non-batched) |
SQLite3 |
91,291 entries/sec |
|
(1.98x non-batched) |
MDB |
1,602,564 entries/sec |
|
(20.5x non-batched) |
BerkeleyDB |
56,085 entries/sec |
|
(1.27x non-batched) |
Random Writes
LevelDB |
85,230 entries/sec |
|
(1.35x non-batched) |
Kyoto TreeDB |
101,194 entries/sec |
|
(non-batched) |
SQLite3 |
35,791 entries/sec |
|
(1.25x non-batched) |
MDB |
109,866 entries/sec |
|
(1.79x non-batched) |
BerkeleyDB |
4,928 entries/sec |
|
(0.99x non-batched) |
F. Synchronous Writes
As slow as the SSD was, the HDD results are even slower.
Note however, that further investigation shows that these results are
nowhere near the maximum performance of the HDD. More details on
this in Section 11.
Sequential Writes
LevelDB |
68 ops/sec |
|
(0.0003x asynch) |
Kyoto TreeDB |
5 ops/sec |
|
(0.00001x asynch) |
SQLite3 |
62 ops/sec |
|
(0.0013x asynch) |
MDB |
35 ops/sec |
|
(0.0004x asynch) |
BerkeleyDB |
60 ops/sec |
|
(0.0014x asynch) |
Random Writes
LevelDB |
68 ops/sec |
|
(0.0011x asynch) |
Kyoto TreeDB |
5 ops/sec |
|
(0.00005x asynch) |
SQLite3 |
62 ops/sec |
|
(0.0222x asynch) |
MDB |
43 ops/sec |
|
(0.0007x asynch) |
BerkeleyDB |
60 ops/sec |
|
(0.0121x asynch) |
The slowness of the HDD overshadows any difference between sequential and random
write performance here. None of these systems are suitable for real-world use in
this configuration, but Kyoto Cabinet is by far the worst. If an application
demands full ACID transactions, Kyoto Cabinet should definitely be avoided.
9. Performance Using More Memory
We increased the overall cache size for each database to 128 MB, as in Section 3.
The "baseline" in these tests refers to the values from Section 8.
A. Sequential Reads
LevelDB |
4,464,286 ops/sec |
|
(0.99x baseline) |
Kyoto TreeDB |
1,236,094 ops/sec |
|
(1.45x baseline) |
SQLite3 |
341,880 ops/sec |
|
(1.25x baseline) |
MDB |
14,705,882 ops/sec |
|
(baseline) |
BerkeleyDB |
548,546 ops/sec |
|
(0.68x baseline) |
B. Random Reads
LevelDB |
100,675 ops/sec |
|
(1.017x baseline) |
Kyoto TreeDB |
219,491 ops/sec |
|
(2.06x baseline) |
SQLite3 |
98,830 ops/sec |
|
(1.20x baseline) |
MDB |
772,201 ops/sec |
|
(baseline) |
BerkeleyDB |
149,343 ops/sec |
|
(1.51x baseline) |
C. Sequential Writes
LevelDB |
206,228 ops/sec |
|
(1.003x baseline) |
Kyoto TreeDB |
320,616 ops/sec |
|
(0.93x baseline) |
SQLite3 |
43,925 ops/sec |
|
(0.95x baseline) |
MDB |
78,021 ops/sec |
|
(baseline) |
BerkeleyDB |
49,993 ops/sec |
|
(1.14x baseline) |
D. Random Writes
LevelDB |
61,931 ops/sec |
|
(0.98x baseline) |
Kyoto TreeDB |
222,816 ops/sec |
|
(2.20x baseline) |
SQLite3 |
29,996 ops/sec |
|
(1.05x baseline) |
MDB |
61,335 ops/sec |
|
(baseline) |
BerkeleyDB |
44,256 ops/sec |
|
(8.89x baseline) |
E. Batch Writes
Sequential Writes
LevelDB |
206,271 entries/sec |
|
(1.00x non-batched) |
Kyoto TreeDB |
320,616 entries/sec |
|
(non-batched) |
SQLite3 |
91,458 entries/sec |
|
(1.98x non-batched) |
MDB |
1,602,564 entries/sec |
|
(20.5x non-batched) |
BerkeleyDB |
76,476 entries/sec |
|
(15.36x non-batched) |
Random Writes
LevelDB |
85,346 entries/sec |
|
(1.35x non-batched) |
Kyoto TreeDB |
222,816 entries/sec |
|
(non-batched) |
SQLite3 |
41,658 entries/sec |
|
(1.46x non-batched) |
MDB |
109,866 entries/sec |
|
(1.79x non-batched) |
BerkeleyDB |
61,958 entries/sec |
|
(12.44x non-batched) |
F. Synchronous Writes
Sequential Writes
LevelDB |
67 ops/sec |
|
(0.0003x asynch) |
Kyoto TreeDB |
5 ops/sec |
|
(0.00001x asynch) |
SQLite3 |
61 ops/sec |
|
(0.0013x asynch) |
MDB |
35 ops/sec |
|
(0.0004x asynch) |
BerkeleyDB |
58 ops/sec |
|
(0.0013x asynch) |
Random Writes
LevelDB |
67 ops/sec |
|
(0.001x baseline) |
Kyoto TreeDB |
5 ops/sec |
|
(0.00005x baseline) |
SQLite3 |
61 ops/sec |
|
(0.0021x baseline) |
MDB |
43 ops/sec |
|
(0.0007x baseline) |
BerkeleyDB |
59 ops/sec |
|
(0.012x baseline) |
10. Performance Using Large Values
This is the same as the test in Section 4, using the HDD.
A. Sequential Reads
LevelDB |
139,276 ops/sec |
|
Kyoto TreeDB |
18,612 ops/sec |
|
SQLite3 |
7,672 ops/sec |
|
MDB |
9,345,794 ops/sec |
|
BerkeleyDB |
9,273 ops/sec |
|
B. Random Reads
LevelDB |
23,064 ops/sec |
|
Kyoto TreeDB |
17,337 ops/sec |
|
SQLite3 |
7,870 ops/sec |
|
MDB |
1,436,782 ops/sec |
|
BerkeleyDB |
4,423 ops/sec |
|
Again, the read results are about the same as for tmpfs.
C. Sequential Writes
LevelDB |
279 ops/sec |
|
Kyoto TreeDB |
4,861 ops/sec |
|
SQLite3 |
1,343 ops/sec |
|
MDB |
5,643 ops/sec |
|
BerkeleyDB |
191 ops/sec |
|
D. Random Writes
LevelDB |
149 ops/sec |
|
Kyoto TreeDB |
5,278 ops/sec |
|
SQLite3 |
1,376 ops/sec |
|
MDB |
5,237 ops/sec |
|
BerkeleyDB |
152 ops/sec |
|
E. Batch Writes
Sequential Writes
LevelDB |
2,174 entries/sec |
|
Kyoto TreeDB |
4,861 entries/sec |
|
SQLite3 |
1007 entries/sec |
|
MDB |
4,069 entries/sec |
|
BerkeleyDB |
187 entries/sec |
|
Random Writes
LevelDB |
2,166 entries/sec |
|
Kyoto TreeDB |
5,279 entries/sec |
|
SQLite3 |
1108 entries/sec |
|
MDB |
5,734 entries/sec |
|
BerkeleyDB |
142 entries/sec |
|
F. Synchronous Writes
Sequential Writes
LevelDB |
20 ops/sec |
|
Kyoto TreeDB |
3 ops/sec |
|
SQLite3 |
18 ops/sec |
|
MDB |
15 ops/sec |
|
BerkeleyDB |
20 ops/sec |
|
Random Writes
LevelDB |
17 ops/sec |
|
Kyoto TreeDB |
4 ops/sec |
|
SQLite3 |
18 ops/sec |
|
MDB |
15 ops/sec |
|
BerkeleyDB |
18 ops/sec |
|
The slowness of the HDD makes most of the database implementations perform about
the same. As before, kyoto Cabinet is much slower than the rest.
11. Performance Using Different Filesystems
The baseline test was repeated on the same HDD, but using a different filesystem each
time. The filesystems tested are btrfs, ext2, ext3, ext4, jfs, ntfs, reiserfs, xfs,
and zfs. In addition, the journaling filesystems that support using an external
journal were retested with their journal stored on a tmpfs file. These were ext3,
ext4, jfs, reiserfs, and xfs. Testing in this second configuration shows how much
overhead the filesystem's journaling mechanism imposes, and how much performance
is lost by using the default internal journal configuration.
Note: storing the journal on tmpfs was just for the purposes of this test.
In a real deployment you would need to store the journal on an actual storage device,
like a separate disk, otherwise the filesystem would be lost after a reboot.
The filesystems are created fresh for each test. The tests are only run once each
due to the great length of time needed to collect all of the data. (It takes several
minutes just to run mkfs for some of these filesystems.) The full results are not presented
in HTML here; you will have to download the Spreadsheet to
view the results.
You can display the results for a specific benchmark operation across all the
filesystem types using the selector in cell B23 of the sheet. Likewise, you can
display the results for a specific filesystem across all the benchmark operations
using the selector in cell B1, but because the results are so totally dominated by
MDB read performance, this view isn't quite as informative.
Just to summarize, jfs with an external journal is the fastest for synchronous
writes. If your workload demands fully synchronous transactions, this is clearly the
best choice. Otherwise, the original ext2 filesystem is fastest for asynchronous
writes.
The raw data for all of these tests is also available.
tmpfs, SSD, and HDD.
The results are also tabulated in an OpenOffice spreadsheet for further analysis
here.
The raw filesystem test results are in out.hdd.tar.gz