Database Microbenchmarks

Symas Corp., September 2012

This page follows on from Google's LevelDB benchmarks published in July 2011 at LevelDB. (A snapshot of that document is available here for reference. In addition to the benchmarks tested there, we add the venerable BerkeleyDB as well as the OpenLDAP MDB database. For this test, we compare LevelDB version 1.5 (git rev dd0d562b4d4fbd07db6a44f9e221f8d368fee8e4), SQLite3 (version 3.7.7.1) and Kyoto Cabinet's (version 1.2.76) TreeDB (a B+Tree based key-value store), Berkeley DB 5.3.21, and OpenLDAP MDB (git rev 2e677bcb995b63d36461eea254f2134ebfe29da2). We would like to acknowledge the LevelDB project for the original benchmark code.

This page shows updated results for MDB write performance when using its writable mmap option. The previous benchmark report is available here. Using the memory map in read/write mode allows much faster writes and even lower memory overhead, but also opens the possibility of undetected database corruption if bugs in the application cause overwrites of arbitrary ranges of the map.

The writable mmap option was implemented after discussions at the LinuxCon in August. The results here obsolete the results in the August LinuxCon presentation.

Benchmarks were all performed on a Dell Precision M4400 laptop with a quad-core Intel(R) Core(TM)2 ExtremeCPU Q9300 @ 2.53GHz, with 6144 KB of total L3 cache and 8 GB of DDR2 RAM at 800 MHz. (Note that LevelDB uses at most two CPUs since the benchmarks are single threaded: one to run the benchmark, and one for background compactions.) The benchmarks were run on three different filesystems: tmpfs, reiserfs on an SSD, and ext2 on an HDD. (See our previous report for performance tests on btrfs, ext2, ext3, ext4, jfs, ntfs, reiserfs, xfs, and zfs.) The SSD is a Crucial M4 512GB, purchased new in August 2012. The HDD is a Seagate ST9500420AS Momentus 7200.4 500GB notebook hard drive. The system had Ubuntu 12.04 installed, with kernel 3.5.0-030500. Tests were all run in single-user mode to prevent variations due to other system activity. CPU performance scaling was disabled (scaling_governor = performance), to ensure a consistent CPU clock speed for all tests. The numbers reported below are the median of three measurements. The databases are completely deleted between each of the three measurements.

Benchmark Source Code

We wrote benchmark tools for SQLite, BerkeleyDB, MDB, and Kyoto TreeDB based on LevelDB's db_bench. The LevelDB, SQLite3, and TreeDB benchmark programs were originally provided in the LevelDB source distribution but we've made additional fixes to the versions used here. The code for each of the benchmarks resides here:

LevelDB: db_bench.cc.
SQLite: db_bench_sqlite3.cc.
Kyoto TreeDB: db_bench_tree_db.cc.
OpenLDAP MDB: db_bench_mdb.cc.
BerkeleyDB: db_bench_bdb.cc.
util/random.h: random.h.

Google's original benchmark code had a number of flaws, as we noted in this post to the LevelDB Discussion Group. We've fixed these flaws for these tests:

The Random write tests now use a shuffled list of all the keys, so no duplicate keys are used. Thus the resulting databases should have the same content as from the Sequential tests.
The synchronous tests on tmpfs are identical in length to the asynch tests, to ensure that the results are comparable. (We still shortened them on SSD and HDD due to their lengthy runtimes.)

In addition, we found that the Kyoto TreeDB results were being unfairly penalized relative to the LevelDB results, because all of its data items were undergoing an extra alloc/copy before being sent to the database. The data items for LevelDB are now similarly copied before use, to compensate for this issue. It's not clear how relevant this flaw is in real world applications but it definitely skewed write speeds in LevelDB's favor in these benchmarks.

Custom Build Specifications

Compression support was disabled in the libraries that support it. No special malloc library was used in the build. All of the benchmark programs were linked to their respective static libraries, to show the actual size needed for a minimal program using each library.

LevelDB: Assertions were disabled.
TreeDB: We enabled the TSMALL and TLINEAR options when opening the database in order to reduce the footprint of each record.
SQLite: We tuned SQLite's performance, by setting its locking mode to exclusive. We also enabled SQLite's write-ahead logging.
BerkeleyDB: We configure with --with-mutex=POSIX/pthreads to avoid using the default hybrid mutex implementation.
MDB: Assertions were disabled.

1. Relative Footprint

Most database vendors claim their product is fast and lightweight. Looking at the total size of each application gives some insight into the footprint of each database implementation.

size db_bench*
   text	   data	    bss	    dec	    hex	filename
 272247	   1456	    328	 274031	  42e6f	db_bench
1675911	   2288	    304	1678503	 199ca7	db_bench_bdb
  90423	   1508	    304	  92235	  1684b	db_bench_mdb
 653480	   7768	   1688	 662936	  a1d98	db_bench_sqlite3
 296572	   4808	   1096	 302476	  49d8c	db_bench_tree_db

The core of the MDB code is barely 32K of x86-64 object code. It fits entirely within most modern CPUs' on-chip caches. All of the other libraries are several times larger.

2. Baseline Performance

This section gives the baseline performance of all the databases. Following sections show how performance changes as various parameters are varied. For the baseline:

All operations are running on tmpfs. This shows the pure CPU time each database requires, independent of I/O speed.
Each database is allowed 4 MB of cache memory. (MDB has no cache, so this is irrelevant.)
Databases are opened in asynchronous write mode. (LevelDB's sync option, TreeDB's OAUTOSYNC option, SQLite3's synchronous options are all turned off, MDB uses the MDB_NOSYNC option, and BerkeleyDB uses the DB_TXN_WRITE_NOSYNC option). I.e., every write is pushed to the operating system, but the benchmark does not wait for the write to reach the disk.
Keys are 16 bytes each.
Values are 100 bytes each.
Sequential reads/writes traverse the key space in increasing order.
Random reads/writes traverse the key space in random order. The entire key space is shuffled so that every key is visited once; there are no duplicates or collisions in the random sequence.

A. Sequential Reads

LevelDB	4,587,156 ops/sec
Kyoto TreeDB	836,820 ops/sec
SQLite3	313,283 ops/sec
MDB	14,705,882 ops/sec
BerkeleyDB	827,130 ops/sec

B. Random Reads

LevelDB	174,246 ops/sec
Kyoto TreeDB	103,252 ops/sec
SQLite3	82,994 ops/sec
MDB	751,315 ops/sec
BerkeleyDB	101,958 ops/sec

C. Sequential Writes

LevelDB	498,753 ops/sec
Kyoto TreeDB	336,474 ops/sec
SQLite3	56,693 ops/sec
MDB	461,255 ops/sec
BerkeleyDB	90,959 ops/sec

D. Random Writes

LevelDB	317,662 ops/sec
Kyoto TreeDB	96,984 ops/sec
SQLite3	42,199 ops/sec
MDB	240,154 ops/sec
BerkeleyDB	45,928 ops/sec

MDB has the fastest read operations by a huge margin, due to its single-level-store architecture. MDB's write performance is good but not the fastest, due to the overhead of its copy-on-write design. This copy overhead is most significant on small records like those used in this test.

E. Batch Writes

A batch write is a set of writes that are applied atomically to the underlying database. A single batch of N writes may be significantly faster than N individual writes. The following benchmark writes one thousand batches where each batch contains one thousand 100-byte values. TreeDB does not support batch writes so its baseline numbers are repeated here for reference.

Sequential Writes

LevelDB	677,048 entries/sec	(1.36x baseline)
Kyoto TreeDB	336,474 entries/sec	(baseline)
SQLite3	109,302 entries/sec	(1.93x baseline)
MDB	2,481,390 entries/sec	(5.38x baseline)
BerkeleyDB	187,829 entries/sec	(2.06x baseline)

Random Writes

LevelDB	432,152 entries/sec	(1.36x baseline)
Kyoto TreeDB	96,984 entries/sec	(baseline)
SQLite3	58,432 entries/sec	(1.38x baseline)
MDB	294,898 entries/sec	(1.23x baseline)
BerkeleyDB	57,680 entries/sec	(1.26x baseline)

Batching allows some of MDB's copy-on-write overhead to be amortized across multiple records. MDB's Append Mode for sequential writes is most effective in batched operation.

F. Synchronous Writes

In the following benchmark, we enable the synchronous writing modes of all of the databases. Otherwise the test is identical to the Baseline.

For LevelDB, we set WriteOptions.sync = true.
In TreeDB, we enabled TreeDB's OAUTOSYNC option.
For SQLite3, we set "PRAGMA synchronous = FULL".
For MDB, we set no options since full sync is its default mode.
For BerkeleyDB, we set no options since full sync is its default mode.

Sequential Writes

LevelDB	444,247 ops/sec	(0.89x baseline)
Kyoto TreeDB	5,730 ops/sec	(0.017x baseline)
SQLite3	51,308 ops/sec	(0.91x baseline)
MDB	404,694 ops/sec	(0.88x baseline)
BerkeleyDB	86,745 ops/sec	(0.95x baseline)

Random Writes

LevelDB	304,692 ops/sec	(0.96x baseline)
Kyoto TreeDB	5.698 ops/sec	(0.058x baseline)
SQLite3	38,713 ops/sec	(0.92x baseline)
MDB	224,871 ops/sec	(0.94x baseline)
BerkeleyDB	44,893 ops/sec	(0.98x baseline)

While most of the databases are only slowed down a little, TreeDB performs extremely poorly in synchronous mode.

G. Disk Usage

The amount of disk space each database consumed is also a factor when considering the footprint of each database. The results for each write test are presented here, in KBytes used per test. The results are obtained using du on the test directory at the completion of each test, after all data has been flushed to disk. It includes space used by log files as well as actual data files, if any.

	Plain Writes		Batched		Synchronous
	Sequential	Random	Sequential	Random	Sequential	Random
LevelDB	112692	114960	112348	115412	112692	114040
Kyoto TreeDB	133104	229644			261980	300676
SQLite3	159128	158772	159232	160092	159136	158792
MDB	126660	180716	126660	192560	126660	180840
BerkeleyDB	570400	683100	527436	639728	570400	682636

LevelDB is the most efficient for disk usage, even without compression. We chose to omit the built in compression from these tests because it is clearly not a core feature of a database; any compression library could easily be used to handle compression for any database using simple wrappers around their put and get APIs.

MDB's sequential write packing gives the most deterministic file sizes; the DB size is the same regardless of batched or unbatched or synchronous writing. All of the other databases and modes yield different sizes due to varying overheads in each mode.

BerkeleyDB's transaction log files form the bulk of its disk usage. We configured it with DB_LOG_AUTO_REMOVE to run these tests, otherwise the consumption would be even higher. In regular deployments it may be unsafe to use this setting, since removal of the log files may prevent database recovery from working after a system crash. BerkeleyDB's transaction log management has been a continual inconvenience in real world deployments.

3. Performance Using More Memory

We increased the overall cache size for each database to 128 MB. For SQLite3, we kept the page size at 1024 bytes, but increased the number of pages to 131,072 (up from 4096). For TreeDB, we also kept the page size at 1024 bytes, but increased the cache size to 128 MB (up from 4 MB). For MDB there is no cache, so the numbers are simply a copy of the baseline. Both MDB and BerkeleyDB use the default system page size (4096 bytes).

A. Sequential Reads

LevelDB	4,566,210 ops/sec	(0.99x baseline)
Kyoto TreeDB	1,242,236 ops/sec	(1.48x baseline)
SQLite3	441,501 ops/sec	(1.41x baseline)
MDB	14,705,882 ops/sec	(baseline)
BerkeleyDB	860,585 ops/sec	(1.04x baseline)

B. Random Reads

LevelDB	177,557 ops/sec	(1.02x baseline)
Kyoto TreeDB	195,198 ops/sec	(1.89x baseline)
SQLite3	105,697 ops/sec	(1.27x baseline)
MDB	751,315 ops/sec	(baseline)
BerkeleyDB	169,348 ops/sec	(1.66x baseline)

C. Sequential Writes

LevelDB	488,520 ops/sec	(0.98x baseline)
Kyoto TreeDB	323,939 ops/sec	(0.96x baseline)
SQLite3	54,660 ops/sec	(0.96x baseline)
MDB	461,255 ops/sec	(baseline)
BerkeleyDB	91,525 ops/sec	(1.006x baseline)

D. Random Writes

LevelDB	318,370 ops/sec	(1.002x baseline)
Kyoto TreeDB	182,983 ops/sec	(1.89x baseline)
SQLite3	43,687 ops/sec	(1.04x baseline)
MDB	240,154 ops/sec	(baseline)
BerkeleyDB	67,838 ops/sec	(1.48x baseline)

E. Batch Writes

Sequential Writes

LevelDB	683,995 entries/sec	(1.40x non-batched)
Kyoto TreeDB	324,675 entries/sec	(non-batched)
SQLite3	105,119 entries/sec	(1.92x non-batched)
MDB	2,481,390 entries/sec	(5.38x non-batched)
BerkeleyDB	191,534 entries/sec	(2.09x non-batched)

Random Writes

LevelDB	441,501 entries/sec	(1.39x non-batched)
Kyoto TreeDB	182,983 entries/sec	(non-batched)
SQLite3	64,078 entries/sec	(1.47x non-batched)
MDB	294,898 entries/sec	(1.23x non-batched)
BerkeleyDB	104,613 entries/sec	(1.54x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB	441,696 ops/sec	(0.89x baseline)
Kyoto TreeDB	5,730 ops/sec	(0.017x baseline)
SQLite3	51,243 ops/sec	(0.90x baseline)
MDB	404,694 ops/sec	(0.88x baseline)
BerkeleyDB	86,957 ops/sec	(0.96x baseline)

Random Writes

LevelDB	304,599 ops/sec	(0.96x baseline)
Kyoto TreeDB	5,698 ops/sec	(0.058x baseline)
SQLite3	40,510 ops/sec	(0.96x baseline)
MDB	224,871 ops/sec	(0.94x baseline)
BerkeleyDB	65,789 ops/sec	(1.43x baseline)

Performance gains from the increased cache are minimal at best; some of the tests are slower with the increased caches. Overall, application-level caching exacts a great cost in terms of software and administrative complexity, and yields little (if any) benefit in return.

4. Performance Using Large Values

For this benchmark, we use 100,000 byte values. To keep the benchmark running time reasonable, we stop after writing 10,000 values. Otherwise, all of the same tests as for the Baseline are run.

A. Sequential Reads

LevelDB	299,133 ops/sec
Kyoto TreeDB	16,514 ops/sec
SQLite3	7,402 ops/sec
MDB	30,303,030 ops/sec
BerkeleyDB	9,133 ops/sec

B. Random Reads

LevelDB	15,183 ops/sec
Kyoto TreeDB	14,518 ops/sec
SQLite3	7,047 ops/sec
MDB	1,718,213 ops/sec
BerkeleyDB	8,646 ops/sec

MDB's single-level-store architecture clearly outclasses all of the other designs; the others barely even register on the results. MDB's zero-memcpy reads mean its read rate is essentially independent of the size of the data items being fetched; it is only affected by the total number of keys in the database.

C. Sequential Writes

LevelDB	3,366 ops/sec
Kyoto TreeDB	5,860 ops/sec
SQLite3	2,029 ops/sec
MDB	12,905 ops/sec
BerkeleyDB	1,920 ops/sec

D. Random Writes

LevelDB	742 ops/sec
Kyoto TreeDB	5,709 ops/sec
SQLite3	2,004 ops/sec
MDB	12,735 ops/sec
BerkeleyDB	1,902 ops/sec

E. Batch Writes

Sequential Writes

LevelDB	3,138 entries/sec
Kyoto TreeDB	5,860 entries/sec
SQLite3	2,068 entries/sec
MDB	13,215 entries/sec
BerkeleyDB	1,952 entries/sec

Random Writes

LevelDB	3,079 entries/sec
Kyoto TreeDB	5,709 entries/sec
SQLite3	2,041 entries/sec
MDB	13,099 entries/sec
BerkeleyDB	1,939 entries/sec

Unlike the other databases, MDB handles large value writes with ease.

In Google's original test only 1,000 entries are used, but that is too few to characterize each database. With 10,000 entries, we see that LevelDB's performance drops off a cliff in the random write tests. This is one of the most common problems with microbenchmarks - very often the data sets are too small to yield any meaningful results.

F. Synchronous Writes

Sequential Writes

LevelDB	3,368 ops/sec
Kyoto TreeDB	3,121 ops/sec
SQLite3	2,026 ops/sec
MDB	12,916 ops/sec
BerkeleyDB	1,913 ops/sec

Random Writes

LevelDB	745 ops/sec
Kyoto TreeDB	2,162 ops/sec
SQLite3	1,996 ops/sec
MDB	12,665 ops/sec
BerkeleyDB	1,893 ops/sec

None of the other databases even begin to approach MDB's performance when using large data values.

G. Disk Usage

	Plain Writes		Batched		Synchronous
	Sequential	Random	Sequential	Random	Sequential	Random
LevelDB	979580	993188	977336	1109824	979580	990068
Kyoto TreeDB	978076	978024			978232	978172
SQLite3	985812	985828	1082016	1083112	985832	985836
MDB	1000380	1000516	1000380	1001352	1000380	1000540
BerkeleyDB	2028472	2028936	2028072	2028572	2028472	2028996

5. Performance On SSD

The same tests as in Section 2 are performed again, this time using the Crucial M4 SSD with reiserfs. This drive has been in light use for only a few weeks, so it should still be near its original performance levels.

A. Sequential Reads

LevelDB	4,629,630 ops/sec
Kyoto TreeDB	834,028 ops/sec
SQLite3	315,259 ops/sec
MDB	14,084,507 ops/sec
BerkeleyDB	843,882 ops/sec

B. Random Reads

LevelDB	143,947 ops/sec
Kyoto TreeDB	99,285 ops/sec
SQLite3	83,668 ops/sec
MDB	723,589 ops/sec
BerkeleyDB	102,669 ops/sec

Read performance is essentially the same as for tmpfs since all of the data is present in the filesystem cache.

C. Sequential Writes

LevelDB	477,555 ops/sec
Kyoto TreeDB	335,683 ops/sec
SQLite3	51,419 ops/sec
MDB	494,315 ops/sec
BerkeleyDB	59,637 ops/sec

D. Random Writes

LevelDB	164,962 ops/sec
Kyoto TreeDB	93,266 ops/sec
SQLite3	38,619 ops/sec
MDB	236,798 ops/sec
BerkeleyDB	16,783 ops/sec

Most of the databases perform at close to their tmpfs speeds, which is expected since these are asynchronous writes. However, BerkeleyDB shows a large reduction in throughput. Also, surprisingly, MDB beats LevelDB's write performance here. Even though the writes are fully cached, the underlying storage device still has an impact on write throughput.

E. Batch Writes

Sequential Writes

LevelDB	619,963 entries/sec	(1.30x non-batched)
Kyoto TreeDB	335,683 entries/sec	(non-batched)
SQLite3	99,850 entries/sec	(1.94x non-batched)
MDB	2,386,635 entries/sec	(4.83x non-batched)
BerkeleyDB	111,297 entries/sec	(1.87x non-batched)

Random Writes

LevelDB	228,154 entries/sec	(1.38x non-batched)
Kyoto TreeDB	93,266 entries/sec	(non-batched)
SQLite3	53,700 entries/sec	(1.39x non-batched)
MDB	286,205 entries/sec	(1.21x non-batched)
BerkeleyDB	16,254 entries/sec	(0.97x non-batched)

F. Synchronous Writes

Here the difference between SSD and tmpfs is made obvious. Note that due to the overall slowness of these operations, they were only performed 1000 times each.

Sequential Writes

LevelDB	342 ops/sec	(0.0007x asynch)
Kyoto TreeDB	69 ops/sec	(0.0002x asynch)
SQLite3	114 ops/sec	(0.0022x asynch)
MDB	149 ops/sec	(0.0003x asynch)
MDB, no MetaSync	328 ops/sec	(0.0007x asynch)
BerkeleyDB	300 ops/sec	(0.0050x asynch)

Random Writes

LevelDB	342 ops/sec	(0.0021x asynch)
Kyoto TreeDB	67 ops/sec	(0.0007x asynch)
SQLite3	114 ops/sec	(0.0029x asynch)
MDB	148 ops/sec	(0.0006x asynch)
MDB, no MetaSync	322 ops/sec	(0.0013x asynch)
BerkeleyDB	291 ops/sec	(0.0173x asynch)

The slowness of the SSD overshadows any difference between sequential and random write performance here.

MDB's synchronous writes are slower because by default it does two separate syncs per commit - one for the data, and one for the meta page update. Generally there is a large seek penalty for the sync of the meta page, although that should not be an issue with an SSD. Using MDB's NOMETASYNC option will skip the explicit sync of the meta page update. Omitting the explicit sync of the meta page means the meta page update for one transaction may not be sync'd along with that transaction but definitely will be sync'd by the commit of the next following transaction. In this case, if the system crashes, the last committed transaction may be lost. Some apps can tolerate this reduced level of reliability. Also, on filesystems which support ordered writes, no transactions will be lost.

6. Performance Using More Memory

We increased the overall cache size for each database to 128 MB, as in Section 3. The "baseline" in these tests refers to the values from Section 5.

A. Sequential Reads

LevelDB	4,672,897 ops/sec	(1.01x baseline)
Kyoto TreeDB	1,246,883 ops/sec	(1.50x baseline)
SQLite3	438,982 ops/sec	(1.39x baseline)
MDB	14,084,507 ops/sec	(baseline)
BerkeleyDB	819,672 ops/sec	(0.97x baseline)

B. Random Reads

LevelDB	142,857 ops/sec	(0.99x baseline)
Kyoto TreeDB	193,386 ops/sec	(1.95x baseline)
SQLite3	104,690 ops/sec	(1.25x baseline)
MDB	723,589 ops/sec	(baseline)
BerkeleyDB	160,077 ops/sec	(1.56x baseline)

C. Sequential Writes

LevelDB	483,793 ops/sec	(0.92x baseline)
Kyoto TreeDB	323,625 ops/sec	(0.96x baseline)
SQLite3	49.925 ops/sec	(0.97x baseline)
MDB	494,315 ops/sec	(baseline)
BerkeleyDB	71,726 ops/sec	(1.20x baseline)

D. Random Writes

LevelDB	134,608 ops/sec	(0.82x baseline)
Kyoto TreeDB	180,636 ops/sec	(1.94x baseline)
SQLite3	40,050 ops/sec	(1.04x baseline)
MDB	236,798 ops/sec	(baseline)
BerkeleyDB	53,333 ops/sec	(3.18x baseline)

E. Batch Writes

Sequential Writes

LevelDB	610,128 entries/sec	(1.26x non-batched)
Kyoto TreeDB	323,625 entries/sec	(non-batched)
SQLite3	94,171 entries/sec	(1.89x non-batched)
MDB	2,386,635 entries/sec	(4.83x non-batched)
BerkeleyDB	108,108 entries/sec	(1.51x non-batched)

Random Writes

LevelDB	228,624 entries/sec	(1.70x non-batched)
Kyoto TreeDB	180,636 entries/sec	(non-batched)
SQLite3	56,145 entries/sec	(1.40x non-batched)
MDB	286,205 entries/sec	(1.21x non-batched)
BerkeleyDB	76,622 entries/sec	(1.44x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB	348 ops/sec	(0.0007x asynch)
Kyoto TreeDB	69 ops/sec	(0.0002x asynch)
SQLite3	117 ops/sec	(0.0023x asynch)
MDB	149 ops/sec	(0.0003x asynch)
MDB, no MetaSync	328 ops/sec	(0.0007x asynch)
BerkeleyDB	299 ops/sec	(0.0042x asynch)

Random Writes

LevelDB	349 ops/sec	(0.0026x asynch)
Kyoto TreeDB	67 ops/sec	(0.0004x asynch)
SQLite3	118 ops/sec	(0.0029x asynch)
MDB	148 ops/sec	(0.0006x asynch)
MDB, no MetaSync	322 ops/sec	(0.0013x asynch)
BerkeleyDB	297 ops/sec	(0.0056x asynch)

The impact of the increased cache is inconsistent at best.

7. Performance Using Large Values

This is the same as the test in Section 4, using the SSD.

A. Sequential Reads

LevelDB	355,240 ops/sec
Kyoto TreeDB	16,384 ops/sec
SQLite3	7,163 ops/sec
MDB	30,303,030 ops/sec
BerkeleyDB	9,144 ops/sec

B. Random Reads

LevelDB	19,245 ops/sec
Kyoto TreeDB	14,639 ops/sec
SQLite3	6,943 ops/sec
MDB	1,697,793 ops/sec
BerkeleyDB	8,652 ops/sec

The read results are about the same as for tmpfs.

C. Sequential Writes

LevelDB	761 ops/sec
Kyoto TreeDB	3,449 ops/sec
SQLite3	1,080 ops/sec
MDB	9,762 ops/sec
BerkeleyDB	450 ops/sec

D. Random Writes

LevelDB	126 ops/sec
Kyoto TreeDB	2,578 ops/sec
SQLite3	948 ops/sec
MDB	9,687 ops/sec
BerkeleyDB	435 ops/sec

E. Batch Writes

Sequential Writes

LevelDB	1,575 entries/sec
Kyoto TreeDB	3,449 entries/sec
SQLite3	1107 entries/sec
MDB	9,996 entries/sec
BerkeleyDB	454 entries/sec

Random Writes

LevelDB	1,574 entries/sec
Kyoto TreeDB	2,578 entries/sec
SQLite3	970 entries/sec
MDB	9,922 entries/sec
BerkeleyDB	441 entries/sec

MDB dominates the asynchronous write tests.

F. Synchronous Writes

Sequential Writes

LevelDB	110 ops/sec
Kyoto TreeDB	34 ops/sec
SQLite3	83 ops/sec
MDB	91 ops/sec
MDB, no MetaSync	93 ops/sec
BerkeleyDB	100 ops/sec

Random Writes

LevelDB	92 ops/sec
Kyoto TreeDB	30 ops/sec
SQLite3	80 ops/sec
MDB	66 ops/sec
MDB, no MetaSync	97 ops/sec
BerkeleyDB	107 ops/sec

The benefit of MDB's NOMETASYNC option is marginal here; most of the time required for these operations is simply in writing the 100,000 byte data values.

8. Performance On HDD

The same tests as in Section 2 are performed again, this time using the Seagate ST9500420AS HDD with EXT2 fs. The EXT2 filesystem was chosen based on our previous survey of multiple filesystems.

A. Sequential Reads

LevelDB	4,385,965 ops/sec
Kyoto TreeDB	823,723 ops/sec
SQLite3	306,937 ops/sec
MDB	14,084,507 ops/sec
BerkeleyDB	825,083 ops/sec

B. Random Reads

LevelDB	135,925 ops/sec
Kyoto TreeDB	101,041 ops/sec
SQLite3	81,579 ops/sec
MDB	746,826 ops/sec
BerkeleyDB	102,323 ops/sec

Read performance is essentially the same as the previous tests since all of the data is present in the filesystem cache.

C. Sequential Writes

LevelDB	497,265 ops/sec
Kyoto TreeDB	337,952 ops/sec
SQLite3	49,383 ops/sec
MDB	488,281 ops/sec
BerkeleyDB	68,507 ops/sec

D. Random Writes

LevelDB	172,058 ops/sec
Kyoto TreeDB	90,827 ops/sec
SQLite3	19,623 ops/sec
MDB	233,155 ops/sec
BerkeleyDB	27,625 ops/sec

E. Batch Writes

Sequential Writes

LevelDB	517,063 entries/sec	(1.04x non-batched)
Kyoto TreeDB	337,952 entries/sec	(non-batched)
SQLite3	97,267 entries/sec	(1.97x non-batched)
MDB	2,487,562 entries/sec	(5.09x non-batched)
BerkeleyDB	118,329 entries/sec	(1.73x non-batched)

Random Writes

LevelDB	226,244 entries/sec	(1.31x non-batched)
Kyoto TreeDB	90,827 entries/sec	(non-batched)
SQLite3	25,825 entries/sec	(1.32x non-batched)
MDB	286,451 entries/sec	(1.23x non-batched)
BerkeleyDB	26,190 entries/sec	(0.95x non-batched)

F. Synchronous Writes

As before, only 1000 operations are performed, due to the slowness of these tests. The HDD's cache was not disabled for these tests.

Sequential Writes

LevelDB	1260 ops/sec	(0.0025x asynch)
Kyoto TreeDB	30 ops/sec	(0.0001x asynch)
SQLite3	114 ops/sec	(0.0023x asynch)
MDB	364 ops/sec	(0.0007x asynch)
BerkeleyDB	733 ops/sec	(0.0107x asynch)

Random Writes

LevelDB	1291 ops/sec	(0.0075x asynch)
Kyoto TreeDB	28 ops/sec	(0.0003x asynch)
SQLite3	112 ops/sec	(0.0057x asynch)
MDB	297 ops/sec	(0.0013x asynch)
BerkeleyDB	704 ops/sec	(0.0255x asynch)

The slowness of the HDD overshadows any difference between sequential and random write performance here. LevelDB seems to benefit more from the HDD's caching than any of the other databases.

MDB's NOMETASYNC option made no discernible difference here so those results are omitted. Ultimately the seek overhead cannot be avoided using a single storage device. Ideally the meta pages should be isolated in a separate file that can be stored on a separate spindle to avoid the seek overhead.

9. Performance Using More Memory

We increased the overall cache size for each database to 128 MB, as in Section 3. The "baseline" in these tests refers to the values from Section 8.

A. Sequential Reads

LevelDB	4,524,887 ops/sec	(1.03x baseline)
Kyoto TreeDB	1,246,883 ops/sec	(1.51x baseline)
SQLite3	430,108 ops/sec	(1.40x baseline)
MDB	14,084,507 ops/sec	(baseline)
BerkeleyDB	572,082 ops/sec	(0.69x baseline)

B. Random Reads

LevelDB	137,231 ops/sec	(1.01x baseline)
Kyoto TreeDB	192,456 ops/sec	(1.90x baseline)
SQLite3	104,102 ops/sec	(1.28x baseline)
MDB	746,826 ops/sec	(baseline)
BerkeleyDB	134,626 ops/sec	(1.32x baseline)

C. Sequential Writes

LevelDB	492,854 ops/sec	(0.99x baseline)
Kyoto TreeDB	327,225 ops/sec	(0.97x baseline)
SQLite3	48,065 ops/sec	(0.97x baseline)
MDB	488,281 ops/sec	(baseline)
BerkeleyDB	58,258 ops/sec	(0.85x baseline)

D. Random Writes

LevelDB	176,929 ops/sec	(1.03 baseline)
Kyoto TreeDB	181,127 ops/sec	(1.99x baseline)
SQLite3	19,085 ops/sec	(0.97x baseline)
MDB	233,155 ops/sec	(baseline)
BerkeleyDB	46,990 ops/sec	(1.70x baseline)

E. Batch Writes

Sequential Writes

LevelDB	553,097 entries/sec	(1.12x non-batched)
Kyoto TreeDB	327,225 entries/sec	(non-batched)
SQLite3	94,697 entries/sec	(1.97x non-batched)
MDB	2,487,562 entries/sec	(5.09x non-batched)
BerkeleyDB	93,266 entries/sec	(1.60x non-batched)

Random Writes

LevelDB	229,095 entries/sec	(1.29x non-batched)
Kyoto TreeDB	181,127 entries/sec	(non-batched)
SQLite3	28,129 entries/sec	(1.47x non-batched)
MDB	286,451 entries/sec	(1.23x non-batched)
BerkeleyDB	64,708 entries/sec	(1.38x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB	1298 ops/sec	(0.0026x asynch)
Kyoto TreeDB	29 ops/sec	(0.0001x asynch)
SQLite3	115 ops/sec	(0.0024x asynch)
MDB	364 ops/sec	(0.0007x asynch)
BerkeleyDB	733 ops/sec	(0.0126x asynch)

Random Writes

LevelDB	1323 ops/sec	(0.0075x baseline)
Kyoto TreeDB	29 ops/sec	(0.0002x baseline)
SQLite3	113 ops/sec	(0.0059x baseline)
MDB	297 ops/sec	(0.0013x baseline)
BerkeleyDB	649 ops/sec	(0.0138x baseline)

10. Performance Using Large Values

This is the same as the test in Section 4, using the HDD.

A. Sequential Reads

LevelDB	373,413 ops/sec
Kyoto TreeDB	15,910 ops/sec
SQLite3	7,634 ops/sec
MDB	28,571,429 ops/sec
BerkeleyDB	9,443 ops/sec

B. Random Reads

LevelDB	20,831 ops/sec
Kyoto TreeDB	14,167 ops/sec
SQLite3	7,210 ops/sec
MDB	1,700,680 ops/sec
BerkeleyDB	8,786 ops/sec

Again, the read results are about the same as for tmpfs.

C. Sequential Writes

LevelDB	651 ops/sec
Kyoto TreeDB	5,124 ops/sec
SQLite3	1,525 ops/sec
MDB	11,865 ops/sec
BerkeleyDB	470 ops/sec

D. Random Writes

LevelDB	99 ops/sec
Kyoto TreeDB	5,000 ops/sec
SQLite3	553 ops/sec
MDB	11,716 ops/sec
BerkeleyDB	519 ops/sec

E. Batch Writes

Sequential Writes

LevelDB	759 entries/sec
Kyoto TreeDB	5,124 entries/sec
SQLite3	1,828 entries/sec
MDB	12,135 entries/sec
BerkeleyDB	482 entries/sec

Random Writes

LevelDB	754 entries/sec
Kyoto TreeDB	5,000 entries/sec
SQLite3	1,820 entries/sec
MDB	12,037 entries/sec
BerkeleyDB	474 entries/sec

F. Synchronous Writes

Sequential Writes

LevelDB	119 ops/sec
Kyoto TreeDB	21 ops/sec
SQLite3	107 ops/sec
MDB	119 ops/sec
BerkeleyDB	136 ops/sec

Random Writes

LevelDB	117 ops/sec
Kyoto TreeDB	18 ops/sec
SQLite3	107 ops/sec
MDB	113 ops/sec
BerkeleyDB	136 ops/sec

The slowness of the HDD makes most of the database implementations perform about the same. As before, Kyoto Cabinet is much slower than the rest.

The raw data for all of these tests is also available. tmpfs, SSD, and HDD. The results are also tabulated in an OpenOffice spreadsheet for further analysis here. The raw data includes additional tests (e.g. reverse sequential read) which were omitted from this already lengthy report for space reasons.