HyperDex Benchmark, Part 2

Symas Corp., September 2013


This page shows additional performance results of LMDB vs HyperLevelDB as used in HyperDex. It follows on from the work previously reported in August 2013. The same software and hardware are used in these tests.

Setup

First we pick up where we left off from the previous test, using the database with 100 million 4000 byte records. In the following tests we use smaller records of 1000 bytes and then 32 bytes.

Results

100M 4K Records, 10M Ops

The results for LMDB are a little slow to start, due to the database being recreated for this test. The HyperLevelDB results were simply run on the same database as the final results of the August 2013 test. But the overall result clearly shows that LMDB retains its performance lead over HyperLevelDB even over much longer runtimes. In this case, LMDB completed the work in 27.8 hours, vs 33.3 hours for HyperLevelDB. The results for LMDB in this test are somewhat contaminated by a indexer daemon being fired off by cron early on. Unfortunately there wasn't sufficient time to restart the machine and get a clean run.
100M records, 10M ops
MinLatency(us)AvgLatency(us)95th%ile(ms)99th%ile(ms)MaxLatency(ms)Runtime(sec)Throughput(ops/sec)CPUtime(mm:ss)
LMDB update246456801973151342610081099.218:27
LMDB read2143894215326713266
LevelDB update2254872217228617554412000183.34412:21
LevelDB read202477501712842940

The raw test output is available in this tar archive. It is also tabulated in this OpenOffice spreadsheet.

100M 1K Records, Sequential Insert

We repeated the tests using 100 million records of 1000 bytes each. In this and the following test using 32 byte records, HyperLevelDB is significantly faster than LMDB. This is as expected; even in the microbench results it was clear that LMDB's write performance only led in two cases - batched sequential writes, and large value writes. With smaller records and non-batched writes the write amplification of LMDB's copy-on-write approach becomes too expensive. The LMDB HyperDex daemon was started with "-s 512000" to set a 500GB map size.

With this record size LMDB is half the speed of HyperLevelDB. Also there's a distinct cliff at around 4550 seconds into the LMDB load, where throughput drops by half. This is most likely the point where the majority of pages no longer fit into RAM.
100M 1K records, Sequential Insert
MinLatency(us)AvgLatency(us)95th%ile(ms)99th%ile(ms)MaxLatency(ms)Runtime(sec)Throughput(ops/sec)CPUtime(mm:ss)
LMDB12312585171960315773167195:56
LevelDB1287050151603177405637674:57

100M 1K Records, 10M Ops


HyperLevelDB maintains its lead over LMDB in this test.
100M records, 10M ops
MinLatency(us)AvgLatency(us)95th%ile(ms)99th%ile(ms)MaxLatency(ms)Runtime(sec)Throughput(ops/sec)CPUtime (mm:ss)
LMDB update2034165418631046118953111216:29
LMDB read168343201382426391
LevelDB update1952543090151649186304415963:31
LevelDB read17125122901514618

Here's the mdb_stat output for the resulting LMDB database:

Environment Info
  Map address: (nil)
  Map size: 536870912000
  Page size: 4096
  Max pages: 131072000
  Number of pages used: 84119625
  Last transaction ID: 102007830
  Max readers: 126
  Number of readers used: 0
Freelist Status
  Tree depth: 2
  Branch pages: 1
  Leaf pages: 3
  Overflow pages: 0
  Entries: 54
  Free pages: 600
Status of Main DB
  Tree depth: 6
  Branch pages: 1235134
  Leaf pages: 82883885
  Overflow pages: 0
  Entries: 201454920
The 1235134 branch pages consume 5GB, while the 82883885 leaf pages consume 339GB. With the data items being stored inline in the leaf pages instead of in overflow pages, the leaf page volume has grown drastically compared to the test with 4K records. The number of branch pages has also increased proportionately to track all the leaf pages. While it's possible that most of the branch pages will still be RAM-resident, just about all of the leaf pages will require a disk I/O to access.

The raw test output is available in this tar archive. It is also tabulated in this OpenOffice spreadsheet.


500M 32B Records, Sequential Insert


For this loading phase LMDB is significantly faster than HyperLevelDB. The Y-Axis of the Latency graph was switched to a logarithmic scale, otherwise the LMDB latency would have been invisible. CPU times were not recorded in these tests; the trend here hasn't changed.
500M 32B records, Sequential Insert
MinLatency(us)AvgLatency(us)95th%ile(ms)99th%ile(ms)MaxLatency(ms)Runtime(sec)Throughput(ops/sec)
LMDB932640015033345314946
LevelDB86434091631548149122

500M 32B Records, 50M Ops


500M records, 50M ops
MinLatency(us)AvgLatency(us)95th%ile(ms)99th%ile(ms)MaxLatency(ms)Runtime(sec)Throughput(ops/sec)
LMDB update1432638212621912999274193182
LMDB read107208008716812973
LevelDB update111161256410844529206737242
LevelDB read10616216641081469172

Here's the mdb_stat output for the resulting LMDB database:

Environment Info
  Map address: (nil)
  Map size: 536870912000
  Page size: 4096
  Max pages: 131072000
  Number of pages used: 27554782
  Last transaction ID: 510031188
  Max readers: 126
  Number of readers used: 4
Freelist Status
  Tree depth: 2
  Branch pages: 1
  Leaf pages: 18
  Overflow pages: 0
  Entries: 337
  Free pages: 3423
Status of Main DB
  Tree depth: 5
  Branch pages: 434130
  Leaf pages: 27117208
  Overflow pages: 0
  Entries: 1008466992
The 434130 branch pages consume 1.7GB, while the 27117208 leaf pages consume 111GB. All of the branch pages will be RAM-resident, but most leaf pages will require disk I/O for access.

The raw output for these tests is available in this tar archive. It is also tabulated in this OpenOffice spreadsheet.

Conclusion

Taken together with the August 2013 test results gives a clear picture of where LMDB's strengths and weaknesses lie. For larger records, LMDB's use of overflow pages improves the lookup performance because only keys are stored in the leaf pages, thus keeping them smaller and keeping more keys in RAM. For smaller records, the volume of leaf pages becomes an issue. Also the write amplification of LMDB's copy-on-write strategy has a higher cost with smaller records, since every update still must write a number of whole pages. Additionally, LMDB tends to perform best when the database is not more than 10 times the size of available RAM.

These results also highlight potential areas for future work: