【发布时间】:2020-10-02 15:31:57
【问题描述】:
首先我要解释一下我们的设置,因为它与带有raidgroups 等的普通数据库服务器完全不同。通常我们的客户会购买像 HP DL 380 Gen10 这样的服务器,它带有两个 300 GB 硬盘(不是 SSD),配置为 RAID 1 并运行 Windows。
我们这里只管理其他存储的元数据,以便客户可以询问我们并在那些大型存储上找到它的信息。
由于我们的旧数据库总是损坏,我们搜索了一个新的更稳定的数据库,它也没有那么多开销,并找到了 Rocksdb,当前版本为 6.12.0。
不幸的是,它运行了几个小时后,由于写入停滞,它似乎阻塞了我的程序很多分钟:
2020/10/01-15:58:44.678646 1a64 [WARN] [db\column_family.cc:876] [default]
Stopping writes because we have 2 immutable memtables (waiting for flush),
max_write_buffer_number is set to 2
我说的对机器来说写工作量太大了吗?
我们的软件是一种服务,它每秒至少从多达 2000 个不同的服务器中检索一个更新(如果可能的话,将来可能会增加限制)。大多数情况下,相同的数据库条目只会再次更新/写入,因为条目中的信息之一是相应服务器的当前时间。当然我可以尝试不经常将数据写入硬盘,但是如果客户向我们请求这些数据,我们的信息将不会是最新的。
所以我的问题是:
-
我假设当前每个 writerequest 都确实写入磁盘,有没有办法启用某种缓存(或者可能增加它的大小,如果不够的话?),这样数据就不会减少写入硬盘的频率,但读取请求从内存中返回正确的数据?
-
我还看到有一个合并运算符,但我不确定这个合并何时发生?是否已经有1.中提到的缓存,并且数据收集了一段时间,然后合并然后写入硬盘?
-
在这种情况下是否有任何其他优化可以帮助我?
欢迎任何帮助。提前致谢。
如果它们可能有趣的话,这里还有一些日志:
** File Read Latency Histogram By Level [default] **
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp
Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
L0 2/0 17.12 MB 0.5 0.0 0.0 0.0 0.4 0.4 0.0 1.0 0.0 81.5 5.15 0.00 52 0.099 0 0
L1 3/0 192.76 MB 0.8 3.3 0.4 2.9 3.2 0.3 0.0 8.0 333.3 327.6 10.11 0.00 13 0.778 4733K 119K
L2 20/0 1.02 GB 0.4 1.6 0.4 1.2 1.2 -0.0 0.0 3.1 387.5 290.0 4.30 0.00 7 0.614 2331K 581K
Sum 25/0 1.22 GB 0.0 4.9 0.8 4.1 4.9 0.7 0.0 11.8 257.4 254.5 19.56 0.00 72 0.272 7064K 700K
Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0
** Compaction Stats [default] **
Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
Low 0/0 0.00 KB 0.0 4.9 0.8 4.1 4.5 0.3 0.0 0.0 349.5 316.4 14.40 0.00 20 0.720 7064K 700K
High 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.4 0.4 0.0 0.0 0.0 81.7 5.12 0.00 51 0.100 0 0
User 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 50.9 0.03 0.00 1 0.030 0 0
Uptime(secs): 16170.9 total, 0.0 interval
Flush(GB): cumulative 0.410, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 4.86 GB write, 0.31 MB/s write, 4.92 GB read, 0.31 MB/s read, 19.6 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
** File Read Latency Histogram By Level [default] **
2020/10/01-15:53:21.248110 1a64 [db\db_impl\db_impl_write.cc:1701] [default] New memtable created with log file: #10465. Immutable memtables: 0.
2020/10/01-15:58:44.678596 1a64 [db\db_impl\db_impl_write.cc:1701] [default] New memtable created with log file: #10466. Immutable memtables: 1.
2020/10/01-15:58:44.678646 1a64 [WARN] [db\column_family.cc:876] [default] Stopping writes because we have 2 immutable memtables (waiting for flush), max_write_buffer_number is set to 2
2020/10/01-16:02:57.448977 2328 [db\db_impl\db_impl.cc:900] ------- DUMPING STATS -------
2020/10/01-16:02:57.449034 2328 [db\db_impl\db_impl.cc:901]
** DB Stats **
Uptime(secs): 16836.8 total, 665.9 interval
Cumulative writes: 20M writes, 20M keys, 20M commit groups, 1.0 writes per commit group, ingest: 3.00 GB, 0.18 MB/s
Cumulative WAL: 20M writes, 0 syncs, 20944372.00 writes per sync, written: 3.00 GB, 0.18 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 517K writes, 517K keys, 517K commit groups, 1.0 writes per commit group, ingest: 73.63 MB, 0.11 MB/s
Interval WAL: 517K writes, 0 syncs, 517059.00 writes per sync, written: 0.07 MB, 0.11 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent
我们的看门狗在一个互斥锁被阻塞几分钟后导致了一个断点,之后这出现在日志文件中:
2020/10/02-17:44:18.602776 2328 [db\db_impl\db_impl.cc:900] ------- DUMPING STATS -------
2020/10/02-17:44:18.602990 2328 [db\db_impl\db_impl.cc:901]
** DB Stats **
Uptime(secs): 109318.0 total, 92481.2 interval
Cumulative writes: 20M writes, 20M keys, 20M commit groups, 1.0 writes per commit group, ingest: 3.00 GB, 0.03 MB/s
Cumulative WAL: 20M writes, 0 syncs, 20944372.00 writes per sync, written: 3.00 GB, 0.03 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 0 writes, 0 keys, 0 commit groups, 0.0 writes per commit group, ingest: 0.00 MB, 0.00 MB/s
Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
L0 2/0 17.12 MB 0.5 0.0 0.0 0.0 0.4 0.4 0.0 1.0 0.0 81.5 5.15 0.00 52 0.099 0 0
L1 3/0 192.76 MB 0.8 3.3 0.4 2.9 3.2 0.3 0.0 8.0 333.3 327.6 10.11 0.00 13 0.778 4733K 119K
L2 20/0 1.02 GB 0.4 1.6 0.4 1.2 1.2 -0.0 0.0 3.1 387.5 290.0 4.30 0.00 7 0.614 2331K 581K
Sum 25/0 1.22 GB 0.0 4.9 0.8 4.1 4.9 0.7 0.0 11.8 257.4 254.5 19.56 0.00 72 0.272 7064K 700K
Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00 0.00 0 0.000 0 0
** Compaction Stats [default] **
Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
Low 0/0 0.00 KB 0.0 4.9 0.8 4.1 4.5 0.3 0.0 0.0 349.5 316.4 14.40 0.00 20 0.720 7064K 700K
High 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.4 0.4 0.0 0.0 0.0 81.7 5.12 0.00 51 0.100 0 0
User 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 50.9 0.03 0.00 1 0.030 0 0
Uptime(secs): 109318.0 total, 92481.2 interval
Flush(GB): cumulative 0.410, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 4.86 GB write, 0.05 MB/s write, 4.92 GB read, 0.05 MB/s read, 19.6 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 1 memtable_compaction, 0 memtable_slowdown, interval 0 total count
** File Read Latency Histogram By Level [default] **
【问题讨论】: