【发布时间】:2016-04-14 02:11:59
【问题描述】:
我遇到了 Cassandra 3 的写入性能问题。
我正在使用此处的官方 Docker 映像试用 Cassandra 3.3:https://github.com/docker-library/cassandra
我是这样开始的:
docker run --net=host --rm cassandra:3.3
然后对它运行 cassandra-stress:
cassandra-stress write
这给了我以下四个线程执行流量的结果:
op rate : 1913 [WRITE:1913]
partition rate : 1913 [WRITE:1913]
row rate : 1913 [WRITE:1913]
latency mean : 2.1 [WRITE:2.1]
latency median : 1.6 [WRITE:1.6]
latency 95th percentile : 4.1 [WRITE:4.1]
latency 99th percentile : 8.4 [WRITE:8.4]
latency 99.9th percentile : 20.5 [WRITE:20.5]
latency max : 155.4 [WRITE:155.4]
Total partitions : 154607 [WRITE:154607]
Total errors : 0 [WRITE:0]
total gc count : 13
total gc mb : 1951
total gc time (s) : 1
avg gc time(ms) : 59
stdev gc time(ms) : 28
Total operation time : 00:01:20
使用官方镜像为 Cassandra 2.2 做同样的事情:
docker run --net=host --rm cassandra:2.2
用四个线程给我以下结果:
op rate : 2248 [WRITE:2248]
partition rate : 2248 [WRITE:2248]
row rate : 2248 [WRITE:2248]
latency mean : 1.8 [WRITE:1.8]
latency median : 1.4 [WRITE:1.4]
latency 95th percentile : 3.5 [WRITE:3.5]
latency 99th percentile : 7.2 [WRITE:7.2]
latency 99.9th percentile : 16.4 [WRITE:16.4]
latency max : 129.5 [WRITE:129.5]
Total partitions : 195461 [WRITE:195461]
Total errors : 0 [WRITE:0]
total gc count : 11
total gc mb : 1612
total gc time (s) : 1
avg gc time(ms) : 62
stdev gc time(ms) : 21
Total operation time : 00:01:26
与 2.2 相比,3.3 的每秒写入次数几乎降低了 15%。这可能是什么原因?我一直在尝试更改各种参数,包括在上面的 Oracle(与 OpenJDK 相比)上运行 Cassandra,但没有显着差异。我还尝试了不同版本的 Cassandra 3,但没有发现真正的区别。我知道这是一个单一的节点,这个基本测试的结果不能转移到生产环境中。不过,我很想知道是否有人有解释或可以重现这种行为。
欢迎任何意见!
2016-04-13更新,JVM参数差异:
diff jvm-param22-sorted jvm-param33-sorted
> -XX:+AlwaysPreTouch
# Before removing this from the 3.3 config:
# USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
# cassand+ 1 19.0 30.3 2876304 1229136 ? Ssl 04:18 0:17 java
#
# After removing it:
# USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
# cassand+ 177 17.0 9.4 2885596 383972 ? Sl+ 04:59 0:17 java -
#
# The above could actually explain some of the issues that I've observed in
# low memory environments with multiple containers running since I guess it
# means Cassandra will grab the whole heap at startup from the OS
# regardless of the actual need.
# This was present twice in the 22 config. Should not matter.
< -XX:CMSWaitDuration=10000
# Removing this from the 3.3 config did not have any significant impact
> -XX:+ResizeTLAB
# Removing from the 3.3 config did not have any significant impact
> -XX:-UseBiasedLocking
# Adding this one to the 3.3 config did not have a significant impact
< -XX:+UseCondCardMark
【问题讨论】: