【问题标题】:Bluemix hyperledger block height/blocks out of syncBluemix hyperledger 块高度/块不同步
【发布时间】:2017-02-21 00:17:02
【问题描述】:

我在 Bluemix 上有一个具有 4 个对等点的区块链实现,并且我一直在向它部署新的链代码。但是,最近,peer 3 需要很长时间才能部署。最终,我认为停止并重新启动对等点 3 会有所帮助。没有。

因此,虽然我一直在部署和调用各种链代码,但对等点 3 已经过时了。看起来新的链码仅由 4 个对等点中的 3 个运行。

我在下面的示例日志中看到了错误。如何让对等点 3 与其他对等点重新同步?

OUT - 18:34:30.273 [consensus/pbft] execDoneSync -> INFO 06b[0m Replica 3 finished execution 28, trying next
OUT - 18:48:07.588 [consensus/pbft] executeOne -> INFO 06c[0m Replica 3 executing/committing request batch for view=0/seqNo=29 and digest 5trDGesTKJPWIWy/RKjTq5vY2tIQZ/L/a7C7LvYurk/H2zYorDAN7zsTnbqq2kcR1HcqPcnpXK1Gqu8q1ItgFA==
OUT - 2017/02/20 18:54:10 transport: http2Client.notifyError got notified that the client transport was broken EOF.
OUT - [31m18:54:10.162 [peer] handleChat -> ERRO 06d[0m Error during Chat, stopping handler: stream error: code = 1 desc = "context canceled"
OUT - [31m18:54:10.162 [peer] handleChat -> ERRO 06e[0m Error during Chat, stopping handler: rpc error: code = 13 desc = transport is closing
OUT - [31m18:54:10.162 [peer] chatWithPeer -> ERRO 06f[0m Ending Chat with peer address 5cc24f88bbcc414a96962ea1c37c3aea-vp2.us.blockchain.ibm.com:30001 due to error: Error during Chat, stopping handler: rpc error: code = 13 desc = transport is closing
OUT - 2017/02/20 18:54:11 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 172.16.6.8:30001: getsockopt: connection refused"; Reconnecting to {"5cc24f88bbcc414a96962ea1c37c3aea-vp2.us.blockchain.ibm.com:30001" <nil>}
OUT - [31m18:54:11.668 [peer] handleChat -> ERRO 070[0m Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp2"  5cc24f88bbcc414a96962ea1c37c3aea-vp2.us.blockchain.ibm.com:30001 VALIDATOR `�ބ��M�U�d,��������9(ˑ(����}
OUT - [35m18:54:11.806 [consensus/pbft] recvCheckpoint -> CRIT 071[0m Network unable to find stable certificate for seqNo 30 (3 different values observed already)
OUT - panic: Network unable to find stable certificate for seqNo 30 (3 different values observed already)
OUT - 
OUT - goroutine 71 [running]:
OUT - panic(0xc137a0, 0xc82032f9e0)
OUT -   /opt/go/src/runtime/panic.go:464 +0x3e6
OUT - github.com/hyperledger/fabric/vendor/github.com/op/go-logging.(*Logger).Panicf(0xc8201ae4e0, 0x103cd40, 0x5d, 0xc8206863e0, 0x2, 0x2)
OUT -   /opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/op/go-logging/logger.go:194 +0x11e
OUT - github.com/hyperledger/fabric/consensus/pbft.(*pbftCore).recvCheckpoint(0xc820069d40, 0xc8206863a0, 0x0, 0x0)
OUT -   /opt/gopath/src/github.com/hyperledger/fabric/consensus/pbft/pbft-core.go:1185 +0xcc7
OUT - github.com/hyperledger/fabric/consensus/pbft.(*pbftCore).ProcessEvent(0xc820069d40, 0xdf2b40, 0xc8206863a0, 0x0, 0x0)
OUT -   /opt/gopath/src/github.com/hyperledger/fabric/consensus/pbft/pbft-core.go:349 +0x571
OUT - github.com/hyperledger/fabric/consensus/pbft.(*obcBatch).ProcessEvent(0xc820220600, 0xdf2b40, 0xc8206863a0, 0x0, 0x0)
OUT -   /opt/gopath/src/github.com/hyperledger/fabric/consensus/pbft/batch.go:429 +0x6b4
OUT - github.com/hyperledger/fabric/consensus/util/events.SendEvent(0x7f0e948fdbe0, 0xc820220600, 0xda32e0, 0xc82032f760)
OUT -   /opt/gopath/src/github.com/hyperledger/fabric/consensus/util/events/events.go:113 +0x45
OUT - github.com/hyperledger/fabric/consensus/util/events.(*managerImpl).Inject(0xc820331920, 0xda32e0, 0xc82032f760)
OUT -   /opt/gopath/src/github.com/hyperledger/fabric/consensus/util/events/events.go:123 +0x4f
OUT - github.com/hyperledger/fabric/consensus/util/events.(*managerImpl).eventLoop(0xc820331920)
OUT -   /opt/gopath/src/github.com/hyperledger/fabric/consensus/util/events/events.go:132 +0xdb
OUT - created by github.com/hyperledger/fabric/consensus/util/events.(*managerImpl).Start
OUT -   /opt/gopath/src/github.com/hyperledger/fabric/consensus/util/events/events.go:100 +0x35
OUT - 2017-02-20 18:54:11,817 INFO exited: start_peer (exit status 2; expected)
OUT - 2017-02-20 18:54:12,819 INFO spawned: 'start_peer' with pid 37
OUT - 18:54:12.869 [nodeCmd] serve -> INFO 001[0m Security enabled status: true
OUT - 18:54:12.869 [nodeCmd] serve -> INFO 002[0m Privacy enabled status: false
OUT - 18:54:12.869 [eventhub_producer] start -> INFO 003[0m event processor started
OUT - 18:54:12.869 [db] open -> INFO 004[0m Setting rocksdb maxLogFileSize to 10485760
OUT - 18:54:12.869 [db] open -> INFO 005[0m Setting rocksdb keepLogFileNum to 10
OUT - 18:54:12.960 [crypto] RegisterValidator -> INFO 006[0m Registering validator [peer3] with name [peer3]...
OUT - 18:54:12.961 [crypto] RegisterValidator -> INFO 007[0m Registering validator [peer3] with name [peer3]...done!
OUT - 18:54:12.962 [crypto] InitValidator -> INFO 008[0m Initializing validator [peer3]...
OUT - 18:54:12.964 [crypto] InitValidator -> INFO 009[0m Initializing validator [peer3]...done!
OUT - 18:54:12.965 [chaincode] NewChaincodeSupport -> INFO 00a[0m Chaincode support using peerAddress: 5cc24f88bbcc414a96962ea1c37c3aea-vp3.us.blockchain.ibm.com:30001
OUT - [33m18:54:12.965 [sysccapi] RegisterSysCC -> WARN 00b[0m Currently system chaincode does support security(noop,github.com/hyperledger/fabric/bddtests/syschaincode/noop)
OUT - 18:54:12.965 [state] loadConfig -> INFO 00c[0m Loading configurations...
OUT - 18:54:12.965 [state] loadConfig -> INFO 00d[0m Configurations loaded. stateImplName=[buckettree], stateImplConfigs=map[maxGroupingAtEachLevel:%!s(int=5) bucketCacheSize:%!s(int=100) numBuckets:%!s(int=1000003)], deltaHistorySize=[500]
OUT - 18:54:12.965 [state] NewState -> INFO 00e[0m Initializing state implementation [buckettree]
OUT - 18:54:12.965 [buckettree] initConfig -> INFO 00f[0m configs passed during initialization = map[string]interface {}{"numBuckets":1000003, "maxGroupingAtEachLevel":5, "bucketCacheSize":100}
OUT - 18:54:12.965 [buckettree] initConfig -> INFO 010[0m Initializing bucket tree state implemetation with configurations &{maxGroupingAtEachLevel:5 lowestLevel:9 levelToNumBucketsMap:map[6:8001 0:1 9:1000003 3:65 2:13 8:200001 7:40001 4:321 1:3 5:1601] hashFunc:0xab4dc0}
OUT - 18:54:12.966 [buckettree] newBucketCache -> INFO 011[0m Constructing bucket-cache with max bucket cache size = [100] MBs
OUT - 18:54:12.966 [buckettree] loadAllBucketNodesFromDB -> INFO 012[0m Loaded buckets data in cache. Total buckets in DB = [72]. Total cache size:=10240
OUT - 18:54:12.967 [consensus/controller] NewConsenter -> INFO 013[0m Creating consensus plugin pbft
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 014[0m PBFT type = *pbft.obcBatch
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 015[0m PBFT Max number of validating peers (N) = 4
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 016[0m PBFT Max number of failing peers (f) = 1
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 017[0m PBFT byzantine flag = false
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 018[0m PBFT request timeout = 30s
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 019[0m PBFT view change timeout = 30s
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 01a[0m PBFT Checkpoint period (K) = 10
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 01b[0m PBFT broadcast timeout = 1s
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 01c[0m PBFT Log multiplier = 4
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 01d[0m PBFT log size (L) = 40
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 01e[0m PBFT null requests disabled
OUT - 18:54:12.967 [consensus/pbft] newPbftCore -> INFO 01f[0m PBFT automatic view change disabled
OUT - 18:54:13.088 [consensus/pbft] restoreLastSeqNo -> INFO 020[0m Replica 3 restored lastExec: 28
OUT - 18:54:13.101 [consensus/pbft] restoreState -> INFO 021[0m Replica 3 restored state: view: 0, seqNo: 30, pset: 10, qset: 10, reqBatches: 10, chkpts: 1 h: 20
OUT - 18:54:13.101 [consensus/pbft] newObcBatch -> INFO 022[0m PBFT Batch size = 1000
OUT - 18:54:13.102 [consensus/pbft] newObcBatch -> INFO 023[0m PBFT Batch timeout = 1s
OUT - 18:54:13.102 [nodeCmd] serve -> INFO 024[0m Starting peer with ID=name:"vp3" , network ID=5cc24f88bbcc414a96962ea1c37c3aea, address=5cc24f88bbcc414a96962ea1c37c3aea-vp3.us.blockchain.ibm.com:30001, rootnodes=5cc24f88bbcc414a96962ea1c37c3aea-vp0.us.blockchain.ibm.com:30001,5cc24f88bbcc414a96962ea1c37c3aea-vp1.us.blockchain.ibm.com:30001,5cc24f88bbcc414a96962ea1c37c3aea-vp2.us.blockchain.ibm.com:30001, validator=true
OUT - 18:54:13.108 [rest] StartOpenchainRESTServer -> INFO 025[0m Initializing the REST service on 0.0.0.0:5001, TLS is enabled.
OUT - 18:54:13.109 [consensus/statetransfer] SyncToTarget -> INFO 026[0m Syncing to target 7f9573db0cae463b3f02b37312525e8f128d1415e05357d04751a88c01f831ff35e631a732c01c917aa9991a3c122a6e4be48ff50cf28f8e82b73729a4851087 for block number 28 with peers []
OUT - [31m18:54:13.180 [peer] handleChat -> ERRO 027[0m Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp2"  5cc24f88bbcc414a96962ea1c37c3aea-vp2.us.blockchain.ibm.com:30001 VALIDATOR `�ބ��M�U�d,��������9(ˑ(����}
OUT - [31m18:54:13.414 [peer] handleChat -> ERRO 028[0m Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp0"  5cc24f88bbcc414a96962ea1c37c3aea-vp0.us.blockchain.ibm.com:30001 VALIDATOR 2�)���J��;B���C��6U&�~ᑀ�A�  }
OUT - [31m18:54:13.415 [peer] handleChat -> ERRO 029[0m Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp0"  5cc24f88bbcc414a96962ea1c37c3aea-vp0.us.blockchain.ibm.com:30001 VALIDATOR 2�)���J��;B���C��6U&�~ᑀ�A�  }
OUT - [31m18:54:13.415 [peer] handleChat -> ERRO 02a[0m Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp0"  5cc24f88bbcc414a96962ea1c37c3aea-vp0.us.blockchain.ibm.com:30001 VALIDATOR 2�)���J��;B���C��6U&�~ᑀ�A�  }
OUT - 18:54:13.478 [consensus/statetransfer] blockThread -> INFO 02b[0m Validated blockchain to the genesis block
OUT - 18:54:13.478 [consensus/pbft] ProcessEvent -> INFO 02c[0m Replica 3 application caught up via state transfer, lastExec now 28
OUT - [31m18:54:13.478 [consensus/pbft] Checkpoint -> ERRO 02d[0m Attempted to checkpoint a sequence number (28) which is not a multiple of the checkpoint interval (10)
OUT - [31m18:54:13.502 [peer] handleChat -> ERRO 02e[0m Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp1"  5cc24f88bbcc414a96962ea1c37c3aea-vp1.us.blockchain.ibm.com:30001 VALIDATOR �7��$iAG��zr-����8���f��8�q�<}
OUT - [31m18:54:13.526 [peer] handleChat -> ERRO 02f[0m Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp1"  5cc24f88bbcc414a96962ea1c37c3aea-vp1.us.blockchain.ibm.com:30001 VALIDATOR �7��$iAG��zr-����8���f��8�q�<}
OUT - [31m18:54:13.537 [peer] handleChat -> ERRO 030[0m Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp1"  5cc24f88bbcc414a96962ea1c37c3aea-vp1.us.blockchain.ibm.com:30001 VALIDATOR �7��$iAG��zr-����8���f��8�q�<}
OUT - 2017-02-20 18:54:28,551 INFO success: start_peer entered RUNNING state, process has stayed up for > than 15 seconds (startsecs)
OUT - /scripts/start.sh -network_id 5cc24f88bbcc414a96962ea1c37c3aea -peer_id vp3 -chaincode_host prod-us-01-chaincode-swarm-vp3.us.blockchain.ibm.com -chaincode_port 3383 -network_name us.blockchain.ibm.com -port_discovery 30001 -port_rest 5001 -port_event 31001 -peer_enrollid peer3 -chaincode_tls true -peer_tls true -num_peers 4
OUT - Enrollment secret is not passed calculating the default

【问题讨论】:

    标签: ibm-cloud blockchain hyperledger hyperledger-fabric


    【解决方案1】:

    这当然是我对上述步骤的描述所期望的行为。对等点的实际同步需要稍微解释一下,并且取决于区块链上设置的一些配置参数。

    通过停止 vp3,您实际上使 vp3 脱离了共识,并导致 vp3 推进了他的观点。区块链可以在只有 3 个节点参与共识的情况下正常进行,这就是目前正在发生的事情。其他三个同伴正在正常参与和进行,他们对自己所处的状态和看法感到满意。您可能会看到一些从 vp2 发给其他对等方的消息,请求更改视图,但由于没有他,他们完全可以正常工作,他们现在将忽略它。

    从 vp3 的角度来看,他知道自己不合时宜,并因此而无法达成共识。如果网络保持其当前状态(vp3 未达成共识且 1 个视图提前且 vp0、vp1、vp2 达成共识,都在同一视图中,但在 vp2 后面一个)则基于 vp3 的一些 PBFT 配置变量(在 Starter Network你正在使用,这将是一个 40 块窗口和 10 个块检查点)他不会担心同步。在落后 40 块时,他将通过状态转移启动追赶,但他使用接下来的两个 10 块检查点来完成此操作。所以你会看到 vp3 仅在当前配置设置下他落后于其他人 60 块时才推进他的链。请注意,这只是确保 vp3 不会落后太多。它不一定会让他重新达成共识。

    您可以在此处找到有关 PBFT 的更多信息以及它在 Starter Network 计划中的实施方式 ==> https://console.ng.bluemix.net/docs/services/blockchain/etn_pbft.html

    现在,关于重新同步,它可以通过几种不同的方式发生。

    1) 其他同行出于某种原因决定改变他们的观点。发生这种情况的原因有很多,节点之间的网络/通信问题,参与节点决定投票给他们认为可能更快的新领导者的沉重负担(并因此改变看法),以及其他一些原因。当他们投票改变他们的观点时,他们会前进到 vp3 已经在等待的地方。 Vp3 将迅速与区块链同步并再次开始参与共识。此时所有对等点都将同步。出于各种原因,这可能随时发生。

    2) 您可以尝试“强制”重新同步问题。这将试图迫使其他对等方推进他们的观点以满足 vp3。这样做的一种方法是停止 vp3。然后停止另一个对等点(例如 vp2)。通过执行指向剩余向上对等点之一的调用来推进链。然后重启vp2,再重启vp3。在大多数情况下,这可以重新调整对等方,尽管时间可能是一个因素。有可能所有 4 个对等点都推进他们的观点(仍然让 vp3 领先 1 个观点)或者 3 个对等点将他们的观点提前 vp3 留下 vp3 一个落后。如果您只是想尝试一下,看看区块链在这些情况下如何反应,您可以试试这个。

    3) 如果您有自己的本地区块链,使用此处发布的 docker 镜像 ==> https://hub.docker.com/r/ibmblockchain/fabric-peer/ 您可以设置一些配置设置,强制在特定时间范围内自动更改视图,这会导致对等点不同步在更一致的基础上重新排队,但这不是您可以在您似乎正在使用的 Bluemix 上的 Starter Network 上做的事情(来自您的屏幕截图)。

    对于真正的解决方案环境,这将如何运行或如何配置将在很大程度上取决于您的应用程序和您的预期用例。对等同步可以以增加对等之间的通信为代价来完成,但这个想法并不是要让所有对等保持同步,而是要确保写入区块链的内容已通过共识过程达成一致。

    希望这会有所帮助!

    【讨论】:

    • 谢谢!我会在几天内试一试。我可能会走在 vp3 之前发送 60 个块的路线
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-05-25
    • 2013-08-04
    相关资源
    最近更新 更多