【问题标题】:Why cyclicBarrier can't be acquired right after barrier action execution?为什么在屏障动作执行后无法立即获取 cyclicBarrier?
【发布时间】:2020-02-23 00:58:41
【问题描述】:

让我们考虑以下代码:

public static void main(String[] args) throws InterruptedException {
    CyclicBarrier cb = new CyclicBarrier(3, () -> {
        logger.info("Barrier action starting");
        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        logger.info("Barrier action finishing");
    });
    for (int i = 0; i < 6; i++) {
        int counter = i;
        Thread.sleep(100);
        new Thread(() -> {
            try {
                logger.info("Try to acquire barrier for {}", counter);
                cb.await();
                logger.info("barrier acquired for {}", counter);

            } catch (Exception e) {
                e.printStackTrace();
            }

        }).start();
    }
}

我创建了大小 = 3 的屏障和需要 5 秒的屏障动作。

我看到以下输出:

2019-10-27 15:23:09.393  INFO   --- [       Thread-0] my.playground.RemoteServiceFacade        : Try to acquire barrier for 0
2019-10-27 15:23:09.492  INFO   --- [       Thread-1] my.playground.RemoteServiceFacade        : Try to acquire barrier for 1
2019-10-27 15:23:09.593  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Try to acquire barrier for 2
2019-10-27 15:23:09.594  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Barrier action starting
2019-10-27 15:23:09.693  INFO   --- [       Thread-3] my.playground.RemoteServiceFacade        : Try to acquire barrier for 3
2019-10-27 15:23:09.794  INFO   --- [       Thread-4] my.playground.RemoteServiceFacade        : Try to acquire barrier for 4
2019-10-27 15:23:09.897  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Try to acquire barrier for 5
2019-10-27 15:23:14.594  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Barrier action finishing
2019-10-27 15:23:14.595  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : barrier acquired for 2
2019-10-27 15:23:14.595  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Barrier action starting
2019-10-27 15:23:19.596  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Barrier action finishing
2019-10-27 15:23:19.597  INFO   --- [       Thread-0] my.playground.RemoteServiceFacade        : barrier acquired for 0
2019-10-27 15:23:19.597  INFO   --- [       Thread-4] my.playground.RemoteServiceFacade        : barrier acquired for 4
2019-10-27 15:23:19.597  INFO   --- [       Thread-3] my.playground.RemoteServiceFacade        : barrier acquired for 3
2019-10-27 15:23:19.597  INFO   --- [       Thread-1] my.playground.RemoteServiceFacade        : barrier acquired for 1
2019-10-27 15:23:19.597  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : barrier acquired for 5

所以我们可以看到:

  1. 第一道屏障动作持续时间为 15:23:09 - 15:23:14
  2. 第二道屏障动作持续时间为 15:23:14 - 15:23:19

但在第一次屏障操作终止后,只有一个线程能够记录:

2019-10-27 15:23:14.595  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : barrier acquired for 2

我预计 3 个线程应该能够在大约 15:23:14 获取,因为 CyclicBarrier 大小为 3。

您能解释一下这种行为吗?

附言

我尝试运行这段代码很多时间,结果总是相似。

附注 2。

我试着稍微改变一下时间:

public static void main(String[] args) throws InterruptedException {
    CyclicBarrier cb = new CyclicBarrier(3, () -> {
        logger.info("Barrier action starting");
        try {
            Thread.sleep(500);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        logger.info("Barrier action finishing");
    });
    for (int i = 0; i < 6; i++) {
        int counter = i;
        Thread.sleep(1000);
        new Thread(() -> {
            try {
                logger.info("Try to acquire barrier for {}", counter);
                cb.await();
                logger.info("barrier acquired for {}", counter);

            } catch (Exception e) {
                e.printStackTrace();
            }

        }).start();
    }
}

我看到了预期的结果:

2019-10-27 23:22:14.497  INFO   --- [       Thread-0] my.playground.RemoteServiceFacade        : Try to acquire barrier for 0
2019-10-27 23:22:15.495  INFO   --- [       Thread-1] my.playground.RemoteServiceFacade        : Try to acquire barrier for 1
2019-10-27 23:22:16.495  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Try to acquire barrier for 2
2019-10-27 23:22:16.496  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Barrier action starting
2019-10-27 23:22:16.998  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : Barrier action finishing
2019-10-27 23:22:16.998  INFO   --- [       Thread-0] my.playground.RemoteServiceFacade        : barrier acquired for 0
2019-10-27 23:22:16.998  INFO   --- [       Thread-2] my.playground.RemoteServiceFacade        : barrier acquired for 2
2019-10-27 23:22:16.998  INFO   --- [       Thread-1] my.playground.RemoteServiceFacade        : barrier acquired for 1
2019-10-27 23:22:17.495  INFO   --- [       Thread-3] my.playground.RemoteServiceFacade        : Try to acquire barrier for 3
2019-10-27 23:22:18.495  INFO   --- [       Thread-4] my.playground.RemoteServiceFacade        : Try to acquire barrier for 4
2019-10-27 23:22:19.496  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Try to acquire barrier for 5
2019-10-27 23:22:19.499  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Barrier action starting
2019-10-27 23:22:20.002  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : Barrier action finishing
2019-10-27 23:22:20.003  INFO   --- [       Thread-5] my.playground.RemoteServiceFacade        : barrier acquired for 5
2019-10-27 23:22:20.003  INFO   --- [       Thread-3] my.playground.RemoteServiceFacade        : barrier acquired for 3
2019-10-27 23:22:20.003  INFO   --- [       Thread-4] my.playground.RemoteServiceFacade        : barrier acquired for 4

【问题讨论】:

    标签: java multithreading concurrency cyclicbarrier


    【解决方案1】:

    一个有趣的问题,它不是非常琐碎,但我会尽量简洁地解释它。

    虽然多线程不保证任何类型的执行顺序,但对于这个答案,让我们假设有两个序列首先发生:

    1. 所有线程都在同一时间启动
    2. 所有线程同时调用barrier.await()

    在这种情况下,您会看到类似的输出

    Try to acquire barrier for 0
    Try to acquire barrier for 1
    Try to acquire barrier for 2
    Try to acquire barrier for 3
    Try to acquire barrier for 4
    Try to acquire barrier for 5
    

    你的6个线程的当前状态如下:

    1. 线程 01 await 位于共享 Condition 上,因为三个线程“尚未”到达障碍

    2. 线程 2 仍可作为屏障的“跳闸”线程运行

    3. 线程345 将等待屏障的lock.lock 调用(与创建ConditionLock 实例相同。

    当屏障看到线程2 时,它先前记录了01 到达屏障,因此它知道这个循环已完成并将释放01。但在它释放其他两个线程之前,它需要运行你定义为休眠 5 秒的barrierAction,所以它会这样做。

    然后你会看到输出

    Barrier action starting
    Barrier action finishing
    

    线程 2 仍然持有锁,是 RUNNABLE 并准备退出屏障,所以它这样做了,你会看到这个输出。

    barrier acquired for 2
    

    但在线程2 存在之前,它会通知所有其他线程在当前屏障上等待。这就是棘手的地方,01await 是在共享的 Condition 上完成的。 Condition 在所有屏障“代”之间共享。因此,即使前两个线程在第二个线程 locking 之前是 awaiting,当一个 signalAll 完成时,第一代线程仍然必须等待轮到它们唤醒。

    此时我们有 5 个线程处于 BLOCKED (3, 4 & 5) 或 TIMED_WAITING (0, 1) 状态。在此示例中,当他们阻塞/等待Lock 时,时间很重要。如果它们都按顺序发生,关键部分的队列将是:

    Thread-0 -> Thread-1 -> Thread-5 -> Thread-4 -> Thread-3 
       |                                               |
      TAIL                                            HEAD
    

    因此下一个发布的线程将是Thread-3,然后是4,然后是5。队列看起来像这样的原因是因为所有线程同时到达lock并且它们都排队,线程01显然首先到达它,因此使其进入屏障的关键部分,但随后await 虽然线程 2 进来唤醒它们,但现在 01 被放置在队列的末尾,接下来将触发 3、4 和 5。

    当线程2 离开屏障并且signalAll 线程34 将运行并且因为它们是第二代的一部分将暂停直到线程5 通过并触发barrier 操作.然后打印出来

    Barrier action starting
    Barrier action finishing
    

    最后,线程5 将再次signalAll,其余线程将完成唤醒。

    在这种情况下,您将看到线程 5 首先完成,其余的将跟随

    barrier acquired for 5
    barrier acquired for 0
    barrier acquired for 1
    barrier acquired for 3
    barrier acquired for 4
    

    【讨论】:

    • 哦,有点复杂。据我了解(请参阅我的更新),一些线程在屏障操作期间出现这种意外行为的原因
    • @gstackoverflow 是的,通过减慢线程,您可以让第一代(线程 0、1 和 2)在第二代(3、4、5)线程获得之前接近并退出屏障在Lock排队。
    • 因此,如果在屏障操作期间至少有一个线程到来 - 这意味着所有等待锁定的负 1 线程将再等待一次迭代以解锁。我不确定这是否是正确的行为
    • 我同意这种行为有点令人惊讶。 it means that all minus 1 threads waiting for lock will wait one more iteration for unlock 没错,从技术上讲,它们都是解锁的,但必须等待第 2 代线程到达屏障,然后第 1 代线程才能退出。
    • @gstackoverflow 这种行为实际上是使用Phaser 解决的。你告诉Phaser“我想等待 this 特定阶段(生成)。一个阶段的线程不会受到另一阶段线程的影响。它还支持父和子阶段/代,但是这对这个问题不太重要。
    猜你喜欢
    • 2012-01-27
    • 2022-01-02
    • 2014-11-29
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-07-14
    • 2016-02-11
    相关资源
    最近更新 更多