【发布时间】:2021-03-19 16:42:07
【问题描述】:
我创建了一个 Java 类,它每分钟检查是否有心跳信号通过某个队列到达。虽然到目前为止效果很好,但生产中报告了一个奇怪的案例,其中每毫秒或什至在一毫秒内两次在日志中都有一个条目。这持续了大约 40 分钟,直到管理员检测到问题并重新启动应用程序。之后,心跳队列每分钟根据日志再次检查一次,这是预期的行为。
有问题的日志条目看起来(简化)如下:
16-03-2021 01:20:20.098 HEARTBEAT_NOT_OK. No initial heartbeat from server after application startup at 01.01.1970-00:01:09.444 for 18702 day(s) 01:19:10 (days hh:mm:ss). Heartbeat interval 60000 ms.
16-03-2021 01:20:20.099 HEARTBEAT_NOT_OK. No initial heartbeat from server after application startup at 01.01.1970-00:01:09.444 for 18702 day(s) 01:19:10 (days hh:mm:ss). Heartbeat interval 60000 ms.
16-03-2021 01:20:20.099 HEARTBEAT_NOT_OK. No initial heartbeat from server after application startup at 01.01.1970-00:01:09.444 for 18702 day(s) 01:19:10 (days hh:mm:ss). Heartbeat interval 60000 ms.
16-03-2021 01:20:20.100 HEARTBEAT_NOT_OK. No initial heartbeat from server after application startup at 01.01.1970-00:01:09.444 for 18702 day(s) 01:19:10 (days hh:mm:ss). Heartbeat interval 60000 ms.
日志文本总是相同的,我找不到解释。这是应该每分钟检查一次心跳队列的那个 Java 类的简化代码:
@Slf4j
@Component(service = ServiceProvider.class)
public class HeartbeatReceiver implements ServiceProvider {
private static DateFormat DateFormatter = new SimpleDateFormat("dd.MM.yyyy-HH:mm:ss.SSS");
private static long DefaultHeartbeatPeriodInMillis = 60_000;
private Timer timer;
private ReceivedHeartbeat lastHeartbeat = null;
private long heartbeatPeriod = DefaultHeartbeatPeriodInMillis;
private Object heartbeatPeriodLock = new Object();
private volatile long watchdogStartTime;
private Object heartbeatAliveLock = new Object();
@Activate
public void activate() {
startWatchdog();
}
private void startWatchdog() {
watchdogStartTime = System.currentTimeMillis();
startWatchdogTimer();
}
private void startWatchdogTimer() {
timer = new Timer();
timer.scheduleAtFixedRate(new TimerTask() {
@Override
public void run() {
checkHeartbeat();
}
}, delayTillFullMinute(), getHeartbeatPeriod());
}
private long delayTillFullMinute() {
return 0; // omitted for brevity
}
public long getHeartbeatPeriod() {
synchronized (heartbeatPeriodLock) {
return heartbeatPeriod;
}
}
private void checkHeartbeat() {
synchronized (heartbeatAliveLock) {
if (lastHeartbeat == null) { // just started up
long duration = System.currentTimeMillis() - watchdogStartTime;
if(duration >= heartbeatPeriod) {
logNoInitialHeartbeatReceivedAfterStartup(duration);
}
// omitted for brevity
return;
}
// omitted for brevity
}
}
private void logNoInitialHeartbeatReceivedAfterStartup(long duration) {
log.error("HEARTBEAT_NOT_OK. No initial heartbeat from server after application startup at {} for {}. Heartbeat interval {} ms.",
DateFormatter.format(watchdogStartTime), parseTime(duration), heartbeatPeriod);
}
public static String parseTime(long time) {
return "foo"; // omitted for brevity
}
// other code omitted for brevity
}
我的猜测是 OSGi @Activate 注释会导致一些类初始化问题,因为没有正确初始化 var heartbeatPeriod。但随后日志输出应显示日期 01.01.1970-00:00:00.000 而不是 01.01.1970-00:01:09.444。这个奇怪的日期 01.01.1970-00:01:09.444 也让我感到困惑。
【问题讨论】: