【发布时间】:2021-11-16 22:21:35
【问题描述】:
我想在 AWS (t2.micro) 上运行一个自我管理的 Thingsboard。
我在运行 Ubuntu 20.04 服务器的 t2.micro AWS 实例上安装了 Thingsboard CE。
我遵循了aws setup 和Ubuntu install 指南(postgresql + 内置队列服务)。
我还使用 this 指南设置了 haproxy。
我能够成功登录到我的 Thingsboard。我只更改了密码并检查了基本功能,但没有创建任何新的仪表板或进行任何修改。
在此之后,我让计算机继续运行,运行 Thingsboard。第二天,我无法访问 Thingsboard,尽管 AWS 实例正在运行,但我无法再通过 ssh 访问它。停止和启动后(重启不起作用)实例一切正常(可以 ssh 和 Thingsboard 访问)。
我可以通过让实例保持打开状态来重现此故障,似乎在几个小时(5-8 小时)之后,Thingsboard(或其他不确定的东西)失败了,这会冻结整个计算机。
我检查了两件事:
-
我在 AWS 监控中检查了 CPU 使用率。 似乎几个小时后,CPU 负载急剧上升,然后几乎降到了零。当 Thingsboard 运行时,它是恒定的。See printscreen from AWS monitoring
-
我检查了 Thingsboard 日志(在 /var/log/thingsboard 中): 有一些错误,但不幸的是,大多数事情都不足以让我猜出全新安装可能会出现什么问题。以下是日志中的一些行:
2021-11-12 00:21:59,626 [http-nio-0.0.0.0-8080-exec-13] INFO o.a.coyote.http11.Http11Processor - Error parsing HTTP request header Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level. java.lang.IllegalArgumentException: Invalid character found in method name[0x160x030x010x00{0x010x000x00w0x030x030x170xb80xb80xe50xef0x000xb50x0a&0x930x020x00:0xde0xd70xa00xab0xb 70x8bU0xc00x92r0x9330x10O0x8c
HTTP method names must be tokens at org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:417) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:261) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:893) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1707) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:829) 2021-11-12 00:22:01,486 [sql-queue-2-ts-4-thread-1] WARN com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - Failed to validate连接 org.postgresql.jdbc.PgConnection@4393afd0 (这个连接 已经被关了。)。可能考虑使用更短的 maxLifetime 价值。 2021-11-12 00:22:01,487 [sql-queue-2-ts latest-8-thread-1] 警告 com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - 验证失败 连接org.postgresql.jdbc.PgConnection@75b9496b(这个连接 已经被关了。)。可能考虑使用更短的 maxLifetime 价值。 2021-11-12 00:22:01,487 [sql-queue-0-ts latest-6-thread-1] 警告 com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - 验证失败 连接org.postgresql.jdbc.PgConnection@31849eec(这个连接 已经被关了。)。可能考虑使用更短的 maxLifetime 价值。 2021-11-12 00:22:01,487 [sql-queue-0-ts-2-thread-1] 警告 com.zaxxer.hikari.pool.PoolBase - HikariPool-1 - 验证失败 连接 org.postgresql.jdbc.PgConnection@725fafe3 (这个连接 已经被关了。)。可能考虑使用更短的 maxLifetime 价值。
更多:
2021-11-12 00:23:46,205 [sql-log-1-thread-1] INFO o.t.s.dao.sql.TbSqlBlockingQueue - Queue-2 [TS Latest] queueSize [9] totalAdded [0] totalSaved [0] totalFailed [0]
2021-11-12 00:23:47,741 [sql-queue-0-ts-2-thread-1] WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 08003
2021-11-12 00:23:47,742 [sql-queue-2-ts-4-thread-1] WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 08003
2021-11-12 00:23:47,742 [sql-queue-2-ts latest-8-thread-1] WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 08003
2021-11-12 00:23:47,742 [sql-queue-0-ts latest-6-thread-1] WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 08003
2021-11-12 00:23:48,022 [sql-queue-0-ts-2-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - HikariPool-1 - Connection is not available, request timed out after 634223ms.
2021-11-12 00:23:48,058 [sql-queue-0-ts-2-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - This connection has been closed.
2021-11-12 00:23:48,022 [sql-queue-0-ts latest-6-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - HikariPool-1 - Connection is not available, request timed out after 634223ms.
2021-11-12 00:23:48,059 [sql-queue-0-ts latest-6-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - This connection has been closed.
2021-11-12 00:23:48,022 [sql-queue-2-ts latest-8-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - HikariPool-1 - Connection is not available, request timed out after 624177ms.
2021-11-12 00:23:48,059 [sql-queue-2-ts latest-8-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - This connection has been closed.
2021-11-12 00:23:48,023 [sql-queue-2-ts-4-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - HikariPool-1 - Connection is not available, request timed out after 627819ms.
2021-11-12 00:23:48,059 [sql-queue-2-ts-4-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - This connection has been closed.
最后:
2021-11-12 00:33:10,919 [sql-queue-0-ts latest-6-thread-1] ERROR o.t.s.dao.sql.TbSqlBlockingQueue - [TS Latest] Failed to save 1 entities
org.springframework.transaction.CannotCreateTransactionException: Could not open JPA EntityManager for transaction; nested exception is org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection
at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:448)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.startTransaction(AbstractPlatformTransactionManager.java:400)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:373)
at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:574)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:361)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:118)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:750)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:692)
at org.thingsboard.server.dao.sqlts.insert.latest.psql.PsqlLatestInsertTsRepository$$EnhancerBySpringCGLIB$$381b448c.saveOrUpdate(<generated>)
at org.thingsboard.server.dao.sqlts.SqlTimeseriesLatestDao.lambda$init$3(SqlTimeseriesLatestDao.java:133)
at org.thingsboard.server.dao.sql.TbSqlBlockingQueue.lambda$init$2(TbSqlBlockingQueue.java:71)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection
at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:48)
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:113)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:99)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:111)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:138)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement(LogicalConnectionManagedImpl.java:276)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.begin(LogicalConnectionManagedImpl.java:284)
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.begin(JdbcResourceLocalTransactionCoordinatorImpl.java:246)
at org.hibernate.engine.transaction.internal.TransactionImpl.begin(TransactionImpl.java:83)
at org.springframework.orm.jpa.vendor.HibernateJpaDialect.beginTransaction(HibernateJpaDialect.java:184)
at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:402)
... 16 common frames omitted
Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 634223ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:695)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)
at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:122)
at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:38)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:108)
... 23 common frames omitted
Caused by: org.postgresql.util.PSQLException: This connection has been closed.
at org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:877)
at org.postgresql.jdbc.PgConnection.setNetworkTimeout(PgConnection.java:1610)
at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:560)
at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:173)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186)
... 28 common frames omitted
有趣的是,CPU 负载达到最大值的时间戳与日志中的错误消息并不精确相关。 对于冗长的错误消息,我深表歉意,但现在我不知道根本原因是什么。
我还没有尝试重新安装整个计算机。
我的问题是,我应该如何进行?有没有人遇到过类似的问题?什么日志/服务/等。我应该检查以抓住根本原因吗?
我应该尝试使用资源更多的机器吗?我应该尝试其他数据库和队列服务吗?
在当前形式下,这个 Thingsboard 实例即使在测试中也不稳定。
编辑:抱歉,我无法正确格式化错误代码的第一部分。 Edit2:第一个链接错误。
【问题讨论】:
-
可能有人入侵并在您的机器上挖掘加密货币。您的 SG 向 Internet 开放哪些端口?
-
我为 Thingsboard 打开了所有推荐的端口。 TCP 22,80,443,1883 和 UDP 5683 和 8080,所以我可以访问 TB Web UI。你有什么建议来确定是否有人侵入了我的电脑,这是根本原因?稍后我打算使用VPN,但我会尽快实现它。
-
您是 Internet 上任何地方的 ssh 密钥吗?我说的是 github repos、comapany sharepoint、团队文件,作为电子邮件附件发送给某人?我也遇到过事情板冻结的问题,但这些错误日志与我见过的任何东西都不一样。其他要检查的事情,这是全新安装还是您有设备将数据发布到平台。你用的是什么数据库。您的实例附加了多大的卷。
-
@JacksonB ssh 密钥不在互联网上 afaik,但我要更改它。这是一个全新的安装,我还没有开始向 TB 发送任何东西。我有 1GB 内存(我已经添加了用户指南中解释的设置,用于内存小于 1GB 的机器)和 16GB 磁盘空间。
-
@Glanthor 如果问题仍然存在,我建议在 thingsboard github 问题论坛上提交错误报告:github.com/thingsboard/thingsboard/issues
标签: postgresql amazon-ec2 thingsboard