【问题标题】:Error GC life time is shorter than transaction duration while writing to TiDB using Spark使用 Spark 写入 TiDB 时出现错误 GC life time is short than transaction duration
【发布时间】:2018-11-13 04:28:11
【问题描述】:

我正在使用 Apache Spark 批量写入数据。批次为 1 天。运行 spark 作业时出现此错误。我正在使用 MySQL java 连接器连接到 TiDB 集群。 Spark 创建 144 个并行写入任务。

java.sql.SQLException: GC life time is shorter than transaction duration
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3536)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3468)
    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1957)
    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2107)
    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2642)
    at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1610)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.mysql.jdbc.LoadBalancingConnectionProxy.invoke(LoadBalancingConnectionProxy.java:359)
    at com.sun.proxy.$Proxy13.commit(Unknown Source)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:665)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:821)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:821)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2067)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2067)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:109)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

【问题讨论】:

    标签: tidb


    【解决方案1】:

    这个错误意味着,Spark 任务 Transaction Time 超过了 TiDBGC Life Time,这意味着:

    当前读取的数据已被TiDB删除,因为数据的生命周期超过了配置的GC Life Time

    所以解决方案可能会尝试通过以下方式增加tikv_gc_life_time

    update mysql.tidb set variable_value='30m' where variable_name='tikv_gc_life_time';
    

    查看更多:

    https://github.com/pingcap/docs/blob/master/op-guide/gc.md#configuration-and-monitor

    【讨论】:

      猜你喜欢
      • 2021-11-21
      • 2022-07-03
      • 2021-11-19
      • 2021-12-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-09-27
      相关资源
      最近更新 更多