【问题标题】:interpreting jstack output解释 jstack 输出
【发布时间】:2013-08-10 12:00:52
【问题描述】:

我有一个 java 进程使用 BatchInserter 将大量数据从一堆 .csv 文件加载到 Neo4j 数据库中。我正在使用:

  • OpenJDK 7
  • Ubuntu 12.04
  • Neo4j 2.0 M3

加载前 164 GB(根据ls -lh)后,文件夹大小停止增加,但进程继续运行,没有释放内存,CPU 仍处于 100%(均根据htop)。

加载过程是单线程的,只有 JVM 使用超过 1 个线程 - 我猜是 ParallelGC

我不确定如何诊断此类问题,但被指示尝试jstack,因此将其输出包含在下面。

有人知道出了什么问题,或者对我如何诊断这个问题有建议吗?

Full thread dump OpenJDK 64-Bit Server VM (22.0-b10 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007fc3a4001000 nid=0x5636 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" daemon prio=10 tid=0x00007fcf58123000 nid=0x4545 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007fcf58120800 nid=0x4544 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007fcf5811d800 nid=0x4543 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007fcf5811b800 nid=0x4542 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007fcf580c4800 nid=0x4541 in Object.wait() [0x00007fc3f3cfb000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007fc4165fc708> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x00007fc4165fc708> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00007fcf580c2000 nid=0x4540 in Object.wait() [0x00007fc3f3dfc000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007fc4165ffe08> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        - locked <0x00007fc4165ffe08> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007fcf58007800 nid=0x452c runnable [0x00007fcf606b7000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.String.substring(String.java:1960)
        at java.lang.String.subSequence(String.java:1993)
        at java.util.regex.Pattern.split(Pattern.java:1202)
        at com.ldbc.socialnet.neo4j.CsvFileReader.parseLine(CsvFileReader.java:73)
        at com.ldbc.socialnet.neo4j.CsvFileReader.nextLine(CsvFileReader.java:61)
        at com.ldbc.socialnet.neo4j.CsvFileReader.hasNext(CsvFileReader.java:33)
        at com.ldbc.socialnet.neo4j.CsvFileInserter.bufferLines(CsvFileInserter.java:62)
        at com.ldbc.socialnet.neo4j.CsvFileInserter.insertAllBuffered(CsvFileInserter.java:93)
        at com.ldbc.socialnet.neo4j.LdbcSocialNeworkNeo4jImporter.load(LdbcSocialNeworkNeo4jImporter.java:79)
        at com.ldbc.socialnet.neo4j.LdbcSocialNeworkNeo4jImporter.main(LdbcSocialNeworkNeo4jImporter.java:49)

"VM Thread" prio=10 tid=0x00007fcf580ba000 nid=0x453f runnable 

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00007fcf58012800 nid=0x452d runnable 

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00007fcf58014800 nid=0x452e runnable 

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00007fcf58016000 nid=0x452f runnable 

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00007fcf58018000 nid=0x4530 runnable 

"GC task thread#4 (ParallelGC)" prio=10 tid=0x00007fcf5801a000 nid=0x4531 runnable 

"GC task thread#5 (ParallelGC)" prio=10 tid=0x00007fcf5801b800 nid=0x4532 runnable 

"GC task thread#6 (ParallelGC)" prio=10 tid=0x00007fcf5801d800 nid=0x4533 runnable 

"GC task thread#7 (ParallelGC)" prio=10 tid=0x00007fcf5801f800 nid=0x4534 runnable 

"GC task thread#8 (ParallelGC)" prio=10 tid=0x00007fcf58021000 nid=0x4535 runnable 

"GC task thread#9 (ParallelGC)" prio=10 tid=0x00007fcf58023000 nid=0x4536 runnable 

"GC task thread#10 (ParallelGC)" prio=10 tid=0x00007fcf58025000 nid=0x4537 runnable 

"GC task thread#11 (ParallelGC)" prio=10 tid=0x00007fcf58026800 nid=0x4538 runnable 

"GC task thread#12 (ParallelGC)" prio=10 tid=0x00007fcf58028800 nid=0x4539 runnable 

"GC task thread#13 (ParallelGC)" prio=10 tid=0x00007fcf5802a800 nid=0x453a runnable 

"GC task thread#14 (ParallelGC)" prio=10 tid=0x00007fcf5802c800 nid=0x453b runnable 

"GC task thread#15 (ParallelGC)" prio=10 tid=0x00007fcf5802e000 nid=0x453c runnable 

"GC task thread#16 (ParallelGC)" prio=10 tid=0x00007fcf58030000 nid=0x453d runnable 

"GC task thread#17 (ParallelGC)" prio=10 tid=0x00007fcf58032000 nid=0x453e runnable 

"VM Periodic Task Thread" prio=10 tid=0x00007fcf5812d800 nid=0x4546 waiting on condition 

JNI global references: 175

Heap
 PSYoungGen      total 11211840K, used 7045440K [0x00007fcb95000000, 0x00007fcf3d160000, 0x00007fcf55000000)
  eden space 7045440K, 100% used [0x00007fcb95000000,0x00007fcd43050000,0x00007fcd43050000)
  from space 4166400K, 0% used [0x00007fce39510000,0x00007fce39510000,0x00007fcf379d0000)
  to   space 4035328K, 0% used [0x00007fcd43050000,0x00007fcd43050000,0x00007fce39510000)
 PSOldGen        total 31457280K, used 31300002K [0x00007fc415000000, 0x00007fcb95000000, 0x00007fcb95000000)
  object space 31457280K, 99% used [0x00007fc415000000,0x00007fcb8b668b98,0x00007fcb95000000)
 PSPermGen       total 21248K, used 7487K [0x00007fc40aa00000, 0x00007fc40bec0000, 0x00007fc415000000)
  object space 21248K, 35% used [0x00007fc40aa00000,0x00007fc40b14ff48,0x00007fc40bec0000)

特别是,任何人都可以对这些堆值提供一些见解或建议阅读它们吗?

Heap
 PSYoungGen      total 16348096K, used 12660905K [0x00007f833a560000, 0x00007f87639c0000, 0x00007f8765000000)
  eden space 15282432K, 82% used [0x00007f833a560000,0x00007f863f18a688,0x00007f86df1a0000)
  from space 1065664K, 0% used [0x00007f86df1a0000,0x00007f86df1a0000,0x00007f8720250000)
  to   space 1036288K, 0% used [0x00007f87245c0000,0x00007f87245c0000,0x00007f87639c0000)
 ParOldGen       total 34952576K, used 34903343K [0x00007f7ae5000000, 0x00007f833a560000, 0x00007f833a560000)
  object space 34952576K, 99% used [0x00007f7ae5000000,0x00007f833754bd90,0x00007f833a560000)
 PSPermGen       total 21248K, used 7130K [0x00007f7adfe00000, 0x00007f7ae12c0000, 0x00007f7ae5000000)
  object space 21248K, 33% used [0x00007f7adfe00000,0x00007f7ae04f6860,0x00007f7ae12c0000)

【问题讨论】:

    标签: java neo4j deadlock openjdk jstack


    【解决方案1】:

    看起来您的进程几乎内存不足:

    eden space 7045440K, 100% used
    object space 31457280K, 99% used
    

    在这种情况下,GC 可能会持续运行,试图释放一些内存,消耗掉所有 CPU。因此应用程序逻辑线程处于严重饥饿状态。您可以通过 -Xmx JVM 选项添加更多内存或分析应用程序。通过 JVisualVM、Jprofiler 或其他工具进行分析将为您提供重要的运行时数据:

    • 内存分配动态。如果内存消耗一直在增加,您可能会发生内存泄漏
    • 堆内容以找出内存占用的内容
    • GC 统计数据

    根据这些数据,您肯定能够解决问题。

    【讨论】:

    • 谢谢,以后会继续关注的。您知道以下哪些是最需要监控的吗?堆 *** PSYoungGen(总计 16348096K,已使用 12660905K)——伊甸园空间 15282432K,已使用 82%——从空间 1065664K,已使用 0%——到空间 1036288K,已使用 0% *** ParOldGen——总计 34952576K,已使用 34903343K对象空间 34952576K,已使用 99% *** PSPermGen 总计 21248K,已使用 7130K 对象空间 21248K,已使用 33%
    • ParOldGen 对于长期内存消耗监控来说是最重要的。如果进程没有内存泄漏,那么从长远来看,它应该或多或少是稳定的。
    • 再次感谢。仅供参考,我认为这不是内存泄漏。该过程维护了许多“大” 映射(总共约 450,000,000 个条目)。我在选择 Xmx 时只考虑了这些地图,并没有留下太多喘息的空间。现在我想起来了,我还维护了一些缓冲区(String[][])和其他临时状态。现在已经用增加的堆重新启动了进程,并将密切关注 jstack 堆信息。
    猜你喜欢
    • 1970-01-01
    • 2012-09-25
    • 2012-08-28
    • 2023-03-18
    • 2012-05-19
    • 2014-02-17
    • 2011-05-03
    • 2012-02-17
    • 2021-12-18
    相关资源
    最近更新 更多