使用 Apache POI XSLF 在 Powerpoint 中从形状中提取文本框答案

【问题标题】：Extract Text Box Off of Shape in Powerpoint Using Apache POI XSLF使用 Apache POI XSLF 在 Powerpoint 中从形状中提取文本框
【发布时间】：2022-09-24 23:07:47
【问题描述】：

我正在使用 Java 和 Apache POI 库来解析幻灯片。我可以提取形状和连接器，但我很难提取每个形状中的“文本”。这是获取形状的示例代码，并且工作正常。

           XMLSlideShow ppt = new XMLSlideShow(new FileInputStream(file));
            List<XSLFSlide> slide = ppt.getSlides();
            System.out.println(\"These are the shapes in the presentation: \");
            for (int i = 0; i < slide.size(); i++) {
                List<XSLFShape> listOfShapes = slide.get(i).getShapes();
                for (int j = 0; j < listOfShapes.size(); j++) {
                    XSLFShape thisShape = listOfShapes.get(j);
                    String thisShapeName = thisShape.getShapeName();
                    int thisShapeID = thisShape.getShapeId();
                    XSLFShapeContainer thisShapeParent = thisShape.getParent();
                    Rectangle2D thisAnchor = thisShape.getAnchor();
                    String textBody = thisShape.;
                    System.out.println(\"Name: \" + thisShapeName + \" ID: \" + thisShapeID + \" Anchor: \" + thisAnchor.toString());
                }
            }

我想，根据我读到的关于 XSLFTextShape 类和其他地方的内容，我可以通过简单地说得到每个形状上的文本：

String textOnShape = thisShape.getTextBody();

但是 getTextBody 似乎不是一个可接受的方法。我已经使用 Apache POI HSLF 阅读了同样问题的问题和答案，但我使用的是 XSLF（较新版本）。我在语法上遗漏了一些明显的东西，但是如果有人以前这样做过并且有想法，那将不胜感激。

标签： java apache-poi powerpoint xslf

【解决方案1】：

我最终想通了。当您进行迭代时，您需要重铸形状对象几次，如下所示：

XSLFShape thisShape = listOfShapes.get(j);
XSLFSimpleShape thisSimpleShape = (XSLFSimpleShape) thisShape;
XSLFTextShape thisTextShape = (XSLFTextShape) thisSimpleShape;
System.out.println(thisTextShape.getText());

这将使您将文本放在形状本身上。

【讨论】：