【问题标题】:Light weight XLSX reader in JavaJava中的轻量级XLSX阅读器
【发布时间】:2018-09-10 16:21:18
【问题描述】:

Apache POI 确实解决了这个问题,但是 apache poi 的所有传递依赖项的总大小接近 13MB。

  • poi-ooxml-schemas-3.17.jar (6MB)
  • xmlbeans-2.6.0.jar (3MB)
  • poi-3.17.jar (3MB)
  • poi-ooxml-3.17.jar (1.5MB)

到目前为止,我可以在没有 apache poi 和未维护的库的情况下找到关于 writing 到 xlsx 的问题。任何指针都会非常有帮助。

【问题讨论】:

  • 反对者,请发表评论。
  • 问题只是格式真的很复杂......在文件中它是 xml...... 13 MiB 有什么问题?
  • 问题是您究竟需要什么样的功能?但不幸的是你没有告诉使用.....如果大小对你来说太大了,你可能会帮助减少大小(为项目做贡献)......其他选择是使用一个特定的库来做什么你需要,也许不超过你需要......
  • @khmarbaise:缩小尺寸不是目标。由于apache poi 很长且尚未完成,因此大小会增加。但它可以更加模块化。我们需要用于处理文件系统的模块,一个用于二进制 (OLE2),一个用于 OOXML,然后每个形状(二进制或 OOXML)中每个 Office 部分(Word、Excel、Powerpoint)的单个模块。可能通过使用二进制和 OOXML 和/或跨部分的接口连接,尽管这也应该是模块化的。但这不是贡献者可以决定的。这是项目负责人的决定。

标签: java excel maven apache-poi


【解决方案1】:

感谢 Axel Richter 的 answer,我将其修改为 Utility 类。到目前为止,这对我来说效果很好,可以帮助任何只想阅读简单 XLSX 文件(其中没有任何复杂内容)的人。

import javax.xml.namespace.QName;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.events.Attribute;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;
import java.nio.file.*;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;


public class XLSXUtil {

    public static void main(String args[]) throws Exception {
        fetchDataFromExcel();
    }

    public static List<List<String>> fetchDataFromExcel() throws Exception {
        List<List<String>> result = new ArrayList<>();
        XMLEventReader reader = null;
        XMLEvent event = null;
        Attribute attribute = null;
        StartElement startElement = null;
        EndElement endElement = null;
        String characters = null;

        StringBuilder stringValue = new StringBuilder(); //for collecting the characters to complete values
        List<String> sharedStrings = new ArrayList<String>(); //list of shared strings
        Map<String, String> numberFormats = new HashMap<String, String>(); //map of number formats
        List<String> cellNumberFormats = new ArrayList<String>(); //list of cell number formats
        Path source = Paths.get("Sample.xlsx"); //path to the Excel file
        FileSystem fs = FileSystems.newFileSystem(source, null); //get filesystem of Excel file

        //get shared strings ==============================================================================
        Path sharedStringsTable = fs.getPath("/xl/sharedStrings.xml");
        reader = XMLInputFactory.newInstance().createXMLEventReader(Files.newInputStream(sharedStringsTable));
        boolean siFound = false;
        while (reader.hasNext()) {
            event = (XMLEvent) reader.next();
            if (event.isStartElement()) {
                startElement = (StartElement) event;
                if (startElement.getName().getLocalPart().equalsIgnoreCase("si")) {
                    //start element of shared string item
                    siFound = true;
                    stringValue = new StringBuilder();
                }
            } else if (event.isCharacters() && siFound) {
                //chars of the shared string item
                characters = event.asCharacters().getData();
                stringValue.append(characters);
            } else if (event.isEndElement()) {
                endElement = (EndElement) event;
                if (endElement.getName().getLocalPart().equalsIgnoreCase("si")) {
                    //end element of shared string item
                    siFound = false;
                    sharedStrings.add(stringValue.toString());
                }
            }
        }
        reader.close();
        //shared strings ==================================================================================

        //get styles, number formats are essential for detecting date / time values =======================
        Path styles = fs.getPath("/xl/styles.xml");
        reader = XMLInputFactory.newInstance().createXMLEventReader(Files.newInputStream(styles));
        boolean cellXfsFound = false;
        while (reader.hasNext()) {
            event = (XMLEvent) reader.next();
            if (event.isStartElement()) {
                startElement = (StartElement) event;
                if (startElement.getName().getLocalPart().equalsIgnoreCase("numFmt")) {
                    //start element of number format
                    attribute = startElement.getAttributeByName(new QName("numFmtId"));
                    String numFmtId = attribute.getValue();
                    attribute = startElement.getAttributeByName(new QName("formatCode"));
                    numberFormats.put(numFmtId, ((attribute != null) ? attribute.getValue() : "null"));
                } else if (startElement.getName().getLocalPart().equalsIgnoreCase("cellXfs")) {
                    //start element of cell format setting
                    cellXfsFound = true;
                } else if (startElement.getName().getLocalPart().equalsIgnoreCase("xf") && cellXfsFound) {
                    //start element of format setting in cell format setting
                    attribute = startElement.getAttributeByName(new QName("numFmtId"));
                    cellNumberFormats.add(((attribute != null) ? attribute.getValue() : "null"));
                }
            } else if (event.isEndElement()) {
                endElement = (EndElement) event;
                if (endElement.getName().getLocalPart().equalsIgnoreCase("cellXfs")) {
                    //end element of cell format setting
                    cellXfsFound = false;
                }
            }
        }
        reader.close();
        //styles ==========================================================================================

        //get sheet data of first sheet ===================================================================
        Path sheet1 = fs.getPath("/xl/worksheets/sheet1.xml");
        reader = XMLInputFactory.newInstance().createXMLEventReader(Files.newInputStream(sheet1));
        boolean rowFound = false;
        boolean cellFound = false;
        boolean cellValueFound = false;
        boolean inlineStringFound = false;
        String cellStyle = null;
        String cellType = null;
        List<String> row = null;
        while (reader.hasNext()) {
            event = (XMLEvent) reader.next();
            if (event.isStartElement()) {
                startElement = (StartElement) event;
                if (startElement.getName().getLocalPart().equalsIgnoreCase("row")) {
                    //start element of row
                    rowFound = true;
                    if (row != null) {
                        result.add(row);
                    }
                    row = new ArrayList<>();
                } else if (startElement.getName().getLocalPart().equalsIgnoreCase("c") && rowFound) {
                    //start element of cell in row
                    cellFound = true;
                    attribute = startElement.getAttributeByName(new QName("t"));
                    cellType = ((attribute != null) ? attribute.getValue() : null);
                    attribute = startElement.getAttributeByName(new QName("s"));
                    cellStyle = ((attribute != null) ? attribute.getValue() : null);
                } else if (startElement.getName().getLocalPart().equalsIgnoreCase("v") && cellFound) {
                    //start element of value in cell
                    cellValueFound = true;
                    stringValue = new StringBuilder();
                } else if (startElement.getName().getLocalPart().equalsIgnoreCase("is") && cellFound) {
                    //start element of inline string in cell
                    inlineStringFound = true;
                    stringValue = new StringBuilder();
                }
            } else if (event.isCharacters() && cellFound && (cellValueFound || inlineStringFound)) {
                //chars of the cell value or the inline string
                characters = event.asCharacters().getData();
                stringValue.append(characters);
            } else if (event.isEndElement()) {
                endElement = (EndElement) event;
                if (endElement.getName().getLocalPart().equalsIgnoreCase("row")) {
                    //end element of row
                    rowFound = false;
                } else if (endElement.getName().getLocalPart().equalsIgnoreCase("c")) {
                    //end element of cell
                    cellFound = false;
                } else if (endElement.getName().getLocalPart().equalsIgnoreCase("v")) {
                    //end element of value
                    cellValueFound = false;
                    String cellValue = stringValue.toString();
                    if ("s".equals(cellType)) {
                        cellValue = sharedStrings.get(Integer.valueOf(cellValue));
                    }
                    row.add(cellValue);
                } else if (endElement.getName().getLocalPart().equalsIgnoreCase("is")) {
                    //end element of inline string
                    inlineStringFound = false;
                    String cellValue = stringValue.toString();
                    row.add(cellValue);
                }
            }
        }
        reader.close();
        //sheet data ======================================================================================
        fs.close();
        return result;
    }
}

【讨论】:

    猜你喜欢
    • 2017-05-06
    • 1970-01-01
    • 2010-10-02
    • 2015-02-22
    • 1970-01-01
    • 2014-06-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多