【发布时间】:2019-06-28 09:40:42
【问题描述】:
我有一个 gradle 项目,我在其中使用 Tika 的 AutoDetectParser 来提取内容。当项目被构建成一个胖 jar 时,AutoDetectParser 返回空字符串。我可以看到这是因为 Parser 不在运行时类路径中。如何将 Parser 添加到运行时类路径?
Gradle 构建文件:
dependencies {
compile 'org.apache.tika:tika-parsers:1.20'
testImplementation 'junit:junit:4.12'
}
jar {
manifest {
attributes (
'Main-Class': 'com.superna.tikatest.TikaTestApp'
)
}
from {
configurations.compile.collect { it.isDirectory() ? it : zipTree(it) }
} {
exclude 'META-INF/*.RSA', 'META-INF/*.SF', 'META-INF/*.DSA'
}
}
代码sn-p:
Metadata metadata = new Metadata();
AutoDetectParser parser = new AutoDetectParser();
BodyContentHandler handler = new BodyContentHandler();
try(FileInputStream fis = new FileInputStream(localPath.toString());
BufferedInputStream bis = new BufferedInputStream(fis);
TikaInputStream stream = TikaInputStream.get(bis)) {
parser.parse(stream, handler, metadata);
System.out.println(handler.toString());
}
【问题讨论】:
标签: java gradle apache-tika