使用Apache Tika运行mvn clean软件包时,超出了GC开销限制

时间:2020-08-31 20:17:47

标签: java maven out-of-memory apache-tika

我正在使用Apache Tika将MultiPart文件解析为纯文本,并且我认为运行mvn clean package时会引起问题。我不断遇到内存不足错误。 我使用的是Tika 1.24.1版,但我尝试使其与较旧的版本一起使用,但无法。 我也曾尝试关闭fileStream之类的fileStream.close(),但遇到相同的错误。

错误的堆栈跟踪为

[ERROR] GC overhead limit exceeded -> [Help 1]
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.lang.StringCoding$StringEncoder.encode (StringCoding.java:300)
    at java.lang.StringCoding.encode (StringCoding.java:344)
    at java.lang.String.getBytes (String.java:918)
    at org.codehaus.plexus.archiver.commonscompress.archivers.zip.FallbackZipEncoding.encode (FallbackZipEncoding.java:80)
    at org.codehaus.plexus.archiver.commonscompress.archivers.zip.ZipArchiveOutputStream.getName (ZipArchiveOutputStream.java:1523)
    at org.codehaus.plexus.archiver.commonscompress.archivers.zip.ZipArchiveOutputStream.writeLocalFileHeader (ZipArchiveOutputStream.java:994)
    at org.codehaus.plexus.archiver.commonscompress.archivers.zip.ZipArchiveOutputStream.putArchiveEntry (ZipArchiveOutputStream.java:754)
    at org.codehaus.plexus.archiver.commonscompress.archivers.zip.ZipArchiveOutputStream.addRawArchiveEntry (ZipArchiveOutputStream.java:555)
    at org.codehaus.plexus.archiver.commonscompress.archivers.zip.ScatterZipOutputStream.writeTo (ScatterZipOutputStream.java:116)
    at org.codehaus.plexus.archiver.commonscompress.archivers.zip.ParallelScatterZipCreator.writeTo (ParallelScatterZipCreator.java:209)
    at org.codehaus.plexus.archiver.zip.AbstractZipArchiver.close (AbstractZipArchiver.java:736)
    at org.codehaus.plexus.archiver.AbstractArchiver.createArchive (AbstractArchiver.java:920)
    at org.apache.maven.plugin.assembly.archive.archiver.AssemblyProxyArchiver.createArchive (AssemblyProxyArchiver.java:445)
    at org.apache.maven.plugin.assembly.archive.DefaultAssemblyArchiver.createArchive (DefaultAssemblyArchiver.java:181)
    at org.apache.maven.plugin.assembly.mojos.AbstractAssemblyMojo.execute (AbstractAssemblyMojo.java:484)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)

这是我的pom

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.github.proyeception</groupId>
    <artifactId>benito</artifactId>
    <version>0.0.0</version>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.1.6.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>

    <properties>
        <kotlin.version>1.3.72</kotlin.version>
        <junit4.version>4.13</junit4.version>
        <junit5.version>5.6.2</junit5.version>
        <jackson.version>2.10.1</jackson.version>
        <poi.version>3.10-FINAL</poi.version>

        <!-- compiler -->
        <java.compiler.version>1.8</java.compiler.version>
        <kotlin.compiler.incremental>true</kotlin.compiler.incremental>
        <jetty.version>9.4.18.v20190429</jetty.version>
    </properties>

    <dependencies>
        <!-- Kotlin test -->
        <dependency>
            <groupId>io.kotlintest</groupId>
            <artifactId>kotlintest</artifactId>
            <version>2.0.5</version>
            <scope>test</scope>
        </dependency>
        <!-- Kotlin test -->

        <!-- Kotlin -->
        <dependency>
            <groupId>org.jetbrains.kotlin</groupId>
            <artifactId>kotlin-stdlib</artifactId>
            <version>${kotlin.version}</version>
        </dependency>

        <dependency>
            <groupId>org.jetbrains.kotlin</groupId>
            <artifactId>kotlin-test</artifactId>
            <version>${kotlin.version}</version>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>org.jetbrains.kotlin</groupId>
            <artifactId>kotlin-reflect</artifactId>
            <version>${kotlin.version}</version>
        </dependency>

        <dependency>
            <groupId>org.jetbrains.kotlinx</groupId>
            <artifactId>kotlinx-coroutines-core</artifactId>
            <version>1.3.8</version>
        </dependency>
        <!-- Kotlin -->

        <!-- Mockito -->
        <dependency>
            <groupId>com.nhaarman</groupId>
            <artifactId>mockito-kotlin</artifactId>
            <version>1.6.0</version>
            <scope>test</scope>
        </dependency>
        <!-- Mockito -->

        <!-- Logging -->
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
            <version>1.7.21</version>
        </dependency>

        <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
            <version>1.2.3</version>
        </dependency>
        <!-- Logging -->

        <!-- Spring -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
            <exclusions>
                <exclusion>
                    <groupId>org.springframework.boot</groupId>
                    <artifactId>spring-boot-starter-tomcat</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-jetty</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-freemarker</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-web</artifactId>
        </dependency>
        <!-- Spring -->

        <!-- Jetty -->
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-server</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-servlet</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-servlets</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-webapp</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-util</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-jmx</artifactId>
            <version>${jetty.version}</version>
        </dependency>
        <!-- Jetty -->

        <!-- Servlet -->
        <dependency>
            <groupId>javax.servlet</groupId>
            <artifactId>javax.servlet-api</artifactId>
            <version>3.1.0</version>
        </dependency>
        <!-- Servlet -->

        <!-- Joda -->
        <dependency>
            <groupId>joda-time</groupId>
            <artifactId>joda-time</artifactId>
            <version>2.9.4</version>
        </dependency>
        <!-- Joda -->

        <!-- Guava -->
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>19.0</version>
        </dependency>
        <!-- Guava -->

        <!-- Apache Commons -->
        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-lang3</artifactId>
            <version>3.4</version>
        </dependency>
        <dependency>
            <groupId>commons-logging</groupId>
            <artifactId>commons-logging</artifactId>
            <version>1.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-collections4</artifactId>
            <version>4.1</version>
        </dependency>
        <dependency>
            <groupId>commons-collections</groupId>
            <artifactId>commons-collections</artifactId>
            <version>3.2.2</version>
        </dependency>
        <dependency>
            <groupId>commons-io</groupId>
            <artifactId>commons-io</artifactId>
            <version>2.5</version>
        </dependency>
        <dependency>
            <groupId>commons-validator</groupId>
            <artifactId>commons-validator</artifactId>
            <version>1.6</version>
        </dependency>
        <!-- Apache Commons -->

        <!-- Apache POI -->
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>4.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>4.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-scratchpad</artifactId>
            <version>4.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml-schemas</artifactId>
            <version>4.0.0</version>
        </dependency>
        <!-- Apache POI -->

        <!-- Apache TIKA -->
        <dependency>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-core</artifactId>
            <version>1.20</version>
        </dependency>

        <dependency>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-parsers</artifactId>
            <version>1.20</version>
        </dependency>
        <!-- Apache TIKA -->

        <!-- Lombok -->
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>1.18.12</version>
            <scope>provided</scope>
        </dependency>
        <!-- Lombok -->

        <!-- Jackson -->
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>${jackson.version}</version>
        </dependency>

        <dependency>
            <groupId>com.fasterxml.jackson.module</groupId>
            <artifactId>jackson-module-afterburner</artifactId>
            <version>${jackson.version}</version>
        </dependency>

        <dependency>
            <groupId>com.fasterxml.jackson.datatype</groupId>
            <artifactId>jackson-datatype-joda</artifactId>
            <version>${jackson.version}</version>
        </dependency>

        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-core</artifactId>
            <version>${jackson.version}</version>
        </dependency>

        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-annotations</artifactId>
            <version>${jackson.version}</version>
        </dependency>

        <dependency>
            <groupId>com.fasterxml.jackson.dataformat</groupId>
            <artifactId>jackson-dataformat-smile</artifactId>
            <version>${jackson.version}</version>
        </dependency>

        <dependency>
            <groupId>com.fasterxml.jackson.datatype</groupId>
            <artifactId>jackson-datatype-jsr310</artifactId>
            <version>${jackson.version}</version>
        </dependency>

        <dependency>
            <groupId>com.fasterxml.jackson.module</groupId>
            <artifactId>jackson-module-kotlin</artifactId>
            <version>${jackson.version}</version>
        </dependency>
        <!-- Jackson -->

        <!-- Apache -->
        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.5.2</version>
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-api</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-text</artifactId>
            <version>1.3</version>
        </dependency>
        <!-- Apache -->

        <!-- Arrow -->
        <dependency>
            <groupId>io.arrow-kt</groupId>
            <artifactId>arrow-core</artifactId>
            <version>0.10.5</version>
        </dependency>
        <!-- Arrow -->

        <!-- Typesafe Config -->
        <dependency>
            <groupId>com.typesafe</groupId>
            <artifactId>config</artifactId>
            <version>1.4.0</version>
        </dependency>
        <!-- Typesafe Config -->

        <!-- File upload -->
        <dependency>
            <groupId>commons-fileupload</groupId>
            <artifactId>commons-fileupload</artifactId>
            <version>1.4</version>
        </dependency>
        <!-- File upload -->

        <!-- Oauth library -->
        <dependency>
            <groupId>com.github.scribejava</groupId>
            <artifactId>scribejava-core</artifactId>
            <version>7.0.0</version>
        </dependency>

        <dependency>
            <groupId>com.github.scribejava</groupId>
            <artifactId>scribejava-apis</artifactId>
            <version>7.0.0</version>
        </dependency>
        <!-- Oauth library -->
    </dependencies>

    <build>
        <finalName>${artifactId}</finalName>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.22.2</version>
                <configuration>
                    <argLine>-Xmx256m -Xms128m -XX:MaxPermSize=500M -XX:MaxPermSize=256m -XX:PermSize=128m</argLine>
                </configuration>
            </plugin>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.6</version>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals> <goal>single</goal> </goals>
                        <configuration>
                            <archive>
                                <manifest>
                                    <mainClass>com.github.proyeception.benito.Benito</mainClass>
                                </manifest>
                            </archive>
                            <descriptorRefs>
                                <descriptorRef>jar-with-dependencies</descriptorRef>
                            </descriptorRefs>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <groupId>org.jetbrains.kotlin</groupId>
                <artifactId>kotlin-maven-plugin</artifactId>
                <version>${kotlin.version}</version>
                <executions>
                    <execution>
                        <id>compile</id>
                        <goals> <goal>compile</goal> </goals>
                        <configuration>
                            <sourceDirs>
                                <sourceDir>${project.basedir}/src/main/kotlin</sourceDir>
                                <sourceDir>${project.basedir}/src/main/java</sourceDir>
                            </sourceDirs>
                        </configuration>
                    </execution>
                    <execution>
                        <id>test-compile</id>
                        <goals> <goal>test-compile</goal> </goals>
                        <configuration>
                            <sourceDirs>
                                <sourceDir>${project.basedir}/src/test/kotlin</sourceDir>
                                <sourceDir>${project.basedir}/src/test/java</sourceDir>
                            </sourceDirs>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.8.1</version>
                <executions>
                    <!-- Replacing default-compile as it is treated specially by maven -->
                    <execution>
                        <id>default-compile</id>
                        <phase>none</phase>
                    </execution>
                    <!-- Replacing default-testCompile as it is treated specially by maven -->
                    <execution>
                        <id>default-testCompile</id>
                        <phase>none</phase>
                    </execution>
                    <execution>
                        <id>java-compile</id>
                        <phase>compile</phase>
                        <goals> <goal>compile</goal> </goals>
                    </execution>
                    <execution>
                        <id>java-test-compile</id>
                        <phase>test-compile</phase>
                        <goals> <goal>testCompile</goal> </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <version>3.2.0</version>
                <configuration>
                    <excludes>
                        <exclude>**/logback.xml</exclude>
                        <exclude>**/*.properties</exclude>
                    </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

这是我调用解析器的部分

fun saveFile(projectId: String, name: String, file: MultipartFile) {
        var fileStream = file.inputStream
        val content = documentParser.parse(fileStream)

        println(content)

        val driveId = documentService.saveFile(projectId = projectId, name = name, file = file)
        medusaClient.saveFile(projectId, name, driveId, content)
    }

这是使用Tika的解析器

    open fun parse(input: InputStream): String {

        val handler = BodyContentHandler(-1)
        val metadata = Metadata()
        parser.parse(input, handler, metadata)
        return handler.toString()

    }

我已经以各种可能的方式在这个问题上进行了搜索,但是找不到答案。另外,如果我没有提供足够的信息,我也很抱歉,这是我在网站上遇到的第一个问题,请告诉我是否应该显示更多代码。

0 个答案:

没有答案