【问题标题】:PDF Clown Highlight multiple search word is failing for PDF contains images, color text, Complex DiagramsPDF Clown Highlight Multiple search word is failed for PDF contains images, color text, Complex Diagrams
【发布时间】:2018-01-11 21:36:40
【问题描述】:

我正在使用 PDFClown 突出显示 PDF 文档中的多个搜索词。在许多包含彩色图像、复杂图表、彩色文本的 pdf 文档中,PDFClown 会在那里抛出异常并且无法突出显示匹配的单词。提到的代码适用于普通或简单的 Pdf。

这是我用于测试的 PDF https://drive.google.com/file/d/0B-nuOO6Zsa4rXy1DS2JjX1RnYmM/view?usp=sharing

    public void searchWordInPdf(DocumentMetadata documentMetadata , String searchWord,
                                HttpServletResponse response)throws IOException{
        try {
            byte[] bytes = null;
            org.pdfclown.files.File file =null;
            if (documentMetadata.getProject().getFtpId() != null && documentMetadata.getProject().getFtpId() > 0) {
                FtpServer ftpServer = ftpServerService.getFtpServer(documentMetadata.getProject().getFtpId());

                ByteArrayOutputStream bos = new ByteArrayOutputStream();
                retrievePdfFile(ftpServer, bos, documentMetadata.getFilePath());
                bytes = bos.toByteArray();
                 file = new org.pdfclown.files.File(bytes);
            }else{
                 file = new org.pdfclown.files.File(documentMetadata.getFilePath());
            }
            List<String> matchList = new ArrayList<String>();
            //Pattern regex = Pattern.compile("[^\\s\"']+|\"([^\"]*)\"|'([^']*)'");
            Pattern regex = Pattern.compile("[^\\s\"']+|\"([^\"]*)\"|'([^']*)'");
            Matcher regexMatcher = regex.matcher(searchWord);
            while (regexMatcher.find()) {
                if (regexMatcher.group(1) != null) {
                    // Add double-quoted string without the quotes
                    matchList.add(regexMatcher.group(1));
                } else if (regexMatcher.group(2) != null) {
                    // Add single-quoted string without the quotes
                    matchList.add(regexMatcher.group(2));
                } else {
                    // Add unquoted word
                    matchList.add(regexMatcher.group());
                }
            }

            for (String key : matchList){
            Pattern pattern = Pattern.compile(key, Pattern.CASE_INSENSITIVE);
            // 2. Iterating through the document pages...
            TextExtractor textExtractor = new TextExtractor(true, true);
            for (final Page page : file.getDocument().getPages()) {
                System.out.println("\nScanning page " + (page.getIndex() + 1) + "...\n");

                // 2.1. Extract the page text!
                Map<Rectangle2D, List<ITextString>> textStrings = textExtractor.extract(page);
                // 2.2. Find the text pattern matches!
                final Matcher matcher = pattern.matcher(TextExtractor.toString(textStrings));

                // 2.3. Highlight the text pattern matches!
                textExtractor.filter(
                    textStrings,
                    new TextExtractor.IIntervalFilter() {
                        @Override
                        public boolean hasNext() {
                            if (matcher.find()) {
                                //count++;
                                return true;
                            }
                            return false;
                        }

                        @Override
                        public Interval<Integer> next() {
                            return new Interval<Integer>(matcher.start(), matcher.end());
                        }

                        @Override
                        public void process(
                            Interval<Integer> interval,
                            ITextString match
                        ) {
                            Rectangle2D textBox = null;
                            // Defining the highlight box of the text pattern match...
                            List<Quad> highlightQuads = new ArrayList<Quad>();
                            {
                        /*
                            NOTE: A text pattern match may be split across multiple contiguous lines,
                             so we have to define a distinct highlight box for each text chunk.
                        */

                                for (TextChar textChar : match.getTextChars()) {
                                    Rectangle2D textCharBox = textChar.getBox();
                                    if (textBox == null) {
                                        textBox = (Rectangle2D) textCharBox.clone();
                                    } else {
                                        if (textCharBox.getY() > textBox.getMaxY()) {
                                            highlightQuads.add(Quad.get(textBox));
                                            textBox = (Rectangle2D) textCharBox.clone();
                                        } else {
                                            textBox.add(textCharBox);
                                        }
                                    }
                                }
                                highlightQuads.add(Quad.get(textBox));
                            }
                            // Highlight the text pattern match!
                            new TextMarkup(page, highlightQuads, null, MarkupTypeEnum.Highlight);
                        }

                        @Override
                        public void remove() {
                            throw new UnsupportedOperationException();
                        }
                    }
                );
            }
        }
                String contentType = getContentType(documentMetadata.getFileName());
                if (contentType == null) {
                    contentType = "binary/octet-stream";
                }
            response.setStatus(HttpStatus.OK.value());
            ByteArrayOutputStream output = new ByteArrayOutputStream();
            if(output != null){
                file.save(output, SerializationModeEnum.Standard );
                bytes =  org.springframework.security.crypto.codec.Base64.encode(output.toByteArray());
                response.addHeader("Content-Disposition", "attachment; filename=" + documentMetadata.getFileName());
                response.addHeader("Content-Type", contentType);
                response.getOutputStream().write(bytes);
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

这里是堆栈跟踪

 java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeLo(TimSort.java:777)
at java.util.TimSort.mergeAt(TimSort.java:514)
at java.util.TimSort.mergeCollapse(TimSort.java:439)
at java.util.TimSort.sort(TimSort.java:245)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at org.pdfclown.tools.TextExtractor.sort(TextExtractor.java:675)
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:306)
at nu.optimise.projectweb.service.DocumentMetadataService.searchWordInPdf(DocumentMetadataService.java:2669)
at nu.optimise.projectweb.service.DocumentMetadataService$$FastClassBySpringCGLIB$$fc6434c2.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at nu.optimise.projectweb.aop.logging.LoggingAspect.logAround(LoggingAspect.java:51)
at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655)
at nu.optimise.projectweb.service.DocumentMetadataService$$EnhancerBySpringCGLIB$$c3a15a18.searchWordInPdf(<generated>)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource.searchContentPDF(DocumentMetadataResource.java:1026)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource$$FastClassBySpringCGLIB$$bb12eea8.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at nu.optimise.projectweb.aop.logging.LoggingAspect.logAround(LoggingAspect.java:51)
at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at com.ryantenney.metrics.spring.TimedMethodInterceptor.invoke(TimedMethodInterceptor.java:48)
at com.ryantenney.metrics.spring.TimedMethodInterceptor.invoke(TimedMethodInterceptor.java:34)
at com.ryantenney.metrics.spring.AbstractMetricMethodInterceptor.invoke(AbstractMetricMethodInterceptor.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource$$EnhancerBySpringCGLIB$$bfe48b3d.searchContentPDF(<generated>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:832)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:743)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:961)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:895)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:967)
at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:858)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:622)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:843)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at com.codahale.metrics.servlet.AbstractInstrumentedFilter.doFilter(AbstractInstrumentedFilter.java:104)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.boot.actuate.autoconfigure.EndpointWebMvcAutoConfiguration$ApplicationContextHeaderFilter.doFilterInternal(EndpointWebMvcAutoConfiguration.java:281)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.boot.actuate.trace.WebRequestTraceFilter.doFilterInternal(WebRequestTraceFilter.java:115)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:115)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:112)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:169)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at nu.optimise.projectweb.security.jwt.JWTFilter.doFilter(JWTFilter.java:43)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:121)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:66)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:106)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:214)
at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:177)
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:346)
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:262)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:87)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:77)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:121)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:528)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1099)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:670)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1520)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1476)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)

【问题讨论】:

  • 如果它抛出一个异常,为什么不将它包含在它的堆栈跟踪中呢?如果并非所有 PDF 都发生这种情况,您为什么不分享一个示例以允许重现该问题?
  • 我已经分享了我使用的示例的 PDF 文件
  • 我添加了在附件 PDF 中搜索单词时发生的异常。
  • 太棒了!稍后我会研究它,我目前只使用智能手机。

标签: java pdf pdfclown


【解决方案1】:

这是 PDF Clown 中的一个错误:在文本提取期间,它使用了自定义的 Comparator 实现,它不完全遵循 Comparator 合同。在 Java 7 及更低版本中,这被忽略了,但在 Java 8 中,这会导致手头的异常。如果指示Java使用旧的排序算法,程序运行无异常。

比较器

这是有问题的比较器

  /**
    Text string position comparator.
   */
  private static class TextStringPositionComparator
    implements Comparator<ITextString>
  {
    /**
      Gets whether the specified boxes lay on the same text line.
    */
    public static boolean isOnTheSameLine(
      Rectangle2D box1,
      Rectangle2D box2
      )
    {
      /*
        NOTE: In order to consider the two boxes being on the same line,
        we apply a simple rule of thumb: at least 25% of a box's height MUST
        lay on the horizontal projection of the other one.
      */
      double minHeight = Math.min(box1.getHeight(), box2.getHeight());
      double yThreshold = minHeight * .75;
      return ((box1.getY() > box2.getY() - yThreshold
          && box1.getY() < box2.getMaxY() + yThreshold - minHeight)
        || (box2.getY() > box1.getY() - yThreshold
          && box2.getY() < box1.getMaxY() + yThreshold - minHeight));
    }

    @Override
    public int compare(
      ITextString textString1,
      ITextString textString2
      )
    {
      Rectangle2D box1 = textString1.getBox();
      Rectangle2D box2 = textString2.getBox();
      if(isOnTheSameLine(box1,box2))
      {
        /*
          [FIX:55:0.1.3] In order not to violate the transitive condition, equivalence on x-axis
          MUST fall back on y-axis comparison.
        */
        int xCompare = Double.compare(box1.getX(), box2.getX());
        if(xCompare != 0)
          return xCompare;
      }
      return Double.compare(box1.getY(), box2.getY());
    }
  }

正如评论[FIX:55:0.1.3] ...所指出的,作者已经遇到了排序问题。但不幸的是,他只解决了一个麻烦的情况。

显然,compare 中使用的 isOnTheSameLine 测试通常会导致不可传递性,考虑一个具有三个 ITextString 实例 ABC 的情况:

(这可能发生在常规文本中,例如,在一行中,首先一些文本用下标,然后一些用正常书写,然后一些用上标。)

AB 以及 BC 将被视为在同一行,但 AC 不会。因此,前两对将分别通过 x 坐标进行比较,而最后一对将通过 y 坐标进行比较,从而产生非传递性:

  • A
  • B
  • A > C(PDF Clown 使用 y 坐标向下增加)。

身份条件也可能被违反,考虑两个ITextString实例AB的情况,它们都具有相同的框,即它们具有相同的尺寸并且被打印在相同的位置(例如用重叠的字母构建一个符号)。 compare 将返回 0,这只会在将对象与其相等的对象进行比较时发生(“应该”,因为这只是建议,而不是严格要求)。

不过,大多数情况下,比较器确实会按照认为正确的方式对文本片段进行排序。

解决方法

在 Java 8 之前,内置的 Java 排序算法没有测试 Comparator 实现是否履行合同。排序结果可能未正确排序,但排序未引发异常。 (不过,一些后来调用的例程假设要排序的数组可能会失败。)

不过,Java 8 使用不同的默认排序算法,该算法会进行一些健全性检查,以识别未履行的Comparator 合同对排序过程的一些影响。

但是通过使用命令行 JRE 参数

-Djava.util.Arrays.useLegacyMergeSort=true

您可以告诉 Java 8 使用不会因异常而失败的旧排序方法。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-03-23
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-01-05
    • 2016-12-11
    • 1970-01-01
    • 2016-08-18
    相关资源
    最近更新 更多