【问题标题】:Struggling with logic to split text into an array努力将文本拆分为数组的逻辑
【发布时间】:2020-07-16 01:47:33
【问题描述】:

所以经过一些处理,我有这样的文字:

18:50 - 13:10(+1 day) Trip duration 12h20 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) KL0646 Inflight services: Lowest fare USD 862 View flight details 17:27 - 13:10(+1 day) Trip duration 13h43 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) Lowest fare USD 862 View flight details  12:00 - 13:10(+1 day) Trip duration 19h10 New York (JFK) 2 x transfer Includes travel by bus Maastricht (ZYT) Inflight services: FromUSD 864 View flight details

我想以这样的方式拆分文本:

18:50 - 13:10(+1 day) Trip duration 12h20 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) KL0646 Inflight services: Lowest fare USD 862 View flight details 
17:27 - 13:10(+1 day) Trip duration 13h43 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) Lowest fare USD 862 View flight details 
12:00 - 13:10(+1 day) Trip duration 19h10 New York (JFK) 2 x transfer Includes travel by bus Maastricht (ZYT) Inflight services: FromUSD 864 View flight details

其中每一行都是一个数组索引。

我知道使用 string.split(); 之类的方法可以按字符拆分字符串,但我不知道这将如何工作。使用string.split(" - "); 不会将其拆分到正确的位置。

说字符串是这样的

18:50 - 13:10(+1 day) Some Text # 17:27 - 13:10(+1 day) Some Text # 12:00 - 13:10(+1 day) Some Text...... 然后使用 string.split("#"); 就可以了。

那么我怎样才能用给定的文本获得我想要的格式呢?

【问题讨论】:

    标签: java regex string text format


    【解决方案1】:

    您可以使用空格进行拆分,后跟一个肯定的前瞻,它断言下一个持续时间子字符串的开始:

    String input = "18:50 - 13:10(+1 day) Trip duration 12h20 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) KL0646 Inflight services: Lowest fare USD 862 View flight details 17:27 - 13:10(+1 day) Trip duration 13h43 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) Lowest fare USD 862 View flight details  12:00 - 13:10(+1 day) Trip duration 19h10 New York (JFK) 2 x transfer Includes travel by bus Maastricht (ZYT) Inflight services: FromUSD 864 View flight details";
    String[] durations = input.split("\\s+(?=\\d{2}:\\d{2} - \\d{2}:\\d{2})");
    for (String duration : durations) {
        System.out.println(duration);
    }
    

    打印出来:

    18:50 - 13:10(+1 day) Trip duration 12h20 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) KL0646 Inflight services: Lowest fare USD 862 View flight details
    17:27 - 13:10(+1 day) Trip duration 13h43 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) Lowest fare USD 862 View flight details
    12:00 - 13:10(+1 day) Trip duration 19h10 New York (JFK) 2 x transfer Includes travel by bus Maastricht (ZYT) Inflight services: FromUSD 864 View flight details
    

    只要时间范围只出现在子字符串的最开头(并且所有子字符串总是以此开头),这种方法应该是可行的。

    【讨论】:

      【解决方案2】:

      你可以使用正则表达式:

      String regex = "(?=(\\d\\d):(\\d\\d) - (\\d\\d):(\\d\\d))";
      
      String[] split = text.split(regex);
      Arrays.asList(split).forEach(System.out::println);
      

      输出

      18:50 - 13:10(+1 day) Trip duration 12h20 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) KL0646 Inflight services: Lowest fare USD 862 View flight details 
      17:27 - 13:10(+1 day) Trip duration 13h43 New York (JFK) 1 x transfer Includes travel by bus Maastricht (ZYT) Lowest fare USD 862 View flight details  
      12:00 - 13:10(+1 day) Trip duration 19h10 New York (JFK) 2 x transfer Includes travel by bus Maastricht (ZYT) Inflight services: FromUSD 864 View flight details
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2019-04-19
        • 2019-11-06
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多