【问题标题】:Split and add the string based on length根据长度拆分并添加字符串
【发布时间】:2019-08-16 12:59:58
【问题描述】:

我有一个段落作为输入字符串。我正在尝试将段落拆分为句子数组,其中每个元素包含的确切句子不超过 250 个字符。

我尝试根据分隔符 (as .) 拆分字符串。将所有字符串转换为列表。使用 StringBuilder ,我试图根据长度(250 个字符)附加字符串。

    List<String> list = new ArrayList<String>();

    String text = "Perhaps far exposed age effects. Now distrusts you her delivered applauded affection out sincerity. As tolerably recommend shameless unfeeling he objection consisted. She although cheerful perceive screened throwing met not eat distance. Viewing hastily or written dearest elderly up weather it as. So direction so sweetness or extremity at daughters. Provided put unpacked now but bringing. Unpleasant astonished an diminution up partiality. Noisy an their of meant. Death means up civil do an offer wound of. Called square an in afraid direct. Resolution diminution conviction so mr at unpleasing simplicity no. No it as breakfast up conveying earnestly immediate principle. Him son disposed produced humoured overcame she bachelor improved. Studied however out wishing but inhabit fortune windows. ";

    Pattern re = Pattern.compile("[^.!?\\s][^.!?]*(?:[.!?](?!['\"]?\\s|$)[^.!?]*)*[.!?]?['\"]?(?=\\s|$)",
            Pattern.MULTILINE | Pattern.COMMENTS);

    Matcher reMatcher = re.matcher(text);
    while (reMatcher.find()) {
        list.add(reMatcher.group());
    }
    String textDelimted[] = new String[list.size()];
    textDelimted = list.toArray(textDelimted);

    StringBuilder stringB = new StringBuilder(100);

    for (int i = 0; i < textDelimted.length; i++) {
        while (stringB.length() + textDelimted[i].length() < 250)
            stringB.append(textDelimted[i]);

        System.out.println("!#@#$%" +stringB.toString());
    }
}

预期结果:

[0] : 可能会暴露年龄效应。现在不信任你,她用真诚传递了鼓掌的感情。作为可以容忍的建议,他的反对意见是无耻的冷酷无情。她虽然开朗,但感觉屏蔽投掷却不吃距离。

[1] : 匆忙查看或写最亲爱的老人天气。所以方向如此甜蜜或极端的女儿。提供现在打开包装但带来。不愉快惊讶地减少了偏心。吵闹是他们的意思。

[2]:死亡意味着向上民事做一个提议伤口的。叫方安里怕直接。决议减少信念所以先生在不愉快的简单没有。不,它作为早餐传达了认真的直接原则。

[3] 儿子处理他的幽默感克服了她单身汉的进步。学习但希望但居住在幸运窗口。

【问题讨论】:

标签: java string list stringbuilder


【解决方案1】:

我认为您需要稍微修改一下循环。 我的结果匹配。

import java.util.List;
import java.util.ArrayList;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class MyClass {
    public static void main(String args[]) {

        List<String> list = new ArrayList<String>();

        String text = "Perhaps far exposed age effects. Now distrusts you her delivered applauded affection out sincerity. As tolerably recommend shameless unfeeling he objection consisted. She although cheerful perceive screened throwing met not eat distance. Viewing hastily or written dearest elderly up weather it as. So direction so sweetness or extremity at daughters. Provided put unpacked now but bringing. Unpleasant astonished an diminution up partiality. Noisy an their of meant. Death means up civil do an offer wound of. Called square an in afraid direct. Resolution diminution conviction so mr at unpleasing simplicity no. No it as breakfast up conveying earnestly immediate principle. Him son disposed produced humoured overcame she bachelor improved. Studied however out wishing but inhabit fortune windows. ";

        Pattern re = Pattern.compile("[^.!?\\s][^.!?]*(?:[.!?](?!['\"]?\\s|$)[^.!?]*)*[.!?]?['\"]?(?=\\s|$)",
                Pattern.MULTILINE | Pattern.COMMENTS);

        Matcher reMatcher = re.matcher(text);
        while (reMatcher.find()) {
            list.add(reMatcher.group());
        }
        String textDelimted[] = new String[list.size()];
        textDelimted = list.toArray(textDelimted);

        StringBuilder stringB = new StringBuilder(300);

        for (int i = 0; i < textDelimted.length; i++) {
            if(stringB.length() + textDelimted[i].length() < 250) {
                stringB.append(textDelimted[i]);
            } else {
                System.out.println("!#@#$%" +stringB.toString());
                stringB = new StringBuilder(300);
                stringB.append(textDelimted[i]);
            }

        }
        System.out.println("!#@#$%" +stringB.toString());
    }
}

用此代码替换println 以获得结果列表:

ArrayList<String> arrlist = new ArrayList<String>(5);
..
arrlist.add(stringB.toString());
..

【讨论】:

  • 如何将输出添加到数组或列表中?
【解决方案2】:

您的问题不清楚,请尝试改写以明确您的问题是什么。

话虽如此,我假设“我尝试根据分隔符拆分字符串(如 .)。将所有字符串转换为列表”意味着您想在任何时候拆分 String “。”出现,并转换为List&lt;String&gt;。可以这样做:

String input = "hello.world.with.delimiters";
String[] words = input.split("\\.");  // String[] with contents {"hello", "world", "with", "delimiters"}
List<String> list = Arrays.asList(words);  // Identical contents, just in a List<String>


// if you want to append to a StringBuilder based on length
StringBuilder sb = new StringBuilder();
for (String s : list) {
    if (someLengthCondition(s.length())) sb.append(list);
}

当然,您对someLengthCondition() 的实现将取决于您想要什么。我无法提供一个,因为很难理解您要做什么。

【讨论】:

  • 你自己测试过这段代码吗?它不会产生您期望它产生的结果,因为. 是正则表达式上下文中的特殊字符。
猜你喜欢
  • 1970-01-01
  • 2023-04-04
  • 2022-08-16
  • 1970-01-01
  • 2021-10-03
  • 2014-06-23
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多