【问题标题】:Preserving the punctuation after converting it to pig latin将标点符号转换为猪拉丁语后保留标点符号
【发布时间】:2021-09-16 12:25:09
【问题描述】:

目前,我可以将英语单词翻译成猪拉丁语。我的实验室任务说,单词之前出现的标点符号应该被删除、存储并添加到piglatinized单词的前面。单词之后出现的标点符号应该被删除、存储并附加到piglatinized单词中。单词中间的任何标点符号都将被视为普通字母。

例如:

  • 什么? -> 怎么了?
  • 哦!!! -> 哦哦!!!
  • “你好”->“hellohay”
  • 不要 -> 当天
  • "pell-mell" -> "ell-mellpay"

这就是我现在要查找和存储标点符号的内容:

public static final String punct = ",./;:'\"?<>[]{}|`~!@#$%^&*()";

String startPunct = "";
String endPunct = "";

for (int c = 0; c < s.length(); c++) {
   for (int i = 0; i < punct.length(); i++) {
      if (s.charAt(c) == punct.charAt(i)) {
         startPunct = startPunct + s.charAt(c);
      }
   }  
}

如果需要,这是我如何打印翻译后的单词的基本思路:

s = s.substring(i) + s.substring(0, i) + "ay";

return s;

所以问题是,我如何保留标点符号,以便它出现在翻译单词的开头和结尾(最好是递归,但正则表达式很好)?

非常感谢任何帮助。提前致谢。

【问题讨论】:

    标签: java regex recursion


    【解决方案1】:

    在我看来,有些问题适合递归,但你的任务不是其中之一。因此下面的代码使用regular expressions

    代码后的注释。

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    /**
     * Converts an English word into pig latin. Algorithm follows.
     * <ol>
     * <li>All initial consonants are moved to the end of the word and <i>ay</i> is appended, for
     * example <i>what</i> becomes <i>atwhay</i></li>
     * <li>For words that begin with a vowel, <i>way</i> is appended to the word for example <i>oh</i>
     * becomes <i>ohway</i>.</li>
     * </ol>
     * Additional stipulations include the following.
     * <ol>
     * <li>Initial punctuation and terminal punctuation are unchanged in the converted word, for
     * example if the original word ends with a question mark then the converted word also ends with a
     * question mark meaning that <i>what?</i> becomes <i>atwhay?</i></li>
     * <li>Case sensitivity is preserved.</li> 
     * </ol>
     */
    public class PigLatin {
        private static final String  VOWELS = "aeiou";
    
        private static int getIndexOfFirstVowelInWord(String word) {
            int index = -1;
            if (word != null  &&  !word.isBlank()) {
                word = word.strip();
                word = word.toLowerCase();
                char[] letters = word.toCharArray();
                for (int i = 0; i < letters.length; i++) {
                    if (VOWELS.indexOf(letters[i]) >= 0) {
                        index = i;
                        break;
                    }
                }
            }
            return index;
        }
    
        /**
         * First method invoked when this class launched via <tt>java</tt> command. Recognizes a single
         * command argument which is the word to be converted into pig latin.
         * 
         * @param args - <tt>java</tt> command arguments.
         */
        public static void main(String[] args) {
            if (args.length == 0) {
                System.out.println("ARGS: word");
            }
            else {
                System.out.printf("Word: ^%s^%n", args[0]);
                Pattern pattern = Pattern.compile("^([!?\"'():;,.-]*)(\\w+[!?\"'():;,.-]*\\w+)([!?\"'():;,.-]*)$");
                Matcher matcher = pattern.matcher(args[0]);
                if (matcher.matches()) {
                    String initial = matcher.group(1);
                    String word = matcher.group(2);
                    word = word.strip();
                    String terminal = matcher.group(3);
                    int index = getIndexOfFirstVowelInWord(word);
                    if (index == 0) {
                        word += "way";
                    }
                    else {
                        String suffix = word.substring(0, index);
                        word = word.substring(index);
                        word += suffix;
                        word += "ay";
                    }
                    String result = initial + word + terminal;
                    System.out.println("Result: " + result);
                }
                else {
                    System.out.println("No match.");
                }
            }
        }
    }
    

    我测试散文中常见的标点符号,包括以下内容。

    • 感叹号
    • 问号
    • 双引号
    • 单引号
    • 括号
    • 冒号
    • 半冒号
    • 逗号
    • 期间
    • 破折号

    正则表达式包含三个组。

    • 第一组是前导标点符号。
    • 第二组是实际单词,可能包含嵌入的标点符号。
    • 第三组是尾随标点。

    我们只需要处理第二组。处理算法在上面代码中的类cmets中有描述。

    我测试了您问题中所有示例单词的代码,并得到了每个单词的预期结果。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2011-04-08
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-07-12
      • 1970-01-01
      • 2014-06-20
      相关资源
      最近更新 更多