将标点符号转换为猪拉丁语后保留标点符号答案

【问题标题】：Preserving the punctuation after converting it to pig latin将标点符号转换为猪拉丁语后保留标点符号
【发布时间】：2021-09-16 12:25:09
【问题描述】：

目前，我可以将英语单词翻译成猪拉丁语。我的实验室任务说，单词之前出现的标点符号应该被删除、存储并添加到piglatinized单词的前面。单词之后出现的标点符号应该被删除、存储并附加到piglatinized单词中。单词中间的任何标点符号都将被视为普通字母。

例如：

什么？ -> 怎么了？
哦！！！ -> 哦哦！！！
“你好”->“hellohay”
不要 -> 当天
"pell-mell" -> "ell-mellpay"

这就是我现在要查找和存储标点符号的内容：

public static final String punct = ",./;:'\"?<>[]{}|`~!@#$%^&*()";

String startPunct = "";
String endPunct = "";

for (int c = 0; c < s.length(); c++) {
   for (int i = 0; i < punct.length(); i++) {
      if (s.charAt(c) == punct.charAt(i)) {
         startPunct = startPunct + s.charAt(c);
      }
   }  
}

如果需要，这是我如何打印翻译后的单词的基本思路：

s = s.substring(i) + s.substring(0, i) + "ay";

return s;

所以问题是，我如何保留标点符号，以便它出现在翻译单词的开头和结尾（最好是递归，但正则表达式很好）？

非常感谢任何帮助。提前致谢。

【问题讨论】：

标签： java regex recursion

【解决方案1】：

在我看来，有些问题适合递归，但你的任务不是其中之一。因此下面的代码使用regular expressions。

代码后的注释。

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Converts an English word into pig latin. Algorithm follows.
 * <ol>
 * <li>All initial consonants are moved to the end of the word and <i>ay</i> is appended, for
 * example <i>what</i> becomes <i>atwhay</i></li>
 * <li>For words that begin with a vowel, <i>way</i> is appended to the word for example <i>oh</i>
 * becomes <i>ohway</i>.</li>
 * </ol>
 * Additional stipulations include the following.
 * <ol>
 * <li>Initial punctuation and terminal punctuation are unchanged in the converted word, for
 * example if the original word ends with a question mark then the converted word also ends with a
 * question mark meaning that <i>what?</i> becomes <i>atwhay?</i></li>
 * <li>Case sensitivity is preserved.</li> 
 * </ol>
 */
public class PigLatin {
    private static final String  VOWELS = "aeiou";

    private static int getIndexOfFirstVowelInWord(String word) {
        int index = -1;
        if (word != null  &&  !word.isBlank()) {
            word = word.strip();
            word = word.toLowerCase();
            char[] letters = word.toCharArray();
            for (int i = 0; i < letters.length; i++) {
                if (VOWELS.indexOf(letters[i]) >= 0) {
                    index = i;
                    break;
                }
            }
        }
        return index;
    }

    /**
     * First method invoked when this class launched via <tt>java</tt> command. Recognizes a single
     * command argument which is the word to be converted into pig latin.
     * 
     * @param args - <tt>java</tt> command arguments.
     */
    public static void main(String[] args) {
        if (args.length == 0) {
            System.out.println("ARGS: word");
        }
        else {
            System.out.printf("Word: ^%s^%n", args[0]);
            Pattern pattern = Pattern.compile("^([!?\"'():;,.-]*)(\\w+[!?\"'():;,.-]*\\w+)([!?\"'():;,.-]*)$");
            Matcher matcher = pattern.matcher(args[0]);
            if (matcher.matches()) {
                String initial = matcher.group(1);
                String word = matcher.group(2);
                word = word.strip();
                String terminal = matcher.group(3);
                int index = getIndexOfFirstVowelInWord(word);
                if (index == 0) {
                    word += "way";
                }
                else {
                    String suffix = word.substring(0, index);
                    word = word.substring(index);
                    word += suffix;
                    word += "ay";
                }
                String result = initial + word + terminal;
                System.out.println("Result: " + result);
            }
            else {
                System.out.println("No match.");
            }
        }
    }
}

我测试散文中常见的标点符号，包括以下内容。

感叹号
问号
双引号
单引号
括号
冒号
半冒号
逗号
期间
破折号

正则表达式包含三个组。

第一组是前导标点符号。
第二组是实际单词，可能包含嵌入的标点符号。
第三组是尾随标点。

我们只需要处理第二组。处理算法在上面代码中的类cmets中有描述。

我测试了您问题中所有示例单词的代码，并得到了每个单词的预期结果。

【讨论】：