您可以使用java.util.regex.Pattern 类和匹配每个字符的正则表达式来替换每个字符。
ArrayList<String> wordList = new ArrayList<String>();
wordList.add("foo");
wordList.add("carrots");
String message = "The foo bar message about carrots";
// use this class to match each character with the regex dot
Pattern p = Pattern.compile(".", Pattern.DOTALL);
// use to create the new message from the words (some replaced with asterisk)
StringBuffer newMessage = new StringBuffer();
// loop through each word
for (String word : message.split(" ") ){
// if it is in your list....
if (wordList.contains(word)) {
// add it to newMessage, but replaced by asterisk.
newMessage.append(p.matcher(word).replaceAll("*"));
} else {
// add the unmodified word
newMessage.append(word);
}
// add a space before we loop to the next word
newMessage.append(" ");
}
// set the new message string with some words replaced
message = newMessage.toString().trim();
System.out.println(message);
运行时会输出如下文本:
关于胡萝卜的 foo bar 消息
关于***的***吧消息
更新 - 用星号替换禁用词的示例代码
public static void main(String[] args) {
// Your input string
String message = "The foo bar message about carrots. Carrots suck so do parrots. Parrotsucker is partially masked. Carrots was already replaced.";
System.out.println(message);
// An array of words you want to mask
ArrayList<String> wordList = new ArrayList<String>();
wordList.add("foo");
wordList.add("carrots");
wordList.add("parrots");
// Create a regex to match the banned words.... in this case it will be "foo|carrots|parrots", case insensitive
String regex = Arrays.toString(wordList.toArray());
regex = regex.substring(1, regex.length()-1).replaceAll(", ", "|");
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
System.out.println("Regex: " + p);
// Keep track of the asterisks strings by length so we don't generate more than once
Map<Integer, String> maskMap = new HashMap<Integer, String>();
// Since we use replaceAll we might get a match more than once, so we can track and skip once that have already been handled
Vector<String> replaced = new Vector<String>();
// Find a list of banned words in the input message
Matcher m = p.matcher(message);
// Loop over each of the matches
while (m.find()){
// Get the text of each match
String match = m.group();
// Have we already replaced it in the message?
if ( !replaced.contains(match) ){
// This is what we will replace it with
String mask = null;
// See if we have a string the same length as the current match
if ( maskMap.containsKey(match.length())) {
// If so, get it out of the map.
mask = maskMap.get(match.length());
System.out.println("Got mask from maskMap: " + mask);
} else {
// No mask, so generate one and save it in the Map
StringBuffer maskBuffer = new StringBuffer("*");
while ( maskBuffer.length() < match.length() ){
maskBuffer.append("*");
}
mask = maskBuffer.toString();
maskMap.put(mask.length(), mask);
System.out.println("Generated new entry for maskMap: " + mask);
}
// Replace the matched banned word with the correct mask
message = message.replaceAll(match, mask);
// Track that we already replaced this word
replaced.add(match);
System.out.println((new StringBuffer(" Replaced '").append(match).append("' with '").append(mask).append("'")).toString());
} else {
System.out.println("Aready replaced: " + match);
}
}
// The message with banned words masked.
System.out.println(message);
System.exit(0);
}
产生以下输出:
The foo bar message about carrots. Carrots suck so do parrots. Parrotsucker is partially masked. Carrots was already replaced.
Regex: foo|carrots|parrots
Generated new entry for maskMap: ***
Replaced 'foo' with '***'
Generated new entry for maskMap: *******
Replaced 'carrots' with '*******'
Got mask from maskMap: *******
Replaced 'Carrots' with '*******'
Got mask from maskMap: *******
Replaced 'parrots' with '*******'
Got mask from maskMap: *******
Replaced 'Parrots' with '*******'
Aready replaced: Carrots
The *** bar message about *******. ******* suck so do *******. *******ucker is partially masked. ******* was already replaced.