【发布时间】:2015-03-24 04:48:52
【问题描述】:
我正在尝试让我的程序显示文本文件中字母的频率,目前它正在显示文本文件中每个单词的频率。因此,例如,如果文本文件中的单词是“我是男人”,它会为每个单词“i”、“am”、“a”、“man”输出 4 倍的字母频率……我需要它来分析它全部作为一个单词,因此删除空格并将其视为“iamaman”。
//
【问题讨论】:
我正在尝试让我的程序显示文本文件中字母的频率,目前它正在显示文本文件中每个单词的频率。因此,例如,如果文本文件中的单词是“我是男人”,它会为每个单词“i”、“am”、“a”、“man”输出 4 倍的字母频率……我需要它来分析它全部作为一个单词,因此删除空格并将其视为“iamaman”。
//
【问题讨论】:
这不是文本中有空格的问题。事实上,当您在添加计数之前检查Character.isLetter() 时,您已经注意忽略空格。
主要是你只需要把你的for 和while 循环放在迭代令牌的主循环之外。
import java.util.*;
import java.io.*;
public class J_<countlettersfilereader> {
public static void main(String[] args)throws Exception {
// open the file
Scanner console = new Scanner(System.in);
System.out.print("What is the name of the text file? ");
String fileName = console.nextLine();
Scanner input = new Scanner(new File(fileName));
//initialize array with 26 elements
int[] letterArray = new int[26];
while (input.hasNext()) {
String next = input.next().toLowerCase();
//run loop for each line incrementing per character
for (int i = 0; i < next.length(); i++) {
char characters = next.charAt(i);
//ignore all characters which aren't alphabetic
if (Character.isLetter(characters)) {
//if character is uppercase then convert to lowercase
characters = Character.toLowerCase(characters);
//populate array
int index = characters - 'a';
letterArray[index]++;
}}
}
int total = 0;
for(int i = 0; i < letterArray.length; i ++) {
total += letterArray[i];
}
for (char characters = 'a'; characters <= 'z'; characters++) {
int index = characters - 'a';
//print out the analysis
System.out.println("'" + characters + "' entered " + (((double)letterArray[index] / (double)total) * 100)
+ " percent");
}
}
}
$ cat abc.txt
a b c
$ java J_
What is the name of the text file? abc.txt
'a' entered 33.33333333333333 percent
'b' entered 33.33333333333333 percent
'c' entered 33.33333333333333 percent
'd' entered 0.0 percent
'e' entered 0.0 percent
'f' entered 0.0 percent
'g' entered 0.0 percent
'h' entered 0.0 percent
'i' entered 0.0 percent
'j' entered 0.0 percent
'k' entered 0.0 percent
'l' entered 0.0 percent
'm' entered 0.0 percent
'n' entered 0.0 percent
'o' entered 0.0 percent
'p' entered 0.0 percent
'q' entered 0.0 percent
'r' entered 0.0 percent
's' entered 0.0 percent
't' entered 0.0 percent
'u' entered 0.0 percent
'v' entered 0.0 percent
'w' entered 0.0 percent
'x' entered 0.0 percent
'y' entered 0.0 percent
'z' entered 0.0 percent
【讨论】:
如果我理解的话,您所要做的就是将最后一个 for 循环留在图表之外,所以:
import java.io.File;
import java.util.Scanner;
public class JCountlettersfilereader {
public static void main(String[] args) throws Exception {
// open the file
// Scanner console = new Scanner(System.in);
// System.out.print("What is the name of the text file? ");
String fileName = "file.txt";
Scanner input = new Scanner(new File(fileName));
// initialize array with 26 elements
int[] letterArray = new int[26];
int totalLetters = 0;
while (input.hasNext()) {
String next = input.next().toLowerCase();
// run loop for each line incrementing per character
for (int i = 0; i < next.length(); i++) {
char characters = next.charAt(i);
// ignore all characters which aren't alphabetic
if (Character.isLetter(characters)) {
totalLetters++;
// if character is uppercase then convert to lowercase
characters = Character.toLowerCase(characters);
// populate array
int index = characters - 'a';
letterArray[index]++;
}
}
int total = 0;
for (int i = 0; i < letterArray.length; i++) {
total += letterArray[i];
}
}
for (char characters = 'a'; characters <= 'z'; characters++) {
int index = characters - 'a';
// print out the analysis
System.out
.println("'"
+ characters
+ "' entered "
+ (((double) letterArray[index] / (double) totalLetters) * 100)
+ " percent" +"("+letterArray[index] +" /"+totalLetters+")");
}
}
}
返回:
'a' 输入 42.857142857142854%(3 /7) ... “我”输入了 14.285714285714285%(1 /7) ... 'm' 输入 28.57142857142857%(2 /7) 'n' 输入 14.285714285714285%(1 /7)
这是你所期望的?
【讨论】:
删除空格的一种方法是:
"i am a man".replaceAll(" ", "");
【讨论】:
将打印结果的代码移到 while 循环之外。您只需要运行一次,而不是对文件中的每个单词运行一次。
此外,您不需要在两个不同的行上转换为小写。
【讨论】:
使用replaceAll("[\s]", "");
这将删除所有空格(空白行、制表符、空格)
【讨论】:
您可以将分隔符设置为\\w,这意味着它不会占用空格
设置
input.setDelimeter("\\w");
在while循环之外
【讨论】: