【问题标题】:How to search a CSV file based on an input field?如何根据输入字段搜索 CSV 文件?
【发布时间】:2019-01-21 07:17:28
【问题描述】:

我只能访问几行(正确的是 41 行)。之后我无法阅读。

import java.io.File;
import java.util.Scanner;

public class FileReader {

    public static void main(String[] args) {

        String filePath = "qwe.csv";
        System.out.println("Enter the City name to be Searched\n");

        Scanner in = new Scanner(System.in);          
        String searchTerm = in.nextLine(); 
        readRecord(filePath, searchTerm);
    }

    public static void readRecord( String filePath, String searchTerm ) {

        boolean found = false;
        String City = ""; String City_Asciis = ""; String  Lattitude = "";
        String longitude = ""; String Country = "";
        String iso_2 = ""; String iso_3 = ""; String Admin_Name = "";
        String Capital = ""; String Population = ""; String Id = "";

        try {
            File file = new File(filePath);
            Scanner x = new Scanner (file);
            x.useDelimiter("[,\n]"); //to separate the data items 

            //hasNext - Returns true if the scanner has another token/value in its input

            while(x.hasNext() && !found) {

                City = x.next();
                City_Asciis = x.next();
                Lattitude = x.next();
                longitude = x.next();
                Country = x.next();
                iso_2 = x.next();
                iso_3 = x.next();
                Admin_Name = x.next();
                Capital = x.next();
                Population = x.next();
                Id = x.next();

                if (City.equals(searchTerm)) {
                    found = true;
                }
            }       

            if (found) {

                System.out.println(" The following details are of city : " + City +"\n The Ascii string would be : "
                    + City_Asciis +"\n Its having the lattitude around : "
                    + Lattitude + "\n and Longitude of : "+ longitude +"\n It is situated in : "
                    + Country +"\n These have iso code like  : "+ iso_2 +" and : "+ iso_3 +"\n It comes under  : "
                    + Admin_Name +" State \n Capital of this city is : "+ Capital +"\n The population is around : "
                    + Population +"\n ZIP code is : "+Id+"");
            }                       
            else {                      
                System.out.print("Enter the Correct City Name");
            }
        }
        catch(Exception e1){
            System.out.print("file not found \n");
            e1.printStackTrace();
        }  
    }       
}

此代码将从给定的文件路径加载搜索到的城市,以便给定特定城市名称打印城市的详细信息。

【问题讨论】:

  • 请配置您的编辑器或 IDE,以便在您保存代码时对其进行格式化。现在代码看起来比它需要的更混乱。
  • 能否也提供您正在阅读的文件?
  • 您必须从您在程序中使用的文件中提供一些示例数据(两个或三个记录)。
  • Java 中的约定是使用 camelCase(第一个字母为小写)来指定变量名。而且,没有下划线。例如CityAdmin_Name分别命名为cityadminName
  • 感谢您的建议和回答,我已经纠正了我的错误,该文件实际上包含大约 12,000 行但它只读取了大约 41 行,我做的错误是在我没有提到的扫描仪中是什么类型的文件,Scanner x = new Scanner (file,"UTF-8");

标签: java file csv


【解决方案1】:

谁知道?

不用头疼,代码本身看起来应该可以工作,我个人无法理解为什么你的阅读只会做 41 行而不用实际数据做一系列实验,而且没有多少人真正想要这样做就是为什么你被要求提供一些样本虚构数据。

这可能很简单,您在 while 循环条件中满足布尔 found 变量条件并且循环中断并停止读取。我会怀疑这一点,因为您确实指出 "code will load the searched city from the file path given".我应该认为这并不是您真正想要的,因为某些国家/地区包含相同的城市名称。事实上,同一国家内的某些州、省或地区可以包含相同的城市名称。例如,您是否知道在United States单独中,有88个名为Washington的城镇?我知道,很奇怪,尤其是考虑到只有 50 个州和 2 个领地。 本杰明·富兰克林也是美国的开国元勋之一,有 35 个城镇/村庄以富兰克林的名字命名在那个国家内。

如果您的数据文件或数据库足够大,那么我相信您会希望显示所有符合您特定搜索条件的城市。话虽如此,也许您需要做的是摆脱 while 循环的 && !found 条件。我个人也不会在 while 循环条件中使用Scanner#hasNext() 方法。这是灾难的邀请,因为当与Scanner#next() 结合使用时,它更侧重于检查令牌的可用性,而不是实际的文件行。将Scanner#hasNextLine()Scanner#nextLine() 方法结合使用,然后使用String#split() 方法解析CSV 逗号(,) 分隔的数据行,一次一行。

下面我提供了一个可运行的 Java 代码示例,它演示了上述方法。您的 readRecord() 方法已使用,但经过大量修改以适应以下选项:

  • 返回找到的城市信息的列表接口 (List<String>) 与提供的搜索条件有关。
  • 忽略(跳过)CSV 文件中的空白行或注释行。注释行可以以 #; 开头。
  • 在搜索过程中忽略字母大小写的选项。
  • 允许选择所需的城市信息字段, 将应用搜索条件。城市信息字段 是:

    城市、CityAscii、纬度、经度、国家、ISO2、 ISO3、管理员名称、资本、人口和 ID

    在提供所需的搜索字段时可以使用通配符(? 和 *),这样就不需要提供整个字段名称,例如:lat* 表示纬度。因此,您可以根据需要仅根据人口进行城市信息搜索,而不是城市名称。

  • 允许在 提供的搜索条件,例如:wash*。这告诉方法 搜索名称以 Wash 开头的任何城市,例如 Washington 或 Washougal 或 Washtucna。

  • 允许返回找到的城市实例数。

以下是演示上述概念的可运行代码。代码注释很好。代码中使用了Regular Expressions,如果您想解释这些表达式,请将它们复制/粘贴到regex101.com

import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;


public class CityInfoRecords {

    public static void main(String[] args) {
        /* The appplication is started this way so that there
           is no need for static methods or variables.
        */
        new CityInfoRecords().startApp(args);
    }

    // Application Start method.
    private void startApp(String[] args) {
        String ls = System.lineSeparator(); // Not all OS Consoles work well with "\n" 
        String filePath = "qwe.csv";        // Path and file name of the data file.
        Scanner in = new Scanner(System.in);
        // Provide the City Info Field to base search from...
        System.out.println("Enter the Data Field you want to search by:" + ls
                + "[City, CityAscii, Lattitude, Longitude, Country" + ls
                + "ISO2, ISO3, AdminName, Capital, Population, ID]" + ls
                + "Wildcards (? and *) can be used:");
        String searchField = in.nextLine();

        // Provide the Search Criteria to find within the supplied City Info Field.
        System.out.println(ls + "Enter the search criteria you are looking for" + ls
                + "in " + searchField + ". Wildcards (? and *) are permitted:");
        String searchCriteria = in.nextLine();

        // Declare a List Interface of String and fill it 
        // with the call to the readRecord method.
        List<String> cityInfoList = readRecord(filePath, searchField, searchCriteria, 0, "N/A");

        // Display the returned List to console window.
        for (int i = 0; i < cityInfoList.size(); i++) {
            System.out.println(cityInfoList.get(i));
        }
    }

    /**
     * Returns a List Interface of the City Information found based on the supplied 
     * search criteria.<br><br>
     * 
     * @param filePath (String) The full path and file name of the data file to read
     * containing City information.<br>
     * 
     * @param searchField (String) The City Information Field to based the supplied 
     * Search Criteria from. Any City Information Field can be supplied here and 
     * letter case is optional. The wildcard characters (? and *) can also be used 
     * here so that the entire field name does not need to be supplied, for example:
     * <pre>
     *          lat*    for the Latitude field or
     *          *asc*   for the CityAscii field or
     *          iso?    for either the ISO2 or ISO3 fields or simply
     *          City    for the City field.</pre><br>
     * 
     * The <b>?</b> wildcard character specifies any single alphanumeric character, 
     * as in ?an, which locates "ran," "pan", "can", and "ban".<br><br>
     * 
     * The <b>*</b> wildcard character specifies zero or more of any alphanumeric 
     * character, as in corp*, which locates "corp", "corporate", "corporation", 
     * "corporal", and "corpulent".<br> 
     * 
     * @param searchCriteria (String) The search criteria string. This can be any 
     * string you would like to search for within the supplied City Information 
     * Field. By default letter case is ignored during searches therefore the 
     * supplied search criteria string does not need to be letter case specific 
     * however if you want the search to be case specific then set this methods
     * optional ignoreLetterCase parameter to false.<br><br>
     * 
     * Wildcard characters (? and *) can also be used within the Search Criteria 
     * string so as to expand the search to other possibilities, for example if 
     * the "City" field is supplied and a criteria string like: "wash*" is supplied
     * then any city which name starts with "Wash" will have their city information 
     * returned.<br><br>
     * 
     * The <b>?</b> wildcard character specifies any single alphanumeric character, 
     * as in ?an, which locates "ran," "pan", "can", and "ban".<br><br>
     * 
     * The <b>*</b> wildcard character specifies zero or more of any alphanumeric 
     * character, as in corp*, which locates "corp", "corporate", "corporation", 
     * "corporal", and "corpulent".<br> 
     * 
     * @param numberOfFoundToReturn (int) The number of cities who's information 
     * should be returned. If 0 is supplied then all cities found will be returned.<br>
     * 
     * @param noDataReplacement (String) Sometimes there is no data supplied for a 
     * specific field within the data file or the file data line may not contain 
     * the same amount of delimited data. Rather than returning NULL or Null String 
     * ("") for empty data fields you can supply here what to actually return in 
     * such a case. "N/A" is a good choice or perhaps: "Nothing Supplied". Whatever 
     * you like to use can be supplied here.<br>
     * 
     * @param ignoreLetterCase (Optional - Boolean - Default is true) By default 
     * searches ignore letter case but if you want your search to be letter case 
     * specific then you can supply boolean false to this optional parameter.<br>
     * 
     * @return (String List Collection) Information for every City found within the 
     * supplied data file which matches the supplied field and search criteria.
     */
    public List<String> readRecord(String filePath, String searchField,
                            String searchCriteria, int numberOfFoundToReturn, 
                            String noDataReplacement, boolean... ignoreLetterCase) {
        String ls = System.lineSeparator(); // Not all OS Consoles work well with "\n" (property)
        boolean ignoreCase = true;          // Ignore letter case when searching (Default - property)
        if (ignoreLetterCase.length > 0) {
            ignoreCase = ignoreLetterCase[0];
        }
        boolean found = false;              // Flag to indicate data was found (toggles)
        int foundCounter = 0;               // Indicates number of same data found (increments)

        List<String> returnableList = // The List of found city information that will be returned (collection)
                new ArrayList<>();

        // City Information Variables (data fields)
        String city;
        String cityAscii;
        String latitude;
        String longitude;
        String country;
        String iso2;
        String iso3;
        String adminName;
        String capital;
        String population;
        String id;

        // Open Scanner to read data file...
        // Try With Resources is used here to auto close the reader.
        try (Scanner fileReader = new Scanner(new File(filePath))) {
            // Iterate through data file...
            while (fileReader.hasNextLine()) {
                // Read file line by line and remove leading or 
                // trailing whitespaces, tabs, line breaks, etc.
                String cityData = fileReader.nextLine().trim();
                // Skip blank or comment lines (comment lines can be lines that start with # or ;)
                if (cityData.equals("") || cityData.startsWith("#") || cityData.startsWith(";")) {
                    continue;   // Get next file line
                }
                // Split the read line based on any comma delimited anomaly.
                String[] cityInfo = cityData.split(",|,\\s+|\\s+,|\\s+,\\s+");
                // The number of data pieces split from data line.
                // Not all lines may contain the same amount of data.
                int i = cityInfo.length;
                /* Ternary is used to fill city information variables
                   so that data not provided will not be null or null string.
                   As an Example for the city variabel this is the same as:
                        if (i >= 1 && !cityInfo[0].equals("")) {
                            city = cityInfo[0].trim();
                        }
                        else {
                            city = noDataReplacement;
                        }
                 */
                city = (i >= 1 && !cityInfo[0].equals("")) ? cityInfo[0].trim() : noDataReplacement;
                cityAscii = (i >= 2 && !cityInfo[1].equals("")) ? cityInfo[1].trim() : noDataReplacement;
                latitude = (i >= 3 && !cityInfo[2].equals("")) ? cityInfo[2].trim() : noDataReplacement;
                longitude = (i >= 4 && !cityInfo[3].equals("")) ? cityInfo[3].trim() : noDataReplacement;
                country = (i >= 5 && !cityInfo[4].equals("")) ? cityInfo[4].trim() : noDataReplacement;
                iso2 = (i >= 6 && !cityInfo[5].equals("")) ? cityInfo[5].trim() : noDataReplacement;
                iso3 = (i >= 7 && !cityInfo[6].equals("")) ? cityInfo[6].trim() : noDataReplacement;
                adminName = (i >= 8 && !cityInfo[7].equals("")) ? cityInfo[7].trim() : noDataReplacement;
                capital = (i >= 9 && !cityInfo[8].equals("")) ? cityInfo[8].trim() : noDataReplacement;
                population = (i >= 10 && !cityInfo[9].equals("")) ? cityInfo[9].trim() : noDataReplacement;
                id = (i >= 11 && !cityInfo[10].equals("")) ? cityInfo[10].trim() : noDataReplacement;

                // Determine the city data field we want to search in
                String regex;
                // Were wildcards used in the supplied Search Field string?
                if (searchField.contains("?") || searchField.contains("*")) {
                    // Yes... Prep regex to get proper search field
                    regex = searchField.replace("?", ".?").replace("*", ".*?").toLowerCase();
                }
                else {
                    regex = "(?i)(" + searchField + ")";
                }

                // Get proper search field data
                String field = "";
                if ("city".toLowerCase().matches(regex)) {
                    field = city;
                }
                else if ("cityAsciis".toLowerCase().matches(regex)) {
                    field = cityAscii;
                }
                else if ("lattitude".toLowerCase().matches(regex)) {
                    field = latitude;
                }
                else if ("longitude".toLowerCase().matches(regex)) {
                    field = longitude;
                }
                else if ("country".toLowerCase().matches(regex)) {
                    field = country;
                }
                else if ("iso2".toLowerCase().matches(regex)) {
                    field = iso2;
                }
                else if ("iso3".toLowerCase().matches(regex)) {
                    field = iso3;
                }
                else if ("adminName".toLowerCase().matches(regex)) {
                    field = adminName;
                }
                else if ("capital".toLowerCase().matches(regex)) {
                    field = capital;
                }
                else if ("population".toLowerCase().matches(regex)) {
                    field = population;
                }
                else if ("id".toLowerCase().matches(regex)) {
                    field = id;
                }
                if (field.equals("")) {
                    System.err.println("Invalid Search Field Name Provided! (" + searchField + ")");
                    return returnableList;
                }

                // See if the search criteria contains wildcard characters
                // A search can be carried out using wildcards in this method.
                if (searchCriteria.contains("?") || searchCriteria.contains("*")) {
                    // There is...build the required Regular Expression (RegEx) to use.
                    regex = searchCriteria.replace("?", ".?").replace("*", ".*?");
                    // See if the data item matches the search criteria ignoring letter case if desired.
                    // The String.matches() method is used for this and ternary for ignoring letter case.
                    if (ignoreCase ? field.toLowerCase().matches(regex.toLowerCase()) : field.matches(regex)) {
                        found = true;   // toogle flag to true if there is a match.
                    }
                }
                // No wildcard characters in search criteria...
                // Ternary is used in condition to handle ignore letter case if desired.
                else if (ignoreCase ? field.equalsIgnoreCase(searchCriteria) : field.equals(searchCriteria)) {
                    found = true;   // toogle flag to true if there is a match.
                }
                // If the 'found' flag has been set to true...
                if (found) {
                    // Add City information to returnable ArrayList
                    String info = ls + "The following details are of city: " + city + ls
                            + "The Ascii string would be: " + cityAscii + ls
                            + "It has the approximate Lattitude of: " + latitude + ls
                            + "And the approximate Longitude of: " + longitude + ls
                            + "It is situated in the country of: " + country + ls
                            + "The city has iso codes like: " + iso2 + " and: " + iso3 + ls
                            + "The State/Province/Region is: " + adminName + ls
                            + "Capital of this city is: " + capital + ls
                            + // Didn't know cities had capitals
                            "The population is approximately: " + population + ls
                            + "City general ZIP code is: " + id;
                    returnableList.add(info);   // Add to list
                    found = false;              // Toggle found flag back to false in prep to locate more city data.
                    foundCounter++;             // increment the found counter.

                    // If the First Instance Only flag is true then...
                    if (numberOfFoundToReturn > 0 && foundCounter == numberOfFoundToReturn) {
                        // Break out of the 'while' loop. We don't need anymore cities.
                        break;
                    }
                }
            }

            // If the Found Counter was not incremented then
            // we didn't find any data in file... Inform User.
            if (foundCounter == 0) {
                System.err.print(ls + "Can not find City Name (" + searchCriteria
                        + ") in data file!" + ls);
            }
        }
        catch (FileNotFoundException ex) {
            System.err.print("City Data file not found! (" + filePath + ")" + ls);
        }

        // Return the List of found data.
        return returnableList;
    }
}

创建一个新的 Java 应用程序项目并将其命名为 CityInfoRecords。将上面的代码复制并粘贴到 Main Startup 类的顶部。运行应用程序,仔细阅读控制台提示并输入正确的数据。

第一个提示要求输入城市信息字段名称...输入:city。 第二个提示将询问city 的搜索条件...以大写或小写形式输入城市名称(没关系)。城市信息将显示在控制台中,但前提是该城市名称包含在数据文件的“城市”字段中。

现在再次运行代码并输入相同的数据,但这次除外,对于城市名称,只需提供城市名称的前三个字母和星号 (*),然后按 Enter 键。现在,您的特定城市数据文件中以提供的三个字母开头的任何城市信息都将显示在控制台窗口中。

使用它,尝试不同的字段进行搜索,并使用通配符以及您提供的字段或搜索条件数据。

现在让 readRecord 成为一个类而不是一个更好的方法。

【讨论】:

    猜你喜欢
    • 2021-02-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-08-01
    • 2020-06-08
    • 2013-12-06
    • 1970-01-01
    相关资源
    最近更新 更多