【问题标题】:Regex/Java - How to capture the five digits after String in a dynamic table正则表达式/Java - 如何在动态表中捕获字符串之后的五位数字
【发布时间】:2017-09-13 03:26:33
【问题描述】:

我有下表,如果它与手机号码“00955555555555”匹配并且需要获取代码“89721”,我正在读取整行。

table border="2" cellspacing="0" width="100%" cellpadding="0">

 手机号码  日期  消息

<td align="left" nowrap><font face="times new roman" size=3 >&nbsp;&nbsp;&nbsp;00955555555555</font></td>
<td align="left" nowrap><font face="times new roman" size=3 >&nbsp;&nbsp;&nbsp;2017-04-17 17:34:06.72</font></td>
<td align="left"><font face="times new roman" size=3 >&nbsp;&nbsp;&nbsp;Your authentication code is  89721 to add name as beneficiary for payment from your account. If you have not requested to add this beneficiary, please contact the bank 

立即拨打 00971 600 54 0000。

<td align="left" nowrap><font face="times new roman" size=3 >&nbsp;&nbsp;&nbsp;955111111111</font></td>
<td align="left" nowrap><font face="times new roman" size=3 >&nbsp;&nbsp;&nbsp;2017-04-17 17:31:13.893</font></td>
<td align="left"><font face="times new roman" size=3 >&nbsp;&nbsp;&nbsp;Your authentication code is: 91518. Please do not share this code with any person.</font></td>
<tr>

我尝试了下面的代码,但它返回的是手机号码,而不是五位数的代码。

代码:

public String entire_row_is_read_which_matches_with_the_Mobile_number() throws Throwable {

String mobilenumber="00955555555555"; 


//Date validate = null;
        {
            List<WebElement> rows = driver1.findElements(By.cssSelector("tr"));
            for (WebElement row : rows)
            {
                String text = row.getText();
                if (text.contains(mobilenumber))
                {
                   String regex = " (\\d+)"; //Your authentication code is

                   System.out.println(regex);

                    Pattern pattern = Pattern.compile(regex);
                    Matcher matcher = pattern.matcher(text);

                    if (matcher.find())             
                         {

                        valueis = matcher.group(1); 
                        System.out.println(valueis);

                        break;

                         }

【问题讨论】:

  • 不要使用正则表达式解析html。
  • 然后呢?我如何获得它
  • 将此作为参考。 stackoverflow.com/a/2170950/2792713
  • @vallentin 他不是。他正在使用 Selenium 并尝试解析没有 HTML 标记的结果文本
  • 这是什么数据?这不是第一次被问到这个问题。 stackoverflow.com/questions/42787481/…,你的代码是我的回答,稍作改动。

标签: java html regex selenium-webdriver


【解决方案1】:

您可以使用 jsoup.jar 来获取您想要的数据。 https://jsoup.org/

演示:

    String html = " <table border=\"2\" cellspacing=\"0\" width=\"100%\" cellpadding=\"0\">" + "<tr>"
            + "<td align=\"left\" nowrap><font face=\"times new roman\" size=3 >&nbsp;&nbsp;&nbsp;00955555555555</font></td>"
            + "<td align=\"left\" nowrap><font face=\"times new roman\" size=3 >&nbsp;&nbsp;&nbsp;2017-04-17 17:34:06.72</font></td>"
            + "<td align=\"left\"><font face=\"times new roman\" size=3 >&nbsp;&nbsp;&nbsp;Your authentication code is  89721 to add name as beneficiary for payment from your account. If you have not requested to add this beneficiary, please contact the bank</td>"
            + "</tr>" + "<tr>"
            + "<td align=\"left\" nowrap><font face=\"times new roman\" size=3 >&nbsp;&nbsp;&nbsp;955111111111</font></td>"
            + "<td align=\"left\" nowrap><font face=\"times new roman\" size=3 >&nbsp;&nbsp;&nbsp;2017-04-17 17:31:13.893</font></td>"
            + "<td align=\"left\"><font face=\"times new roman\" size=3 >&nbsp;&nbsp;&nbsp;Your authentication code is: 91518. Please do not share this code with any person.</font></td>"
            + "</tr>" + "</table>";

    Document doc = Jsoup.parse(html);
    String text = doc.select("tr >td:nth-child(2n+1)").text();
    Matcher m = Pattern.compile("\\d+").matcher(text);
    List<String> result = new ArrayList<String>();
    while (m.find()) {
        result.add(m.group());
    }
    System.out.println(result);

输出:

[00955555555555, 89721, 955111111111, 91518]

【讨论】:

    【解决方案2】:

    我喜欢为这样的事情编写函数,因为它们很可能被重用。下面的函数接受你正在搜索的手机号码并返回验证码。

    public String GetAuthCode(String number)
    {
        String code = driver
                .findElement(
                        By.xpath("//tr/td[contains(.,'" + number + "')]/following-sibling::td[contains(.,'Your authentication code')]"))
                .getText();
        String regex = "Your authentication code is: (\\d+)";
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(code);
    
        if (matcher.find())
        {
            return matcher.group(1);
        }
    
        return "";
    }
    

    【讨论】:

    • 感谢它的工作。我问过的同一个,但我之前处理的表格有所不同。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-03-12
    • 1970-01-01
    • 1970-01-01
    • 2012-08-20
    • 2016-07-26
    相关资源
    最近更新 更多