【问题标题】:to find the attribute value of a tag when the specific text matches in the content当特定文本在内容中匹配时查找标签的属性值
【发布时间】:2016-07-02 20:50:23
【问题描述】:

如果标签的子标签包含特定文本,我想获取 id 的值。

输入:

<base>
  <parent id="101" txt="hello">
    <child1>
       <data> search </data>
    </child1>
     <child2>
       <data> send</data>
    </child2>
  </parent>
  <parent id="102" txt="hello">
    <child1>
       <data> hai </data>
    </child1>
     <child2>
       <data> hey </data>
    </child2>
  </parent>
</base>

输出:

我正在整个文件中搜索“嘿”文本,所以它应该返回 id="102"

我尝试过的代码片段

if(line.indexOf("<Parent")>= 0)
{
 String output="";
 Pattern pat = Pattern.compile("id=\".*?\"");
 Matcher mat = pat.matcher(line);
 if(mat.find())
    {
     int start=mat.start();
     int end=mat.end();
     output = line.substring(start+4,end-1);
    }

    Pattern pat1 = Pattern.compile("<parent"[A-Z](?i)[^.?!]*?\\b(hey)\\b[^.?!]*[.?!]")</parent>");
    Matcher mat1 = pat.matcher(line);
    if(mat.find())
    {
     bw.write(output);
     }
    }

【问题讨论】:

  • @PritamBanerjee - 使用我尝试过的脚本进行了更新
  • 为什么所有的反对票在我看来都是合法的?

标签: java xml parsing dom sax


【解决方案1】:

试试这个:它应该给你文本:

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;

public class ReadXML {

/**
 * @param args
 */
public static void main(String[] args) {
    // TODO Auto-generated method stub

    try {

        File fXmlFile = new File("Path to your xml");
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(fXmlFile);

        doc.getDocumentElement().normalize();

        System.out.println("Root element :" + doc.getDocumentElement().getNodeName());

        NodeList nList = doc.getElementsByTagName("parent");

        System.out.println("----------------------------");

        for (int temp = 0; temp < nList.getLength(); temp++) {

            Node nNode = nList.item(temp);

            System.out.println("\nCurrent Element :" + nNode.getNodeName());

            if (nNode.getNodeType() == Node.ELEMENT_NODE) {

                Element eElement = (Element) nNode;

                System.out.println("Parent id : " + eElement.getAttribute("id"));
                System.out.println("Parent txt : " + eElement.getAttribute("txt"));
            }
        }
        } catch (Exception e) {
        e.printStackTrace();
        }
      }

}

【讨论】:

  • 如果我没记错的话,它会打印所有的 id 值,但如果它与子标签中的特定文本匹配,我想要 id 值。
【解决方案2】:
try{
     bw = new BufferedWriter(new FileWriter(outfilename));
     br = new BufferedReader(new FileReader(infilename));
     while((line=br.readLine())!=null){
      if(line.indexOf("<PGBLK")>= 0){
        Pattern pat = Pattern.compile("KEY=\".*?\"");
        Matcher mat = pat.matcher(line);
        if(mat.find()){
         int start=mat.start();
         int end=mat.end();
         output = line.substring(start+5,end-1);
        }
       }
      Pattern pat1 = Pattern.compile(".*?Reference dimensions do not require inspection.*?");
      Matcher mat1 = pat1.matcher(line);
      if(mat1.find()){
        bw.write(output);
        bw.newLine();
       }
     } 
   }

【讨论】:

    【解决方案3】:

    下面是VTD-XML中的代码

    import com.ximpleware.*;
    
    public class xpathSearch {
    
        public static void main(String s[])throws VTDException{
            VTDGen vg = new VTDGen();
            if (!vg.parseFile("d:\\xml\\input.txt", false))
                    return;
            VTDNav vn = vg.getNav();
            AutoPilot ap = new AutoPilot(vn);
            ap.selectXPath("/base/parent/*/data[contains(.,'hey')]/../../@id");
            int i;
            while ((i=ap.evalXPath())!=-1)
                System.out.println("attr id has the value of  "+vn.toString(i+1));
    
        }
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2011-04-16
      • 2019-09-03
      • 2012-11-14
      • 1970-01-01
      • 2018-10-11
      • 2018-07-06
      • 1970-01-01
      相关资源
      最近更新 更多