【发布时间】:2017-11-08 05:31:05
【问题描述】:
我想用漂亮的汤来查找子标签(收益或损失)大于0的标签。然后我想打印内部标签“gains”“losses”和“band.textualrepresentation”的内容”。这基本上就是我想要的脚本(虽然这个不起作用)。
import sys
from BeautifulSoup import BeautifulSoup as Soup
def parseLog(file):
file = sys.argv[1]
handler = open(file).read()
soup = Soup(handler)
for anytype in soup('anytype', 'gains'.string>0 || 'losses'.string>0):
gain = anytype.gains.string
loss = anytype.losses.string
band = anytype.band.textualrepresentation.string
print gain loss band
parseLog(sys.argv[1])
我一开始就遇到麻烦,连收益的内容都打印不出来,更别说打印符合一定条件的内容了。我当前的脚本
def parseLog(file):
file = sys.argv[1]
handler = open(file).read()
soup = Soup(handler)
for anytype in soup.findall('anytype'):
gain = anytype.fetch('gains')
print gain
parseLog(sys.argv[1])
返回
Traceback (most recent call last):
File "./soup.py", line 13, in <module>
parseLog(sys.argv[1])
File "./soup.py", line 9, in parseLog
for anytype in soup.findall('anytype'):
TypeError: 'NoneType' object is not callable
.
示例输入
<anytype xsi:type="GainLossStruct">
<band>
<textualrepresentation>
22q11.1
</textualrepresentation>
</band>
<gains>
2
</gains>
<losses>
1
</losses>
<structs>
0
</structs>
</anytype>
<anytype xsi:type="GainLossStruct">
<band>
<textualrepresentation>
22q11.2
</textualrepresentation>
</band>
<gains>
0
</gains>
<losses>
1
</losses>
<structs>
0
</structs>
</anytype>
<anytype xsi:type="GainLossStruct">
<band>
<textualrepresentation>
22q12
</textualrepresentation>
</band>
<gains>
0
</gains>
<losses>
0
</losses>
<structs>
0
</structs>
</anytype>
样本输出
2 1 22q11.1
0 1 22q11.2
.
.
更新 目前的解决方案
import sys
from BeautifulSoup import BeautifulSoup as Soup
def parseLog(file):
file = sys.argv[1]
handler = open(file).read()
soup = Soup(handler)
for anytype in soup(lambda x: x.name=='anytype' and (hasattr(x, 'gains') and int(x.gains.string) > 0 or hasattr(x, 'losses') and int(x.losses.string) > 0)):
gain = anytype.gains.string
loss = anytype.losses.string
band = anytype.band.textualrepresentation.string
print gain, loss, band
parseLog(sys.argv[1])
仍然返回错误
Traceback (most recent call last):
File "./soup.py", line 15, in <module>
parseLog(sys.argv[1])
File "./soup.py", line 9, in parseLog
for anytype in soup(lambda x: x.name=='anytype' and (hasattr(x, 'gains') and int(x.gains.string) > 0 or hasattr(x, 'losses') and int(x.losses.string) > 0)):
File "/Users/jacob/homebrew/lib/python2.7/site-packages/BeautifulSoup.py", line 659, in __call__
return apply(self.findAll, args, kwargs)
File "/Users/jacob/homebrew/lib/python2.7/site-packages/BeautifulSoup.py", line 849, in findAll
return self._findAll(name, attrs, text, limit, generator, **kwargs)
File "/Users/jacob/homebrew/lib/python2.7/site-packages/BeautifulSoup.py", line 377, in _findAll
found = strainer.search(i)
File "/Users/jacob/homebrew/lib/python2.7/site-packages/BeautifulSoup.py", line 966, in search
found = self.searchTag(markup)
File "/Users/jacob/homebrew/lib/python2.7/site-packages/BeautifulSoup.py", line 924, in searchTag
or (markup and self._matches(markup, self.name)) \
File "/Users/jacob/homebrew/lib/python2.7/site-packages/BeautifulSoup.py", line 983, in _matches
result = matchAgainst(markup)
File "./soup.py", line 9, in <lambda>
for anytype in soup(lambda x: x.name=='anytype' and (hasattr(x, 'gains') and int(x.gains.string) > 0 or hasattr(x, 'losses') and int(x.losses.string) > 0)):
AttributeError: 'NoneType' object has no attribute 'string'
即使我将 for 循环减少到
for anytype in soup(lambda x: x.name=='anytype' and (hasattr(x, 'gains'))):
gain = anytype.gains.string
print gain
我还是明白了
Traceback (most recent call last):
File "./soup.py", line 13, in <module>
parseLog(sys.argv[1])
File "./soup.py", line 10, in parseLog
gain = anytype.gains.string
AttributeError: 'NoneType' object has no attribute 'string'
【问题讨论】:
标签: python xml beautifulsoup