【发布时间】:2021-12-28 10:29:08
【问题描述】:
我想使用正则表达式从这个函数中获取标签和数据,我尝试过使用这个:
pattern = re.compile(r'/blabels: ],/b')
print(pattern)
result = soup.find("script", text=pattern)
但我没有使用边界得到 None
这是soup:
<script>
Chart.defaults.LineWithLine = Chart.defaults.line;
new Chart(document.getElementById("chart-overall-mentions"), {
type: 'LineWithLine',
data: {
labels: [1637005508000,1637006108000,1637006708000,1637007308000,1637007908000,1637008508000,1637009108000,1637009708000,1637010308000,1637010908000,1637011508000,1637012108000,1637012708000,1637013308000,1637013908000,1637014508000,1637015108000,1637015708000,1637016308000,1637016908000,1637017508000,1637018108000,1637018708000,1637019308000,1637019908000,1637020508000,1637021108000,1637021708000,1637022308000,1637022908000,1637023508000,1637024108000,1637024708000,1637025308000,1637025908000,1637026508000,1637027108000,1637027708000,1637028308000,1637028908000,1637029508000,1637030108000,1637030708000,1637031308000,1637031908000,1637032508000,1637033108000,1637033708000,1637034308000,1637034908000,1637035508000,1637036108000,1637036708000,1637037308000,1637037908000,1637038508000,1637039108000,1637039708000,1637040308000,1637040908000,1637041508000,1637042108000,1637042708000,1637043308000,1637043908000,1637044508000,1637045108000,1637045708000,1637046308000,1637046908000,1637047508000,1637048108000,1637048708000,1637049308000,1637049908000,1637050508000,1637051108000,1637051708000,1637052308000,1637052908000,1637053508000,1637054108000,1637054708000,1637055308000,1637055908000,1637056508000,1637057108000,1637057708000,1637058308000,1637058908000,1637059508000,1637060108000,1637060708000,1637061308000,1637061908000,1637062508000,1637063108000,1637063708000,1637064308000,1637064908000,1637065508000,1637066108000,1637066708000,1637067308000,1637067908000,1637068508000,1637069108000,1637069708000,1637070308000,1637070908000,1637071508000,1637072108000,1637072708000,1637073308000,1637073908000,1637074508000,1637075108000,1637075708000,1637076308000,1637076908000,1637077508000,1637078108000,1637078708000,1637079308000,1637079908000,1637080508000,1637081108000,1637081708000,1637082308000,1637082908000,1637083508000,1637084108000,1637084708000,1637085308000,1637085908000,1637086508000,1637087108000,1637087708000,1637088308000,1637088908000,1637089508000,1637090108000,1637090708000,1637091308000],
datasets: [{
data: [13,10,20,26,21,23,24,21,24,35,25,31,42,24,24,20,23,22,17,23,30,11,16,20,9,10,22,10,19,16,15,16,17,19,10,20,24,14,19,15,13,9,13,17,20,16,15,21,18,25,15,14,16,15,16,14,14,21,10,9,5,9,9,13,14,9,9,18,15,11,11,6,12,14,19,17,16,11,20,14,21,13,15,12,14,10,20,16,25,17,17,11,23,11,13,11,19,10,17,19,10,20,22,19,19,27,28,18,20,22,18,16,17,18,14,17,19,18,20,11,13,20,15,15,18,14,13,14,14,11,19,14,14,11,11,15,26,12,15,15,11,4,3,6],
pointRadius: 0,
borderColor: "#666",
fill: true,
yAxisID:'yAxis1'
},
]
},
options: {
tooltips: {
mode: 'index',
bodyFontSize: 18,
intersect: false,
titleFontSize: 16,
},
.
.
.
</script>
【问题讨论】:
-
字边界语法为
\b。此外,,\b将仅在,之后有一个单词 char 时匹配 -
您要查找脚本标签,还是提取
labels字段中的所有数字? -
我想要这些值,我认为这个表达式 (r'\blabels: ],\b) 会返回以标签开头的数据:并以 ] 结尾,
-
不,代码和表达式没有这样做。
标签: python python-3.x regex re