您可以使用 for 循环来迭代“分子式”列中的值。例如:
import requests
import pandas as pd
from bs4 import BeautifulSoup as BS
col_list = ["Molecular Formula"] # this is a column title in my csv file
Chem = pd.read_csv("data.csv", usecols=col_list)
for c in Chem["Molecular Formula"]:
res = requests.get(
"https://hmdb.ca/unearth/q?utf8=✓&query="
+ c
+ "&searcher=metabolites&button="
)
html_page = res.content
soup = BS(html_page, "html.parser")
body = soup.find_all("div", attrs={"class": "hit-name"})
for div in body:
print(div.text)
print("-" * 80)
打印:
Succinylcholine
2-Ethyl-4,5-dimethylthiazole
Water
--------------------------------------------------------------------------------
Licoricesaponin C2
Illudin C2
Eremopetasitenin C2
Cinncassiol C2
Gladiatoside C2
Prostaglandin-c2
Capsicoside C2
Schidigerasaponin C2
Ganoderic acid C2
Ginsenoside C
Diethyl sulfide
Mangiferin
4-Nitrophenol
L-Acetylcarnitine
Malonic acid
11-trans-Leukotriene C4
(-)-Epigallocatechin
Tryptophan 2-C-mannoside
--------------------------------------------------------------------------------
data.csv的内容:
Molecular Formula
H2O
C2
编辑:将结果保存到 CSV:
import requests
import pandas as pd
from bs4 import BeautifulSoup as BS
col_list = ["Molecular Formula"] # this is a column title in my csv file
Chem = pd.read_csv("data.csv", usecols=col_list)
all_data = []
for c in Chem["Molecular Formula"]:
print(f"Getting {c=}")
res = requests.get(
"https://hmdb.ca/unearth/q?utf8=✓&query="
+ c
+ "&searcher=metabolites&button="
)
html_page = res.content
soup = BS(html_page, "html.parser")
body = soup.find_all("div", attrs={"class": "hit-name"})
for div in body:
all_data.append([c, div.text])
df = pd.DataFrame(all_data, columns=["Molecular Formula", "Value"])
print(df)
df.to_csv("result.csv", index=False)
打印:
Getting c='H2O'
Getting c='C2'
Molecular Formula Value
0 H2O Succinylcholine
1 H2O 2-Ethyl-4,5-dimethylthiazole
2 H2O Water
3 C2 Licoricesaponin C2
4 C2 Illudin C2
5 C2 Eremopetasitenin C2
6 C2 Cinncassiol C2
7 C2 Gladiatoside C2
8 C2 Prostaglandin-c2
9 C2 Capsicoside C2
10 C2 Schidigerasaponin C2
11 C2 Ganoderic acid C2
12 C2 Ginsenoside C
13 C2 Diethyl sulfide
14 C2 Mangiferin
15 C2 4-Nitrophenol
16 C2 L-Acetylcarnitine
17 C2 Malonic acid
18 C2 11-trans-Leukotriene C4
19 C2 (-)-Epigallocatechin
20 C2 Tryptophan 2-C-mannoside
并保存result.csv