【发布时间】:2021-08-17 14:23:39
【问题描述】:
我正在做一个项目,你可以在Colab 看到它。总而言之,我正在对Excel File 中特定列的值应用 Google 搜索。如果你们想看看,我把它的链接放上去。
所以基本上我的代码将在 Google 中搜索 F 列中的值,并在 G、H、I、J 和 K 列中返回我需要的 URL。
代码如下:
FILE_NAME = "planilha.xlsx"
QUERY_LIST = ("Site Oficial", "Linkedin", "Facebook", "Instagram", "Twitter")
TAB_NAME = "Sheet1"
def _get_company_information(company_name):
"""Retrieve the information based on the Query List on the given company name."""
list_links = []
# for query_item, query_validators in QUERY_LIST.items():
for query_item in QUERY_LIST:
for query_result in search(
f"{company_name} {query_item}",
tld='com.br', lang='pt-br', num=1, start=0, stop=1, pause=1.0
):
list_links.append(query_result)
return list_links
if __name__=='__main__':
xl = pd.ExcelFile(FILE_NAME)
with pd.ExcelWriter("output_"+FILE_NAME, mode="w", engine="openpyxl") as writer:
print("- Parsing Excel file")
df1 = xl.parse(TAB_NAME)
# Get single row by iteration
for row_number, row_data in df1.iterrows():
company_name = row_data.get("Organização - Nome fantasia")
print(f"-- Getting info for company: {company_name} . . .")
df_company_info = _get_company_information(company_name=company_name)
df1.loc[row_number, QUERY_LIST] = df_company_info
print(f"-- Got info: {df_company_info} !!!")
print()
print(f"- Updating DF . . .")
df1.to_excel(writer, index=False)
print(f"- Completed!!!")
此代码的输出返回以下错误:
- Parsing Excel file
-- Getting info for company: NORDEA DO BRASIL REPRESENTACOES LTDA . . .
-- Got info: ['https://www.emis.com/php/company-profile/BR/Nordea_do_Brasil_Representacoes_Ltda_en_2321032.html', 'https://no.linkedin.com/company/nordea', 'https://www.facebook.com/Nordea/', 'https://www.instagram.com/nordea_sverige/', 'https://twitter.com/nordea'] !!!
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-41-7f6b4a1d3574> in <module>()
AttributeError: 'str' object has no attribute 'sheet_state'
During handling of the above exception, another exception occurred:
IndexError Traceback (most recent call last)
9 frames
/usr/local/lib/python3.7/dist-packages/openpyxl/writer/workbook.py in get_active_sheet(wb)
59 visible_sheets = [idx for idx, sheet in enumerate(wb._sheets) if sheet.sheet_state == "visible"]
60 if not visible_sheets:
---> 61 raise IndexError("At least one sheet must be visible")
62
63 idx = wb._active_sheet_index
IndexError: At least one sheet must be visible
奇怪的是,它似乎一直工作到我的sheet 的第五行,然后分解为上面的错误。
知道哪里出了问题吗?
【问题讨论】:
标签: python xlsxwriter google-search-api pandas.excelwriter