【发布时间】:2017-10-04 12:06:03
【问题描述】:
我有一个 csv 文件 Decoded.csv
Query,Doc,article_id,data_source
5000,how to get rid of serve burn acne,1 Rose water and sandalwood: Make a paste of rose water and sandalwood and gently apply it on your acne scars.
2 Leave the paste on your skin overnight then wash it with cold water the next morning.
3 Do this regularly together with other natural treatments for acne scars to get rid of the scars as quickly as possible.,459,random
5001,what is hypospadia,A birth defect of the male urethra.,409,dummy
5002,difference between alimentary canal and accessory organs,The alimentary canal is the tube going from the mouth to the anus. The accessory organs are the organs located along that canal which produce enzymes to aid the digestion process.,461,nytimes
并且有 3 个查询 5000,5001 和 5002。 查询 5000 的 Doc 值包含多行,这让 pandas 感到困惑。 (1 玫瑰水和檀香:将玫瑰水和檀香制成糊状,轻轻涂抹在痤疮疤痕上。 2 将糊状物留在皮肤上过夜,然后在第二天早上用冷水清洗。 3 定期与其他治疗痤疮疤痕的自然疗法一起进行,以尽快消除疤痕)
我的python代码如下
def main():
import pandas as pd
dataframe = pd.read_csv("Decoded.csv")
queries, docs = dataframe['Query'], dataframe['Doc']
for idx in range(len(queries)):
print("idx: ", idx, " ", queries[idx], " <-> ", docs[idx])
query_doc_appended = (queries[idx] + " " + docs[idx])
print(query_doc_appended)
if __name__ == '__main__':
main()
它失败了。请指出如何去掉换行符,以便 Query 5000 拥有完整的 Doc 语句集。
【问题讨论】:
-
任何错误信息?你的数据文件是什么样的?不清楚。
-
问题本身提供了数据文件 Decoded.csv ,Query,Doc,article_id,data_source...并且错误是 Traceback (最近一次调用最后一次): line 53, in
main( ) 第 49 行,在主 query_doc_appended = (queries[idx] + " " + docs[idx]) TypeError: unsupported operand type(s) for +: 'float' and 'str' idx: 0 how to get rid of serve burn痤疮 1 玫瑰水和檀香:将玫瑰水和檀香制成糊状,轻轻涂抹在痤疮疤痕上。 idx: 1 南 南 -
当你运行这个程序时你会得到什么错误信息?
标签: python python-2.7 python-3.x pandas csv