【发布时间】:2019-02-22 18:07:55
【问题描述】:
我正在学习 SQLAlchemy,但我被卡住了。 我有一个 SQL 表(table1)有两个字段:'name' 和 'other_names'
我有一个包含两列的 excel 文件:
first_name alias
paul patrick
john joe
simon simone
john joey
john jo
我想将 excel 文件读入我的 table1,使其看起来像这样(即同一行的所有别名都在一行上):
paul patrick
john joe,joey,jo
simon simone
这是我试图做的想法。我尝试过的代码(使用 cmets):
for line in open('file.txt', 'r'): #for each line in the excel file
line = line.strip().split('\t') #split each line with a name and alias
first_name = line[0] #first name is the name before the tab
alias = line[1] #alias is the name after the tab
instance =
Session.query(session,tbs['table1'].name).filter_by(name=first_name) #look through the database table, by name field, and see if the first name is there
list_instance = [x[0] for x in instance] #make a list of first names already in database table
if first_name not in list_instance: #if the excel first name is not in the database table
alias_list = [] #make an empty list
alias_list.append(alias) #append the alias
name_obj = lib.get_or_create( #small function to make db object
session,
tbs["table1"],
name = first_name, #add first name to the name field
other_names = alias_list # add alias list to the other_names field
)
elif first_name in list_instance: #elif first name already in db
alias_list.append(alias) #append the alias to the alias list made above
name_obj = lib.get_or_create(
session,
tbs["table1"],
name = first_name,
other_names = alias_list #create object as before, but use updated alias list
)
问题是我可以让上面的代码运行没有错误,而且输出不是附加列表,它只是一个看起来像excel文件的数据库表;即
name alias
paul patrick
john joe
simon simone
john joey
john jo
有人可以指出我哪里出错了,具体来说,我该如何修改这段代码?如果问题不清楚,请告诉我,我试图将其作为一个简单的例子。具体来说,我如何初始化并添加到列表中作为 SQLalchemy 数据库表中的字段条目。
更新 1:我已根据下面的建议更新了我的代码。但是我仍然有这个问题。这是完整的目标、代码和测试文件: 目标:
我在数据库中有一个表(请参阅下面的测试文件进入表)。该表有两个字段,名称(拉丁名称,例如智人)和其他名称(常用名称,例如人类,人)。我想更新表中的一个字段(其他名称),所以不要:
Rana rugosa human
Rana rugosa man
Rana rugosa frog
Rana rugosa cow
我有:
Rana rugosa human,man,frog,cow
test_data 文件如下所示:
origin_organism common_name tested_organism
Rana rugosa human -
Rana rugosa man -
Rana rugosa frog homo sapiens
Rana rugosa cow Rana rugosa
Rana rugosa frog Rana rugosa
Rana rugosa frog -
Rana rugosa frog -
Rana rugosa frog homo sapiens
- - -
- - homo sapiens
- - -
- - -
- - -
- - -
streptococcus pneumoniae - -
代码:
import sys
from sqlalchemy.orm import *
from sqlalchemy import *
from dbn.sqlalchemy_module import lib
import pd
engine = lib.get_engine(user="user", psw="pwd", db="db", db_host="111.111.111.11")
Base = lib.get_automapped_base(engine)
session = Session(engine)
tbs = lib.get_mapped_classes(Base)
session.rollback()
df = pd.read_excel('test_data.xlsx', sheet_name = 'test2')
for index, row in df.iterrows():
origin_latin_name = row['origin_organism'].strip().lower()
other_names_name = row['common_name'].strip().lower()
tested_species = row['tested_organism'].strip().lower()
if origin_latin_name not in [None, "None", "", "-"]:
instance = [x[0] for x in Session.query(session,tbs['species'].name).filter_by(name=origin_latin_name).all()]
if origin_latin_name not in instance:
origin_species = lib.get_or_create(
session,
tbs["species"],
name = origin_latin_name,
other_names = other_names_name
)
elif origin_latin_name in instance:
other_names_query = Session.query(session,tbs['species'].other_names).filter_by(name=origin_latin_name)
other_names_query_list = [x for x in other_names_query]
original_list2 = list(set([y for y in x[0].split(',') for x in other_names_query_list]))
if other_names_name not in original_list2:
original_list2.append(other_names_name)
new_list = ','.join(original_list2)
new_names = {'other_names':','.join(original_list2)}
origin_species = lib.get_or_create(
session,
tbs["species"],
name = origin_latin_name,
other_names = new_list
)
elif 语句中的部分不起作用。我遇到了两个问题:
(1) 我得到的最新错误: NameError: name 'new_list' 没有定义
(2) 我得到的另一个错误是我还有另一个表
map1 = lib.get_or_create(
session,
tbs["map1"],
age_id_id = age,
name_id_id = origin_species.id
)
...它说找不到 origin_species,但我认为这与 elif 语句有关,即不知何故 origin_species 对象未正确更新。
如果有人可以提供帮助,我将不胜感激。
【问题讨论】:
-
阅读ericlippert.com/2014/03/05/how-to-debug-small-programs 了解如何调试代码的提示。
-
我认为这对你有用:stackoverflow.com/questions/48799232/…。您可能需要
pd.to_sql()将 DataFrame 推送到您的数据库 -
如果条件不执行,
elif下的所有内容都不会被分配。所以elif origin_latin_name in instance -
你想做的每一件事都不需要用 SQL 炼金术来完成。您正在导入熊猫。只需将您的表导入 pandas 并使用 groupby 语句。如果你只使用 MySQL alchemy 来保存数据,我愿意告诉你
标签: python sql sqlalchemy