【问题标题】:Could import data and table structure from csv to mysql using python可以使用python将数据和表结构从csv导入mysql
【发布时间】:2020-11-12 16:43:51
【问题描述】:

我有带有标题的 csv 文件,我的任务是使用此文件创建模式和表并将数据导入 mysql\mssql 数据库,我发现了如何从 csv 动态创建模式的好文章,但我遇到了 2 个问题:

  1. 函数正在识别像 varchar(5) 这样的布尔类型
cm_mac varchar(0),
partnerid varchar(0),
version varchar(0),
accountid varchar(0),
securityedgeenabled varchar(5)); <-- should be boolean
  1. 尝试导入数据时,获取
"Exception has occurred: AttributeError
'Cursor' object has no attribute 'cursor'" "Exception has occurred: ProgrammingError
not all arguments converted during bytes formatting")

谁能帮我解决这些问题? link to this article i mentioned

我正在尝试执行的代码

from flask import Flask, request, jsonify
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import Integer, Enum
from flask_marshmallow import Marshmallow
import os
import enum
import csv
import ast
from sqlalchemy.sql.sqltypes import Boolean
import mysql.connector
import MySQLdb

f = open('c:\Projects\python\splunk\secedge072120.csv', 'r')
reader = csv.reader(f)

longest = []
type_list = []
headers = []


def dataType(val, current_type):
    try:
        # Evaluates numbers to an appropriate type, and strings an error
        ###!!!Here needs to add boolean!!!###
        t = ast.literal_eval(val)
    except ValueError:
        return 'varchar'
    except SyntaxError:
        return 'varchar'
    if type(t) in [int, float]:
        if (type(t) in [int]) and current_type not in ['float', 'varchar']:
            # Use smallest possible int type
            if (-32768 < t < 32767) and current_type not in ['int', 'bigint']:
                return 'smallint'
            elif (-2147483648 < t < 2147483647) and current_type not in ['bigint']:
                return 'int'
            else:
                return 'bigint'
        if type(t) is float and current_type not in ['varchar']:
            return 'decimal'
    elif (type(t) is Boolean or bool):
        return 'boolean'
    else:
        return 'varchar'


for row in reader:
    if len(headers) == 0:
        headers = row
        for col in row:
            longest.append(0)
            type_list.append('')
    else:
        for i in range(len(row)):
            # NA is the csv null value
            if type_list[i] == 'varchar' or row[i] == 'NA':
                pass
            else:
                var_type = dataType(row[i], type_list[i])
                type_list[i] = var_type
        if len(row[i]) > longest[i]:
            longest[i] = len(row[i])
f.close()


insert_headers = (tuple(headers))
insert_values = ()
insert_values = tuple('?' for header in headers)
statement = 'create table if not exists stack_overflow_survey ('

for i in range(len(headers)):
    if type_list[i] == 'varchar':
        statement = (
            statement + '\n{} varchar({}),').format(headers[i].lower(), str(longest[i]))
    else:
        statement = (statement + '\n' + '{} {}' +
                     ',').format(headers[i].lower(), type_list[i])

statement = statement[:-1] + ');'


print(statement)


mydb = MySQLdb.connect(user='root', password='1234',
                       host='127.0.0.1',
                       database='employees')


cur = mydb.cursor()

csv_data = open(r'c:\Projects\python\splunk\secedge072120.csv', 'r')
reader = csv.reader(csv_data)
print(type(reader))
for row in reader:
    print(row)
    ###!!!Here error is happening !!!###
    cur.execute(f'INSERT INTO stack_overflow_survey({headers})' <--Error
                f'VALUES({insert_values})', row)
mydb.commit()
mydb.close()


# cur.execute(statement)
# csv_data = csv.reader('c:\Projects\python\splunk\secedge072120.csv')

感谢您的帮助

【问题讨论】:

    标签: python mysql csv sqlalchemy


    【解决方案1】:
    • 对于类型检测,该函数通过尝试将类型评估为 Python 常量来识别类型;对于布尔值,这意味着只接受“真”和“假”,而不接受“真”或“真”。您可能希望在顶部添加一个额外的子句以识别布尔值,无论大小写:

      if val.lower() in ('true', 'false'):
        return 'boolean'
      
    • 我不确定您粘贴的错误,但肯定 INSERT 语句在使用 ({headers})({insert_values}) 的地方存在格式问题;打印并调整它,这样你就没有多余的括号和引号。例如,您可以使用({", ".join(headers)})({", ".join(insert_values)})(如果您的标头可以包含空格或需要在SQL 中引用,则更复杂)。

    【讨论】:

    • 谢谢,通过添加这段代码,我得到了几乎所有的布尔字段:
    【解决方案2】:

    萨比克, 谢谢,通过添加这段代码,我几乎所有的字段都是布尔值:

    try:
            # Evaluates numbers to an appropriate type, and strings an error
            ###!!!Here needs to add boolean!!!###
            t = ast.literal_eval(val)
        except ValueError:
            if val.lower() in ('true', 'false') or ('True', 'False'):
                return 'boolean'
            else:
                return 'varchar'
        except SyntaxError:
            if val.lower() in ('true', 'false') or ('True', 'False'):
                return 'boolean'
            else:
                return 'varchar'
        if type(t) in ('true', 'false') or ('True', 'False'):
            return 'boolean'
    

    通过检查我的查询,我得到了这个 query_result = (f'INSERT INTO stack_overflow_survey {insert_headers}' f'VALUES {insert_values}')

    INSERT INTO stack_overflow_survey ('cm_mac', 'PartnerId', 'Version', 'AccountId', 'SecurityEdgeEnabled')VALUES ('?', '?', '?', '?', '?') 似乎我的字段名称应该不带引号,我怎么能用元组或列表来实现呢?谢谢

    【讨论】:

    • 所有以or ('True', 'False') 结尾的if 语句将始终触发,因为它正在评估('True', 'False') 是否为非空并且它是(它有两个元素)。
    猜你喜欢
    • 2020-08-06
    • 2020-05-06
    • 1970-01-01
    • 2019-06-22
    • 2013-05-22
    • 2013-12-21
    • 1970-01-01
    • 1970-01-01
    • 2015-02-07
    相关资源
    最近更新 更多