SQLAlchemy read_sql() 进入 Pandas 数据框 - 大列值被截断答案

【问题标题】：SQLAlchemy read_sql() into Pandas dataframe - large column value gets truncatedSQLAlchemy read_sql() 进入 Pandas 数据框 - 大列值被截断
【发布时间】：2021-12-06 12:50:39
【问题描述】：

我正在尝试从 MySQL 表中读取数据，其中一列包含较大的 varchar 值，例如长度49085。当我将查询结果读入数据框时，列值被截断为 87 个字符。请参阅下面的代码和输出。有谁知道我如何在不截断的情况下读取整个字符串？

在下面的代码中，表 test 包含一列 description，其中一行的字符串长度为 49085。

代码：

import sys
import os
from sqlalchemy import create_engine
import pandas as pd

db_connection_str = 'mysql+pymysql://username:password@host/db_name'
db_connection = create_engine(db_connection_str)

#this returns 1 row where the value in the description field is of length 49085
df = pd.read_sql("select id, description, length(description) as len from myTable where length(description) = 49085", con=db_connection)

#this returns the truncated value of length 87
print(df)
len(str(df['description']))

输出：

   id                                             description    len
0  1  This document is for the testing Team.\n\nThe attach...  49085
87

【问题讨论】：

你雇过别的司机吗？
我没有，对此了解不多。你的意思是尝试sqlalchemy以外的东西吗？
是的，试试 mysql.connector

标签： python mysql sql pandas sqlalchemy

【解决方案1】：

你被len(str(df['description']))误导了。 df['description'] 返回一个 <class 'pandas.core.series.Series'> 对象，如果我们在其上调用 str()，我们会得到

'0    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...\nName: description, dtype: object'

对于系列中任意大的字符串，该字符串的长度将为 87。要测试字符串的实际长度，请使用

print(len(df['description'][0]))

或类似的。

【讨论】：

谢谢！知道这很有帮助。我试过print(len(df['description'][0]))，它确实显示了正确的49085长度。但是当我将df写入.txt文件时，我仍然得到截断的值。下面是我用来将其写入.txt 文件的代码。
writePath = r'sample_data.txt' with open(writePath, 'a') as f: dfAsString = df.to_string(index=False) f.writelines(dfAsString)
如果您希望将 DataFrame 转储到文本文件中，使用 df.to_csv() 之类的东西可能会更好
如果我错了，请纠正我，但这不会有同样的问题吗？因为要打开 csv，我需要在 Excel 中执行此操作，并且 Excel 中单元格的最大长度为 30k 个字符。
除 Excel 之外的许多应用程序都可以使用 CSV 文件。你到底打算用那个文本文件做什么？