【发布时间】:2021-06-03 06:01:36
【问题描述】:
我对张量流和堆栈溢出比较陌生,所以请耐心等待。我的问题如下:“如何使用 .get_file() 方法和 pandas 读取方法将自定义数据集电子表格加载到 TensorFlow 中?”我搜索了 TensorFlow 网站、堆栈溢出和其他网站,但它们似乎要么使用在线公开可用的数据,要么使用我不理解的不同方法进行一些奇怪的导入。这是我目前拥有的:
import tensorflow as tf
import pandas as pd
CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
SPECIES = ['Setosa', 'Versicolor', 'Virginica']
# This is just some flower data online
train_path = tf.keras.utils.get_file(
"iris_training.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv")
test_path = tf.keras.utils.get_file(
"iris_test.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv")
# I have a spreadsheet on my machine with the exact same data. I want to use those files instead
train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)
# Here I am reading a csv file inputting the data, labels, and defining header. Should I use pd.read_excel instead because the files on my machine are excel files?
train_y = train.pop('Species')
test_y = test.pop('Species') # removes answers/thing to predict and test against
非常感谢您的阅读!
【问题讨论】:
-
或者如果有办法通过谷歌数据集在线操作数据,那也可以。
标签: pandas tensorflow dataset tensorflow2.0