【发布时间】:2021-03-08 04:10:12
【问题描述】:
我希望将这里找到的数据导入https://www.asx.com.au/data/shortsell.txt,使它变成一个 691x7 的表格。如何告诉它识别不同的列?
提前致谢
【问题讨论】:
我希望将这里找到的数据导入https://www.asx.com.au/data/shortsell.txt,使它变成一个 691x7 的表格。如何告诉它识别不同的列?
提前致谢
【问题讨论】:
您现在很可能已经解决了您的问题。如果没有,这里有一个建议:您可以编写一些正则表达式来阅读这些行,但在我看来这太麻烦了。在我看来,定义列的特征是它们的大小。所以你必须数一下,剩下的就很简单了。进行第一次计数,然后通过执行以下操作来可视化您的结果(无论如何,这是您以后需要的代码):
with open('shortsell.txt', 'r') as file:
[next(file) for _ in range(5)] # Skip the first 5 rows
for line in file:
print(line[:8] + '|' + line[8:42] + '|' + line[42:56] + '|'
+ line[56:71] + '|' + line[71:90] + '|' + line[90:].rstrip())
一旦你有合适的尺寸:
with open('shortsell.txt', 'r') as file:
[next(file) for _ in range(5)] # Skip the first 5 rows
# Read the columns parts
columns = [[line[:8].strip(), line[8:42].strip(), line[42:56].strip(),
line[56:71].strip(), line[71:90].strip(), line[90:].strip()]
for line in (next(file), next(file), next(file))]
# Join the parts
columns = [' '.join(columns[i][j] for i in range(3)).strip() for j in
range(6)]
# Read the data and cast to fitting type
data = [[line[:8].strip(),
line[8:42].strip(),
line[42:56].strip(),
int(line[56:71].strip().replace(',', '')),
int(line[71:90].strip().replace(',', '')),
float(line[90:].strip().replace(',', ''))]
for line in file]
结果:
['ASX Code',
'Company Name',
'Product/ Class',
'Reported Gross Short Sales (a) ASX + CHI-X',
'Issued Capital (b)',
'% of issued capital reported as short sold (a)/(b)']
[['360', 'LIFE360 INC.', 'CDI FORUS', 8999, 148866201, 0.0],
['3DP', 'POINTERRA LIMITED', 'FPO', 15213, 670733112, 0.0],
['4DS', '4DS MEMORY LIMITED', 'FPO', 15000, 1310693486, 0.0],
...
['ZEL', 'Z ENERGY LIMITED.', 'FPO NZX', 23255, 520476853, 0.0],
['ZLD', 'ZELIRA THERAPEUTICS LIMITED', 'FPO', 101860, 1185322966, 0.0],
['ZNO', 'ZOONO GROUP LIMITED', 'FPO', 67213, 163612707, 0.04]]
【讨论】: