【问题标题】:Not getting txt into Python没有将 txt 输入 Python
【发布时间】:2015-12-30 09:10:02
【问题描述】:

我正在尝试通过此代码获取特定公司的关键财务数据(以下代码中的股票):

        netIncomeAr = []

        endLink = 'order=asc'   # order=asc&
        try:

            netIncome = urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read()

            splitNI = netIncome.split('\n')
            print('Net Income:')
            for eachNI in splitNI[1:-1]:
                print(eachNI)
                netIncomeAr.append(eachNI)


            incomeDate, income = np.loadtxt(netIncomeAr, delimiter=',',unpack=True,
                                            converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

        except Exception as e:
            print('failed in the Quandl grab')
            print(str(e))
            time.sleep(555)

但我收到了我设计的“Quandl 抓取失败”的错误消息。我知道错误必须在 Quandl 执行 urllib.request 的第一行。

有人知道为什么这段代码不起作用吗?

好的 - 谢谢罗兰,

我已将我的代码更改为这个有限的概念验证 sn-p:

import urllib.request, urllib.error, urllib.parse
import time
import datetime
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import matplotlib.dates as mdates

evenBetter = ['GOOG','AAPL']


def graphData(stock, MA1, MA2):
    #######################################
    #######################################
    '''
        Use this to dynamically pull a stock from Quandl:
    '''
    print('Currently Pulling',stock)

    netIncomeAr = []
#    revAr = []
#    ROCAr = []

    endLink = 'order=asc'

    netIncome = str(urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read())[2:-1]
    # convert to string, remove leading "b'" and trailing "'" characters.
    # netIncome = 'head\\ndata\\ndata\\n...'


    splitNI = netIncome.split('\\')[1:-1]
    # data segments still have leading 'n' character.
    # the [1:-1] is more pythonic and releases memory.
    for i in range (len(splitNI)):
        splitNI[i] = splitNI[i][1:]
    # data segments are now converted.

    print('Net Income:')
    for eachNI in splitNI:
        print(eachNI)
        netIncomeAr.append(eachNI)


    incomeDate, income = np.loadtxt(netIncomeAr, delimiter=',',unpack=True,
                                    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

for stock in evenBetter:
    graphData(stock,25,50)

现在我正在将 urllib.request 问题转移到另一个问题...下面的错误:

Currently Pulling GOOG
Net Income:
2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0
Traceback (most recent call last):

  File "<ipython-input-3-5ce0b8405254>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 57, in <module>
    graphData(stock,25,50)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 54, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 860, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 860, in <listcomp>
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\matplotlib\dates.py", line 261, in __call__
    return date2num(datetime.datetime(*time.strptime(s, self.fmt)[:6]))

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\_strptime.py", line 494, in _strptime_time
    tt = _strptime(data_string, format)[0]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\_strptime.py", line 306, in _strptime
    raise TypeError(msg.format(index, type(arg)))

TypeError: strptime() argument 0 must be str, not <class 'bytes'>

根据 Davse Bamse 的建议,我看到了以下回溯(这是一个艰难的回溯):

Currently Pulling GOOG
Net Income:
Traceback (most recent call last):

  File "<ipython-input-3-c3f1db0f3995>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py", line 59, in <module>
    graphData(stock)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py", line 56, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 845, in loadtxt
    converters[i] = conv

IndexError: list assignment index out of range

有了 Davse Bamse 的新建议,转换器中包含这样的列表:

[incomeDate, income] = np.loadtxt(netIncomeAr, delimiter=',',unpack=True,
                                converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

我看到了这个错误:

Currently Pulling GOOG
Net Income:
C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py:823: UserWarning: loadtxt: Empty input file: "[]"
  warnings.warn('loadtxt: Empty input file: "%s"' % fname)
Traceback (most recent call last):

  File "<ipython-input-1-c3f1db0f3995>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py", line 60, in <module>
    graphData(stock)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py", line 56, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 845, in loadtxt
    converters[i] = conv

IndexError: list assignment index out of range

感谢您在 10 月 12 日的意见。 2015 戴夫斯·巴姆塞,

但是我不确定在哪里插入 .join 你说的...

能否请您复制此 sn-p 并发布您的(已编辑)提案。我需要看到光明!这就是我在 10 月 12 日之前的所有编辑之后所拥有的。

import urllib.request, urllib.error, urllib.parse
import numpy as np
import matplotlib.dates as mdates

stocklist = ['GOOG']


def graphData(stock, MA1, MA2):
    #######################################
    #######################################
    '''
        Use this to dynamically pull a stock from Quandl:
    '''
    print('Currently Pulling',stock)

    netIncomeAr = []

    endLink = 'order=asc'   # order=asc&

    netIncome = str(urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read())[2:-1]
    # convert to string, remove leading "b'" and trailing "'" characters.
    # netIncome = 'head\\ndata\\ndata\\n...'


    splitNI = netIncome.split('\\')[1:-1]
    # data segments still have leading 'n' character.
    # the [1:-1] is more pythonic and releases memory.
    for i in range (len(splitNI)):
        splitNI[i] = splitNI[i][1:]
    # data segments are now converted.

    print('Net Income:')
    for eachNI in splitNI:
        print(eachNI)
        netIncomeAr.append(eachNI)


    incomeDate, income = np.loadtxt(netIncomeAr, delimiter=',',unpack=True,
                                    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

for stock in stocklist:
    graphData(stock,25,50)

今天(13-10-2015)来自 Davse Bamse 的输入,我收到以下错误:

Currently Pulling GOOG
Net Income:
2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0
Traceback (most recent call last):

  File "<ipython-input-13-5ce0b8405254>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 54, in <module>
    graphData(stock,25,50)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 51, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 740, in loadtxt
    fh = iter(open(fname))

OSError: [Errno 22] Invalid argument: '2009-12-31,6520448000.0\n2010-12-31,8505000000.0\n2011-12-31,9737000000.0\n2012-12-31,10737000000.0\n2013-12-31,12920000000.0'

Davse Bamse 建议我像这样使用 io.StringIO:

incomeDate, income = StringIO(np.loadtxt('\n'.join(netIncomeAr), delimiter=',',unpack=True,
                                converters={ 0: mdates.strpdate2num('%Y-%m-%d')}))

但这给了我和以前一样的错误......有什么想法吗???

把转换器线改成这样:

incomeDate, income = np.loadtxt(StringIO('\n'.join(netIncomeAr)), delimiter=',',unpack=True,
                                converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

提供以下 Stacktrace:

Currently Pulling GOOG
Net Income:
2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0
Traceback (most recent call last):

  File "<ipython-input-26-5ce0b8405254>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 60, in <module>
    graphData(stock,25,50)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 57, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 860, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 860, in <listcomp>
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\matplotlib\dates.py", line 261, in __call__
    return date2num(datetime.datetime(*time.strptime(s, self.fmt)[:6]))

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\_strptime.py", line 494, in _strptime_time
    tt = _strptime(data_string, format)[0]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\_strptime.py", line 306, in _strptime
    raise TypeError(msg.format(index, type(arg)))

TypeError: strptime() argument 0 must be str, not <class 'bytes'>

我找到了另一种方法 np.genfromtxt 而不是 Numpy 的(我在 np 1.9.2)loadtxt,它显然可以在这个解决方案numpy.loadtxt does not read file with complex numbers 中描述。

所以改用这个转换器线

incomeDate, income = np.genfromtxt('\n'.join(netIncomeAr), delimiter=',',unpack=True,
                                converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

输出

Currently Pulling GOOG
Net Income:
2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0
Traceback (most recent call last):

  File "<ipython-input-10-5ce0b8405254>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 50, in <module>
    graphData(stock,25,50)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 47, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 1366, in genfromtxt
    fhd = iter(np.lib._datasource.open(fname, 'rb'))

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\_datasource.py", line 151, in open
    return ds.open(path, mode)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\_datasource.py", line 501, in open
    raise IOError("%s not found." % path)

OSError: 2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0 not found.

我不知道这件事是更好,还是更糟......

【问题讨论】:

  • print(str(e)) 打印什么?
  • 删除代码块周围的 try/catch,再次运行,然后打印堆栈跟踪。 “except Exception as e:”行掩盖了错误。
  • 感谢 Roland,Stacktrace 现在包含在上面的问题中。

标签: python csv python-3.x data-retrieval quandl


【解决方案1】:

正如 Roland 指出的那样,问题在于返回的是字节数组而不是字符串。

代码应该是这样的:

netIncomeBytes = urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read()
netIncome = netIncomeBytes.decode("utf-8")

这会将字节数组转换为 utf-8 格式的字符串。

【讨论】:

  • 谢谢大卫,我想这不是一个容易破解的。现在我得到另一个错误,根本看不到任何数据。请参阅我在上述问题中的编辑。
  • 确实解决了第一个问题。现在问题出在您创建转换器的那一行。尝试将其添加为列表(用 [ 和 ] 包围。它说它无法获得第 i 个转换器,所以这就是为什么我猜测它必须是一个列表。你能找到该函数的文档吗?你在打电话吗?
  • 它似乎在没有列表 [ ] 的情况下在这里工作。 stackoverflow.com/questions/22582691/… 或者我不确定你会在哪里列出清单...?
  • OK 在转换器中添加了一个列表。请参阅我最初的问题中的编辑。仍然看到一个错误。 ???
  • 我已经阅读了 loadtext 函数的文档。转换器是一个字典,而不是我想的列表。无需使用 [ 和 ]。
【解决方案2】:

在 Python 3.x 中,urllib.request.urlopen(...).read() 函数如果成功,则返回一个 ByteArray - 不是 字符串对象.

将ByteArray转为String的解决方法如下:

...
netIncome = str(urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read())[2:-1]
# convert to string, remove leading "b'" and trailing "'" characters.
# netIncome = 'head\\ndata\\ndata\\n...'
...

splitNI = netIncome.split('\\')[1:-1]
# data segments still have leading 'n' character.
# the [1:-1] is more pythonic and releases memory.
for i in range (len(splitNI)):
    splitNI[i] = splitNI[i][1:]
# data segments are now converted.

print('Net Income:')
for eachNI in splitNI:
    print(eachNI)
    netIncomeAr.append(eachNI)

【讨论】:

    猜你喜欢
    • 2020-05-10
    • 1970-01-01
    • 2015-01-21
    • 2017-04-29
    • 1970-01-01
    • 2020-08-06
    • 1970-01-01
    • 1970-01-01
    • 2023-02-20
    相关资源
    最近更新 更多