相当于熊猫 read_clipboard 的 NumPy？答案

【问题标题】：A NumPy equivalent of pandas read_clipboard?相当于熊猫 read_clipboard 的 NumPy？
【发布时间】：2018-02-02 13:28:57
【问题描述】：

例如，如果您遇到的问题/答案发布如下数组：

[[ 0  1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14 15]
 [16 17 18 19 20 21 22 23]
 [24 25 26 27 28 29 30 31]
 [32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47]
 [48 49 50 51 52 53 54 55]
 [56 57 58 59 60 61 62 63]]

如何将它加载到 REPL 会话中的变量中，而不必在任何地方添加逗号？

【问题讨论】：

我认为没有直接的等价物。
@juanpa.arrivillaga 嗯，应该有。超方便的方法ftw。
Ehhh 我更倾向于认为人们应该负责提供可重现的示例。 print(repr(arr)) 有多难？但这仍然是一个好问题。
@juanpa.arrivillaga 100% 与您同在。可悲的是，具有可重复示例的问题是濒临灭绝的物种，我尽可能地珍惜它们。
stackoverflow.com/questions/101128/…, pypi.python.org/pypi/pyperclip, pypi.python.org/pypi/clipboard/0.0.4

标签： python numpy clipboard

【解决方案1】：

对于一次性场合，我可能会这样做：

将包含数组的文本复制到剪贴板。
在 ipython shell 中，输入 s = """，但不要按回车键。
从剪贴板粘贴文本。
键入结束的三引号。

这给了我：

In [16]: s = """[[ 0  1  2  3  4  5  6  7]
    ...:  [ 8  9 10 11 12 13 14 15]
    ...:  [16 17 18 19 20 21 22 23]
    ...:  [24 25 26 27 28 29 30 31]
    ...:  [32 33 34 35 36 37 38 39]
    ...:  [40 41 42 43 44 45 46 47]
    ...:  [48 49 50 51 52 53 54 55]
    ...:  [56 57 58 59 60 61 62 63]]"""

然后使用np.loadtxt()如下：

In [17]: a = np.loadtxt([line.lstrip(' [').rstrip(']') for line in s.splitlines()], dtype=int)

In [18]: a
Out[18]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29, 30, 31],
       [32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47],
       [48, 49, 50, 51, 52, 53, 54, 55],
       [56, 57, 58, 59, 60, 61, 62, 63]])

【讨论】：

【解决方案2】：

如果你有 Pandas、pyperclip 或 something else to read from the clipboard，你可以使用这样的东西：

from pandas.io.clipboard import clipboard_get
# import pyperclip
import numpy as np
import re
import ast

def numpy_from_clipboard():
    inp = clipboard_get()
    # inp = pyperclip.paste()
    inp = inp.strip()
    # if it starts with "array(" we just need to remove the
    # leading "array(" and remove the optional ", dtype=xxx)"
    if inp.startswith('array('):
        inp = re.sub(r'^array\(', '', inp)
        dtype = re.search(r', dtype=(\w+)\)$', inp)
        if dtype:
            return np.array(ast.literal_eval(inp[:dtype.start()]), dtype=dtype.group(1))
        else:
            return np.array(ast.literal_eval(inp[:-1]))
    else:
        # In case it's the string representation it's a bit harder.
        # We need to remove all spaces between closing and opening brackets
        inp = re.sub(r'\]\s+\[', '],[', inp)
        # We need to remove all whitespaces following an opening bracket
        inp = re.sub(r'\[\s+', '[', inp)
        # and all leading whitespaces before closing brackets
        inp = re.sub(r'\s+\]', ']', inp)
        # replace all remaining whitespaces with ","
        inp = re.sub(r'\s+', ',', inp)
        return np.array(ast.literal_eval(inp))

然后阅读您保存在剪贴板中的内容：

>>> numpy_from_clipboard()
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29, 30, 31],
       [32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47],
       [48, 49, 50, 51, 52, 53, 54, 55],
       [56, 57, 58, 59, 60, 61, 62, 63]])

这应该能够从剪贴板解析（大多数）数组（str 以及数组中的repr）。它甚至应该适用于多行数组（np.loadtxt 失败）：

[[ 0.34866207  0.38494993  0.7053722   0.64586156  0.27607369  0.34850162
   0.20530567  0.46583039  0.52982216  0.92062115]
 [ 0.06973858  0.13249867  0.52419149  0.94707951  0.868956    0.72904737
   0.51666421  0.95239542  0.98487436  0.40597835]
 [ 0.66246734  0.85333546  0.072423    0.76936201  0.40067016  0.83163118
   0.45404714  0.0151064   0.14140024  0.12029861]
 [ 0.2189936   0.36662076  0.90078913  0.39249484  0.82844509  0.63609079
   0.18102383  0.05339892  0.3243505   0.64685352]
 [ 0.803504    0.57531309  0.0372428   0.8308381   0.89134864  0.39525473
   0.84138386  0.32848746  0.76247531  0.99299639]]

>>> numpy_from_clipboard()
array([[ 0.34866207,  0.38494993,  0.7053722 ,  0.64586156,  0.27607369,
         0.34850162,  0.20530567,  0.46583039,  0.52982216,  0.92062115],
       [ 0.06973858,  0.13249867,  0.52419149,  0.94707951,  0.868956  ,
         0.72904737,  0.51666421,  0.95239542,  0.98487436,  0.40597835],
       [ 0.66246734,  0.85333546,  0.072423  ,  0.76936201,  0.40067016,
         0.83163118,  0.45404714,  0.0151064 ,  0.14140024,  0.12029861],
       [ 0.2189936 ,  0.36662076,  0.90078913,  0.39249484,  0.82844509,
         0.63609079,  0.18102383,  0.05339892,  0.3243505 ,  0.64685352],
       [ 0.803504  ,  0.57531309,  0.0372428 ,  0.8308381 ,  0.89134864,
         0.39525473,  0.84138386,  0.32848746,  0.76247531,  0.99299639]])

但是我不太擅长正则表达式，所以这可能不是万无一失的，使用ast.literal_eval 感觉有点尴尬（但它避免了自己进行解析）。

欢迎提出改进建议。

【讨论】：