【问题标题】:A NumPy equivalent of pandas read_clipboard?相当于熊猫 read_clipboard 的 NumPy?
【发布时间】:2018-02-02 13:28:57
【问题描述】:

例如,如果您遇到的问题/答案发布如下数组:

[[ 0  1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14 15]
 [16 17 18 19 20 21 22 23]
 [24 25 26 27 28 29 30 31]
 [32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47]
 [48 49 50 51 52 53 54 55]
 [56 57 58 59 60 61 62 63]]

如何将它加载到 REPL 会话中的变量中,而不必在任何地方添加逗号?

【问题讨论】:

标签: python numpy clipboard


【解决方案1】:

对于一次性场合,我可能会这样做:

  • 将包含数组的文本复制到剪贴板。
  • 在 ipython shell 中,输入 s = """,但不要按回车键。
  • 从剪贴板粘贴文本。
  • 键入结束的三引号。

这给了我:

In [16]: s = """[[ 0  1  2  3  4  5  6  7]
    ...:  [ 8  9 10 11 12 13 14 15]
    ...:  [16 17 18 19 20 21 22 23]
    ...:  [24 25 26 27 28 29 30 31]
    ...:  [32 33 34 35 36 37 38 39]
    ...:  [40 41 42 43 44 45 46 47]
    ...:  [48 49 50 51 52 53 54 55]
    ...:  [56 57 58 59 60 61 62 63]]"""

然后使用np.loadtxt()如下:

In [17]: a = np.loadtxt([line.lstrip(' [').rstrip(']') for line in s.splitlines()], dtype=int)

In [18]: a
Out[18]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29, 30, 31],
       [32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47],
       [48, 49, 50, 51, 52, 53, 54, 55],
       [56, 57, 58, 59, 60, 61, 62, 63]])

【讨论】:

    【解决方案2】:

    如果你有 Pandas、pyperclip 或 something else to read from the clipboard,你可以使用这样的东西:

    from pandas.io.clipboard import clipboard_get
    # import pyperclip
    import numpy as np
    import re
    import ast
    
    def numpy_from_clipboard():
        inp = clipboard_get()
        # inp = pyperclip.paste()
        inp = inp.strip()
        # if it starts with "array(" we just need to remove the
        # leading "array(" and remove the optional ", dtype=xxx)"
        if inp.startswith('array('):
            inp = re.sub(r'^array\(', '', inp)
            dtype = re.search(r', dtype=(\w+)\)$', inp)
            if dtype:
                return np.array(ast.literal_eval(inp[:dtype.start()]), dtype=dtype.group(1))
            else:
                return np.array(ast.literal_eval(inp[:-1]))
        else:
            # In case it's the string representation it's a bit harder.
            # We need to remove all spaces between closing and opening brackets
            inp = re.sub(r'\]\s+\[', '],[', inp)
            # We need to remove all whitespaces following an opening bracket
            inp = re.sub(r'\[\s+', '[', inp)
            # and all leading whitespaces before closing brackets
            inp = re.sub(r'\s+\]', ']', inp)
            # replace all remaining whitespaces with ","
            inp = re.sub(r'\s+', ',', inp)
            return np.array(ast.literal_eval(inp))
    

    然后阅读您保存在剪贴板中的内容:

    >>> numpy_from_clipboard()
    array([[ 0,  1,  2,  3,  4,  5,  6,  7],
           [ 8,  9, 10, 11, 12, 13, 14, 15],
           [16, 17, 18, 19, 20, 21, 22, 23],
           [24, 25, 26, 27, 28, 29, 30, 31],
           [32, 33, 34, 35, 36, 37, 38, 39],
           [40, 41, 42, 43, 44, 45, 46, 47],
           [48, 49, 50, 51, 52, 53, 54, 55],
           [56, 57, 58, 59, 60, 61, 62, 63]])
    

    这应该能够从剪贴板解析(大多数)数组(str 以及数组中的repr)。它甚至应该适用于多行数组(np.loadtxt 失败):

    [[ 0.34866207  0.38494993  0.7053722   0.64586156  0.27607369  0.34850162
       0.20530567  0.46583039  0.52982216  0.92062115]
     [ 0.06973858  0.13249867  0.52419149  0.94707951  0.868956    0.72904737
       0.51666421  0.95239542  0.98487436  0.40597835]
     [ 0.66246734  0.85333546  0.072423    0.76936201  0.40067016  0.83163118
       0.45404714  0.0151064   0.14140024  0.12029861]
     [ 0.2189936   0.36662076  0.90078913  0.39249484  0.82844509  0.63609079
       0.18102383  0.05339892  0.3243505   0.64685352]
     [ 0.803504    0.57531309  0.0372428   0.8308381   0.89134864  0.39525473
       0.84138386  0.32848746  0.76247531  0.99299639]]
    
    >>> numpy_from_clipboard()
    array([[ 0.34866207,  0.38494993,  0.7053722 ,  0.64586156,  0.27607369,
             0.34850162,  0.20530567,  0.46583039,  0.52982216,  0.92062115],
           [ 0.06973858,  0.13249867,  0.52419149,  0.94707951,  0.868956  ,
             0.72904737,  0.51666421,  0.95239542,  0.98487436,  0.40597835],
           [ 0.66246734,  0.85333546,  0.072423  ,  0.76936201,  0.40067016,
             0.83163118,  0.45404714,  0.0151064 ,  0.14140024,  0.12029861],
           [ 0.2189936 ,  0.36662076,  0.90078913,  0.39249484,  0.82844509,
             0.63609079,  0.18102383,  0.05339892,  0.3243505 ,  0.64685352],
           [ 0.803504  ,  0.57531309,  0.0372428 ,  0.8308381 ,  0.89134864,
             0.39525473,  0.84138386,  0.32848746,  0.76247531,  0.99299639]])
    

    但是我不太擅长正则表达式,所以这可能不是万无一失的,使用ast.literal_eval 感觉有点尴尬(但它避免了自己进行解析)。

    欢迎提出改进建议。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-11-29
      • 1970-01-01
      • 1970-01-01
      • 2016-08-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多