【问题标题】:reading RDa file in python as a pandas data frame在 python 中读取 RDa 文件作为 pandas 数据框
【发布时间】:2017-03-02 18:59:28
【问题描述】:

我有一个在 R 中创建的 RDa 文件。我想在 python 上将此文件作为 pandas 数据框读取。我有以下代码来做同样的事情:

import rpy2.robjects as robjects
import numpy as np
from rpy2.robjects import pandas2ri
pandas2ri.activate()

# load your file
robjects.r['load']('Data.RDa')

matrix = robjects.r['data']

matrix

我得到以下结果:

R object with classes: ('data.frame',) mapped to:
<DataFrame - Python:0x0CF46F58 / R:0x0ED0F200>
[Float..., Float..., Float..., ..., Float..., Float..., Float...]
  area: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0CF56A80 / R:0x0F281898>
[NA_real_, NA_real_, NA_real_, ..., NA_real_, NA_real_, NA_real_]
  i: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0CF68E68 / R:0x0F2B9520>
[NA_real_, NA_real_, NA_real_, ..., NA_real_, NA_real_, NA_real_]
  s: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0CF68940 / R:0x0F380008>
[NA_real_, NA_real_, NA_real_, ..., NA_real_, NA_real_, NA_real_]
  ...
  upslope_area: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0D03FDA0 / R:0x0FE87C90>
[NA_real_, NA_real_, NA_real_, ..., 292.256494, NA_real_, NA_real_]
  i: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0D03FC88 / R:0x0FEBF918>
[331347.500000, 331352.500000, 331357.500000, ..., 332187.500000, 332192.500000, 332197.500000]
  s: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0D03FE68 / R:0x0FEF75A0>
[4554812.500000, 4554812.500000, 4554812.500000, ..., 4553982.500000, 4553982.500000, 4553982.500000]

如何将其转换为 pandas 数据框?

【问题讨论】:

    标签: python r pandas dataframe rpy2


    【解决方案1】:

    当从搜索路径检索带有符号“数据”的第一个 R 对象时(简而言之,在执行robjects.r["data"] 时),这看起来像是对当前转换的缺失调用。如果 rpy2 跟踪器还没有问题,请在 rpy2 跟踪器上打开一个问题,或者如果尚未解决或假定过早解决了已打开的问题,则在 cmets 中发出噪音。

    显式调用仅限于代码块的转换规则应该是一种简单的解决方法,并且可以帮助您确保良好的性能。转换机制提供了便利,但通常会以牺牲性能为代价,因为每次转换的任一方向都会生成数据帧的副本。

    如下所示:

    from rpy2.robjects import default_converter
    from rpy2.robjects import pandas2ri
    from rpy2.robjects.conversion import localconverter
    
    # use the default conversion rules to which the pandas conversion
    # is added
    with localconverter(default_converter + pandas2ri.converter) as cv:
        dataf = robjects.r["data"]
    

    这在文档中:http://rpy2.readthedocs.io/en/version_2.8.x/robjects_convert.html#local-conversion-rules

    【讨论】:

      猜你喜欢
      • 2018-05-29
      • 2019-07-12
      • 2019-07-21
      • 2021-06-01
      • 1970-01-01
      • 1970-01-01
      • 2020-06-12
      • 2022-11-24
      • 2021-05-30
      相关资源
      最近更新 更多