【问题标题】:Export summary table of statsmodels regression results as csv将statsmodels回归结果汇总表导出为csv
【发布时间】:2021-12-18 02:43:24
【问题描述】:

假设我要并排比较三个 statsmodels OLS 对象。我可以使用summary_col 创建一个汇总表,我可以将其打印为文本或导出到 Latex 中。

如何将此表导出为 csv?

这是我想做的可复制示例:

# Libraries
import pandas as pd
import statsmodels.api as sm
from statsmodels.iolib.summary2 import summary_col

# Load silly data and add constant
df = sm.datasets.stackloss.load_pandas().data
df['CONSTANT'] = 1

# Train three silly models
m0 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW']]).fit()
m1 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW','WATERTEMP']]).fit()
m2 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW','WATERTEMP','ACIDCONC']]).fit()

# Results table
res = summary_col([m0,m1,m2], regressor_order=m2.params.index.tolist())
print(res)

    ================================================
              STACKLOSS I STACKLOSS II STACKLOSS III
    ------------------------------------------------
    CONSTANT  -44.1320    -50.3588     -39.9197     
              (6.1059)    (5.1383)     (11.8960)    
    AIRFLOW   1.0203      0.6712       0.7156       
              (0.1000)    (0.1267)     (0.1349)     
    WATERTEMP             1.2954       1.2953       
                          (0.3675)     (0.3680)     
    ACIDCONC                           -0.1521      
                                       (0.1563)     
    ================================================
    Standard errors in parentheses.

有没有办法将res 导出为 csv?

【问题讨论】:

  • 结果应该在内部存储在一个可以写入 csv 的 pandas DataFrame 中。我不记得如何访问它。 dir(res)

标签: python regression statsmodels


【解决方案1】:

结果存储为数据框列表:

res.tables
[               STACKLOSS I STACKLOSS II STACKLOSS III
 CONSTANT          -44.1320     -50.3588      -39.9197
                   (6.1059)     (5.1383)     (11.8960)
 AIRFLOW             1.0203       0.6712        0.7156
                   (0.1000)     (0.1267)      (0.1349)
 WATERTEMP                        1.2954        1.2953
                                (0.3675)      (0.3680)
 ACIDCONC                                      -0.1521
                                              (0.1563)
 R-squared           0.8458       0.9088        0.9136
 R-squared Adj.      0.8377       0.8986        0.8983]

这应该可行:

res.tables[0].to_csv("test.csv")

pd.read_csv("test.csv")

       Unnamed: 0 STACKLOSS I STACKLOSS II STACKLOSS III
0        CONSTANT    -44.1320     -50.3588      -39.9197
1             NaN    (6.1059)     (5.1383)     (11.8960)
2         AIRFLOW      1.0203       0.6712        0.7156
3             NaN    (0.1000)     (0.1267)      (0.1349)
4       WATERTEMP         NaN       1.2954        1.2953
5             NaN         NaN     (0.3675)      (0.3680)
6        ACIDCONC         NaN          NaN       -0.1521
7             NaN         NaN          NaN      (0.1563)
8       R-squared      0.8458       0.9088        0.9136
9  R-squared Adj.      0.8377       0.8986        0.8983

【讨论】:

    猜你喜欢
    • 2019-03-10
    • 2016-05-05
    • 2021-09-11
    • 2020-12-04
    • 2017-10-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多