【问题标题】:In Julia, how can I combine multiple dataframes if some columns are different?在 Julia 中,如果某些列不同,我该如何组合多个数据框?
【发布时间】:2021-05-09 15:05:27
【问题描述】:

我有类似但不完全相等的列键的无序数据框。例如:

DataFrame.   Columns
Dataframe 1: A, B, C
Dataframe 2: A, B, C
Dataframe 3: A, B, C, D
Dataframe 4: A, C, D

我想让它们堆叠/连接/附加。我不在乎如何为缺少给定列的数据框填充缺失的数据。

也就是说,我想要一个数据框:

DataFrame combined: A, B, C, D

【问题讨论】:

    标签: dataframe julia


    【解决方案1】:

    如果你想要 vectical 连接做:

    julia> dfs = [DataFrame(permutedims(1:n), :auto) for n in 1:5]
    5-element Vector{DataFrame}:
     1×1 DataFrame
     Row │ x1
         │ Int64
    ─────┼───────
       1 │     1
     1×2 DataFrame
     Row │ x1     x2
         │ Int64  Int64
    ─────┼──────────────
       1 │     1      2
     1×3 DataFrame
     Row │ x1     x2     x3
         │ Int64  Int64  Int64
    ─────┼─────────────────────
       1 │     1      2      3
     1×4 DataFrame
     Row │ x1     x2     x3     x4
         │ Int64  Int64  Int64  Int64
    ─────┼────────────────────────────
       1 │     1      2      3      4
     1×5 DataFrame
     Row │ x1     x2     x3     x4     x5
         │ Int64  Int64  Int64  Int64  Int64
    ─────┼───────────────────────────────────
       1 │     1      2      3      4      5
    
    julia> vcat(dfs[1], dfs[2], dfs[3], dfs[4], dfs[5], cols=:union)
    5×5 DataFrame
     Row │ x1     x2       x3       x4       x5
         │ Int64  Int64?   Int64?   Int64?   Int64?
    ─────┼───────────────────────────────────────────
       1 │     1  missing  missing  missing  missing
       2 │     1        2  missing  missing  missing
       3 │     1        2        3  missing  missing
       4 │     1        2        3        4  missing
       5 │     1        2        3        4        5
    

    如果你想追加做:

    julia> dfs = [DataFrame(permutedims(1:n), :auto) for n in 1:5]
    5-element Vector{DataFrame}:
     1×1 DataFrame
     Row │ x1
         │ Int64
    ─────┼───────
       1 │     1
     1×2 DataFrame
     Row │ x1     x2
         │ Int64  Int64
    ─────┼──────────────
       1 │     1      2
     1×3 DataFrame
     Row │ x1     x2     x3
         │ Int64  Int64  Int64
    ─────┼─────────────────────
       1 │     1      2      3
     1×4 DataFrame
     Row │ x1     x2     x3     x4
         │ Int64  Int64  Int64  Int64
    ─────┼────────────────────────────
       1 │     1      2      3      4
     1×5 DataFrame
     Row │ x1     x2     x3     x4     x5
         │ Int64  Int64  Int64  Int64  Int64
    ─────┼───────────────────────────────────
       1 │     1      2      3      4      5
    
    julia> append!(dfs[1], dfs[2], cols=:union)
    2×2 DataFrame
     Row │ x1     x2
         │ Int64  Int64?
    ─────┼────────────────
       1 │     1  missing
       2 │     1        2
    
    julia> append!(dfs[1], dfs[3], cols=:union)
    3×3 DataFrame
     Row │ x1     x2       x3
         │ Int64  Int64?   Int64?
    ─────┼─────────────────────────
       1 │     1  missing  missing
       2 │     1        2  missing
       3 │     1        2        3
    
    julia> append!(dfs[1], dfs[4], cols=:union)
    4×4 DataFrame
     Row │ x1     x2       x3       x4
         │ Int64  Int64?   Int64?   Int64?
    ─────┼──────────────────────────────────
       1 │     1  missing  missing  missing
       2 │     1        2  missing  missing
       3 │     1        2        3  missing
       4 │     1        2        3        4
    
    julia> append!(dfs[1], dfs[5], cols=:union)
    5×5 DataFrame
     Row │ x1     x2       x3       x4       x5
         │ Int64  Int64?   Int64?   Int64?   Int64?
    ─────┼───────────────────────────────────────────
       1 │     1  missing  missing  missing  missing
       2 │     1        2  missing  missing  missing
       3 │     1        2        3  missing  missing
       4 │     1        2        3        4  missing
       5 │     1        2        3        4        5
    
    julia> dfs[1]
    5×5 DataFrame
     Row │ x1     x2       x3       x4       x5
         │ Int64  Int64?   Int64?   Int64?   Int64?
    ─────┼───────────────────────────────────────────
       1 │     1  missing  missing  missing  missing
       2 │     1        2  missing  missing  missing
       3 │     1        2        3  missing  missing
       4 │     1        2        3        4  missing
       5 │     1        2        3        4        5
    

    【讨论】:

      猜你喜欢
      • 2018-08-25
      • 1970-01-01
      • 2020-02-17
      • 1970-01-01
      • 1970-01-01
      • 2021-04-21
      • 2021-12-24
      • 2019-07-05
      • 2020-12-24
      相关资源
      最近更新 更多