【问题标题】:Python3 Data ImputePython3 数据插补
【发布时间】:2019-07-28 06:52:48
【问题描述】:

我轰炸的求职面试。

    Remove all rows where at least half of the entries are negative
    Fill in remaining negative values in each column with the 
    average for that column, excluding invalid entries
Input: [[5],
     [3],
     [1.0, 2.0, 10.0],
     [-1.0, -99.0, 0],
     [-1.0, 4.0, 0],
     [3.0, -6.0, -0.1],
     [1.0, -0.31, 6.0]
    ]

Output: new mean rounded to one decimal place

不知道从哪里开始

输出新平均值,四舍五入到小数点后一位

【问题讨论】:

    标签: python-3.x


    【解决方案1】:

    假设有 3 列,并且在剩余的每一列中有 1 个或没有负值被整个列的平均值替换。

    然后:

    np = [[5],[3],[1.0,2.0,10.0],[-1.0,-99.0,0],[-1.0,4.0,0],[3.0,-6.0,-0.1],[1.0,-0.31, 6.0]]
    cols = 0  # save num of cols for later
    for l in inp:
        pos = 0  # count positive
        neg = 0  # count negative
        for n in l:
            if n > 0:
                pos += 1  # update
            elif n < 0:
                neg += 1  # update
        if pos+neg > cols:  # save num of cols
            cols = pos+neg
        if pos < neg:  # remove list with too many negatives
            inp.remove(l)
    
    for i in range(cols):  # loop through cols
        neg_index = 0  # find the negative value's index to replace with the average
        entries = 0  # for calculating the average
        summ = 0  # for calculating the average
        for c in inp:
            try:  # see if col exist in a list
                if c[i] < 0:
                    neg_index = inp.index(c)  # save index of negative value found
                else:
                    summ += c[i]
                    entries += 1
            except:
                continue
        try:  # see if col exist in list
            inp[neg_index][i] = round(summ / entries,1)  # replace negative value index with average
        except:
            continue
    
    print(inp)
    

    结果:

    [5],
    [3],
    [1.0, 2.0, 10.0],
    [2.5, 4.0, 0],
    [1.0, 3.0, 6.0],
    

    我相信这就是他们正在寻找的东西,希望这会有所帮助。

    【讨论】:

    • 谢谢,它工作正常。我无法计算最终数组的平均值。我将它转换为一个 numpy 数组并运行 numpy.mean(inp)。我得到 TypeError: unsupported operand type(s) for /: 'list' and 'int'
    • numpy.mean(...) 正在尝试将列表与 int 分开,这是不可能的。这是因为列表的大小不同。看看这篇文章,它可能会有所帮助:stackoverflow.com/questions/10058227/…
    猜你喜欢
    • 2013-06-17
    • 2020-08-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-05-21
    • 1970-01-01
    相关资源
    最近更新 更多