Python中的中位数、众数、均值答案

【问题标题】：Median, mode, mean in PythonPython中的中位数、众数、均值
【发布时间】：2017-04-17 09:44:38
【问题描述】：

我正在尝试将匹配的正则表达式加载到列表中，然后计算中位数、众数和均值。

数据文件（pc1.txt）：

2017-04-16 13:32:59 
\\desktop-XXXXXXX\processor(_total)\% processor time : 1.05614841124945
\\desktop-XXXXXXX\memory\% committed bytes in use : 5.50960924380334

2017-04-16 13:33:05
\\desktop-XXXXXXX\processor(_total)\% processor time : 1.08875159384721
\\desktop-XXXXXXX\memory\% committed bytes in use : 5.5102938969471

2017-04-16 13:33:10
\\desktop-XXXXXXX\processor(_total)\% processor time : 0
\\desktop-XXXXXXX\memory\% committed bytes in use : 5.46869437193207

BootTime 200938

 ------------------------------------ 
 ------------------------------------ 

2017-04-16 13:40:11 
\\desktop-XXXXXXX\processor(_total)\% processor time : 4.37510327488846
\\desktop-XXXXXXX\memory\% committed bytes in use : 4.438387242009

2017-04-16 13:40:17
\\desktop-XXXXXXX\processor(_total)\% processor time : 1.90625777477218
\\desktop-XXXXXXX\memory\% committed bytes in use : 4.44426156598249

2017-04-16 13:40:22
\\desktop-XXXXXXX\processor(_total)\% processor time : 0.078229917076289
\\desktop-XXXXXXX\memory\% committed bytes in use : 4.44589104046464

BootTime 69920

正则表达式查找值：

Processor: ^[\\].+processor.+[: ](\d*\.?\d*)
Memory: ^[\\].+memory.+[: ](\d*\.?\d*)
Boottime: ^BootTime.(\d+)

到目前为止，我尝试过：

with open('pc1.txt') as f:
    for line in f:
        re.findall(processor, f)

但是，我无法 1) 匹配值； 2) 将放入列表中； 3) 计算中位数、众数和均值。

我有计算模式的基本知识：

from statistics import mode
mode([value1, value2])

但我仍然无法将所有部分放在一起。此外，我愿意接受任何其他可以以简单/更轻松的方式处理统计数据的编程语言。

【问题讨论】：

processor_regex 到底是什么？
你用的是哪个python版本？
Python2.7. Ubuntu 16.04。
您需要向我们提供您正在使用的 python 版本，告诉我们实际错误是什么（如果您得到一个）或您得到什么以及它与您的不同预计。此外，将问题限制在一个主题上。你在这里问了两个非常不相关的事情，很可能你只会得到正则表达式的答案。您的代码有错误：您需要将line 传递给findall，而不是f。 findall 接受一个字符串，而不是文件描述符。 docs.python.org/2/library/re.html
你说的不是python3是什么意思？

标签： python regex statistics

【解决方案1】：

文本文件中的匹配项是strings 而不是floats，我们需要对其进行转换以获得median 和mean，我在statistics 上使用numpy，而不是statistics，即：

import numpy as np
import re

with open('pc1.txt', 'r') as myfile:
    data = myfile.read()
    processor = re.findall(r"processor time : ([\d.]+)", data, re.IGNORECASE | re.DOTALL | re.MULTILINE)
    processor = [float(i) for i in processor] # we convert the matching list of strings to floats
    if processor:
        print (np.median(processor))
        print (np.mean(processor))

    memory = re.findall(r"memory\\%.*?: ([\d.]+)", data, re.IGNORECASE | re.DOTALL | re.MULTILINE)
    memory = [float(i) for i in memory] # we convert the matching list of strings to floats
    if memory:
        print (np.median(memory))
        print (np.mean(memory))

    boot_time = re.findall(r"BootTime ([\d]+)", data, re.IGNORECASE | re.DOTALL | re.MULTILINE)
    boot_time = [float(i) for i in boot_time]  # we convert the matching list of strings to floats
    if boot_time:
        print (np.median(boot_time))
        print (np.mean(boot_time))

【讨论】：

由于某种原因，代码没有输出任何内容。即使它看起来是正确的。
没有输出。
老实说，我不知道您执行代码时会出现什么问题，但在我这边，它按预期工作。
哦，快！发现问题，空格导致正则表达式不匹配。
太好了，很高兴您发现了问题。总帐