使用 Python 在子图中绘制 Seaborn 条形图答案

【问题标题】：Plot Seaborn Barplots in Subplots with Python使用 Python 在子图中绘制 Seaborn 条形图
【发布时间】：2017-02-12 23:04:35
【问题描述】：

我有两个文件作为输入，如下所示：

col1 col2
A B
C C
B A
A A
A C
C A
B B

意思是，我有两列字母，用空格分隔。我想绘制这些字母出现的次数，每列都在自己的条形图中。假设两个文件的字母分布不同。

这是代码：

from collections import Counter
from os.path import isfile, join
from os import listdir
import matplotlib.pyplot as plt

import seaborn as sns
sns.set(color_codes=True)

inputDir = "/tmp/files/"

inputFiles = [ f for f in listdir(inputDir) if isfile(join(inputDir, f)) ]

fig, axes = plt.subplots(figsize=(6,6), ncols=2, nrows=len(inputFiles))

z=0

while inputFiles:

  files = inputFiles[0]
  inputFiles.remove(files)

  c = Counter()
  a = Counter()

  x1 = []
  y1 = []
  x2 = []
  y2 = []

  with open(inputDir + files, "r") as f2:
    for line in f2:
      line = line.strip()
      if line.split(" ")[0] != "col1":
        c[str(line.split(" ")[0])] += 1
        a[str(line.split(" ")[1])] += 1

  try:
    for cc in c:
      x1.append(cc)
      y1.append(c[cc])
    row = z // 2
    col = z % 2
    ax_curr = axes[row, col]
    sns.barplot(x1, y1, ax=ax_curr)

    z+=1

    for aa in a:
      x2.append(aa)
      y2.append(a[aa])
    row = z // 2
    col = z % 2
    ax_curr = axes[row, col]
    sns.barplot(x2, y2, ax=ax_curr)

    z+=1

  except:
    continue

sns.plt.show()

结果应该是一张图像，其中我有以下条形图作为子图：

---------------------------------------
|                  |                  |
|                  |                  |
|   barplot col1   |   barplot col2   |
|        file1     |       file1      |
|                  |                  |
--------------------------------------|
|                  |                  |
|                  |                  |
|   barplot col1   |   barplot col2   |
|        file2     |       file2      |
|                  |                  |
---------------------------------------

所以每个条的高度应该对应每个字母的个数。

到目前为止的问题是，每个子图中的条看起来完全不同，我不知道为什么。如果我能提供更多信息，请告诉我。

【问题讨论】：

标签： python matplotlib statistics bar-chart seaborn

【解决方案1】：

虽然不清楚这里“完全”不同的含义是什么，但可能是您需要在分割线之前先剥离它们。否则，最后一列的值可能看起来像 "B " 而不是 "B"。另外我不确定您为什么尝试在c[int(line.split(" ")[0])] += 1 中将字符串转换为整数。这对我来说没有多大意义。

试试：

with open(inputDir + files, "r") as f2:
      for line in f2:
          line = line.strip()
          if line.split(" ")[0] != "col1":
              c[line.split(" ")[0]] += 1
              a[line.split(" ")[1]] += 1

【讨论】：

你是完全正确的 - 我有不同的数据，并在夜间将我的代码调整为新的数据文件，而不运行它们。对此感到抱歉。我调整了我的代码，它仍然无法正常工作。在左上角我有数千条，在右上角只有几条（这是不正确的，因为两个文件中大约有 100k 行）。
我希望您能理解，我们需要根据我们在此处获得的信息来讨论这个问题。因此，如果您遇到的数据与您在问题中显示的数据不同，则很难为您提供帮助。然后您需要创建一个新的minimal reproducible example，将数据减少到可以在此处共享的数量，但仍然会重现问题。