【问题标题】:Bars not aligning with X-axis tick and first bar cut-off in Bokeh plot未与 X 轴刻度对齐的条形图和散景图中的第一个条形图截断
【发布时间】:2020-12-03 06:11:03
【问题描述】:

我浏览了根据我的问题标题自动填充的帖子,但找不到与我奇怪的结果完全一致的任何内容。

我正在输入数据包捕获的 CSV 文件并收集协议,分别对每个协议的每个协议的长度求和,然后除以协议总数以获得每个协议的平均数据包大小。

使用我的显示命令,我能够确认列协议和字节具有相同的大小(有 18 个协议和 18 个字节大小的平均值)。除了可能被视为一个问题的两件事之外,我还能够确认图表绘制正确。

  1. “DNS”的第一个条形图较早开始,半截
  2. 条形图未在 x 轴刻度上对齐

我尝试使用他们文档中的简单散景水果示例来复制该问题,但它绘制正常,但这只是创建了一个带有 x 和 y 的虚拟数组。

Off-Center Plot

这是代码,但请记住,CSV 文件未附加,并且由于 IP 地址的原因,我不得不避免分享,但如果有人看到此内容,任何建议或建议将不胜感激。

另外,这是temp.head() 的结果。

# Here is the myriad of declarations that helps this program work

import pandas as pd
import math as m
import ipympl
import ipywidgets
import numpy as np

# Bokeh libraries and modules (I am aware not all of these are required, just
# occasionally trying a few fun features, though let me know if this could be
# the problem)

from bokeh.io import  show, reset_output, output_notebook, export_png
from bokeh.plotting import figure, output_file
from bokeh.models import Range1d, FactorRange, ColumnDataSource, LabelSet, HoverTool
from bokeh.models import ColorBar, LogColorMapper, LogTicker
from bokeh.layouts import gridplot, row, column
from bokeh.transform import factor_cmap, linear_cmap
from bokeh.models.annotations import Label
from bokeh.models.tools import HoverTool
from bokeh.palettes import Spectral6, Category20, viridis, turbo, linear_palette

# Set up plots to stay inside the notebook
# Remove this when you want to bring up a separate window display

output_notebook()

# Set up Bokeh visualtization toolset  

TOOLS = "pan,wheel_zoom,box_zoom,reset,save"

data_r = pd.read_csv(r'C:\Users\xxx\Desktop\xxx.csv')
data_s = data_r.groupby('Protocol').Length.sum()
data_p = data_r.groupby('Protocol').Source.count()
data_a_p = data_s / data_p
data_a_p_df = data_a_p.to_frame()

temp = data_a_p_df.reset_index()
temp.columns = ['Protocol', 'bytes']

# Setting up the ColumnDataSource
cds = ColumnDataSource(temp)

p = figure(x_range=cds.data['Protocol'],
           plot_height=300,
           plot_width=800, 
           title="Average Packet Size by Protocol",
           y_axis_label='Size in Bytes',
           tools=TOOLS)

p.vbar(range(len(cds.data['Protocol'])), 
       width=.8,
       top=cds.data['bytes'],
       line_color='black',
       fill_color=turbo(len(cds.data['Protocol'])),
       fill_alpha=.5)

# To help the labels fit nicely, rotate the x-axis labels 45 degrees
p.xaxis.major_label_orientation = 45

# Display the graph
show(p)
display (len(cds.data['Protocol'])) # Results in 18
display (len(cds.data['bytes'])) # Results in 18

编辑:为了制作一个显示错误的功能示例,这里是使用虚拟数据框的更新代码:

# Here is the myriad of declarations that helps this program work
# Note I added prettytable to make the dummy csv file

import pandas as pd
import math as m
import ipympl
import ipywidgets
import numpy as np

# Bokeh libraries and modules (I am aware not all of these are required, just
# occasionally trying a few fun features, though let me know if this could be
# the problem)

from bokeh.io import  show, reset_output, output_notebook, export_png
from bokeh.plotting import figure, output_file
from bokeh.models import Range1d, FactorRange, ColumnDataSource, LabelSet, HoverTool
from bokeh.models import ColorBar, LogColorMapper, LogTicker
from bokeh.layouts import gridplot, row, column
from bokeh.transform import factor_cmap, linear_cmap
from bokeh.models.annotations import Label
from bokeh.models.tools import HoverTool
from bokeh.palettes import Spectral6, Category20, viridis, turbo, linear_palette

# Set up plots to stay inside the notebook
# Remove this when you want to bring up a separate window display

output_notebook()

# Set up Bokeh visualtization toolset  

TOOLS = "pan,wheel_zoom,box_zoom,reset,save"

# For this example I created a dummy dataframe for the data

data_r = pd.DataFrame({
   'Protocol': ['DNS','DNS', 'TCP', 'ICMPv6', 'TCP', 'TCP',
                 'HTTP', 'HTTP', 'TCP', 'TCP', 'TCP', 'TCP',
                 'TCP', 'ICMPv6', 'TCP', 'TCP', 'TCP', 
                 'ICMPv6', 'ICMPv6', 'AJP13', 'AJP13'],
   'Length': [96, 154, 66, 110, 171, 171, 208, 209, 56, 56,
            56, 56, 66, 110, 54, 55, 56, 110, 110, 171, 171]
})

data_s = data_r.groupby('Protocol').Length.sum()
data_p = data_r.groupby('Protocol').Protocol.count()
data_a_p = data_s / data_p
data_a_p_df = data_a_p.to_frame()

temp = data_a_p_df.reset_index()
temp.columns = ['Protocol', 'bytes']

# Setting up the ColumnDataSource
cds = ColumnDataSource(temp)

p = figure(x_range=cds.data['Protocol'],
           plot_height=300,
           plot_width=800, 
           title="Average Packet Size by Protocol",
           y_axis_label='Size in Bytes',
           tools=TOOLS)

p.vbar(range(len(cds.data['Protocol'])), 
       width=.8,
       top=cds.data['bytes'],
       line_color='black',
       fill_color=turbo(len(cds.data['Protocol'])),
       fill_alpha=.5)

# To help the labels fit nicely, rotate the x-axis labels 45 degrees
p.xaxis.major_label_orientation = 45

# Display the graph
show(p)

【问题讨论】:

    标签: pandas jupyter-notebook bar-chart bokeh


    【解决方案1】:

    首先,当您提供依赖于某些数据的代码时,请务必同时提供该数据。不是temp.head() 的图片,而是可以复制的东西。理想情况下,只需在代码本身中包含一些玩具数据。

    至于您的问题 - 只是不要在 p.vbar 中使用 range(len(...))。只需提供cds.data['Protocol'] 作为第一个参数。

    【讨论】:

    • Eugene,我更新了代码,添加了一个虚拟数据框,显示其他人可能遇到的错误,它看起来不一样,但我确实想尽我所能帮助社区。但是,您是对的,那是我的问题。如果我将范围设置为基于 len,是什么导致它在 x 轴上移动本质上是 -0.5 的值?非常感谢您的帮助和指导。您的回答很有用,但显然我是一个潜伏的菜鸟,无法真正点击“这个答案很有用”向上箭头
    • 通过使用分类值,您要求 Bokeh 为您计算合成坐标。通过使用数值,您是在告诉 Bokeh 您自己提供合成坐标。并且它需要一些基于散景内部知识的特定计算。为什么特别是-0.5 - 我认为没有特别的原因。可能只是因为相邻类别之间1 的距离似乎合理。
    猜你喜欢
    • 2015-10-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-03-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多