【问题标题】:Is there a technique to create columns headers for a Plotly Sankey Diagram, similar to Tableau是否有一种技术可以为 Plotly Sankey 图创建列标题,类似于 Tableau
【发布时间】:2022-01-04 19:03:54
【问题描述】:

是否有一种技术可以绘制时间序列流程图,其中列节点表示每个月的开始日期,值表示每种类型的计数,以及表示类型的标签(即消费者、家庭办公室、公司、小型下图示例中的业务)?

Plotly 提供了一些关于如何创建Sankey Diagram in Python 的示例。添加日期作为列标题,类似于 Tableau 示例 Superstore Interactive Demo,将使桑基图更加清晰。例如,“0 级区域”将替换为“2022 年 1 月 1 日”,“2 级客户群”将替换为“2022 年 2 月 1 日”。

【问题讨论】:

    标签: python plotly tableau-api sankey-diagram


    【解决方案1】:

    样本数据

    from_date to_date from_type to_type value source target
    2022-01-01 00:00:00 2022-02-01 00:00:00 Consumer Home Office 3 Consumer_20220101 Home Office_20220201
    2022-01-01 00:00:00 2022-03-01 00:00:00 Consumer Corporate 6 Consumer_20220101 Corporate_20220301
    2022-01-01 00:00:00 2022-03-01 00:00:00 Small Business Corporate 21 Small Business_20220101 Corporate_20220301
    2022-01-01 00:00:00 2022-04-01 00:00:00 Consumer Home Office 14 Consumer_20220101 Home Office_20220401
    2022-02-01 00:00:00 2022-03-01 00:00:00 Corporate Consumer 20 Corporate_20220201 Consumer_20220301

    解决方案

    import pandas as pd
    import numpy as np
    import plotly.graph_objects as go
    
    ms = pd.date_range("1-jan-2022", freq="MS", periods=4)
    types = ["Consumer", "Home Office", "Corporate", "Small Business"]
    
    # simulate some data, date and type to date and type
    s = 50
    df = pd.DataFrame(
        {
            "from_date": np.random.choice(ms, s),
            "to_date": np.random.choice(ms, s),
            "from_type": np.random.choice(types, s),
            "to_type": np.random.choice(types, s),
            "value": np.random.randint(1, 20, s),
        }
    ).loc[
        # remove invalid combis from random generation
        lambda d: (d["to_date"] > d["from_date"]) & (d["from_type"] != d["to_type"])
    ].groupby(
        ["from_date", "to_date", "from_type", "to_type"], as_index=False
    ).sum()
    
    # start of solution, define source and target of sankey from column concat
    df = df.assign(source=lambda d: d["from_type"] + "_" + d["from_date"].dt.strftime("%Y%m%d"),
              target=lambda d: d["to_type"] + "_" + d["to_date"].dt.strftime("%Y%m%d"),
             )
    
    
    def factorize(s):
        a = pd.factorize(s, sort=True)[0]
        return (a + 0.01) / (max(a) + 0.1)
    
    
    # unique nodes
    nodes = np.unique(df[["source", "target"]], axis=None)
    nodes = pd.Series(index=nodes, data=range(len(nodes)))
    # work out positioning of nodes
    nodes = (
        nodes.to_frame("id")
        .assign(
            y=lambda d: factorize(d.index.to_series().apply(lambda s: s.split("_")[0])),
            x=lambda d: factorize(d.index.to_series().apply(lambda s: s.split("_")[1])),
        )
    )
    
    # now simple job of building sankey
    fig = go.Figure(
        go.Sankey(
            arrangement="snap",
            node={"label": nodes.index.to_series().apply(lambda s: s.split("_")[0]), "x": nodes["x"], "y": nodes["y"]},
            link={
                "source": nodes.loc[df["source"], "id"],
                "target": nodes.loc[df["target"], "id"],
                "value": df["value"],
            },
        )
    )
    
    for i, x in nodes["x"].drop_duplicates().iteritems():
        fig.add_annotation(x=x, y=1.4, text=i.split("_")[1], showarrow=False)
        
    fig
    

    【讨论】:

    • 这太棒了!它结合了在 DataFrame 中构建 fromto 的挑战。谢谢!
    猜你喜欢
    • 1970-01-01
    • 2019-09-03
    • 2015-12-04
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多