在同一张图中绘制两个数据集不完整的数据集

时间:2020-07-08 15:25:49

标签: python plotly

我的数据框由两列组成,每个工作日的股价和每股收益。股价仅在工作日提供,而每股收益仅在周六提供。现在,我想在同一可视化中使用两个y轴绘制两个图形。

            close   eps
date
...         
2020-04-01  240.91  NaN
2020-03-31  254.29  NaN
2020-03-30  254.81  NaN
2020-03-28     NaN  2.59
2020-03-27  247.74  NaN
2020-03-26  258.44  NaN
...
2019-12-28     NaN  5.04
2019-12-27  289.80  NaN
...   

到目前为止,我的方法是使用plotly:

fig = make_subplots(specs=[[{"secondary_y": True}]])
    fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["close"],
            name = "Price"
        ),
        secondary_y = False,
    )
    fig.add_trace(
        go.Scatter(
            x=df.dropna(subset=["eps"]),
            y=df["eps"],
            name = "EPS",
        ),
        secondary_y = True,
    )

    
    fig.update_yaxes(
        title_text="Price",
        secondary_y=False
    )
    fig.update_yaxes(
        title_text="EPS",
        secondary_y=True,
    )
    
    fig.show()

但是,我最终得到一个图形,但未显示EPS。对于eps列中所有缺少的数据点,我希望eps是连接点的线。

enter image description here

1 个答案:

答案 0 :(得分:2)

如果您要进行逐步绘制还是仅将点与线连接起来,我不确定。在第一种情况下,我认为您可以在第二个df["eps"].fillna(method="ffill")

上使用df["eps"].interpolate()

生成数据

import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots

df = pd.DataFrame({"date":pd.date_range('2019-01-01', '2020-12-31')})

df["close"] = np.abs(np.random.randn(len(df))) * 300
df["eps"] = np.abs(np.random.randn(len(df))) * 10

df["close"] = np.where(df["date"].dt.weekday>=5,
                       np.nan,
                       df["close"])

df["eps"] = np.where((df["date"].dt.month%4==0) & 
                     (df["date"].dt.weekday==5),
                     df["eps"],
                     np.nan)

grp = df.set_index("date").groupby(pd.Grouper(freq="M"))["eps"].last().reset_index()

df = df.drop("eps", axis=1)
df = pd.merge(df, grp, how="left", on="date")

df = df.set_index("date")

使用fillna(method="ffill")

df["eps_fillna"] = df["eps"].fillna(method="ffill")

fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["close"],
            name = "Price"
        ),
        secondary_y = False,
    )
fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["eps_fillna"],
            name = "EPS",

        ),
        secondary_y = True,
    )

    
fig.update_yaxes(
        title_text="Price",
        secondary_y=False
    )
fig.update_yaxes(
        title_text="EPS",
        secondary_y=True,
    )
    
fig.show()

enter image description here

使用interpolate()

df["eps_interpolate"] = df["eps"].interpolate()

fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["close"],
            name = "Price"
        ),
        secondary_y = False,
    )
fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["eps_interpolate"],
            name = "EPS",

        ),
        secondary_y = True,
    )

    
fig.update_yaxes(
        title_text="Price",
        secondary_y=False
    )
fig.update_yaxes(
        title_text="EPS",
        secondary_y=True,
    )
    
fig.show()

enter image description here