Plotly Dash有机会删除和插值离群值

时间:2019-02-04 16:11:55

标签: python plotly interpolation dashboard plotly-dash

我正在建立一个仪表板来绘制图形,并有机会通过单击图形上的点并使用插值来更改数据来排除异常值。

此仪表板的主要思想是使不使用Python的人可以更快更轻松地准备数据。此破折号还将用于简单的数据可视化(某种形式的BI手工制作)。并且在所有迭代之后,一个包含清晰数据的新文件将被写入.csv且没有异常值。

为此我遇到了两个问题:

  1. 如何将仪表板界面数据导入的图与布局布局连接起来。
  2. 如何在图表上选择点或在datePickerRange中选择时间段,删除值并插值( scipy.interpolate.interp1d )丢失(已删除)的值或更改它们移动平均值( pd.rolling_mean() )。 我还发现,熊猫插值产生的结果相同,因此可以使用它。

有用于数据解析的代码块:

def parse_contents(contents, filename, date):

content_type, content_string = contents.split(',')
decoded = base64.b64decode(content_string)

try:
    if 'csv' in filename:
        # Assume that the user uploaded a CSV file
        df = pd.read_csv(io.StringIO(decoded.decode('cp1251')), sep = ';' )
    elif 'tsv' in filename:
        # Assume that the user uploaded a TSV file
        df = pd.read_csv(io.StringIO(decoded.decode('utf-8')), sep = '\t')   
    elif 'xls' in filename:
        # Assume that the user uploaded an excel file
        df = pd.read_excel(io.BytesIO(decoded))
    elif 'xlsx' in filename:
        # Assume that the user uploaded a new excel file
        df = pd.read_excel(io.BytesIO(decoded))

except Exception as e:
    print(e)
    return html.Div([
        'There was an error processing this file.'
    ])

return html.Div([
    html.H5(filename),
    html.H6(datetime.datetime.fromtimestamp(date)),

    dash_table.DataTable(
        data = df.to_dict('rows'),
        columns = [{'name': i, 'id': i} for i in df.columns]),

    html.Hr(),  # horizontal line

    # For debugging, display the raw contents provided by the web browser
    html.Div('Raw Content'),
    html.Pre(contents[0:10] + '...', style = {
        'whiteSpace': 'pre-wrap',
        'wordBreak': 'break-all'
    })
])

和输入框的回调函数:

@dashboard.callback(
Output('output-data-upload', 'children'),
[Input('upload-data', 'contents')],
[State('upload-data', 'filename'),
State('upload-data', 'last_modified')])

def update_output(list_of_contents, list_of_names, list_of_dates):

if list_of_contents is not None:
    children = [parse_contents(c, n, d) for c, n, d in zip(list_of_contents, list_of_names, list_of_dates)]

    return children

这部分代码是从官方文档中摘录的。 可以查看上载的数据非常有趣,但是我想使用这些列中的列名和日期来绘制与在Pandas中执行的相同的方式。

要选择列名,我创建了两个下拉组件:

#Create dropdown for X-axis
    html.Div([
        dcc.Dropdown(
            id = 'xaxis-column',
            options = [{'label': i, 'value': i} for i in df.columns],
            value = 'Xdate')],
        style = {'width': '48%', 'display': 'inline-block'}),

#Create dropdown for Y-axis
    html.Div([
        dcc.Dropdown(
            id = 'yaxis-column',
            options = [{'label': i, 'value': i} for i in df.columns],
            value = 'Yval')],
        style = {'width': '48%', 'float': 'right', 'display': 'inline-block'})

图形的代码部分:

dcc.Graph(id = 'graph')

@dashboard.callback(
Output('graph', 'figure'),
[Input('xaxis-column', 'value'),
 Input('yaxis-column', 'value'),
 Input('xaxis-type', 'value'),
 Input('yaxis-type', 'value'),
 Input('XYeardate--slider', 'value')])


def update_graph(xaxis_column_name, yaxis_column_name, 
             xaxis_type, yaxis_type, Year_value):

dff = df[df['XYeardate'] == Year_value]

return {
    'data': [go.Scatter(
        x = dff[dff['Xval'] == xaxis_column_name]['Xdate'],
        y = dff[dff['Xval'] == yaxis_column_name]['Yval'],
        text = dff[dff['Xval'] == yaxis_column_name]['ID'],
        mode = 'markers',
        marker = {
            'size': 10, #was 'size': 15
            'opacity': 0.5,
            'line': {'width': 0.5, 'color': 'white'}})],

    'layout': go.Layout(
        xaxis = {
            'title': xaxis_column_name,
            'type': 'linear' if xaxis_type == 'Linear'},

        yaxis = {
            'title': yaxis_column_name,
            'type': 'linear' if yaxis_type == 'Linear'},

        margin = {'l': 40, 'b': 40, 't': 10, 'r': 0},
        hovermode = 'closest')}

如果需要,我可以在注释中添加代码的其他部分。

任何评论将不胜感激!

0 个答案:

没有答案