熊猫时间序列的线性变形

时间:2017-11-28 13:49:04

标签: python pandas dataframe

我有一个pandas数据框,df:

import pandas as pd
import numpy as np
np.random.seed(123)

s  = np.arange(5)
df = pd.DataFrame()
for i in s:
    s_df = pd.DataFrame({'time':np.arange(100),
                         'x':np.arange(100),
                         'y':np.arange(100),
                         'r':np.random.randint(60,100)})
    s_df['unit'] = str(i)
    df = df.append(s_df)

我想为每个'单位'选择'x'和'y'数据,从'time'0到'r'的值,然后扭曲所选数据以适应新的标准化时间刻度0- 100。新的DataFrame应该看起来相同,但x和y将被拉伸以适应新的时间刻度。

1 个答案:

答案 0 :(得分:0)

我认为你可以从这开始并修改:

df.groupby('unit', as_index=False, group_keys=False)\
  .apply(lambda g: g[g.time <= g.r.max()].pipe(lambda x: x.assign(x = np.interp(x.time * 100/x.r.max(), g.time, g.x),
                                                                  y = np.interp(x.time * 100/x.r.max(), g.time, g.y))))

输出:

     r  time         x         y unit
0   91     0  0.369445  0.802790    0
1   91     1  0.802881  0.411523    0
2   91     2  0.080290  0.228482    0
3   91     3  0.248865  0.624470    0
4   91     4  0.350376  0.207805    0
5   91     5  0.604447  0.495269    0
6   91     6  0.402430  0.317250    0
7   91     7  0.205757  0.296371    0
8   91     8  0.426954  0.793716    0
9   91     9  0.728095  0.486691    0
10  91    10  0.087941  0.701258    0
11  91    11  0.653719  0.937834    0
12  91    12  0.702571  0.381267    0
13  91    13  0.113419  0.492686    0
14  91    14  0.381172  0.539422    0
15  91    15  0.490320  0.166290    0
16  91    16  0.440490  0.029675    0
17  91    17  0.663973  0.245057    0
18  91    18  0.273116  0.280711    0
19  91    19  0.807658  0.869288    0
20  91    20  0.227972  0.987803    0
21  91    21  0.747295  0.526613    0
22  91    22  0.491929  0.118479    0
23  91    23  0.403465  0.564284    0
24  91    24  0.618359  0.648467    0
25  91    25  0.867436  0.447866    0
26  91    26  0.487128  0.526473    0
27  91    27  0.135412  0.855466    0
28  91    28  0.469281  0.753690    0
29  91    29  0.397495  0.786670    0
..  ..   ...       ...       ...  ...
53  82    53  0.985053  0.534743    4
54  82    54  0.255997  0.789710    4
55  82    55  0.629316  0.889916    4
56  82    56  0.730672  0.539548    4
57  82    57  0.484289  0.278669    4
58  82    58  0.120573  0.754350    4
59  82    59  0.071606  0.912240    4
60  82    60  0.126613  0.775831    4
61  82    61  0.392633  0.706384    4
62  82    62  0.312653  0.698514    4
63  82    63  0.164337  0.420798    4
64  82    64  0.655284  0.317136    4
65  82    65  0.526961  0.484673    4
66  82    66  0.205197  0.516752    4
67  82    67  0.405965  0.314419    4
68  82    68  0.892710  0.620090    4
69  82    69  0.351876  0.422846    4
70  82    70  0.601449  0.152340    4
71  82    71  0.187239  0.486854    4
72  82    72  0.757108  0.727058    4
73  82    73  0.728311  0.623236    4
74  82    74  0.725225  0.279149    4
75  82    75  0.536730  0.746806    4
76  82    76  0.584319  0.543595    4
77  82    77  0.591636  0.451003    4
78  82    78  0.042806  0.766688    4
79  82    79  0.326183  0.832956    4
80  82    80  0.558992  0.507238    4
81  82    81  0.303649  0.143872    4
82  82    82  0.303214  0.113151    4

[428 rows x 5 columns]
相关问题