Pandas DataFrame具有连续索引

时间:2016-09-02 12:19:02

标签: python pandas indexing

我有以下代码:

import pandas as pd
df = pd.DataFrame(
    {'Index' : ['1', '2', '5','7', '8', '9', '10'],
     'Vals' : [1, 2, 3, 4, np.nan, np.nan, 5]})

这给了我:

  Index  Vals
0     1   1.0
1     2   2.0
2     5   3.0
3     7   4.0
4     8   NaN
5     9   NaN
6    10   5.0

但我想要的是这样的:

  Index      Vals
0     1  1.000000
1     2  2.000000
2     3  NaN
3     4  NaN
4     5  3.000000
5     6  NaN
6     7  4.000000
7     8  NaN
8     9  NaN
9    10  5.000000

我尝试通过创建具有连续索引的新数据框来实现此目的。然后我想分配我已经拥有的值,但如何?到目前为止我唯一能做的就是:

clean_data = pd.DataFrame({'Index' : range(1,11)})

这给了我:

   Index
0      1
1      2
2      3
3      4
4      5
5      6
6      7
7      8
8      9
9     10

2 个答案:

答案 0 :(得分:3)

因此,对于您的示例,它将如下所示:

import pandas as pd
import numpy as np 

df = pd.DataFrame(
    {'Index' : ['1', '2', '5','7', '8', '9', '10'],
     'Vals' : [1, 2, 3, 4, np.nan, np.nan, 5]})
df['Index'] = df['Index'].astype(int)
clean_data = pd.DataFrame({'Index' : range(1,11)})
result = clean_data.merge(df,on="Index",how='outer')

结果是:

  Index Vals
0   1   1.0
1   2   2.0
2   3   NaN
3   4   NaN
4   5   3.0
5   6   NaN
6   7   4.0
7   8   NaN
8   9   NaN
9   10  5.0

答案 1 :(得分:1)

您可以将Index列放在索引中(在转换为整数后),选择行110(这将创建相应的NaN s)并重置索引。

import numpy as np
import pandas as pd

df = pd.DataFrame(
    {'Index' : ['1', '2', '5','7', '8', '9', '10'],
     'Vals' : [1, 2, 3, 4, np.nan, np.nan, 5]})
df['Index'] = df['Index'].astype(int)

df = df.set_index('Index').loc[range(1, 11)].reset_index()

输出:

   Index  Vals
0      1   1.0
1      2   2.0
2      3   NaN
3      4   NaN
4      5   3.0
5      6   NaN
6      7   4.0
7      8   NaN
8      9   NaN
9     10   5.0
相关问题