从.tsv文件(pandas)创建分层索引

时间:2016-06-20 21:01:58

标签: python csv pandas indexing dataframe

我有一个这种形状的数据框(不是实际的):

Fruit Banana House-1 15

Fruit Banana House-2 4

Fruit Apple House-2 6

Fruit Apple House-2 8

蔬菜西兰花馆-3 8

蔬菜生菜屋-4 12

蔬菜辣椒馆-5 3

蔬菜玉米屋-4 4

调味橄榄油之家-6 2

调味醋屋-7 2

我想知道是否有办法在Pandas中创建一个具有这两个级别的分层索引的数据框架:食物类型,食物。然后分配给每个位置和金额。我不能手动执行此操作,因为实际数据集有超过60,000行。我想到的一种方法是从tsv文件创建一个列表,然后将其用作索引,但我想有一种更自动的方式。提前谢谢!

1 个答案:

答案 0 :(得分:0)

试试这个:

In [11]: df
Out[11]:
        Type       Food      Loc  Num
0      Fruit     Banana  House-1   15
1      Fruit     Banana  House-2    4
2      Fruit      Apple  House-2    6
3      Fruit      Apple  House-2    8
4  Vegetable   Broccoli  House-3    8
5  Vegetable    Lettuce  House-4   12
6  Vegetable    Peppers  House-5    3
7  Vegetable       Corn  House-4    4
8  Seasoning  Olive Oil  House-6    2
9  Seasoning    Vinegar  House-7    2

In [13]: df = df.set_index(['Type','Food'])

In [14]: df
Out[14]:
                         Loc  Num
Type      Food
Fruit     Banana     House-1   15
          Banana     House-2    4
          Apple      House-2    6
          Apple      House-2    8
Vegetable Broccoli   House-3    8
          Lettuce    House-4   12
          Peppers    House-5    3
          Corn       House-4    4
Seasoning Olive Oil  House-6    2
          Vinegar    House-7    2

顺便说一句,你可以在阅读TSV / CSV文件时“即时”完成这项工作:

df = pd.read_csv(filename, sep='\t', header=None,
                 columns=['Type','Food','Loc','Num'],
                 index_col=['Type','Food'])
相关问题