熊猫根据分隔符将行中的值拆分为多行

时间:2019-06-28 08:46:15

标签: python pandas

我有以下格式的Pandas数据框。

[apple]
[banana]
[apple, orange]

我想对此进行转换,使其只有唯一的值,但每个值按行分割:

apple
banana
orange

2 个答案:

答案 0 :(得分:2)

首先将您的列表unnest排成行,然后使用drop_duplicates

Dim FSO As Object

Dim sh As Object, fld As Object, n As Object

Set FSO = CreateObject("Scripting.FileSystemObject")


Set sh = CreateObject("Shell.Application")
Set ZipFile = sh.Namespace("C:\Users\mohit.bansal\Desktop\Test\Test.zip")

For Each fileInZip In ZipFile.Items
        Debug.Print (fileInZip)
Next

# Make example dataframe
df = pd.DataFrame({'Col1':[['apple'], ['banana'], ['apple', 'orange']]})

              Col1
0          [apple]
1         [banana]
2  [apple, orange]

输出

df = explode_list(df, 'Col1').drop_duplicates()

链接答案中使用的功能

     Col1
0   apple
1  banana
2  orange

答案 1 :(得分:2)

您可以使用itertools.chainfrom_iterable()展平列表列表,并使用OrderedDict删除重复的维护顺序:

from collections import OrderedDict
import itertools

df['col2']=OrderedDict.fromkeys(itertools.chain.from_iterable(df.col)).keys()
print(df)

               col    col2
0          [apple]   apple
1         [banana]  banana
2  [apple, orange]  orange