将浮动列表转换为一组值

时间:2018-02-23 09:53:57

标签: python pandas

问题1:我们可以将浮点值列表转换为集合

数据:

A   B
1   [1212.0, 2121.0, 323.0]
2   [2222.0, 2222.0, 323.0]
3   [3232.0, 2323.0, 323.0]

dtype(B) = object

预期产出:

A   B
1   {121, 2121, 323}
2   {2222, 2222, 323}
3   {3232, 2323,323}

问题2:

我有一个数据框,我将集群与歌曲合并,如果有一个空值,则在集群一中,它应该忽略它并仅考虑具有数字的值。

数据:

cluster songs
1   11
2   22
1   22
2   
3   22
1   
3   11
4   

输出:

cluster songs
1   [11,  22, ]
2   [22, ]
3   [22,11]
4   []

预期产出:

cluster songs
    1   [11,  22]
    2   [22]
    3   [22,11]
    4   []

1 个答案:

答案 0 :(得分:1)

使用list comprehension

df.B = df.B.apply(lambda x: [int(i) for i in x])

或者:

df.B = [[int(i) for i in x] for x in df.B]

print (df)
   A                  B
0  1  [1212, 2121, 323]
1  2  [2222, 2222, 323]
2  3  [3232, 2323, 323]

对于集合:

df.B = df.B.apply(lambda x: set([int(i) for i in x]))
df.B = [set([int(i) for i in x]) for x in df.B]

print (df)
   A                  B
0  1  {2121, 323, 1212}
1  2        {323, 2222}
2  3  {3232, 323, 2323}

但如果只需转换为set s:

df.B = df.B.apply(set)
print (df)
   A                        B
0  1  {2121.0, 323.0, 1212.0}
1  2          {323.0, 2222.0}
2  3  {3232.0, 323.0, 2323.0}

对于另一个问题:

uniq = df['cluster'].unique()
df = df.dropna(subset=['songs'])
df.songs = df.songs.astype(int)
df = df.groupby('cluster')['songs'].apply(list).reindex(uniq, fill_value=[])
print (df)
cluster
1    [11, 22]
2        [22]
3    [22, 11]
4          []
Name: songs, dtype: object