拆分pandas包含多行字符串的系列行分成不同的行

时间:2014-11-26 17:14:45

标签: python pandas split series

我有一个充满字符串的pandas系列:

In:    
s = pd.Series(['This is a single line.', 'This is another one.', 'This is a string\nwith more than one line.'])

Out:
0                        This is a single line.
1                          This is another one.
2    This is a string\nwith more than one line.
dtype: object

如何将此系列中包含换行符\n的所有行拆分为自己的行?我期望的是:

0      This is a single line.
1        This is another one.
2            This is a string
3    with more than one line.
dtype: object

我知道我可以用

分隔换行符来划分每一行
s = s.str.split('\n')

给出了

0                        [This is a single line.]
1                          [This is another one.]
2    [This is a string, with more than one line.]

但这只会破坏行中的字符串,而不是每个令牌的行。

1 个答案:

答案 0 :(得分:4)

您可以遍历每一行中的每个字符串以创建一个新系列:

pd.Series([j for i in s.str.split('\n') for j in i])

在输入上执行此操作可能更有意义,而不是创建临时系列,例如:

strings = ['This is a single line.', 'This is another one.', 'This is a string\nwith more than one line.']
pd.Series([j for i in strings for j in i.split('\n')])