将带有时间后缀的字符串转换为numpy

时间:2016-11-09 20:49:31

标签: pandas numpy

我有一个numpy系列,其中的值类似于" 1.0s"," 100ms"等等。我无法绘制这个(使用pandas,将数组放入系列),因为熊猫并没有意识到这些是数字。我怎样才能将numpy或pandas推断为数字,同时注意后缀?

2 个答案:

答案 0 :(得分:1)

请参阅问题how do I get at the pandas.offsets object given an offset string

  • 使用pandas.tseries.frequencies.to_offset
  • 转换为timedeltas
  • 得到总秒数
from pandas.tseries.frequencies import to_offset

s = pd.Series(['1.0s', '100ms', '10s', '0.5T'])
pd.to_timedelta(s.apply(to_offset)).dt.total_seconds()

0      0.0
1      0.1
2     10.0
3    300.0
dtype: float64

答案 1 :(得分:0)

此代码可以解决您的问题。

# Test data
se = Series(['10s', '100ms', '1.0s'])

# Pattern to match ms and as integer of float
pat = "([0-9]*\.?[0-9]+)(ms|s)"
# Extracting the data
df = se.str.extract(pat, flags=0, expand=True)
# Renaming columns
df.columns = ['value', 'unit']
# Converting to number
df['value'] = pd.to_numeric(df['value'])
# Converting to the same unit
df.loc[df['unit']=='s', ['value', 'unit']]  = (df['value'] * 1000, 'ms')

# Now you are ready to plot !
print(df['value'])
# 0     10000.0
# 1       100.0
# 2    100000.0