字符串“整数”列表,占“非数字”字符串的整数Python

时间:2019-04-30 20:03:00

标签: python numpy weather

我正在从在线数据库中获取数据。它以列表中的字符串形式返回日期和数值。即['87', '79', '50', 'M', '65'](这是y轴图的值,而x轴值是与这些值关联的年份,即['2018', '2017', '2016', '2015', '2014']。在绘制这些值之前,我首先需要将它们转换为整数。只需使用maxT_int = list(map(int,maxTList)即可完成此操作,但是仍然存在问题,如上例所示,有时数据会丢失,并由'M'指示为丢失。

我想做的是删除“ M”或以某种方式解释它并能够绘制值。

当列表中没有'M'时,我可以绘制出很好的值。关于如何最好地处理此问题的任何建议?

下面列出了我的完整代码

import urllib
import datetime
import urllib.request
import ast
from bokeh.plotting import figure
#from bokeh.io import output_file, show, export_png
import numpy as np



# Get user input for day
# in the format of mm-dd
print("Enter a value for the day that you would like to plot.")
print("The format should be mm-dd")
dayofmonth = input("What day would you like to plot? ")


# testing out a range of years
y = datetime.datetime.today().year

# get starting year
ystart = int(input("What year would you like to start with? "))
# get number of years back
ynum = int(input("How many years would you like to plot? "))
# calculate the number of years back to start from current year
diff = y - ystart
#assign values to the list of years
years = list(range(y-diff,y-(diff+ynum), -1))

start = y - diff
endyear = y - (diff+ynum)

i = 0
dateList=[]
minTList=[]
maxTList=[]
for year in years:
    sdate = (str(year) + '-' + dayofmonth)
    #print(sdate)

    url = "http://data.rcc-acis.org/StnData"

    values = {
    "sid": "KGGW",
    "date": sdate,
    "elems": "maxt,mint",
    "meta": "name",
    "output": "json"
    }

    data = urllib.parse.urlencode(values).encode("utf-8")


    req = urllib.request.Request(url, data)
    response = urllib.request.urlopen(req)
    results = response.read()
    results = results.decode()
    results = ast.literal_eval(results)

    if i < 1:
        n_label = results['meta']['name']
        i = 2
    for x in results["data"]:
            date,maxT,minT = x
            #setting the string of date to datetime

            date = date[0:4]
            date_obj = datetime.datetime.strptime(date,'%Y')
            dateList.append(date_obj)
            minTList.append(minT)
            maxTList.append(maxT)

maxT_int = list(map(int,maxTList))


# setting up the array for numpy
x = np.array(years)
y = np.array(maxT_int)


p = figure(title="Max Temps by Year for the day " + dayofmonth + " " + n_label, x_axis_label='Years',
           y_axis_label='Max Temps', plot_width=1000, plot_height=600)

p.line(x,y,  line_width=2)
output_file("temps.html")
show(p)

3 个答案:

答案 0 :(得分:1)

您可以使用numpy.nan和一个函数:

import numpy as np

lst = ['87', '79', '50', 'M', '65']

def convert(item):
    if item == 'M':
        return np.nan
    else:
        return int(item)

new_lst = list(map(convert, lst))
print(new_lst)

或者-如果您对列表理解感兴趣:

new_lst = [int(item) if item is not 'M' else np.nan for item in lst]


两者都会产生

[87, 79, 50, nan, 65]

答案 1 :(得分:0)

尝试一下:

>>> maxTList = ['87', '79', '50', 'M', '65']
>>> maxT_int = [int(item) for item in maxTList if item.isdigit()]
>>> maxT_int
[87, 79, 50, 65]

现在,代码只是丢弃非数字字符串(如问题中所指定),使 maxT_int maxTList 短(在这种情况下,您必须将相同的算法应用于其他列表,以确保排除相应的年份)。
如果希望它们相等,则可以指定默认值,以防字符串不是有效的 int (请注意 if for < / em>顺序相反):

>>> maxT_int2 = [int(item) if item.isdigit() else -1 for item in maxTList]
[87, 79, 50, -1, 65]

答案 2 :(得分:0)

您可以使用列表推导,对y值进行两次迭代。

raw_x = ['2018', '2017', '2016', '2015', '2014']
raw_y = ['87', '79', '50', 'M', '65']

clean_x = [x for x, y in zip(raw_x, raw_y) if y != 'M']
clean_y = [y for y in raw_y if y != 'M']