使用numpy从特定文本文件行中提取数据

时间:2017-02-26 07:24:55

标签: python numpy

我有一个包含一些数字数据的文件。我需要做的是找到倒数第三行并需要提取出来的数据,并将其放入另一个名为newyork.csv的文件中。

然而,我很难提取第3个第4个第5行等;这是我的代码。如果我猜我认为问题始于----- for line in file2:

如何获取此代码以提取第3行到最后一行?

Weather = open("weathernewyork.txt",'r').read().split('\n')

csvholder = [] 
i = 0
while i < len(Weather):
    with open(Weather[i], 'r') as file2:
        reader = csv.reader(file2)
        wtr__12 = open(Weather[i]).read().split('\n')            
        new_york_d  = wtr__12[-3]
        new_york_d = numpy.array(new_york_d)

        for line in file2:
            line = line.strip()
            new_york_d = line.split(",")[4:]


        xx = numpy.array(new_york_d).reshape(-1,8)
        csvholder.append(xx)        
        i = i+1

xxz = numpy.array(csvholder).reshape(-1,8)
numpy.savetxt(newyork.csv", xxz, delimiter=",", fmt='%s')

这就是数据文件的样子。

0.0002%,3/30/2005,0.205130307,-0.001238007,1,0,0,0,0,1,1,1 <- I want to extract this one
0.0004%,3/31/2005,-0.10252641,-0.010432191,1,0,0,0,1,1,1,1
-0.0009%,4/1/2005,0.101510875,-0.000877706,1,0,0,0,0,1,1,1 <- Python extracts this one which I don't want 

更新:我忘了提到文件

Weather = open("weathernewyork.txt",'r').read().split('\n')

inside里面有其他要打开的文件。我在看所有的县。

2 个答案:

答案 0 :(得分:1)

我不确定你在更大的环境中尝试做什么,但是如果你想要的只是&#34;找到倒数第三行并且需要提取数据&#34 ;。然后我就这样做了。

样本数据

SELECT c.phone, c.first_name, c.last_name, 
MAX(str_to_date(co.order_date, '%m-%d-%Y')) AS order_date
FROM customer_order co INNER JOIN customer c on co.phone=c.phone 
GROUP BY c.phone, c.first_name, c.last_name 
HAVING MAX(str_to_date(co.order_date, '%m-%d-%Y')) < DATE_SUB(curdate(), INTERVAL 2 WEEK);

获得第三行

a,b,c,d,a,b,c,d
a,b,c,d,a,b,c,d
a,b,c,d,a,b,c,d
t,h,i,s,t,h,i,s
a,b,c,d,a,b,c,d
a,b,c,d,a,b,c,d

然后你可以将它存储在你所显示的另一个csv中。

答案 1 :(得分:0)

使用deque限制存储的行数并获取第一个元素:

from collections import deque
csvholder = [] 
with open("weathernewyork.txt",'r') as filenames:
    for filename in filenames:
        with open(filename.strip()) as datafile:
            data = deque(datafile, 3)
        csvholder.append(data[0].strip().split(',')[4:])
xxz = numpy.array(csvholder).reshape(-1,8)
numpy.savetxt("newyork.csv", xxz, delimiter=",", fmt='%s')