从文件中剥离数据

时间:2019-05-23 07:08:58

标签: python data-manipulation

我有一个文件,该文件将数据(数字)分配到(this is a screenshot of the original data) 8列中,我设法编写了python代码以将所有这些都存储起来并将其全部排列到单个列中并将其转储到文本文件中。该文件包含30,000个数据点,我需要剥离一些数据(例如,我只想从数据点1400到8000获取数据),然后将其转储到文本文件中。我对此很困惑。关于从哪里开始的任何指针?

这是我编写的将其放在一栏中的代码。

file = open("C://Users//Randall//Desktop//randallseismic//Deep//Nakasen//EW1//AKTH170807240026.EW1", 'r')

scaled_values_file = open("scaled_values.txt", 'w')

scaled_factor = float(raw_input())

for line in file:        
    P = line.split()
    #print(P)
    P[-1] = P[-1].strip()
    for i in range(0,len(P)):
        #print(P[i])
        x = float(P[i]) * scaled_factor
        #print(x)
        y = str(x)
        scaled_values_file.write(y + "\n")

file.close()
scaled_values_file.close()

1 个答案:

答案 0 :(得分:0)

有很多方法可以做到这一点。这就是我要怎么做。我不必担心逐行读取文件(个人选择小文件),而我会使用“ with”打开文件(读取和写入文件),因此每个文件都会自动关闭。

您只想提取点1400和8000之间的数据,并且每行有八个点,因此可以定义要从源文件保留的行:

# convert to integer since Python does not like to slice using floats
firstRow = int(1400 / 8 - 1)
lastRow = int(8000 / 8 + 1)

scaled_factor = float(raw_input())

srcFile = "C://Users//Randall//Desktop//randallseismic//Deep//Nakasen/EW1//AKTH170807240026.EW1"

dstFile = "scaled_values.txt"

with open(srcFile, "w") as file:
    # read file, split by end of line character...then grab rows of interest
    data = file.read().split("\n")[firstRow:lastRow]

# now, convert your list of rows in "data" to an array and change type
data = np.array([r.split() for r in data], dtype=float)

# flatten array to 1d and apply scaling factor
data = data.flatten() * scaled_factor


# write data to file
with open(dstFile, "w") as fid:
    for dataPoint in data:
        fid.write(str(dataPoint) + "\n")