Question

我有需要网格化的xyz文本文件。对于每个xyz文件，我都有关于原点坐标信息的信息，以及行/列的数量。但是，xyz文件中缺少z值的记录，因此只是因为缺少值而从当前记录创建网格失败。所以我试过这个：

nxyz = np.loadtxt(infile,delimiter=",",skiprows=1)

ncols = 4781
nrows = 4405
xllcorner = 682373.533843
yllcorner = 205266.898604
cellsize = 1.25

grid = np.zeros((nrows,ncols))

for item in nxyz:
    idx = (item[0]-xllcorner)/cellsize
    idy = (item[1]-yllcorner)/cellsize
    grid[idy,idx] = item[2]

outfile = open(r"e:\test\myrasout.txt","w")
np.savetxt(outfile,grid[::-1], fmt="%.2f",delimiter= " ")
outfile.close()

这使我得到了一个零的网格，其中xyz文件中没有记录。它适用于较小的文件，但我有一个290Mb大小的文件（~8900000记录）出现内存不足错误。这不是我必须处理的最大文件。

所以我尝试了Joe Kington的另一种（迭代）方法，我发现here用于加载xyz文件。这适用于290MB文件，但在下一个更大的文件（533MB，~1560000记录）中出现内存不足错误时失败。

如何在不耗尽内存的情况下正确地对这些较大的文件进行网格化（计算丢失的记录）？

Answer 1

根据评论，我将代码更改为

ncols = 4781
nrows = 4405
xllcorner = 682373.533843
yllcorner = 205266.898604
cellsize = 1.25
grid = np.zeros((nrows,ncols))

with open(file) as f:
    for line in f:
        item = line.split() # fill with whatever is separating the values 
        idx = (item[0]-xllcorner)/cellsize
        idy = (item[1]-yllcorner)/cellsize
        #...

Answer 2

您可以使用NumPy进行花式索引。尝试使用这样的东西，而不是可能是你的问题根源的循环：

grid = np.zeros((nrows,ncols))
grid[nxyz[:,0],nxyz[:,1]] = nxyz[:,2]

随着原点和单元格大小的转换，它涉及更多：

grid = np.zeros((nrows,ncols))
grid[(nxyz[:,0]-x11corner)/cellsize,(nxyz[:,1]-y11corner)/cellsize] = nxyz[:,2]

如果这没有帮助，nxyz数组太大了，但我对此表示怀疑。如果是，则可以将文本文件加载到几个部分，然后按顺序对每个部分执行上述操作。

P.S。您可能知道文本文件中包含的数据范围，并且可以在加载文件时明确说明此内容的使用情况。如果您正在处理最大16位整数：np.loadtxt("myfile.txt", dtype=int16)。

如何在没有内存不足的情况下对缺少记录的大型xyz文件进行网格化

2 个答案: