Python加载了大量文件

时间:2016-01-27 12:28:30

标签: python numpy

我正在尝试将以Ensight黄金格式保存的大量文件加载到numpy数组中。为了进行这个读取,我编写了自己的类libvec,它读取几何文件,然后预先分配python将用于保存数据的数组,如下面的代码所示。

N = len(file_list)
# Create the class object and read geometry file
gvec = vec.libvec(os.path.join(current_dir,casefile))
x,y,z = gvec.xyz()

# Preallocate arrays
U_temp = np.zeros((len(y),len(x),N),dtype=np.dtype('f4'))
V_temp = np.zeros((len(y),len(x),N),dtype=np.dtype('f4'))
u_temp = np.zeros((len(x),len(x),N),dtype=np.dtype('f4'))
v_temp = np.zeros((len(x),len(y),N),dtype=np.dtype('f4'))

# Read the individual files into the previously allocated arrays
for idx,current_file in enumerate(file_list):
    U,V =gvec.readvec(os.path.join(current_dir,current_file))
    U_temp[:,:,idx] = U
    V_temp[:,:,idx] = V

    del U,V

然而,这看似永远,所以我想知道你是否知道如何加快这个过程?将各个文件读入数组结构的代码如下所示:

def readvec(self,filename):
# we are supposing for the moment that the naming scheme PIV__vxy.case PIV__vxy.geo not changes should that
# not be the case appropriate changes have to be made to the corresponding file
    data_temp = np.loadtxt(filename, dtype=np.dtype('f4'), delimiter=None, converters=None, skiprows=4)

    # U value
    for i in range(len(self.__y)):
        # x value counter
        for j in range(len(self.__x)):
            # y value counter
            self.__U[i,j]=data_temp[i*len(self.__x)+j]

    # V value
    for i in range(len(self.__y)):
        # x value counter
        for j in range(len(self.__x)):
            # y value counter
            self.__V[i,j]=data_temp[len(self.__x)*len(self.__y)+i*len(self.__x)+j]

    # W value
    if len(self.__z)>1:

        for i in range(len(self.__y)):
            # x value counter
            for j in range(len(self.__xd)):
                # y value counter
                self.__W[i,j]=data_temp[2*len(self.__x)*len(self.__y)+i*len(self.__x)+j]

        return self.__U,self.__V,self.__W
    else:    
        return self.__U,self.__V

非常感谢提前和最好的问候,

Ĵ

1 个答案:

答案 0 :(得分:1)

如果没有任何测试输入\输出进行比较,就很难说。但我认为这会给你与U\V中嵌套for循环相同的readvec数组。这个方法应该比for循环快得多。

U = data[:size_x*size_y].reshape(size_x, size_y)
V = data[size_x*size_y:].reshape(size_x, size_y)

直接将这些内容返回U_tempV_temp也应该有所帮助。现在,您正在制作3(?)份数据副本,以便将其转入U_tempV_temp

  1. 从文件到temp_data
  2. 从temp_data到self .__ U \ V
  3. 从U \ V到U \ V_temp
  4. 虽然我的猜测是两个嵌套for循环,并且一次访问一个元素导致缓慢