Question

我在定义每一列的数据类型时以DataFrame的形式读取CSV文件。如果CSV文件中包含空白行，则此代码会产生错误。如何读取没有空白行的CSV？

dtype = {'material_id': object, 'location_id' : object, 'time_period_id' : int, 'demand' : int, 'sales_branch' : object, 'demand_type' : object }

df = pd.read_csv('./demand.csv', dtype = dtype)

我想到了一种类似的解决方法，但不确定这是否有效：

df=pd.read_csv('demand.csv')
df=df.dropna()

，然后在df中重新定义列数据类型。

编辑：代码-

import pandas as pd
dtype1 = {'material_id': object, 'location_id' : object, 'time_period_id' : int, 'demand' : int, 'sales_branch' : object, 'demand_type' : object }
df = pd.read_csv('./demand.csv', dtype = dtype1)
df

错误-ValueError: Integer column has NA values in column 2

我的CSV文件的快照-

Answer 1

尝试这样：

data = pd.read_table(filenames,skip_blank_lines=True, a_filter=True)

Answer 2

这对我有用。

def delete_empty_rows(file_path, new_file_path):
    data = pd.read_csv(file_path, skip_blank_lines=True)
    data.dropna(how="all", inplace=True)
    data.to_csv(new_file_path, header=True)

Answer 3

解决方案可能是：

data = pd.read_table(filenames,skip_blank_lines=True, na_filter=True)

Answer 4

我不确定它是否有效，但是否有效。这段代码不会在读取csv时加载nan值。

data_mod[21:28]

Answer 5

try.csv

s,v,h,h
1,2,3,4

4,5,6,7



9,10,1,2

Python代码

df = pd.read_csv('try.csv', delimiter=',')
print(df)

输出

   s   v  h  h.1
0  1   2  3    4
1  4   5  6    7
2  9  10  1    2

熊猫read_csv删除空白行

5 个答案: