如何删除csv文件中值周围的三重双引号?

时间:2019-05-07 10:02:01

标签: python sql pandas postgresql

我有一个csv文件。每个值都用"""引号引起来。我想将其删除以进行进一步处理

这是我的csv文件

Name,age,class,place
""""ishika""","""21""","""B"""","""Whitefield"""
"""anju""","""23""","""C""","""ITPL"""

我希望输出为:

Name,age,class,place
ishika,21,B,Whitefield
anju,23,C,ITPL

我正在从postgres表获取csv。

import psycopg2
import config as cfg
conn = cfg.DATABASE_CONNECT
cur = conn.cursor()
import csv
import pandas as pd
import numpy as np

tablename = "sf_paymentprofile_error_log"
query = "SELECT * from {} ".format(tablename)
outputquery = "COPY ({0}) TO STDOUT WITH CSV HEADER".format(query)
with open(cfg.PG_EXTRACT_PATH+'sf_paymentprofile_error_log.csv', 'w') as f:
   cur.copy_expert(outputquery, data)


conn.commit()
conn.close()

我想要使用python的上述输出。

3 个答案:

答案 0 :(得分:0)

使用熊猫的方法

import pandas as pd

df = pd.read_csv("your_file.csv")


for i in df.columns :         
    df[i] = df[i].apply(lambda x: str(x).replace('"',''))

df.to_csv("output.csv",index=False)

如果是列表:

output = []
for row in your_data :  
    b = []
    for val in row : 
        b.append(val.replace('"',''))
    c.append(b) 

print(output)

答案 1 :(得分:0)

通过将它们视为引号来删除它们,但是csv仅接受一个字符分隔符,因此:

import re
with open('data.csv') as f:
    # replace """ to single "
    data = (re.sub(r'"+', '"', line) for line in f.readlines())
    # now treat it as normal csv
    rd = csv.reader(data, delimiter=',', quotechar='"')
    # print 
    for row in rd:
        print(','.join(row))

或者,如果您认为安全,请对整个文件进行re.sub('"', '', f.read())

答案 2 :(得分:0)

pd.str.replacepd.str.strip都会有所帮助,例如:

df.apply(lambda x: x.str.strip('"'))

无论如何,您的csv的某些行具有"的继承,这些继承隐藏一些,的分隔符,因此,如果我应用strip函数:

import pandas as pd

df = pd.read_csv("my.csv")
df = df.apply(lambda x: x.str.strip('"'))
print(df)

     Name age            class place
0  ishika  21  B"","Whitefield   NaN
1    anju  23                C  ITPL

我发现的第一个解决方法是更改​​quotechar参数:

import pandas as pd

df = pd.read_csv("my.csv", quotechar="'")
df = df.apply(lambda x: x.str.strip('"'))
print(df)

     Name age class       place
0  ishika  21     B  Whitefield
1    anju  23     C        ITPL