Question

将pandas导入为np 来自pandas import Series，DataFrame

dframe_final.to_csv('C:/Program Files/source/csv_data/2015/Merged files/jjj_all_merged.csv')

我有这部分代码，我需要在此csv文件的末尾添加一个新列，并将其命名为“New_name”。并根据不同的标准填充它：

例如，如果cell1为“a”且cell2为“b”且cell3为“1”且cell4为“2或5”，则输入“OK” 如果没有，请输入“NOT OK”或留空。

Column 1    Column 2    Column 3    Column 4    "New_name"
a              b           1           2            "OK"
a              b           1           5            "OK"
c              d           e           f          "NOT OK"

请帮助!!! ：）

Answer 1

IIUC：

mask = (df['Column 1'] == 'a') & (df['Column 2'] == 'b') & (df['Column 3'] == '1') & (df['Column 4'].isin(['2','5']))
df['new_value'] = np.where(mask,'OK','NOT OK')

输出：

  Column 1 Column 2 Column 3 Column 4 "New_name" new_value
0        a        b        1        2       "OK"        OK
1        a        b        1        5       "OK"        OK
2        c        d        e        f   "NOT OK"    NOT OK

Answer 2

就我而言

选项中的

df = pd.read_csv('csv_file.csv')表示＆＃39;追加＆＃39;。然后我将csv文件导入为＆＃39; pd.DataFrame＆＃39;。

import pandas as pd
import csv

我不知道这是不是最好的方法。

# The first argument is the column where you want to find id
# I'm unsure about what you want to subtract; subtracting the entry from 
# the count columns corresponds to setting the entry to 0
some_function <- function(col, id, df) {
    idx <- which(colnames(df) == col);
    df[df[, idx] == id, idx + 2] <- 0;
    return(df);
}

some_function("test", "two", df);
#    test    hyp testcount hypcount
#1    one    two         3        3
#2    two    one         0        3
#3  three onetwo         5        6
#4    one    one         3        3
#5 onetwo    two         6        3

some_function("hyp", "two", df)
#    test    hyp testcount hypcount
#1    one    two         3        0
#2    two    one         3        3
#3  three onetwo         5        6
#4    one    one         3        3
#5 onetwo    two         6        0

您将需要以上两件事。

在python pandas中的csv文件中添加新列

2 个答案: