我有一个主表,其中列出了每种产品的费率以及其包装和风险的各自类别。
class ApplicationController < ActionController::Base
protected
def after_sign_in_path_for(resource)
request.env['omniauth.origin'] || stored_location_for(resource) || root_path
end
end
在第二张表上,我获得了用户选项,用户将能够根据其他产品费率为这些产品创建任意数量的规则。而且这些规则仅适用于特定的包装或风险等级。
因此,对于下面的示例,产品B的产品A率仅适用于基本包装和良好/中等风险的产品,加上5%。所有包装的产品C的税率均为D加10%,仅是为了避免风险。
df = pd.DataFrame({'package': {0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium'},
'risk_bin': {0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad'},
'A': {0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07},
'B': {0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071},
'C': {0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072},
'D': {0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073}})
df = df[df.columns[[4,5,0,1,2,3]]]
因为我可以创建用户想要的规则,所以我需要创建一个循环,然后将值相应地传递给定义的关系。
rules = pd.DataFrame({'rule': {0: '1', 1: '2'},
'product1': {0: 'B', 1: 'C'},
'relantioship': {0:'=',1:'='},
'product2': {0:'A',1:'D'},
'symbol': {0:'+',1:'-'},
'value': {0:0.05,1:0.10},
'package':{0:'basic',1:'all'},
'risk': {0:'good/mid', 1:'bad'}})
rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]
执行此操作时,出现以下错误:
df2 = df.reset_index()
rules_nc = rules['rule'].get_values()
nc_cnt = rules_nc.size
for i in range(nc_cnt):
if pd.isnull(rules['rule'][i]):
break
product_1 = rules['product1'][i]
product_2 = rules['product2'][i]
sym = str(rules['symbol'][i])
val = rules['value'][i]
pack= rules['package'][i]
risk = rules['risk'][i]
if (df2['risk_bin']==risk) & (df2['package']==pack):
if sym=='+':
df2[product_1] = df2[product_2] + val
if sym=='-':
df2[product_1] = df2[product_2] - val
else:
df2[product_1] = df2[product_1]
这是我期望这组规则的输出。
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
能帮我吗? 非常感谢!
答案 0 :(得分:3)
这是一种可能的解决方案。不像使用apply
那样理想,它比循环更快,但不如矢量解快。我在risk
中将risk_bin
重命名为rules
。
import pandas as pd
df = pd.DataFrame({'package': {0: 'basic', 1: 'medium', 2: 'premium', 3:'basic', 4:'medium', 5:'premium'},
'risk_bin': {0: 'good/mid', 1: 'good/mid', 2: 'good/mid', 3:'bad', 4:'bad',5:'bad'},
'A': {0:0.012,1:0.022,2:0.032,3:0.05,4:0.06,5:0.07},
'B': {0:0.013,1:0.023,2:0.033,3:0.051,4:0.061,5:0.071},
'C': {0:0.014,1:0.024,2:0.034,3:0.052,4:0.062,5:0.072},
'D': {0:0.015,1:0.025,2:0.035,3:0.053,4:0.063,5:0.073}})
df = df[df.columns[[4,5,0,1,2,3]]]
rules = pd.DataFrame({'rule': {0: '1', 1: '2'},
'product1': {0: 'B', 1: 'C'},
'relantioship': {0:'=',1:'='},
'product2': {0:'A',1:'D'},
'symbol': {0:'+',1:'-'},
'value': {0:0.05,1:0.10},
'package':{0:'basic',1:'all'},
'risk_bin': {0:'good/mid', 1:'bad'}})
rules = rules[rules.columns[[5,1,3,2,6,7,0,4]]]
def fun(row):
if row["symbol"] == "+":
row[row["product1"]] = row[row["product2"]] + row["value"]
else:
row[row["product1"]] = row[row["product2"]] - row["value"]
return row
# here you look for all the rows where rules match with the given columns
df1 = pd.merge(df.reset_index(), rules, on=["package", "risk_bin"])
# here you what a rule for `all` package
df2 = pd.merge(df.reset_index(),
rules[rules["package"]=='all'].loc[:, rules.columns != "package"],
on=["risk_bin"])
# now you apply the function to both df
df1 = df1.apply(lambda x: fun(x), axis=1)
df2 = df2.apply(lambda x: fun(x), axis=1)
#select the indices in df1 and df2
bad_idx = df.index.isin(df1["index"].tolist()+df2["index"].tolist())
#concat all together
res = pd.concat([df1[df.columns], df2[df.columns], df[~bad_idx]],ignore_index=True)