映射多个值而无需硬编码

时间:2018-07-17 06:58:26

标签: python regex

我正在使用for循环读取下面的每一行源数据,然后在数组中获取值,并使用此值将其转换为所需的格式,当行数组只有两个值但大于该值时,我能够进行转换在第3行

源数据

 Row 1:
    [(array['X'] <@ type_list AND (array['X6'] <@ value_list OR array['A6.5'] <@ value_list OR array['YZ'] <@ value_list)]
 Row 2:
    [(array['Z'] <@ type_list AND array['30'] <@ value_list)]
 Row 3:
[(array['KZA'] <@ type_list AND (array['AM'] <@ value_list OR array['UA'] <@ value_list OR array['RC'] <@ value_list OR array['WEQZ4.5'] <@ value_list)]

代码:

import re
regex = r"array\['(?P<array>.*?)\']"
for val in data1:
    data=val
    arrVal2 = re.findall(regex, str(data))
    cntInnr2=len(arrVal2)
    rt=arrVal2[0]
    for cont in range(cntInnr2):
        if rt != arrVal2[cont]:
            val1=arrVal2[cont]
            updVal=("(type_value LIKE ANY ('%%%s=%s;%%'))" % (rt, val1))
            #f=f.replace(data, updVal)

转换后,行应如下所示:

  Row 1:
(type_value LIKE ANY ('%X=X6;%','%X=A6.5;%','%X=YZ;%'))

  Row 2:
(type_value LIKE ANY ('%Z=30;%'))

  Row 3:
(type_value LIKE ANY ('%KZA=AM;%','%KZA=UA%','%KZA=RC;%','%KZA=WEQZ4.5;%'))

我能够通过上面的代码处理Row 2,但是无法处理诸如Row 1Row 3中的更多值

2 个答案:

答案 0 :(得分:1)

您可以在array['...']内获取值,然后将其用作键,然后收集array['之后']<@ type_list之间的值(稍后用作值),然后构建结果:

import re
strs=["[(array['X'] <@ type_list AND (array['X6'] <@ value_list OR array['A6.5'] <@ value_list OR array['YZ'] <@ value_list)]", "[(array['Z'] <@ type_list AND array['30'] <@ value_list)]", "[(array['KZA'] <@ type_list AND (array['AM'] <@ value_list OR array['UA'] <@ value_list OR array['RC'] <@ value_list OR array['WEQZ4.5'] <@ value_list)]"]
r = re.compile(r"array\['(.*?)']")    # Compile the regex
for s in strs:
    m = r.search(s)                   # Get the key value
    if m:                             # If we found it
        array_vals = r.findall(s, s.index("<@ type_list")) # Get the values
        if len(array_vals) > 0:       # If there is at least 1 value, build the result
            print("(type_value LIKE ANY ({}))".format(",".join(["'%{}={};%'".format(m.group(1), x) for x in array_vals])))

输出:

(type_value LIKE ANY ('%X=X6;%','%X=A6.5;%','%X=YZ;%'))
(type_value LIKE ANY ('%Z=30;%'))
(type_value LIKE ANY ('%KZA=AM;%','%KZA=UA;%','%KZA=RC;%','%KZA=WEQZ4.5;%'))

请参见Python demo

您可能要添加一项检查,检查array_vals是否包含至少一个值,例如使用if len(array_vals) > 0:

答案 1 :(得分:0)

您可以将初始字符串分为关键部分(“ X”,“ Z”)和值部分(“ X6”,“ A6.5”等)。
然后使用正则表达式获取所需的键和值组件。
通过列表理解和联接将它们缝在一起,您很高兴。

import re

# basic matching patterns
p_k = re.compile("[A-Z]")
p_v = re.compile("'.{1,4}'")

for row in data:
    k, v = row.split("type_list") # split into key/value sections

    k = k.split()[0] 
    new_k = p_k.search(k).group(0)

    new_vs = [x.replace("'","") for x in p_v.findall(v)]
    # rejoin keys and values with the new formatting
    k_v = ",".join(["'%{k}={v};%'".format(k=new_k, v=v) for v in new_vs])
    # add the string wrapping for the new rows
    new_row = "(type_value LIKE ANY ({}))".format(k_v)

    print(new_row)

输出:

(type_value LIKE ANY ('%X=X6;%','%X=A6.5;%','%X=YZ;%'))
(type_value LIKE ANY ('%Z=30;%'))

regex可能会更优雅一点,这样可以节省一到两个步骤进行清洁-但这就是这个主意(它按原样工作)。

数据:

data = ["[(array['X'] <@ type_list AND (array['X6'] <@ value_list OR array['A6.5'] <@ value_list OR array['YZ'] <@ value_list)]",
        "[(array['Z'] <@ type_list AND array['30'] <@ value_list)]"]