Question

我有一个要解析的txt文件，如下所示：

--- What kind of submission is this? ---
Sold Property
--- State? ---
Los Angeles
...

，并且需要将--- ---标记后的值存储在变量中。它适用于所有这些if语句，但是我想知道是否可以将大量的ifs重构为某种结构（例如字典），然后轻松地将其写入输出文件。

这是我做的：

"""Open a file to read"""
        for line in res:
            if "Instagram Usernames" in line:
                usernames = next(res)
            if "Date" in line:
                date = next(res)
            if "Address" in line:
                address = next(res)
            if "Neighborhood" in line:
                market = next(res)
            if "State" in line:
                city = next(res)
            if "Asset" in line:
                as_type = next(res)
            if "Sale Price" in line:
                price = next(res)
                if "," in price:
                    price = price.replace(',', '')
                if "$" in price:
                    price = price.replace('$', '')
            if "Square" in line:
                sf = next(res)
                if "," in sf:
                    sf = sf.replace(',', '')
                if "$" in sf:
                    sf = sf.replace('$', '')
            if "Buyer" in line:
                buyer = next(res)
            if "Seller" in line:
                seller = next(res)
            if "Broker" in line:
                brokers = next(res)
            if "Notes" in line:
                notes = next(res)

        """Write to output file"""
        fin.write("IMAGE:  @" + usernames)
        fin.write("DATE: " + date)
        fin.write("ADDRESS: " + address)
        fin.write("MARKET: " + market)
        fin.write("CITY: " + city)
        if as_type == "Multi Family" or "Multi Family\n":
            fin.write("ASSET TYPE: Multifamily\n")
        else:
            fin.write("ASSET TYPE: " + as_type)
        fin.write("PRICE: $" + price)
        if sf in bad_symb:
            fin.write("SF: N/A\n")
            fin.write("PPSF: N/A\n")
        else:
            fin.write("SF: " + sf)
            fin.write("PPSF: $" + "{0:.2f}\n".format(float(price) / float(sf)))
        fin.write("BUYER: " + buyer)
        fin.write("SELLER: " + seller)
        fin.write("BROKERS: " + brokers + "\n")
        if notes != "\n":
            fin.write("NOTES: " + notes + "\n")
        fin.write(footer_sale(market, buyer, seller))

任何帮助将不胜感激，在此先感谢！

Answer 1

当我有一系列这样的项目时，我喜欢设置一个小的数据结构，该结构指定我要查找的内容以及是否找到它应该去的地方。

def strip_currency(s):
    """Function to strip currency and commas from a real number string"""
    return s.replace('$', '').replace(',', '')

# mapping of data labels to attribute/key names
label_attr_map = (
    ('Instagram Usernames', 'usernames'),
    ('Date', 'date'),
    ('Address', 'address'),
    ('Neighborhood', 'market'),
    ('State', 'city'),            # <-- copy-paste bug?
    ('Asset', 'as_type'),
    ('Sale Price', 'price', strip_currency),
    ('Square', 'sf', strip_currency),
    ('Buyer', 'buyer'),
    ('Seller', 'seller'),
    ('Broker', 'broker'),
    ('Notes', 'notes'),
    )

# populate data dict with values from file, as defined in the label_attr_map
data = {}
for line in file:
    # find any matching label, or just go on to the next line
    match_spec = next((spec for spec in label_attr_map if spec[0] in line), None)
    if match_spec is None:
        continue

    # found a label, now extract the next line, and transform it if necessary
    key = match_spec[1]
    data[key] = next(file)
    if len(match_spec) > 2:
        transform_fn = match_spec[2]
        data[key] = transform_fn(data[key])

现在，标签到属性的映射更容易验证，并且如果“ if”的级联只是单个next表达式。

要写入输出，只需访问data字典中的不同项目即可。

Answer 2

您可以使用字典，破折号之间的所有内容都是键，下一行是相应的值。

由于我们没有使用循环，因此我们首先将文件的内容分成几行：

res = res.split("\n")

下一行产生字典； res[::2]选择res中的第二个项目，从第一个项目（所有行---）开始选择，res[1::2]每隔第二个项目，从第二个项目（所有行信息）。

现在，我们选择以---作为字典中每个条目的关键字的行，并以信息行作为值：key: value;因为您可能不想包含破折号，所以我们用.rstrip("- ")从开头和结尾删除了破折号和空格：

x = {key.rstrip("- "): value for key in res[::2] for value in res[1::2]}

现在，您可以轻松地索引x以获得所需的信息，这也将简化写入输出文件的过程。

Answer 3

使用定义的lambda函数从所有行字符串的列表中查找下一个行字符串。

malloc

在另一个字典中获取变量作为键，并获取相应的特定搜索字符串作为值：

search_func = lambda search_str : [line_list[line_list.index(line)+1] for line in line_list[:-1] if search_str in line]

现在创建另一个调用上一个函数的字典，以获取您要搜索的所需值：

all_vars_search_dict = {'usernames' : "Instagram Usernames" , 'date' : "Date", 'address' : "Address", 'market' : "Neightbourhood", 'city' : "State",...}

在写入输出文件时，您只需遍历此字典即可。

注意：在行中搜索关键字all_vals = {k: search_func(all_vars_search_dict[k]) for k in all_vars_search_dict}和"Square"时无法完成此过程。

减少python上if语句的数量

3 个答案: