我有一个字符串列表,我想使用一些规则来过滤它们。传递过滤的任何字符串都会附加到新的字符串列表中。一个示例规则可以是传递一个字符串,如果它包含X 或,如果它包含Y 和 Z.我知道我可以使用Python if
编写这些东西语句等等,但有没有更简洁,用户友好的方式进行这种过滤?是否有一些(可能类似于SQL)语言来做这样的事情?
# Accept or filter specified datasets.
filterDatasets = False
if filterDatasets:
# Filter specified datasets.
datasets = []
# Cycle over all datasets specified.
logger.info('filtering specified datasets')
for dataset in datasetsSpecified:
# If data was specified, then skip a specified dataset if its name
# does not contain "data12" or "merge". If data was not specified,
# then skip a specified dataset if its name does not contain "mc12".
if isData:
requiredSubstrings = [
#'data12',
'Egamma',
'Muons',
#'merge',
]
for substring in requiredSubstrings:
if substring not in dataset:
logger.debug("substring {substring} not in dataset name {dataset}".format(substring = substring, dataset = dataset))
continue
else:
datasets.append(dataset)
else:
requiredSubstrings = [
'mc12'
]
for substring in requiredSubstrings:
if substring not in dataset:
logger.debug("substring {substring} not in dataset name {dataset}".format(substring = substring, dataset = dataset))
continue
else:
datasets.append(dataset)
excludedSubstrings = [
'#'
]
for substring in excludedSubstrings:
if substring in dataset:
logger.debug("substring {substring} in dataset name {dataset}".format(substring = substring, dataset = dataset))
continue
else:
datasets = datasetsSpecified
logger.info('datasets accepted: {datasets}'.format(datasets = datasets))
答案 0 :(得分:0)
我认为正则表达式是文本字符串的“SQL” - 几乎所有你能想到的文本处理都可以完成:
http://rick.measham.id.au/paste/explain.pl?regex=hello|[a-z]%2Bbye
匹配此内容的示例:
与此不符的例子:
只需通过正则表达式运行列表中的每个项目,如果匹配则保留它。
我自己(仅偶尔)不是一个沉重的Python用户,但根据文档,他们支持: https://docs.python.org/2/library/re.html