Question

我有一个赋值，要求我在python中使用正则表达式来查找包含名称列表的文件中的联合表达式。以下是具体说明：＆＃34;打开文件并返回文件中的所有名称。为了我们的目的，＆＃34;名称＆＃34;是由两个字母分隔的字母序列一个空间，大写字母只在领先位置。如果名字和姓氏开始，我们称之为名字使用相同的字母，除了考虑s和sh 不同的，同样适用于c / ch和t / th。名称文件将包含由逗号分隔的字符串列表。结果：分两个阶段执行此操作。＆＃34;这是我到目前为止的尝试：

def check(regex, string, flags=0):
return not (re.match("(?:" + regex + r")\Z", string, flags=flags)) is None 
def alliterative(names_file):
f = open(names_file)
string = f.read()
lst = string.split(',')
lst2 = []
for i in lst:
    x=lst[i]
    if re.search(r'[A-Z][a-z]* [A-Z][a-z]*', x):
        k=x.split(' ')
        if check('{}'.format(k[0][0]), k[1]):
            if not check('[cst]', k[0][0]):
                lst2.append(x)
            elif len(k[0])==1:
                if len(k[1])==1:
                    lst2.append(x)
                elif not check('h',k[1][1]):
                    lst2.append(x)
            elif len(k[1])==1:
                if not check('h',k[0][1]):
                    lst2.append(x)
return lst2

我有两个问题：首先，我编码的内容似乎对我有意义，其背后的一般想法是我首先检查名称的格式是否正确（名字，姓氏，所有字母）只有，只有首字母和姓氏的首字母大写），然后检查名字和姓氏的起始字母是否匹配，然后查看这些首字母是不是cs还是t，如果它们不是我们添加的话新列表的名称，如果是，我们检查我们是否意外地将[cst]与[cst] h匹配。代码编译，但当我尝试在这个名称列表上运行它时： Umesh Vazirani，Vijay Vazirani，Barbara Liskov，Leslie Lamport，Scott Shenker，R2D2 Rover，Shaq，Sam Spade，Thomas Thing

它返回一个空列表而不是[＆＃34; Vijay Vazirani＆＃34;，＆＃34; Leslie Lamport＆＃34;，＆＃34; Sam Spade＆＃34;，＆＃34; Thomas Thing＆＃34;它应该返回。我添加了打印语句给alliterative，所以看看哪里出了问题，似乎就行了如果检查（＆＃39; {}＆＃39; .format（k [0] [0]），k [1]）：是一个问题。

不仅仅是我的程序问题，我觉得我错过了正则表达式的重点：我是否过于复杂了？使用正则表达式有更好的方法吗？

Answer 1

请考虑改善您的问题。

特别是这个问题只对那些想要回答完全相同问题的人有用，我认为这几乎没有机会。请考虑如何进行改进，以便将其推广到本QA对其他人有帮助的程度。

我认为你的方向是正确的。

使用常规检查输入正确性是个好主意表达。 r'[A-Z][a-z]* [A-Z][a-z]*'是一个很好的表达方式。
您可以通过括号对输出进行分组。这样您就可以在以后轻松获得名字和姓氏
请注意re.match和re.search之间的区别。 re.search(r'[A-Z][a-z]* [A-Z][a-z]*', 'aaRob Smith')返回一个MatchObject。见this。

同时评论一般编程风格

最好将变量first和last命名为可读性，而不是k[0]和k[1]（以及如何选择字母k！）< / LI>

这是一种方法：

import re

FULL_NAME_RE = re.compile(r'^([A-Z][a-z]*) ([A-Z][a-z]*)$')

def is_alliterative(name):
    """Returns True if it matches the alliterative requirement otherwise False"""
    # If not matches the name requirement, reject
    match = FULL_NAME_RE.match(name)
    if not match:
        return False
    first, last = match.group(1, 2)
    first, last = first.lower(), last.lower()  # easy to assume all lower-cases

    if first[0] != last[0]:
        return False

    if first[0] in 'cst':  # Check sh/ch/th
        # Do special check
        return _is_cst_h(first) == _is_cst_h(last)

    # All check passed!
    return True


def _is_cst_h(text):
    """Returns true if text is one of 'ch', 'sh', or 'th'."""
    # Bad (?) assumption that the first letter is c, s, or t
    return text[1:].startswith('h')


names = [
    'Umesh Vazirani', 'Vijay Vazirani' , 'Barbara Liskov',
    'Leslie Lamport', 'Scott Shenker', 'R2D2 Rover', 'Shaq' , 'Sam Spade', 'Thomas Thing'
]
print [name for name in names if is_alliterative(name)]
# Ans
print ['Vijay Vazirani', 'Leslie Lamport', 'Sam Spade', 'Thomas Thing']

Answer 2

试试这个正则表达式：

@Override
public View onCreateView(LayoutInflater inflater, ViewGroup container,
                         Bundle savedInstanceState) {
    View view = inflater.inflate(R.layout.fragment_captcha, container);
    getDialog().setTitle("Catpcha verification");
    activity = (LoginActivity) getActivity();
    image = (ImageView) view.findViewById(R.id.captcha_image);
    linear = (LinearLayout) view.findViewById(R.id.captcha_linear);
    captcha_input = (EditText) view.findViewById(R.id.captcha_input);
    /* ... button callback events skipped ... */
    return view;
}

注意：它不处理sh / ch / th特殊情况。

如何有效地使用正则表达式来查找联合表达式？

2 个答案: