用于识别关键字的正则表达式

时间:2018-12-11 10:59:13

标签: python

我要检查文本数据中是否存在关键字, 关键字是

Keywords=["just checking to see if you are there so we can continue.",
          "please let me know if you're receiving my responses or i will need to end our session", 
          "our chat session is now ending. thanks for choosing at&t!  we appreciate your business."]


Text= "Agent 'jl759s' enters chat (as Jacqueline)", "Hi Cristina!  My name is Jacqueline.  I'm happy to help.", 
  'Sure.', 
  'I see that you have issue with the internet service.', 
  "I'm sorry to hear that Cristina. Let me check that for you", 
  'Please not to worry, I can help you with that!', 
  'have you rebooted the router recently?', 
  'Thanks for the info!', 
  'Let me quickly run a line test to check if there is any issue detected either with the network lines or with the modem.',
  "You're welcome.",
  'Thank you for waiting Cristina.',
  'I wish I could resolve this issue through the chat session however it looks like this particular issue demands the expertise of the field technician.',
  'I understand that it is a little inconvenient to be waiting for a technician however we want to ensure that there is a permanent solution to this issue. I will do my best to get the earliest appointment available for you.',
  'I will dispatch a technician to the premise to help you on this.',
  'I am cehcking on the earliest available time slots now.',
  '*checking',
  'I see the earliest time slot available for the technician visit is on 10/16/2018 between 2:00 PM - 4:00 PM. Will it work for you?',
  'Ye sure.',
  'the technician will call you before the arrival.',
  'I will check that as well.',
  'The appointment is available on 10/20/2018.',
  'The available timings are 8:30 AM - 9:30 AM, 10:00 AM - 12:00 PM, 12:00 PM - 2:00 PM, 2:00 PM - 4:00 PM and 4:00 PM - 8:00 PM.',
  'Yes Cristina. Its available.',
  'Sure Cristina.',
  'Give me a minute.',
  'I have scheduled your appointment for October 16,
  xxxx. An AT&T technician will arrive as early as 4:00 PM or as late as 8:00 PM.',
  'Your service call may take 2-4 hours after arrival to resolve the issue.',
  'Please make sure all AT&T equipment is accessible to do repairs. Our technician will not move any furniture.',
  'An adult 18 years of age or older must be on-site for the duration of your Service Call and reachable on the day of the service call at 5863854186.',
  'With just two taps, you will able to track your repair/install appointment using the myAT&T app.',
  'Please launch the myAT&T app, input your member id and password, and then tap login. By tapping on the alert you will see all of your appointment details.',
  'Just to recap, You have reached us for the Internet service issue, as there is a line issue detected while troubleshooting I have dispatched a technician to help you fix this issue.',
  'I hope you do not have any concern with the assistance provided to you today. Is there anything else I can help you with I will be happy to assist?',
  'Pleasure is Mine Cristina!',
  'It was pleasure working with you!',
  'I appreciate your patience on this.',
  'Thank you for choosing AT&T. We appreciate your business. Have a great day!',
  'Bye!'

我编写了以下代码,

def Key_words(y):
    if(any(bool(re.search(r'\b'+x+r'\b', ''.join(y).lower()))) for x in keywords):
        return("Yes")
    else:
        return("No")

Key_words(Text)

其给出的输出为Yes 但是输出应为No,因为文本中没有匹配的关键字。

请帮助我获得正确的输出。

4 个答案:

答案 0 :(得分:0)

这是使用re.search的一种方法。

例如:

import re

keywords=["just checking to see if you are there so we can continue.",  "please let me know if you're receiving my responses or i will need to end our session", "our chat session is now ending. thanks for choosing at&t!  we appreciate your business."]
Text= ["Agent 'jl759s' enters chat (as Jacqueline)", "Hi Cristina!  My name is Jacqueline.  I'm happy to help.", 'Sure.', 'I see that you have issue with the internet service.', "I'm sorry to hear that Cristina. Let me check that for you", 'Please not to worry, I can help you with that!', 'have you rebooted the router recently?', 'Thanks for the info!', 'Let me quickly run a line test to check if there is any issue detected either with the network lines or with the modem.', "You're welcome.", 'Thank you for waiting Cristina.', 'I wish I could resolve this issue through the chat session however it looks like this particular issue demands the expertise of the field technician.', 'I understand that it is a little inconvenient to be waiting for a technician however we want to ensure that there is a permanent solution to this issue. I will do my best to get the earliest appointment available for you.', 'I will dispatch a technician to the premise to help you on this.', 'I am cehcking on the earliest available time slots now.', '*checking', 'I see the earliest time slot available for the technician visit is on 10/16/2018 between 2:00 PM - 4:00 PM. Will it work for you?', 'Ye sure.', 'the technician will call you before the arrival.', 'I will check that as well.', 'The appointment is available on 10/20/2018.', 'The available timings are 8:30 AM - 9:30 AM, 10:00 AM - 12:00 PM, 12:00 PM - 2:00 PM, 2:00 PM - 4:00 PM and 4:00 PM - 8:00 PM.', 'Yes Cristina. Its available.', 'Sure Cristina.', 'Give me a minute.', 'I have scheduled your appointment for October 16, xxxx. An AT&T technician will arrive as early as 4:00 PM or as late as 8:00 PM.', 'Your service call may take 2-4 hours after arrival to resolve the issue.', 'Please make sure all AT&T equipment is accessible to do repairs. Our technician will not move any furniture.', 'An adult 18 years of age or older must be on-site for the duration of your Service Call and reachable on the day of the service call at 5863854186.', 'With just two taps, you will able to track your repair/install appointment using the myAT&T app.', 'Please launch the myAT&T app, input your member id and password, and then tap login. By tapping on the alert you will see all of your appointment details.', 'Just to recap, You have reached us for the Internet service issue,  as there is a line issue detected while troubleshooting I have dispatched a technician to help you fix this issue.', 'I hope you do not have any concern with the assistance provided to you today. Is there anything else I can help you with I will be happy to assist?', 'Pleasure is Mine Cristina!', 'It was pleasure working with you!', 'I appreciate your patience on this.', 'Thank you for choosing AT&T. We appreciate your business. Have a great day!', 'Bye!']


def Key_words(y):
    if re.search("(" +"|".join(keywords) + ")", r"\n".join(y), flags=re.IGNORECASE):
        return("Yes")
    else:
        return("No")

print(Key_words(Text))

答案 1 :(得分:0)

您可以使用Downaload downaload = new Downaload(); ListPanel.Content = downaload; (更快,更简单):

in

在这里,我们仅在文本中没有一个关键字的情况下才迭代每个关键字。在所有其他情况下,我们将拥有更好的性能。

答案 2 :(得分:0)

有什么理由不能做这样简单的事情吗?

Keywords=["just checking to see if you are there so we can continue.",  "please let me know if you're receiving my responses or i will need to end our session", "our chat session is now ending. thanks for choosing at&t!  we appreciate your business."]
Text= "Agent 'jl759s' enters chat (as Jacqueline)", "Hi Cristina!  My name is Jacqueline.  I'm happy to help.", 'Sure.', 'I see that you have issue with the internet service.', "I'm sorry to hear that Cristina. Let me check that for you", 'Please not to worry, I can help you with that!', 'have you rebooted the router recently?', 'Thanks for the info!', 'Let me quickly run a line test to check if there is any issue detected either with the network lines or with the modem.', "You're welcome.", 'Thank you for waiting Cristina.', 'I wish I could resolve this issue through the chat session however it looks like this particular issue demands the expertise of the field technician.', 'I understand that it is a little inconvenient to be waiting for a technician however we want to ensure that there is a permanent solution to this issue. I will do my best to get the earliest appointment available for you.', 'I will dispatch a technician to the premise to help you on this.', 'I am cehcking on the earliest available time slots now.', '*checking', 'I see the earliest time slot available for the technician visit is on 10/16/2018 between 2:00 PM - 4:00 PM. Will it work for you?', 'Ye sure.', 'the technician will call you before the arrival.', 'I will check that as well.', 'The appointment is available on 10/20/2018.', 'The available timings are 8:30 AM - 9:30 AM, 10:00 AM - 12:00 PM, 12:00 PM - 2:00 PM, 2:00 PM - 4:00 PM and 4:00 PM - 8:00 PM.', 'Yes Cristina. Its available.', 'Sure Cristina.', 'Give me a minute.', 'I have scheduled your appointment for October 16, xxxx. An AT&T technician will arrive as early as 4:00 PM or as late as 8:00 PM.', 'Your service call may take 2-4 hours after arrival to resolve the issue.', 'Please make sure all AT&T equipment is accessible to do repairs. Our technician will not move any furniture.', 'An adult 18 years of age or older must be on-site for the duration of your Service Call and reachable on the day of the service call at 5863854186.', 'With just two taps, you will able to track your repair/install appointment using the myAT&T app.', 'Please launch the myAT&T app, input your member id and password, and then tap login. By tapping on the alert you will see all of your appointment details.', 'Just to recap, You have reached us for the Internet service issue,  as there is a line issue detected while troubleshooting I have dispatched a technician to help you fix this issue.', 'I hope you do not have any concern with the assistance provided to you today. Is there anything else I can help you with I will be happy to assist?', 'Pleasure is Mine Cristina!', 'It was pleasure working with you!', 'I appreciate your patience on this.', 'Thank you for choosing AT&T. We appreciate your business. Have a great day!', 'Bye!'

for k in Keywords:
    if k in Text:
        print(k)

答案 3 :(得分:0)

您的代码问题是此行上括号的位置错误:

if (any(bool(re.search(r'\b'+x+r'\b', ''.join(y).lower()))) for x in keywords):

在您的情况下,表达式re.search(r'\b'+x+r'\b', ''.join(y).lower())始终取值为None,因此上面的if语句可以重写为:

if (any(bool(None)) for x in keywords):

现在,表达式(any(bool(None)) for x in keywords)返回一个生成器,对于x中的每个keywords,生成器生成表达式any(bool(None))。该表达式本身在语义上无效,因为any需要一个可迭代的而不是bool作为其自变量,因此,它本身会引发一个TypeError

但是,表达式any(bool(None))从未被求值,因为它不需要。 if语句立即成功,因为它的参数是生成器(而不是bool),并且生成器在转换为True时,无论生成什么,都始终求值为bool。 / p>

因此,要修复此行为,您需要像这样移动括号:

if any(bool(re.search(r'\b'+x+r'\b', ''.join(y).lower())) for x in keywords):