从字符串中提取大写单词

时间:2019-06-12 10:08:45

标签: php regex

编辑,我解决了我的问题。 这是解决方案

$string = "Hello my Name is Paloppa. Im' 20 And? Hello! Words I  Io Man";     
// Word boundary before the first uppercase letter, followed by any alphanumeric character
preg_match_all( '/(?<!^)\b[A-Z][a-z]{1,}\b(?!["!?.\\\'])/', $string, $matches);
print_r( $matches[0] );

现在我还有一个问题

每次找到一个单词时,都会将该单词插入数组的某个位置。

如果我有这句话“您的名字和姓氏是什么?我的名字和姓氏是Paolo Celio和Serie A Iim 25 Thankbro Bro Ciao” 这是我的代码

    $string = "Whats is your Name and Surname? My Name And Surname' is Paolo Celio and Serie A Iim 25 Thanksbro Bro Ciao";     
// Word boundary before the first uppercase letter, followed by any alphanumeric character
preg_match_all( '/(?<!^)\b([A-Z][a-z]+ +){1,}\b(?!["!?.\\\'])/', $string, $matches);
print_r( $matches[0] );

输出如下

Array ( 
        [0] => Name 
        [1] => Name And Surname 
        [2] => Paolo Celio 
        [3] => Serie 
        [4] => Iim 
        [5] => Thanksbro Bro 
       )

为什么它不加入意甲,却没有打印A? 为什么最后一个单词不在输出中?

谢谢

编辑 我解决了我的问题,这是我的REGEX

preg_match_all('/(?<!^)\b[A-Z]([a-z0-9A-Z]| [A-Z]){1,}\b(?!["!?.\\\'])/', $string, $matches);

4 个答案:

答案 0 :(得分:1)

可以使用。.

<?php
      $test="the Quick brown Fox jumps Over the Lazy Dog";
      preg_match_all("/[A-Z][a-z]*/",$test,$op);
      $output = implode(' ',$op[0]);
      echo $output;
?>

答案 1 :(得分:0)

要提取完整的单词,您需要使用单词边界和字符类来匹配单词的其余部分,并使用lookbehinds排除以前的内容:

$string = "Hello my Name is Paloppa. I'm 20 And? Hello! Words' Man";     
// Word boundary before the first uppercase letter, followed by any alphanumeric character
preg_match_all( '/(?<!^)(?<!\. )\b[A-Z][a-zA-Z]*\b(?!["!?\\\'])/', $string, $matches);
print_r( $matches[0] );

如果您只想使用大写单词(不包括混合大小写单词),请仅将[a-zA-Z]替换为[a-z]

演示here

答案 2 :(得分:0)

关于边缘情况,这有点复杂,但是我们只需要根据期望的输出和输入(可能带有单词边界)来定义两个char类,其表达式类似于:

(?=[^I'])\b([A-Z][a-z'é]+)\b

,我们将根据我们的案例进行扩展。

Demo

测试

$re = '/(?=[^I\'])\b([A-Z][a-z\'é]+)\b/m';
$str = 'Hello my name is Paloppa. I\'m 20 And i love Football.
Hello my name is Chloé. I\'m 20 And i love Football.
Hello my name is Renée O\'neal. I\'m 20 And i love Football.';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_dump($matches);

RegEx电路

jex.im可视化正则表达式:

enter image description here

答案 3 :(得分:-1)

您可以最快的方式使用。

$test="Hi There this Is my First Job";
preg_match_all('/[A-Z][a-z]*/', $test, $matches, PREG_OFFSET_CAPTURE);

$res=array();

foreach( $matches[0] as $key=> $value){

    $res[]=$value[0];
}

print_r($res);

输出:

  Array
(
    [0] => Hi
    [1] => There
    [2] => Is
    [3] => First
    [4] => Job
)

DEMO