我正在尝试编写一个脚本,该脚本将采用文本字符串并允许我替换随机单词。例如:
$str = "The quick brown fox jumps over the lazy dog";
我会用 as 代替这样的几个词:
<块引用>敏捷的______狐狸跳过____狗
我可能可以通过首先将字符串拆分为数组来做到这一点
$arr = str_word_count($str, 1);
然后替换 $arr[2]
和 $arr[7]
。
如果字符串中有非单词(例如标点符号),我认为我会遇到的问题:
$str = "The quick brown fox, named Jack, jumps over the lazy dog; and Bingo was his...";
我该如何解决这个问题?想法?
答案 0 :(得分:3)
你可以这样做:
$test1 = "test1";
$test2 = "test2";
$test3 = "Bingo2";
// set new words
$str = "The quick brown fox, named Jack, jumps over the lazy dog; and Bingo was his...";
$re = explode(" ", $str);
// split them on space in array $re
echo $str . "<br>";
$num = 0;
foreach ($re as $key => $value) {
echo $value . "<br>";
$word = "";
switch (true) {
case (strpos($value, 'Jack') !== false):
// cheak if each value in array has in it wanted word to replace
// and if it does
$new = explode("Jack", $value);
// split at that word just to save punctuation
$word = $test1 . $new[1];
//replace the word and add back punctuation
break;
case (strpos($value, 'dog') !== false):
$new1 = explode("dog", $value);
$word = $test2 . $new1[1];
break;
case (strpos($value, 'Bingo') !== false):
$new2 = explode("Bingo", $value);
$word = $test3 . $new2[1];
break;
default:
$word = $value;
// if no word are found to replace just leave it
}
$re[$num++] = $word;
//push new words in order back into array
};
echo implode(" ", $re);
// join back with space
结果:
The quick brown fox, named test1, jumps over the lazy test2; and Bingo2 was his...
它可以使用或不使用标点符号。
但请记住,如果您有 Jack
和 Jacky
,例如,您将需要添加额外的逻辑,例如检查标点部分是否没有任何带有 Regex to match only letters 的字母,如果确实跳过它,则意味着它不是完全匹配。或者舒缓类似的。
编辑(基于评论):
$wordstoraplce = ["Jacky","Jack", "dog", "Bingo","dontreplace"];
$replacewith = "_";
$word = "";
$str = "The quick brown fox, named Jack, jumps over the lazy dog; and Bingo was his...";
echo $str . "<br>";
foreach ($wordstoraplce as $key1 => $value1) {
$re = explode(" ", $str);
foreach ($re as $key => $value) {
if((strpos($value, $value1) !== false)){
$countn=strlen($value1);
$new = explode($value1, $value);
if (!ctype_alpha ($new[1])){
$word = " " . str_repeat($replacewith,$countn) . $new[1]. " ";
}else{
$word = $value;
}
}else{
$word = $value;
};
//echo $word;
$re[$key] = $word;
};
$str = implode(" ", $re);
};
echo $str;
结果:
The quick brown fox, named Jack, jumps over the lazy dog; and Bingo was his...
The quick brown fox, named ____, jumps over the lazy ___; and _____ was his...
答案 1 :(得分:2)
我认为更好的方法是使用正则表达式,因为您不仅允许使用逗号,还允许使用不是单词字符的所有内容。此外,正则表达式比循环中的正常拆分或子字符串快得多。 我的解决方案是:
<?php
function randomlyRemovedWords($str)
{
$sentenceParts = [];
$wordCount = preg_match_all("/([\w']+)([^\w']*)/", $str, $sentenceParts, PREG_SET_ORDER);
for ($i = 0;$i < $wordCount / 4;$i++)
{ //nearly every fourth word will be changed
$index = rand(0, $wordCount - 1);
$sentenceParts[$index][1] = preg_replace("/./", "_", $sentenceParts[$index][1]);
}
$str = "";
foreach ($sentenceParts as $part)
{
$str .= $part[1] . $part[2];
}
return $str;
}
echo randomlyRemovedWords("The quick brown fox, doesn't jumps over, the lazy dog.");
echo "\n<br>\n";
echo randomlyRemovedWords("The quick brown fox, jumps over, the lazy dog.");
结果
The quick brown ___, _______ jumps over, the ____ dog.
<br>
The quick brown fox, jumps ____, ___ ____ dog.
这样您就可以确保忽略所有非单词字符并随机删除单词。