我编写了这个算法来计算删除和插入次数的总和(因此,编辑),使第一个字符串等于第二个字符串。但它没有用。
public static int distance (String s1, String s2) {
return distance(s1, s2, 0, 0);
}
private static int distance(String s1, String s2, int i, int j) {
if (i == s1.length) return j;
if (j == s2.length) return i;
if (s1.charAt(i) == s2.charAt(j))
return distance(s1, s2, i + 1, j + 1);
int rep = distance(s1, s2, i + 1, j + 1) + 1;
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
return Math.min(del, Math.min(ins, rep));
}
编辑:示例 字符串1:" casa" 字符串2:" cara" edit_distance = 2(1个删除+ 1个插入)
EDIT2: 这些是有效的字符串: 字符串1:" casa",字符串2:" cassa",edit_distance = 1; 字符串1:" pioppo",字符串2:" pioppo",edit_distance = 0;
这些是不起作用的: 字符串1:" casa",字符串2:" cara",edit_distance = 2; (在我的代码中= 0) 字符串1:" tassa",字符串2:" passato",edit_distance = 4; (在我的代码中= 2)
答案 0 :(得分:1)
我认为实施几乎是正确的,你错过了停止条件。他们应该是:
if (j == s2.length()) {
return s1.length() - i;
}
if (i == s1.length()) {
return s2.length() - j;
}
所以完整的实现应该是:
private static int distance(String s1, String s2, int i, int j) {
if (j == s2.length()) {
return s1.length() - i;
}
if (i == s1.length()) {
return s2.length() - j;
}
if (s1.charAt(i) == s2.charAt(j))
return distance(s1, s2, i + 1, j + 1);
int rep = distance(s1, s2, i + 1, j + 1) + 2; // since Jim Belushi considers replacement to be worth 2.
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
return Math.min(del, Math.min(ins, rep));
}
<强>更新强>
以下是&#34; tassa&#34;的结果。和&#34; passato&#34;:
代码:
private static int distance(String s1, String s2, int i, int j) {
if (j == s2.length()) {
return s1.length() - i;
}
if (i == s1.length()) {
return s2.length() - j;
}
if (s1.charAt(i) == s2.charAt(j))
return distance(s1, s2, i + 1, j + 1);
int rep = distance(s1, s2, i + 1, j + 1) + 2;
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
return Math.min(del, Math.min(ins, rep));
}
public static void main(String[] args) {
int dist = distance("tassa", "passato", 0, 0);
System.out.println(dist);
}
如果你运行这个,你得到:
4
答案 1 :(得分:0)
这应该是你想要的
如果每次编辑char都意味着距离+ 2(=删除+添加),它还会添加/删除的字符数 - 但只有+1,而不是+2
//get number of deletions / edits - inc 1 per each
public static void editDistance() {
String s1 = "casa";
String s2 = "cara";
String longer;
String shorter;
if(s1.length() > s2.length()) {
longer = s1;
shorter = s2;
}else {
shorter = s1;
longer = s2;
}
int edits = 0;
for (int i = 0; i < shorter.length(); i++) {
if(shorter.charAt(i) != longer.charAt(i)) {
edits++;
}
}
edits = edits *2; //one delete, one insert you told
edits = edits + Math.abs(s1.length() - s2.length()); //if different length then add counts of added/removed chars
System.out.println("edit count: " + edits);
}
答案 2 :(得分:0)
你需要指定当你到达一个字符串而不是另一个字符串时如何继续,试试这个
public static void main(String[] args) {
System.out.println(distance("casa","cassa"));
}
public static int distance (String s1, String s2) {
return distance(s1, s2, 0, 0);
}
private static int distance(String s1, String s2, int i, int j) {
if (i == s1.length() && j==s2.length())
return 0;
else if(i== s1.length())
return s2.length() - j;
else if(j == s2.length())
return s1.length() - i;
if (s1.charAt(i) == s2.charAt(j))
return distance(s1, s2, i + 1, j + 1);
int rep = distance(s1, s2, i + 1, j + 1) + 1;
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
return Math.min(del, Math.min(ins, rep));
}
输出
1
注意:第一个if
不是必需的,只需让代码更容易理解......在你的impl中删除它
答案 3 :(得分:0)
两个简单的更改和您的代码有效:
首先:
if (i == s1.length()) return s2.length() - j;
if (j == s2.length()) return s1.length() - i;
而不是
if (i == s1.length()) return j;
if (j == s2.length()) return i;
下一步:
int rep = distance(s1, s2, i + 1, j + 1) + 2;
最后的2在这里很重要。如果rep表示替换,则为删除AND插入。做两个操作,而不是1。
答案 4 :(得分:0)
它适用于我:
private static int distance(String s1, String s2, int i, int j) {
if (i == s1.length() && j == s2.length()) {
return 0;
} else if (i == s1.length()) {
return s2.length() - j;
} else if (j == s2.length()) {
return s1.length() - i;
}
if (s1.charAt(i) == s2.charAt(j)) {
return distance(s1, s2, i + 1, j + 1);
}
// int rep = distance(s1, s2, i + 1, j + 1) + 1;
int del = distance(s1, s2, i, j + 1) + 1;
int ins = distance(s1, s2, i + 1, j) + 1;
// return Math.min(del, Math.min(ins, rep));
return Math.min(del, ins);
}
有测试,它也有效:
/**
* Test of distanceRec method, of class EditDistance.
*/
@Test
public void testDistanceRec() {
System.out.println("distanceRec");
String s1 = "passato";
String s2 = "tassa";
int expResult = 4;
int result = EditDistance.distanceRec(s1, s2);
assertEquals(expResult, result);
// Review the generated test code and remove the default call to fail.
//fail("The test case is a prototype.");
}
在这个应用程序中,你只能使用两个操作:插入和删除,没有其他操作,如替换或匹配。 运动文本:
假设可用的操作只有两个:删除和插入一个字符。例子: - “casa”和“cassa”的编辑距离等于1(1取消); - “casa”和“cara”的编辑距离等于2(1个取消+ 1个插入); - “tax”和“past”的编辑距离等于4(3次取消+ 1次插入); - “poplar”和“poplar”的编辑距离为0。