两个文件之间的差异

时间:2015-07-27 23:00:46

标签: java string file diff

我正在寻找执行以下操作的代码片段:

给出两个代表两个文件的字符串列表

例如,

  • FILE1 = {" SSome" ," SSimple"," TText"," FFile"}
  • FILE2 = {"另一个"," TText"," FFile"," WWith"," AAdditional" ," LLines"}

如果我调用diff(file1,file2)

输出将是FILE1和FILE2之间的差异:

  1. * SSome |另一个
  2. -SSimple
  3. TText
  4. ffile的
  5. + WWith
  6. +内容的附加
  7. + LLines
  8. 非常感谢!

2 个答案:

答案 0 :(得分:0)

这是我尝试过的。

import java.util.*;

public class SetDemo
{
    public static void main(String[] args){
        String[] file1 = new String[]{"Some", "Simple", "Text", "File"};
        String[] file2  = new String[]{"Another", "Text", "File", "With", "Additional", "Lines"};
        Set<String> set1 = new HashSet<String>();
        Set<String> set2 = new HashSet<String>();

        for(String s: file1)
            {
                set1.add(s);
            }

        for(String s2: file2)
            {
                set2.add(s2);
            }

        Set<String> s1intercopy = new HashSet<String>(set1);
        Set<String> s2intercopy = new HashSet<String>(set2);

        s1intercopy.retainAll(s2intercopy); //Finds the intesection                                                                                                                                                                                                                  

        Set<String> s1symdiffcopy = new HashSet<String>(set1);
        Set<String> s2symdiffcopy = new HashSet<String>(set2);

        s1symdiffcopy.removeAll(set2);
        s2symdiffcopy.removeAll(set1);

        int count = 0;
        for(String s7: s1intercopy){
            count++;
            System.out.println(Integer.toString(count)+'.'+s7);
        }
        if (set1.size() > set2.size())
        {
            for(String s3: s1symdiffcopy){
                count++;
                System.out.println(Integer.toString(count)+'.'+'+'+s3);
            }
            for(String s4: s2symdiffcopy){
                count++;
                System.out.println(Integer.toString(count)+'.'+'-'+s4);
            }
        }else if (set2.size() > set1.size())
        {
            for(String s5: s2symdiffcopy){
                count++;
                System.out.println(Integer.toString(count)+'.'+'+'+s5);
            }
            for(String s6: s1symdiffcopy){
                count++;
                System.out.println(Integer.toString(count)+'.'+'-'+s6);
            }
        }

    }
}

输出:

1.Text
2.File
3.+Lines
4.+Additional
5.+Another
6.+With
7.-Some
8.-Simple

我不确定*Some|Another你的意思,但上面的代码所做的只是找到交集和集合之间的对称差异,确定哪个集合更大,并指定&#39 ; +&#39;对于作为较大集合的一部分的值而言,&#39; - &#39;对那些较小的那些。我没有从文件中读取以节省时间,但这部分很容易,你可以看一下。根据您的输出,您似乎正在搜索一个文件以及该文件中的每个字符串搜索另一个文件。对于大型文件来说这是非常低效的,所以我相信上面的解决方案通过将其保存到集合中并执行集合操作来优化它。

答案 1 :(得分:0)

我从你的问题中收集以下内容:

  • *word1|word2 - 表示文件2中的单词已在文件2中更改
  • -word - 表示文件1中的单词已被删除文件2
  • word - 表示文件2中的单词在文件2中保持不变
  • +word - 表示该单词最初不在文件1中,但已添加到文件2

我认为文件1是“源”文件,文件2是我们显示这些差异的“目标”文件。话虽如此,尝试这个算法(它不是DiffNow的完美,但它非常接近):

public static void main(String[] args) throws Exception {
    List<String> file1 = new ArrayList(Arrays.asList("Some", "Simple", "Text", "File"));
    List<String> file2 = new ArrayList(Arrays.asList("Another", "Text", "File", "With", "Additional", "Lines"));

    boolean diff = false;
    int file2Index = 0;
    for (int file1Index = 0; file1Index < file1.size();) {
        if (!file1.get(file1Index).equals(file2.get(file2Index)) && !diff) {
            diff = true;
            // The word from file 1 was changed
            System.out.println("*" + file1.get(file1Index) + "|" + file2.get(file2Index));
            file1Index++;
            file2Index++;
        } else if (!file1.get(file1Index).equals(file2.get(file2Index)) && diff) {
            // This word was removed from file 1
            System.out.println("-" + file1.get(file1Index));
            file1Index++;
        } else {
            System.out.println(file1.get(file1Index));
            diff = false;
            file1Index++;
            file2Index++;
        }
    }

    // Print what's left from file 2
    for (; file2Index < file2.size(); file2Index++) {
        System.out.println("+" + file2.get(file2Index));
    }
}

结果:

*Some|Another
-Simple
Text
File
+With
+Additional
+Lines