Java数组,查找重复项

时间:2010-10-17 01:15:30

标签: java arrays

我有一个数组,正在寻找副本。

duplicates = false;
for(j = 0; j < zipcodeList.length; j++){
    for(k = 0; k < zipcodeList.length; k++){
        if (zipcodeList[k] == zipcodeList[j]){
            duplicates = true;
        }
    }
}

但是,当没有重复项时,此代码不起作用。是谁?

15 个答案:

答案 0 :(得分:136)

在鼻子上回答..

duplicates=false;
for (j=0;j<zipcodeList.length;j++)
  for (k=j+1;k<zipcodeList.length;k++)
    if (k!=j && zipcodeList[k] == zipcodeList[j])
      duplicates=true;

编辑后将.equals()切换回==,因为我读到了您正在使用int的地方,这在初始问题中并不清楚。同样要设置k=j+1,将执行时间减半,但它仍然是O(n 2 )。

更快(限制)方式

这是一种基于哈希的方法。你必须为自动装箱付费,但它是O(n)而不是O(n 2 )。一个有进取心的灵魂会找到一个原始的基于int的哈希集(Apache或Google Collections有这样的东西,可以解决这个问题。)

boolean duplicates(final int[] zipcodelist)
{
  Set<Integer> lump = new HashSet<Integer>();
  for (int i : zipcodelist)
  {
    if (lump.contains(i)) return true;
    lump.add(i);
  }
  return false;
}

向HuyLe低头

请参阅HuyLe's answer了解更多或更少的O(n)解决方案,我认为这需要一些额外的步骤:

static boolean duplicates(final int[] zipcodelist)
{
   final int MAXZIP = 99999;
   boolean[] bitmap = new boolean[MAXZIP+1];
   java.util.Arrays.fill(bitmap, false);
   for (int item : zipcodeList)
     if (!bitmap[item]) bitmap[item] = true;
     else return true;
   }
   return false;
}

或者只是紧凑

static boolean duplicates(final int[] zipcodelist)
{
   final int MAXZIP = 99999;
   boolean[] bitmap = new boolean[MAXZIP+1];  // Java guarantees init to false
   for (int item : zipcodeList)
     if (!(bitmap[item] ^= true)) return true;
   return false;
}

重要吗?

好吧,所以我运行了一个小基准测试,这个地方都是iffy,但这里是代码:

import java.util.BitSet;

class Yuk
{
  static boolean duplicatesZero(final int[] zipcodelist)
  {
    boolean duplicates=false;
    for (int j=0;j<zipcodelist.length;j++)
      for (int k=j+1;k<zipcodelist.length;k++)
        if (k!=j && zipcodelist[k] == zipcodelist[j])
          duplicates=true;

    return duplicates;
  }


  static boolean duplicatesOne(final int[] zipcodelist)
  {
    final int MAXZIP = 99999;
    boolean[] bitmap = new boolean[MAXZIP + 1];
    java.util.Arrays.fill(bitmap, false);
    for (int item : zipcodelist) {
      if (!(bitmap[item] ^= true))
        return true;
    }
    return false;
  }

  static boolean duplicatesTwo(final int[] zipcodelist)
  {
    final int MAXZIP = 99999;

    BitSet b = new BitSet(MAXZIP + 1);
    b.set(0, MAXZIP, false);
    for (int item : zipcodelist) {
      if (!b.get(item)) {
        b.set(item, true);
      } else
        return true;
    }
    return false;
  }

  enum ApproachT { NSQUARED, HASHSET, BITSET};

  /**
   * @param args
   */
  public static void main(String[] args)
  {
    ApproachT approach = ApproachT.BITSET;

    final int REPS = 100;
    final int MAXZIP = 99999;

    int[] sizes = new int[] { 10, 1000, 10000, 100000, 1000000 };
    long[][] times = new long[sizes.length][REPS];

    boolean tossme = false;

    for (int sizei = 0; sizei < sizes.length; sizei++) {
      System.err.println("Trial for zipcodelist size= "+sizes[sizei]);
      for (int rep = 0; rep < REPS; rep++) {
        int[] zipcodelist = new int[sizes[sizei]];
        for (int i = 0; i < zipcodelist.length; i++) {
          zipcodelist[i] = (int) (Math.random() * (MAXZIP + 1));
        }
        long begin = System.currentTimeMillis();
        switch (approach) {
        case NSQUARED :
          tossme ^= (duplicatesZero(zipcodelist));
          break;
        case HASHSET :
          tossme ^= (duplicatesOne(zipcodelist));
          break;
        case BITSET :
          tossme ^= (duplicatesTwo(zipcodelist));
          break;

        }
        long end = System.currentTimeMillis();
        times[sizei][rep] = end - begin;


      }
      long avg = 0;
      for (int rep = 0; rep < REPS; rep++) {
        avg += times[sizei][rep];
      }
      System.err.println("Size=" + sizes[sizei] + ", avg time = "
            + avg / (double)REPS + "ms");
    }
  }

}

使用NSQUARED:

Trial for size= 10
Size=10, avg time = 0.0ms
Trial for size= 1000
Size=1000, avg time = 0.0ms
Trial for size= 10000
Size=10000, avg time = 100.0ms
Trial for size= 100000
Size=100000, avg time = 9923.3ms

使用HashSet

Trial for zipcodelist size= 10
Size=10, avg time = 0.16ms
Trial for zipcodelist size= 1000
Size=1000, avg time = 0.15ms
Trial for zipcodelist size= 10000
Size=10000, avg time = 0.0ms
Trial for zipcodelist size= 100000
Size=100000, avg time = 0.16ms
Trial for zipcodelist size= 1000000
Size=1000000, avg time = 0.0ms

使用BitSet

Trial for zipcodelist size= 10
Size=10, avg time = 0.0ms
Trial for zipcodelist size= 1000
Size=1000, avg time = 0.0ms
Trial for zipcodelist size= 10000
Size=10000, avg time = 0.0ms
Trial for zipcodelist size= 100000
Size=100000, avg time = 0.0ms
Trial for zipcodelist size= 1000000
Size=1000000, avg time = 0.0ms

BITSET赢了!

但是只有头发... .15ms在currentTimeMillis()的错误范围内,并且我的基准测试中存在一些漏洞。请注意,对于任何超过100000的列表,您只需返回true,因为会有重复。事实上,如果列表是随机的,你可以为更短的列表返回真正的WHP。什么是道德?在限制中,最有效的实现是:

 return true;

你经常不会出错。

答案 1 :(得分:13)

让我们看看你的算法是如何工作的:

an array of unique values:

[1, 2, 3]

check 1 == 1. yes, there is duplicate, assigning duplicate to true.
check 1 == 2. no, doing nothing.
check 1 == 3. no, doing nothing.
check 2 == 1. no, doing nothing.
check 2 == 2. yes, there is duplicate, assigning duplicate to true.
check 2 == 3. no, doing nothing.
check 3 == 1. no, doing nothing.
check 3 == 2. no, doing nothing.
check 3 == 3. yes, there is duplicate, assigning duplicate to true.

更好的算法:

for (j=0;j<zipcodeList.length;j++) {
    for (k=j+1;k<zipcodeList.length;k++) {
        if (zipcodeList[k]==zipcodeList[j]){ // or use .equals()
            return true;
        }
    }
}
return false;

答案 2 :(得分:12)

您可以使用位图来获得更大的阵列性能。

    java.util.Arrays.fill(bitmap, false);

    for (int item : zipcodeList)
        if (!bitmap[item]) bitmap[item] = true;
        else break;

更新:这是我当天的一个非常疏忽的回答,保留在这里仅供参考。你应该参考andersoj的优秀answer

答案 3 :(得分:4)

要检查重复项,您需要比较不同对。

答案 4 :(得分:2)

因为您正在将数组的第一个元素与自身进行比较,因此它发现即使没有重复也存在重复。

答案 5 :(得分:2)

初始化k = j + 1。您不会将元素与自身进行比较,也不会重复比较。例如,j = 0,k = 1且k = 0,j = 1比较同一组元素。这将删除k = 0,j = 1的比较。

答案 6 :(得分:1)

请勿使用==使用.equals

尝试这样做(IIRC,ZipCode需要实现Comparable才能实现此目的。

boolean unique;
Set<ZipCode> s = new TreeSet<ZipCode>();
for( ZipCode zc : zipcodelist )
    unique||=s.add(zc);
duplicates = !unique;

答案 7 :(得分:1)

您也可以使用Set,它不允许在Java中复制..

    for (String name : names)
    {         
      if (set.add(name) == false) 
         { // your duplicate element }
    }

使用add()方法并检查返回值。如果add()返回false,则表示该集合中不允许该元素,这是您的副本。

答案 8 :(得分:1)

public static ArrayList<Integer> duplicate(final int[] zipcodelist) {

    HashSet<Integer> hs = new HashSet<>();
    ArrayList<Integer> al = new ArrayList<>();
    for(int element: zipcodelist) {
        if(hs.add(element)==false) {
            al.add(element);
        }   
    }
    return al;
}

答案 9 :(得分:0)

使用这种方法怎么样?

var apiArray=[

     {
            "api" : "login",
            "path" : "/auth/login",
            "baseurl":"/auth/login"
           "method" : "POST"
         },
         {
           "api" : "logout",
           "path" : "/auth/logout/{sessionId}",
           "baseurl":"/auth/logout/"
           "method" : "POST"
         }]


function getApiName(apiArray, reqbaseurl){

  // compares reqPath with the path property of each element in the array
  // returns the name of the api for path that matched.
  for(var i=0; i < apiArray.length;i++){
     if(apiArray[i].baseurl=== reqbaseurl) return apiArray[i].api;
  }
  return null;
}

答案 10 :(得分:0)

@andersoj给出了一个很好的答案,但我也希望添加新的简单方法

    private boolean checkDuplicateBySet(Integer[] zipcodeList) {
        Set<Integer> zipcodeSet = new HashSet(Arrays.asList(zipcodeList));
        if (zipcodeSet.size() == zipcodeList.length) {
            return true;
        }
        return false;
    }

如果zipcodeList是int [],你需要首先将int []转换为Integer [](不是自动装箱),代码here

完整的代码将是:

    private boolean checkDuplicateBySet2(int[] zipcodeList) {
        Integer[] zipcodeIntegerArray = new Integer[zipcodeList.length];
        for (int i = 0; i < zipcodeList.length; i++) {
            zipcodeIntegerArray[i] = Integer.valueOf(zipcodeList[i]);
        }

        Set<Integer> zipcodeSet = new HashSet(Arrays.asList(zipcodeIntegerArray));
        if (zipcodeSet.size() == zipcodeList.length) {
            return true;
        }
        return false;
    }

希望这有帮助!

答案 11 :(得分:0)

打印所有重复元素。当没有找到重复元素时,输出-1

import java.util.*;

public class PrintDuplicate {

    public static void main(String args[]){
        HashMap<Integer,Integer> h = new HashMap<Integer,Integer>();


        Scanner s=new Scanner(System.in);
        int ii=s.nextInt();
        int k=s.nextInt();
        int[] arr=new  int[k];
        int[] arr1=new  int[k];
        int l=0;
        for(int i=0; i<arr.length; i++)
            arr[i]=s.nextInt();
        for(int i=0; i<arr.length; i++){
            if(h.containsKey(arr[i])){
                h.put(arr[i], h.get(arr[i]) + 1);
                arr1[l++]=arr[i];
            } else {
                h.put(arr[i], 1);
            }
        }
        if(l>0)
        { 
            for(int i=0;i<l;i++)
                System.out.println(arr1[i]);
        }
        else
            System.out.println(-1);
    }
}

答案 12 :(得分:0)

import java.util.Scanner;

public class Duplicates {
    public static void main(String[] args) {
        Scanner console = new Scanner(System.in);
        int number = console.nextInt();
        String numb = "" + number;
        int leng = numb.length()-1;

        if (numb.charAt(0) != numb.charAt(1)) {
            System.out.print(numb.substring(0,1));
        }

        for (int i = 0; i < leng; i++){

          if (numb.charAt(i)==numb.charAt(i+1)){ 
             System.out.print(numb.substring(i,i+1));
          }
          else {
              System.out.print(numb.substring(i+1,i+2));
          }
       }
   }
}

答案 13 :(得分:0)

该程序将打印数组中所有重复的值。

public static void main(String [] args){

    int[] array = new int[] { -1, 3, 4, 4,4,3, 9,-1, 5,5,5, 5 };
    
    Arrays.sort(array);

 boolean isMatched = false;
 int lstMatch =-1;
      for(int i = 0; i < array.length; i++) {  
          try {
                if(array[i] == array[i+1]) { 
                    isMatched = true;
                    lstMatch = array[i+1]; 
                }
                else if(isMatched) {
                    System.out.println(lstMatch);
                    isMatched = false;
                    lstMatch = -1;
                }
          }catch(Exception ex) {
              //TODO NA
          }

      }
      if(isMatched) {
          System.out.println(lstMatch);
      }

}

}

答案 14 :(得分:0)

  public static void findDuplicates(List<Integer> list){
    if(list!=null && !list.isEmpty()){
      Set<Integer> uniques = new HashSet<>();
      Set<Integer> duplicates = new HashSet<>();

      for(int i=0;i<list.size();i++){
        if(!uniques.add(list.get(i))){
          duplicates.add(list.get(i));
        }
      }
      System.out.println("Uniques: "+uniques);
      System.out.println("Duplicates: "+duplicates);
    }else{
      System.out.println("LIST IS INVALID");
    }
  }