我创建了两个包含两个独立txt文件字符串的HashMaps。
现在,我正在尝试比较两个HashMaps并计算每个文件包含的重复值的数量。例如,如果file1和file2都包含字符串" hello"两次,我的控制台应该打印:你好2次出现。
这是我的第一个HashMap:
List<String> word_list = new ArrayList<>();
//Load your words to the word_list here
while (INPUT_TEXT1.hasNext()) {
String input_word = INPUT_TEXT1.next();
word_list.add(input_word);
}
INPUT_TEXT1.close();
String regexPattern = "[^a-zA-Z]";
int index = 0;
for (String s : word_list) {
word_list.set(index++, s.replaceAll(regexPattern, "").toLowerCase());
}
//Find the unique words now from list
String[] uniqueWords = word_list.stream().distinct().
toArray(size -> new String[size]);
Map<String, Integer> wordsMap = new HashMap<>();
int frequency = 0;
//Load the words to Map with each uniqueword as Key and frequency as Value
for (String uniqueWord : uniqueWords) {
frequency = Collections.frequency(word_list, uniqueWord);
System.out.println(uniqueWord+" occured "+frequency+" times");
wordsMap.put(uniqueWord, frequency);
}
//Now, Sort the words with the reverse order of frequency(value of HashMap)
Stream<Entry<String, Integer>> topWords = wordsMap.entrySet().stream().
sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(6);
//Now print the Top 5 words to console
System.out.println("Top 5 Words:::");
topWords.forEach(System.out::println);
System.out.println("\n\n");
这是我的第二个HashMap:
List<String> wordList = new ArrayList<>();
//Load your words to the word_list here
while (INPUT_TEXT2.hasNext()) {
String input_word1 = INPUT_TEXT2.next();
wordList.add(input_word1);
}
INPUT_TEXT2.close();
String regex = "[^a-zA-Z]";
int index1 = 0;
for (String s : wordList) {
wordList.set(index1++, s.replaceAll(regex, "").toLowerCase());
}
String[] uniqueWords1 = wordList.stream().distinct().
toArray(size -> new String[size]);
Map<String, Integer> wordsMap1 = new HashMap<>();
//Load the words to Map with each uniqueword as Key and frequency as Value
for (String uniqueWord : uniqueWords1) {
frequency = Collections.frequency(wordList, uniqueWord);
System.out.println(uniqueWord+" occured "+frequency+" times");
wordsMap.put(uniqueWord, frequency);
}
//Now, Sort the words with the reverse order of frequency(value of HashMap)
Stream<Entry<String, Integer>> topWords1 = wordsMap1.entrySet().stream().
sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(6)
以下是我找到重复值的原始方法:
boolean val = wordsMap.keySet().containsAll(wordsMap1.keySet());
for (Entry<String, Integer> str : wordsMap.entrySet()) {
System.out.println("================= " + str.getKey());
if(wordsMap1.containsKey(str.getKey())){
System.out.println("Map2 Contains Map 1 Key");
}
}
System.out.println("================= " + val);
有没有人有任何其他建议来实现这一目标?谢谢
修改 我怎么能计算每个单独值的出现次数?
答案 0 :(得分:3)
我认为你的代码也可以运行。如果您的目标是找到更好的方法来实施上一次检查,您可以尝试这样做:
Set<String> keySetMap1 = new HashSet<String>(wordsMap.keySet());
Set<String> keySet2 = wordsMap1.keySet();
keySetMap1.retainAll(keySet2);
keySetMap1.stream().forEach(x -> System.out.println("Map2 Contains Map 1 Key: "+x));