hash table

Work in Progress

Summary

Hash table

OperationMethodPerformance
search(v)check if the index h(v) is filled-
insert(v)fill index h(v)-
remove(v)empty index h(v)-
0123...mv0v3vmh(v)

identical to DAT, just using the hash of the value as the index

Separate chaining

  • closed addressing
  • usually DLL for each “bucket”
  • can have
0123...mv1v2...v3...vkvn...

worse case is to traverse the bucket, a good hash function will prevent this

Linear probing

  • open addressing
  • forms undesirable primary clusters
(h(v) + step++) % m

1, 2, 3, 4, 5, ...
01234:::mbaseaddress

Quadratic probing

  • open addressing
  • forms secondary clusters
  • and and is a prime, then quadratic probing will always find an empty slot
(h(v) + (step * step++)) % m

1, 4, 9, 16, 25, ...
01234:::mbaseaddress

Double hashing

  • open addressing
  • secondary hash adds variation to the jump
(h(v) + (step++ * h2(v))) % m

Concept

Hashing

  • map some large range of values(might be non-integers, ie. strings) to integer keys
  • prone to collisions due to the pigeonhole principle
  • must be deterministic

Perfect hash function

  • a bijective function -> all keys are mapped 1-1 to a value
  • no collisions
  • no wasted space

Load factor

usually the hash table is rehashed when for open addressing

Collision resolution

  • closed addressing
    • multiple values stored in a bucket
  • open addressing
    • at most one value in each slot
    • maintain a DEL indicator to avoid breaking probing
  • goals:
    • find an empty slot if it exists

Application

Leetcode: Contains Duplicate II

java
if (nums.length < 2) // ignore trivial case
	return false;

Map<Integer, Integer> mp = new HashMap<>();

for (int i = 0; i < nums.length; i++) {
	int num = nums[i];
	if (mp.containsKey(num)) 
		if (i - mp.get(num) <= k) 
			return true;

	mp.put(num, i); // add the most recent index of the number to the map
}

return false;

Leetcode: Longest Substring Without Repeating Characters

java
Map<Character, Integer> mp = new HashMap<>();

int left = 0, max = 0;

for (int i = 0; i < s.length(); i++) {
	char c = s.charAt(i);
	if (mp.containsKey(c))
		left = Math.max(left, mp.get(c) + 1); // set left if its larger then the old value
	max = Math.max(max, i - left + 1); // compare against the length of the current string

	mp.put(c, i); // add the most rescent index of this character to the map
}

return max;

Leetcode: longest-consecutive-sequence

java
HashSet<Integer> hs = new HashSet<>(); // dont care about duplicates

for (int n : nums)
	hs.add(n);

int longest = 0;
for (int n : hs) {
	if (hs.contains(n - 1)) continue; // dont bother searching, since the smaller value would have been searched already

	int cur = 1;
	while (hs.contains(++n))
		cur++;

	longest = Math.max(longest, cur);
}

return longest;

Leetcode: group-anagrams

  • map the anagrams(values) according to the sorted version(hash)
java
Map<String, List<String>> mp = new HashMap<>(); // map from the sorted anagram to the other anagrams

for (String w : strs) {
	char[] chars = w.toCharArray();
	Arrays.sort(chars);
	String k = new String(chars); // sorted anagram as the key

	List<String> ls;
	if (mp.containsKey(k)) {
		ls = mp.get(k); // retrieve the existing list
	} else {
		ls = new ArrayList<>(); // create new list if it doesnt yet exist
	}
	ls.add(w);
	mp.put(k, ls);
}

return new ArrayList<>(mp.values()); // HashSet -> Collection -> List

Leetcode: unique-number-of-occurrences

  • use a set to remove duplicates
java
Map<Integer, Integer> map = new HashMap<>(); // map each number to its occurance

for (int n : arr) {
	if (map.containsKey(n))
		map.put(n, map.get(n) + 1);
	else
		map.put(n, 1);
}

Set<Integer> count = new HashSet<>(map.values());
return map.size() == count.size(); // if they are the same size there are no duplicates

Leetcode: Ugly Number II

java
TreeSet<Long> pq = new TreeSet<>(); // overflows with Integer, -ve values get polled first

pq.add(1L);
while (--n > 0) {
	long curr = pq.pollFirst();
	pq.add(curr * 2);
	pq.add(curr * 3);
	pq.add(curr * 5);
}

return pq.pollFirst().intValue();

Leetcode: Unique Length-3 Palindromes

  • palindromes in the format X * X
java
// s.length >= 3
HashMap<Character, int[]> mp = new HashMap<>();

for (int i = 0; i < s.length(); i++) {
	int[] app = mp.getOrDefault(s.charAt(i), new int[] { i, i });
	app[1] = i;
	mp.put(s.charAt(i), app); // first and last appearances
}

int total = 0;
for (Map.Entry<Character, int[]> e : mp.entrySet()) {
	char c = e.getKey();
	int[] app = e.getValue();
	
	HashSet<Character> hs = new HashSet<>(); // prevent duplicates
	for (int i = app[0] + 1; i < app[1]; i++) 
		hs.add(s.charAt(i));

	total += hs.size();
}

return total;