Category: Hash-based
Topics: key-value storage, caching, indexing, frequency counting
A data structure that stores key-value pairs using a hash function to compute an index into an array of buckets, providing average O(1) lookup, insertion, and deletion.
| Operation | Average | Worst | Description |
|---|---|---|---|
| Get | O(1) | O(n) | Retrieve value by key |
| Put | O(1) | O(n) | Insert or update key-value pair |
| Delete | O(1) | O(n) | Remove key-value pair |
| Contains | O(1) | O(n) | Check if key exists |
| Resize | O(n) | O(n) | Rehash all elements |
# Python dict IS a hash map
cache = {}
# Insert - O(1) average
cache["key1"] = "value1"
cache["key2"] = "value2"
# Get - O(1) average
val = cache.get("key1", "default")
# Check existence - O(1) average
if "key1" in cache:
print("Found!")
# Delete - O(1) average
del cache["key1"]
# Simple hash map implementation (chaining)
class HashMap:
def __init__(self, capacity=16):
self.capacity = capacity
self.size = 0
self.buckets = [[] for _ in range(capacity)]
self.load_factor = 0.75
def _hash(self, key):
return hash(key) % self.capacity
def put(self, key, value):
if self.size / self.capacity >= self.load_factor:
self._resize()
idx = self._hash(key)
for i, (k, v) in enumerate(self.buckets[idx]):
if k == key:
self.buckets[idx][i] = (key, value)
return
self.buckets[idx].append((key, value))
self.size += 1
def get(self, key):
idx = self._hash(key)
for k, v in self.buckets[idx]:
if k == key:
return v
raise KeyError(key)
def _resize(self):
old_buckets = self.buckets
self.capacity *= 2
self.buckets = [[] for _ in range(self.capacity)]
self.size = 0
for bucket in old_buckets:
for key, value in bucket:
self.put(key, value)
Load factor: Python dict resizes at ~66% capacity. Higher load factor = more collisions, lower = more memory waste.
O(n) for n key-value pairs. Additional space for empty buckets depends on load factor and capacity. With separate chaining, O(m + n) where m is bucket array size.
Pros:
Cons: