Category: Hash-based
Topics: membership testing, deduplication, graph visited sets, set theory operations
An unordered collection of unique hashable elements, implemented as a hash table with only keys (no values), providing O(1) average membership testing.
| Operation | Average | Worst | Description |
|---|---|---|---|
| add(x) | O(1) | O(n) | Add element |
| remove(x) | O(1) | O(n) | Remove element, raises KeyError if missing |
| discard(x) | O(1) | O(n) | Remove element if present |
| x in s | O(1) | O(n) | Check membership |
| len(s) | O(1) | O(1) | Number of elements |
| s | t (union) | O(n+m) | O(n+m) | Elements in either set |
| s & t (intersect) | O(min) | O(n*m) | Elements in both sets |
| s - t (diff) | O(n) | O(n) | Elements in s but not t |
| s ^ t (sym diff) | O(n+m) | O(n+m) | Elements in exactly one set |
# Creation
s = {1, 2, 3}
s = set([1, 2, 2, 3]) # {1, 2, 3} - duplicates removed
# Empty set (not {} which creates dict)
empty = set()
# Add/Remove
s.add(4)
s.remove(1) # KeyError if missing
s.discard(10) # No error if missing
s.pop() # Remove and return arbitrary element
# Membership - O(1) average
if 2 in s:
print("Found!")
# Set operations
a = {1, 2, 3}
b = {2, 3, 4}
union = a | b # {1, 2, 3, 4}
intersection = a & b # {2, 3}
difference = a - b # {1}
symmetric_diff = a ^ b # {1, 4}
# Set comparisons
a.issubset(b) # Is a contained in b?
a.issuperset(b) # Does a contain b?
a.isdisjoint(b) # No common elements?
# Set comprehension
evens = {x for x in range(10) if x % 2 == 0}
# Common patterns: deduplication
duplicated_list = [1, 2, 2, 3, 3, 3]
unique = list(set(duplicated_list)) # [1, 2, 3]
# Graph visited set pattern
visited = set()
def dfs(node, graph):
if node in visited:
return
visited.add(node)
for neighbor in graph.get(node, []):
dfs(neighbor, graph)
O(n) for n elements. Like dictionaries, Python sets over-allocate for fast access. Each element stores the object reference and its hash.
Pros:
Cons: