Sets and Frozensets
Introduction
In this chapter of our tutorial, we are dealing with Python's implementation of sets. Though sets are nowadays an integral part of modern mathematics, this has not always been the case. The set theory had been rejected by many, even by some great thinkers. One of them was the philosopher Wittgenstein. He didn't like the set theory and complained mathematics is "ridden through and through with the pernicious idioms of set theory...". He dismissed the set theory as "utter nonsense", as being "laughable" and "wrong". His criticism appeared years after the death of the German mathematician Georg Cantor, the founder of the set theory. David Hilbert defended it from its critics by famously declaring: "No one shall expel us from the Paradise that Cantor has created.
Cantor defined a set at the beginning of his "Beiträge zur Begründung der transfiniten Mengenlehre" as: "A set is a gathering together into a whole of definite, distinct objects of our perception and of our thought - which are called elements of the set." Nowadays, we can say in "plain" English: A set is a well-defined collection of objects.
The elements or members of a set can be anything: numbers, characters, words, names, letters of the alphabet, even other sets, and so on. Sets are usually denoted with capital letters. This is not the exact mathematical definition, but it is good enough for the following.
The data type "set", which is a collection type, has been part of Python since version 2.4. A set contains an unordered collection of unique and immutable objects. The set data type is, as the name implies, a Python implementation of the sets as they are known from mathematics. This explains, why sets unlike lists or tuples can't have multiple occurrences of the same element.
Sets
If we want to create a set, we can call the built-in set function with a sequence or another iterable object.
In the following example, a string is singularized into its characters to build the resulting set x:
x = set("A Python Tutorial")
x
type(x)
We can pass a list to the built-in set function, as we can see in the following:
x = set(["Perl", "Python", "Java"])
x
Now, we want to show what happens, if we pass a tuple with reappearing elements to the set function - in our example the city "Paris":
cities = set(("Paris", "Lyon", "London","Berlin","Paris","Birmingham"))
cities
cities = set((["Python","Perl"], ["Paris", "Berlin", "London"]))
cities
Tuples on the other hand are fine:
cities = set((("Python","Perl"), ("Paris", "Berlin", "London")))
cities = set(["Frankfurt", "Basel","Freiburg"])
cities.add("Strasbourg")
cities
Frozensets are like sets except that they cannot be changed, i.e. they are immutable:
cities = frozenset(["Frankfurt", "Basel","Freiburg"])
cities.add("Strasbourg")
adjectives = {"cheap","expensive","inexpensive","economical"}
adjectives
colours = {"red","green"}
colours.add("yellow")
colours
colours.add(["black","white"])
cities = {"Stuttgart", "Konstanz", "Freiburg"}
cities.clear()
cities
more_cities = {"Winterthur","Schaffhausen","St. Gallen"}
cities_backup = more_cities.copy()
more_cities.clear()
cities_backup
Just in case, you might think, an assignment might be enough:
more_cities = {"Winterthur","Schaffhausen","St. Gallen"}
cities_backup = more_cities
more_cities.clear()
cities_backup
x = {"a","b","c","d","e"}
y = {"b","c"}
z = {"c","d"}
x.difference(y)
x.difference(y).difference(z)
Instead of using the method difference, we can use the operator "-":
x - y
x - y - z
x = {"a","b","c","d","e"}
y = {"b","c"}
x.difference_update(y)
x = {"a","b","c","d","e"}
y = {"b","c"}
x = x - y
x
x = {"a","b","c","d","e"}
x.discard("a")
x
x.discard("z")
x
x = {"a","b","c","d","e"}
x.remove("a")
x
x.remove("z")
x = {"a","b","c","d","e"}
y = {"c","d","e","f","g"}
x.union(y)
This can be abbreviated with the pipe operator "|":
x = {"a","b","c","d","e"}
y = {"c","d","e","f","g"}
x | y
x = {"a","b","c","d","e"}
y = {"c","d","e","f","g"}
x.intersection(y)
This can be abbreviated with the ampersand operator "&":
x = {"a","b","c","d","e"}
y = {"c","d","e","f","g"}
x & y
x = {"a","b","c"}
y = {"c","d","e"}
x.isdisjoint(y)
x = {"a","b","c"}
y = {"d","e","f"}
x.isdisjoint(y)
x = {"a","b","c","d","e"}
y = {"c","d"}
x.issubset(y)
y.issubset(x)
x < y
y < x # y is a proper subset of x
x < x # a set can never be a proper subset of oneself.
x <= x
x = {"a","b","c","d","e"}
y = {"c","d"}
x.issuperset(y)
x > y
x >= y
x >= x
x > x
x.issuperset(x)
x = {"a","b","c","d","e"}
x.pop()
x.pop()