Python 101: Learning About Sets

A set data type is defined as an “unordered collection of distinct hashable objects” according to the Python 3 documentation. You can use a set for membership testing, removing duplicates from a sequence and computing mathematical operations, like intersection, union, difference, and symmetric difference.

Due to the fact that they are unordered collections, a set does not record element position or order of insertion. Because of that, they also do not support indexing, slicing or other sequence-like behaviors that you have seen with lists and tuples.

There is two types of set built-in to the Python language:

  • set – which is mutable
  • frozenset – which is immutable and hashable

This article will focus on set.

You will learn how to do the following with sets:

  • Creating a set
  • Accessing set members
  • Changing items
  • Adding items
  • Removing items
  • Deleting a set

Let’s get started by creating a set!

Would you like to learn more about Python?

Python 101 – 2nd Edition

Purchase now on Leanpub

Creating a Set

Creating a set is pretty straight-forward. You can create them by adding a series of comma-separated objects inside of curly braces or you can pass a sequence to the built-in set() function.

Let’s look at an example:

>>> my_set = {"a", "b", "c", "c"}
>>> my_set
{'c', 'a', 'b'}
>>> type(my_set)
<class 'set'>

A set uses the same curly braces that you used to create a dictionary. Note that instead of key: value pairs, you have a series of values. When you print out the set, you can see that duplicates were removed automatically.

Now let’s try creating a set using set():

>>> my_list = [1, 2, 3, 4]
>>> my_set = set(my_list)
>>> my_set
{1, 2, 3, 4}
>>> type(my_set)
<class 'set'>

In this example, you created a list and then cast it to a set using set(). If there had been any duplicates in the list, they would have been removed.

Now let’s move along and see some of the things that you can do with this data type.

Accessing Set Members

You can check if an item is in a set by using Python’s is operator:

>>> my_set = {"a", "b", "c", "c"}
>>> "a" in my_set
True

Sets do not allow you to use slicing or the like to access individual members of the set. Instead, you need to iterate over a set. You can do that using a loop, such as a while loop or a for loop.

You won’t be covering loops until chapter 12, but here is the basic syntax for iterating over a collection using a for loop:

>>> for item in my_set:
...     print(item)
... 
c
a
b

This will loop over each item in the set one at a time and print it out.

Changing Items

Once a set is created, you cannot change any of its items.

However, you can add new items to a set. Let’s find out how!

Adding Items

There are two ways to add items to a set:

  • add()
  • update()

Let’s try adding an item using add():

>>> my_set = {"a", "b", "c", "c"}
>>> my_set.add('d')
>>> my_set
{'d', 'c', 'a', 'b'}

That was easy! You were able to add an item to the set by passing it into the add() method.

If you’d like to add multiple items all at once, then you should use update() instead:

>>> my_set = {"a", "b", "c", "c"}
>>> my_set.update(['d', 'e', 'f'])
>>> my_set
{'a', 'c', 'd', 'e', 'b', 'f'}

Note that update() will take any iterable you pass to it. So it could take, for example, a list, tuple or another set.

Removing Items

You can remove items from sets in several different ways.

You can use:

  • remove()
  • discard()
  • pop()

Let’s go over each of these in the following sub-sections!

Using .remove()

The remove() method will attempt to remove the specified item from a set:

>>> my_set = {"a", "b", "c", "c"}
>>> my_set.remove('a')
>>> my_set
{'c', 'b'}

If you happen to ask the set to remove() an item that does not exist, you will receive an error:

>>> my_set = {"a", "b", "c", "c"}
>>> my_set.remove('f')
Traceback (most recent call last):
  Python Shell, prompt 208, line 1
builtins.KeyError: 'f'

Now let’s see how the closely related discard() method works!

Using .discard()

The discard() method works in almost exactly the same way as remove() in that it will remove the specified item from the set:

>>> my_set = {"a", "b", "c", "c"}
>>> my_set.discard('b')
>>> my_set
{'c', 'a'}

The difference with discard() though is that it won’t throw an error if you try to remove an item that doesn’t exist:

>>> my_set = {"a", "b", "c", "c"}
>>> my_set.discard('d')
>>>

If you want to be able to catch an error when you attempt to remove an item that does not exist, use remove(). If that doesn’t matter to you, then discard() might be a better choice.

Using .pop()

The pop() method will remove and return an arbitrary item from the set:

>>> my_set = {"a", "b", "c", "c"}
>>> my_set.pop()
'c'
>>> my_set
{'a', 'b'}

If your set is empty and you try to pop() and item out, you will receive an error:

>>> my_set = {"a"}
>>> my_set.pop()
'a'
>>> my_set.pop()
Traceback (most recent call last):
  Python Shell, prompt 219, line 1
builtins.KeyError: 'pop from an empty set'

This is very similar to the way that pop() works with the list data type, except that with a list, it will raise an IndexError. Also lists are ordered while sets are not, so you can’t be sure what you will be removing with pop() since sets are not ordered.

Clearing or Deleting a Set

Sometimes you will want to empty a set or even completely remove it.

To empty a set, you can use clear():

>>> my_set = {"a", "b", "c", "c"}
>>> my_set.clear()
>>> my_set
set()

If you want to completely remove the set, then you can use Python’s del built-in:

>>> my_set = {"a", "b", "c", "c"}
>>> del my_set
>>> my_set
Traceback (most recent call last):
  Python Shell, prompt 227, line 1
builtins.NameError: name 'my_set' is not defined

Now let’s learn what else you can do with sets!

Set Operations

Sets provide you with some common operations such as:

  • union() – Combines two sets and returns a new set
  • intersection() – Returns a new set with the elements that are common between the two sets
  • difference() – Returns a new set with elements that are not in the other set

These operations are the most common ones that you will use when working with sets.

The union() method is actually kind of like the update() method that you learned about earlier, in that it combines two or more sets together into a new set. However the difference is that it returns a new set rather than updating the original set with new items:

>>> first_set = {'one', 'two', 'three'}
>>> second_set = {'orange', 'banana', 'peach'}
>>> first_set.union(second_set)
{'two', 'banana', 'three', 'peach', 'orange', 'one'}
>>> first_set
{'two', 'three', 'one'}

In this example, you create two sets. Then you use union() on the first set to add the second set to it. However union doesn’t update the set. It creates a new set. If you want to save the new set, then you should do the following instead:

>>> united_set = first_set.union(second_set)
>>> united_set
{'two', 'banana', 'three', 'peach', 'orange', 'one'}

The intersection() method takes two sets and returns a new set that contains only the items that are the same in both of the sets.

Let’s look at an example:

>>> first_set = {'one', 'two', 'three'}
>>> second_set = {'orange', 'banana', 'peach', 'one'}
>>> first_set.intersection(second_set)
{'one'}

These two sets have only one item in common: the string “one”. So when you call intersection(), it returns a new set with a single element in it. As with union(), if you want to save off this new set, then you would want to do something like this:

>>> intersection = first_set.intersection(second_set)
>>> intersection
{'one'}

The difference() method will return a new set with the elements in the set that are not in the other set. This can be a bit confusing, so let’s look at a couple of examples:

>>> first_set = {'one', 'two', 'three'}
>>> second_set = {'three', 'four', 'one'}
>>> first_set.difference(second_set)
{'two'}
>>> second_set.difference(first_set)
{'four'}

When you call difference() on the first_set, it returns a set with “two” as its only element. This is because “two” is the only string not found in the second_set. When you call difference() on the second_set, it will return “four” because “four” is not in the first_set.

There are other methods that you can use with sets, but they are used pretty infrequently. You should go check the documentation for full details on set methods should you need to use them.

Wrapping Up

Sets are a great data type that is used for pretty specific situations. You will find sets most useful for de-duplicating lists or tuples or by using them to find differences between multiple lists.

In this article, you learned about the following:

  • Creating a set
  • Accessing set members
  • Changing items
  • Adding items
  • Removing items
  • Deleting a set

Any time you need to use a set-like operation, you should take a look at this data type. However, in all likelihood, you will be using lists, dictionaries, and tuples much more often.

Related Reading