Python Tuples
Python provides another type that is an ordered collection of objects, called a tuple. But Unlike Lists Tuples are Immutable collections.
Defining and Using Tuples
Tuples are identical to lists in all respects, except for the following properties:
- Tuples are defined by enclosing the elements in parentheses (
()
) instead of square brackets ([]
). - Tuples are immutable.
Here is a short example showing a tuple definition, indexing, and slicing:>>>
>>> t = ('foo', 'bar', 'baz', 'qux', 'quux', 'corge')
>>> t
('foo', 'bar', 'baz', 'qux', 'quux', 'corge')
>>> t[0]
'foo'
>>> t[-1]
'corge'
>>> t[1::2]
('bar', 'qux', 'corge')
Never fear! Our favorite string and list reversal mechanism works for tuples as well:>>>
>>> t[::-1]
('corge', 'quux', 'qux', 'baz', 'bar', 'foo')
Note: Even though tuples are defined using parentheses, you still index and slice tuples using square brackets, just as for strings and lists.
Everything you’ve learned about lists—they are ordered, they can contain arbitrary objects, they can be indexed and sliced, they can be nested—is true of tuples as well. But they can’t be modified:>>>
>>> t = ('foo', 'bar', 'baz', 'qux', 'quux', 'corge')
>>> t[2] = 'Bark!'
Traceback (most recent call last):
File "<pyshell#65>", line 1, in <module>
t[2] = 'Bark!'
TypeError: 'tuple' object does not support item assignment
Why use a tuple instead of a list?
- Program execution is faster when manipulating a tuple than it is for the equivalent list. (This is probably not going to be noticeable when the list or tuple is small.)
- Sometimes you don’t want data to be modified. If the values in the collection are meant to remain constant for the life of the program, using a tuple instead of a list guards against accidental modification.
- There is another Python data type that you will encounter shortly called a dictionary, which requires as one of its components a value that is of an immutable type. A tuple can be used for this purpose, whereas a list can’t be.
In a Python REPL session, you can display the values of several objects simultaneously by entering them directly at the >>>
prompt, separated by commas:>>>
>>> a = 'foo'
>>> b = 42
>>> a, 3.14159, b
('foo', 3.14159, 42)
Python displays the response in parentheses because it is implicitly interpreting the input as a tuple.
There is one peculiarity regarding tuple definition that you should be aware of. There is no ambiguity when defining an empty tuple, nor one with two or more elements. Python knows you are defining a tuple:>>>
>>> t = ()
>>> type(t)
<class 'tuple'>
>>>
>>> t = (1, 2)
>>> type(t)
<class 'tuple'>
>>> t = (1, 2, 3, 4, 5)
>>> type(t)
<class 'tuple'>
But what happens when you try to define a tuple with one item:>>>
>>> t = (2)
>>> type(t)
<class 'int'>
Doh! Since parentheses are also used to define operator precedence in expressions, Python evaluates the expression (2)
as simply the integer 2
and creates an int
object. To tell Python that you really want to define a singleton tuple, include a trailing comma (,
) just before the closing parenthesis:>>>
>>> t = (2,)
>>> type(t)
<class 'tuple'>
>>> t[0]
2
>>> t[-1]
2
You probably won’t need to define a singleton tuple often, but there has to be a way.
When you display a singleton tuple, Python includes the comma, to remind you that it’s a tuple:>>>
>>> print(t)
(2,)
Defining a Set
Python’s built-in set
type has the following characteristics:
- Sets are unordered.
- Set elements are unique. Duplicate elements are not allowed.
- A set itself may be modified, but the elements contained in the set must be of an immutable type.
Let’s see what all that means, and how you can work with sets in Python.
A set can be created in two ways. First, you can define a set with the built-in set()
function:
x = set(<iter>)
In this case, the argument <iter>
is an iterable—again, for the moment, think list or tuple—that generates the list of objects to be included in the set. This is analogous to the <iter>
argument given to the .extend()
list method:>>>
>>> x = set(['foo', 'bar', 'baz', 'foo', 'qux'])
>>> x
{'qux', 'foo', 'bar', 'baz'}
>>> x = set(('foo', 'bar', 'baz', 'foo', 'qux'))
>>> x
{'qux', 'foo', 'bar', 'baz'}
Strings are also iterable, so a string can be passed to set()
as well. You have already seen that list(s)
generates a list of the characters in the string s
. Similarly, set(s)
generates a set of the characters in s
:>>>
>>> s = 'quux'
>>> list(s)
['q', 'u', 'u', 'x']
>>> set(s)
{'x', 'u', 'q'}
You can see that the resulting sets are unordered: the original order, as specified in the definition, is not necessarily preserved. Additionally, duplicate values are only represented in the set once, as with the string 'foo'
in the first two examples and the letter 'u'
in the third.
Operating on a Set
Many of the operations that can be used for Python’s other composite data types don’t make sense for sets. For example, sets can’t be indexed or sliced. However, Python provides a whole host of operations on set objects that generally mimic the operations that are defined for mathematical sets.
Operators vs. Methods
Most, though not quite all, set operations in Python can be performed in two different ways: by operator or by method. Let’s take a look at how these operators and methods work, using set union as an example.
Given two sets, x1
and x2
, the union of x1
and x2
is a set consisting of all elements in either set.
Consider these two sets:
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}
The union of x1
and x2
is {'foo', 'bar', 'baz', 'qux', 'quux'}
.
Note: Notice that the element 'baz'
, which appears in both x1
and x2
, appears only once in the union. Sets never contain duplicate values.
In Python, set union can be performed with the |
operator:>>>
>>> x1 = {'foo', 'bar', 'baz'}
>>> x2 = {'baz', 'qux', 'quux'}
>>> x1 | x2
{'baz', 'quux', 'qux', 'bar', 'foo'}
Set union can also be obtained with the .union()
method. The method is invoked on one of the sets, and the other is passed as an argument:>>>
>>> x1.union(x2)
{'baz', 'quux', 'qux', 'bar', 'foo'}
The way they are used in the examples above, the operator and method behave identically. But there is a subtle difference between them. When you use the |
operator, both operands must be sets. The .union()
method, on the other hand, will take any iterable as an argument, convert it to a set, and then perform the union.
Observe the difference between these two statements:>>>
>>> x1 | ('baz', 'qux', 'quux')
Traceback (most recent call last):
File "<pyshell#43>", line 1, in <module>
x1 | ('baz', 'qux', 'quux')
TypeError: unsupported operand type(s) for |: 'set' and 'tuple'
>>> x1.union(('baz', 'qux', 'quux'))
{'baz', 'quux', 'qux', 'bar', 'foo'}
Both attempt to compute the union of x1
and the tuple ('baz', 'qux', 'quux')
. This fails with the |
operator but succeeds with the .union()
method.
Available Operators and Methods
Below is a list of the set operations available in Python. Some are performed by operator, some by method, and some by both. The principle outlined above generally applies: where a set is expected, methods will typically accept any iterable as an argument, but operators require actual sets as operands.
x1.union(x2[, x3 ...])
x1 | x2 [| x3 ...]
Compute the union of two or more sets.
x1.union(x2)
and x1 | x2
both return the set of all elements in either x1
or x2
:>>>
>>> x1 = {'foo', 'bar', 'baz'}
>>> x2 = {'baz', 'qux', 'quux'}
>>> x1.union(x2)
{'foo', 'qux', 'quux', 'baz', 'bar'}
>>> x1 | x2
{'foo', 'qux', 'quux', 'baz', 'bar'}
More than two sets may be specified with either the operator or the method:>>>
>>> a = {1, 2, 3, 4}
>>> b = {2, 3, 4, 5}
>>> c = {3, 4, 5, 6}
>>> d = {4, 5, 6, 7}
>>> a.union(b, c, d)
{1, 2, 3, 4, 5, 6, 7}
>>> a | b | c | d
{1, 2, 3, 4, 5, 6, 7}
The resulting set contains all elements that are present in any of the specified sets.
x1.intersection(x2[, x3 ...])
x1 & x2 [& x3 ...]
Compute the intersection of two or more sets.
x1.intersection(x2)
and x1 & x2
return the set of elements common to both x1
and x2
:>>>
>>> x1 = {'foo', 'bar', 'baz'}
>>> x2 = {'baz', 'qux', 'quux'}
>>> x1.intersection(x2)
{'baz'}
>>> x1 & x2
{'baz'}
You can specify multiple sets with the intersection method and operator, just like you can with set union:>>>
>>> a = {1, 2, 3, 4}
>>> b = {2, 3, 4, 5}
>>> c = {3, 4, 5, 6}
>>> d = {4, 5, 6, 7}
>>> a.intersection(b, c, d)
{4}
>>> a & b & c & d
{4}
The resulting set contains only elements that are present in all of the specified sets.
x1.difference(x2[, x3 ...])
x1 - x2 [- x3 ...]
Compute the difference between two or more sets.
x1.difference(x2)
and x1 - x2
return the set of all elements that are in x1
but not in x2
:>>>
>>> x1 = {'foo', 'bar', 'baz'}
>>> x2 = {'baz', 'qux', 'quux'}
>>> x1.difference(x2)
{'foo', 'bar'}
>>> x1 - x2
{'foo', 'bar'}
Another way to think of this is that x1.difference(x2)
and x1 - x2
return the set that results when any elements in x2
are removed or subtracted from x1
.
Once again, you can specify more than two sets:>>>
>>> a = {1, 2, 3, 30, 300}
>>> b = {10, 20, 30, 40}
>>> c = {100, 200, 300, 400}
>>> a.difference(b, c)
{1, 2, 3}
>>> a - b - c
{1, 2, 3}
When multiple sets are specified, the operation is performed from left to right. In the example above, a - b
is computed first, resulting in {1, 2, 3, 300}
. Then c
is subtracted from that set, leaving {1, 2, 3}
:
x1.symmetric_difference(x2)
x1 ^ x2 [^ x3 ...]
Compute the symmetric difference between sets.
x1.symmetric_difference(x2)
and x1 ^ x2
return the set of all elements in either x1
or x2
, but not both:>>>
>>> x1 = {'foo', 'bar', 'baz'}
>>> x2 = {'baz', 'qux', 'quux'}
>>> x1.symmetric_difference(x2)
{'foo', 'qux', 'quux', 'bar'}
>>> x1 ^ x2
{'foo', 'qux', 'quux', 'bar'}
The ^
operator also allows more than two sets:>>>
>>> a = {1, 2, 3, 4, 5}
>>> b = {10, 2, 3, 4, 50}
>>> c = {1, 50, 100}
>>> a ^ b ^ c
{100, 5, 10}
As with the difference operator, when multiple sets are specified, the operation is performed from left to right.
Curiously, although the ^
operator allows multiple sets, the .symmetric_difference()
method doesn’t:>>>
>>> a = {1, 2, 3, 4, 5}
>>> b = {10, 2, 3, 4, 50}
>>> c = {1, 50, 100}
>>> a.symmetric_difference(b, c)
Traceback (most recent call last):
File "<pyshell#11>", line 1, in <module>
a.symmetric_difference(b, c)
TypeError: symmetric_difference() takes exactly one argument (2 given)