< Python Concepts

Objective

  • Learn about Python sets.
  • Learn how to dynamically manipulate sets.
  • Learn about set math and comparison.
  • Learn about built-in set functions.
  • Learn when to use sets and when not to.

Lesson

Python Sets

Sets are mutable sequences, like lists. However, sets and lists differ. Unlike lists, you cannot use append() nor can you index or slice. Although the set has limitations, it has two advantages. The set can only contain unique items, so if there are two or more items the same, all but one will be removed. This can help get rid of duplicates. A set is an unordered collection with no duplicate elements. (The technical definition: A set object is an unordered collection of distinct hashable objects. ) Secondly, sets can perform set mathematics. This makes Python sets much like mathematical sets. To create a set, use curly braces ({}). To create an empty set you have to use set().

>>> spam = {1, 2, 3}
>>> spam
{1, 2, 3}
>>> eggs = {1, 2, 1, 3, 5, 2, 7, 3, 4}
>>> eggs
{1, 2, 3, 4, 5, 7}    # each object unique
>>> {True, False, True, False, True}
{False, True}
>>> {"hi", "hello", "hey", "hi", "hiya", "sup"}
{'hey', 'sup', 'hi', 'hello', 'hiya'}
>>>
>>> a = {} ; a
{}
>>> isinstance(a,set)
False
>>>
>>> a = set() ; a # to create empty set.
set()
>>> isinstance(a,set)
True
>>>


Note: If you try to create an empty set (a set with no items in it) you'll end up creating a dictionary. This is because the dictionary also uses the curly braces. If you want to create an empty set, you'll need to do something like spam = set(). Also, an empty set will return set() so there's no confusion with dictionaries.

Operations on a single set

Initialize the set:

>>> b = set('alacazam') ; b
{'z', 'l', 'm', 'c', 'a'}
>>> 
>>> b = {'alacazam'} ; b
{'alacazam'}
>>> 
>>> b = {'pear','plum'} ; b
{'pear', 'plum'}
>>>
>>> d = ['apple', 'pear', 'plum', 'peach', 'pecan',  'plum', 'peach', 'pecan',  'plum', 'peach'] ; d
['apple', 'pear', 'plum', 'peach', 'pecan', 'plum', 'peach', 'pecan', 'plum', 'peach']
>>> b = set(d) ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>>

Familiar operations:

>>> isinstance(b,set)
True
>>> len(b)
5
>>> 'apple' in b
True
>>> 'grape' in b
False
>>> 
>>> 'grape' not in b
True
>>> 
>>> for x in b : print ( x[0:3] ) # for x in set :
... 
pea
pec
pea
plu
app
>>> 
>>> f = b # A shallow copy.
>>> f == b
True
>>> f is b
True
>>> f = set(b) # A deep copy.
>>> f == b
True
>>> f is b
False
>>>

Operations available for set:

>>> b = set() ; b.add('alacazam') ; b # add element 'alacazam' to set b.
{'alacazam'}
>>> 
>>> b = {'pear','plum'} ; b
{'pear', 'plum'}
>>>
>>> d = ['apple', 'pear', 'plum', 'peach', 'pecan'] ; d
['apple', 'pear', 'plum', 'peach', 'pecan']
>>> for c in d : b.add(c) ; b
... 
{'pear', 'apple', 'plum'} # 'apple' was added
{'pear', 'apple', 'plum'} # 'pear' was not added.
{'pear', 'apple', 'plum'} # 'plum' was not added.
{'peach', 'pear', 'apple', 'plum'} # 'peach' was added
{'peach', 'pecan', 'pear', 'plum', 'apple'} # 'pecan' was added. ordering not same as list d.
>>> 
>>> b = {'peach', 'pecan', 'pear', 'plum', 'apple'} ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>> b.clear() ; b # remove all elements from set b
set()
>>> 
>>> b = {'peach', 'pecan', 'pear', 'plum', 'apple'} ; b
{'peach', 'pecan', 'pear', 'plum', 'apple'}
>>> a = b.pop() ; a ; b # Remove and return an arbitrary element from the set. Raises KeyError if the set is empty.
'peach'
{'pecan', 'pear', 'plum', 'apple'}
>>> 
>>> b.discard('grape') ; b # Remove element 'grape' from set b if element is present.
{'pecan', 'pear', 'plum', 'apple'}
>>> 
>>> b.discard('pear') ; b
{'pecan', 'plum', 'apple'}
>>> 
>>> b.remove('apple') ; b # Remove element 'apple' from set b. Raises KeyError if element is not contained in the set.
{'pecan', 'plum'}
>>>

Set comprehensions

Similarly to list comprehensions, set comprehensions are also supported:

>>> {x*x%7   for x in range(-234,79)}
{0, 1, 2, 4}
>>>
>>> a = {x for x in 'abracadabra' if x in 'abcrmgz'} ; a
{'b', 'a', 'c', 'r'}
>>>

Operations on two sets

set.isdisjoint(other)

Return True if set set has no elements in common with other set. Sets are disjoint if and only if their intersection is the empty set.

>>> set1 = {'pecan', 'pear', 'plum', 'apple'}
>>> set2 = {'pecan', 'pear', 'orange', 'mandarin'}
>>> set3 = {'grape', 'watermelon', 'orange', 'mandarin'}
>>> set1.isdisjoint(set2)
False
>>> set1.isdisjoint(set3)
True
>>> 
>>> {'a', 'b', 'c'}.isdisjoint( {'a', 'd', 'e'} )
False
>>> {'a', 'b', 'c'}.isdisjoint( {'z', 'd', 'e'} )
True
>>>

set.issubset(other)

Test whether every element in set set is in other. Equivalent to set <= other.

>>> {'a', 'b', 'c'}.issubset( {'a', 'b', 'c', 'd'} )
True
>>> {'a', 'b', 'c'}.issubset( ['a', 'b', 'c'] ) # this form accepts iterable for 'other'.
True
>>> {'a', 'b', 'c'}.issubset( ['a', 'b', 'd'] )
False
>>> {'a', 'b', 'c'}.issubset( 'abd' )
False
>>> {'a', 'b', 'c'}.issubset( 'abcdef' )
True
>>> 
>>> {'a', 'b', 'c'} <= {'a', 'b', 'c'} # In this form both arguments are sets.
True
>>> {'a', 'b', 'c'} <= {'a', 'b', 'c', 'd'}
True
>>> {'a', 'b', 'c'} <= {'a', 'b', 'd'}
False
>>> 
>>> {'a', 'b', 'c'} < {'a', 'b', 'c'}
False
>>> {'a', 'b', 'c'} < {'a', 'b', 'c', 'd'} # set is a proper subset of other
True
>>> {'a', 'b', 'c'} < {'a', 'b', 'd'}
False
>>>

Symmetric difference

newSet = set.symmetric_difference(other).

Return a new set with elements in either set or other but not both.

>>> newSet = {'a', 'b', 'c', 'g'}.symmetric_difference( {'a', 'b', 'h', 'i'} ) ; newSet
{'i', 'c', 'h', 'g'}
>>> {'a', 'b', 'c', 'g'}.symmetric_difference( 'abcdef' ) # this form accepts iterable for 'other'.
{'e', 'f', 'd', 'g'}
>>> {'a', 'b', 'c', 'g'} ^ {'a', 'b', 'h', 'i'} # In this form both arguments are sets.
{'i', 'c', 'h', 'g'}
>>>

Operations on two or more sets

Union


newSet = set.union(*others).

Return a new set with elements from set and all others.

>>> newSet = {'a', 'b', 'c', 'g'}.union( {'b', 'h'},'abcz', [1,2,3] ) ; newSet # iterable as argument
{'b', 1, 'z', 'c', 2, 'h', 3, 'g', 'a'}
>>> 
>>> newSet = {'a', 'b', 'c', 'g'} |  {'b', 'h'} | set([1,2,3,4]) ; newSet # operands must be sets.
{'b', 1, 2, 'c', 3, 'h', 4, 'g', 'a'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} |  {'b', 'h'} | [1,2,3,4] ; newSet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |: 'set' and 'list'
>>>

Intersection


newSet = set.intersection(*others).

Return a new set with elements common to set and all others.

>>> newSet = {'a', 'b', 'c', 'g'}.intersection( {'g', 'b', 'h'},'abczg' ) ; newSet # iterable as argument
{'b', 'g'}
>>
>>> newSet = {'a', 'b', 'c', 'g'}.intersection( {'g', 'b', 'h'},'abczg','p', 'q', 's', 'b', 'g' ) ; newSet
set()
>>> 
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'}  ; newSet # operands must be sets
{'b', 'g'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'} & set(('g', 'b', 'z', 'm')) ; newSet
{'b', 'g'}
>>>
>>> newSet = {'a', 'b', 'c', 'g'} & {'g', 'b', 'h'} & ('g', 'b', 'z', 'm') ; newSet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'set' and 'tuple'
>>>

Difference


newSet = set.difference(*others).

Return a new set with elements in set that are not in others.

>>> newSet = {'a', 'b', 'c', 'g','q', 'x'}.difference( {'g', 'b', 'h'},'bczg' ) ; newSet # iterable as argument
{'q', 'x', 'a'}
>>> 
>>> newSet = {'a', 'b', 'c', 'g','q', 'x'} -  {'g', 'b', 'h'} - set('bczg')  ; newSet # operands are sets
{'a', 'q', 'x'}
>>> newSet = {'a', 'b', 'c', 'g','q', 'x'} - set( {'g', 'b', 'h'} | set('bczg') )  ; newSet
{'q', 'x', 'a'} # same as above
>>> 
>>> {'a', 'q', 'x'} == {'q', 'x', 'a'}
True
>>>

Assignments

String str1 contains the names of all 50 states of the United States of America with some duplicates and extraneous white space. Use sets, including set comprehensions, to determine the one letter that does not appear in the name of any state.

str1 = '''                                                                                                                                       
  Indiana , Kentucky , Nebraska , California , Oregon , Washington , Hawaii , Alaska ,                                                           
  Arizona , Utah , Nevada , Idaho , New Mexico , Colorado , Wyoming , Montana , Texas                                                            
 , Oklahoma , Kansas , Nebraska , South Dakota , North D , Louisiana , Ark , Missouri ,                                                          
 Iowa , Illinois , Minnesota , Michigan , Mississippi , Tennessee , Alabama ,                                                                    
Ohio,West V , Virginia , Michigan , Florida , Georgia , S Carolina ,  ,    N C ,                                                                 
      , Maryland , Delaware , New Jersey , New York , Pennsylvania , Vermont , New  Hampshire                                                    
   , Maine , Connecticut ,         , Rhode Island , Massachussetts , Maine , Wisconsin ,                                                         
'''

Why are the letters "N C" sufficient for "North Carolina"?

One Possible Solution

Further Reading or Review

Completion status: this resource is just getting off the ground. Please feel welcome to help!

References

    1. Python's documentation:

    "Sets", "Set Types — set, frozenset", "Displays for lists, sets and dictionaries", "Set displays"


    2. Python's methods:


    3. Python's built-in functions:

    "class set(....)"

    This article is issued from Wikiversity. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.