1 00:00:00,870 --> 00:00:02,700 - [Tutor] Up to this point in the videos, 2 00:00:02,700 --> 00:00:05,270 we've been focusing on a number 3 00:00:05,270 --> 00:00:08,570 of different built in collections in Python. 4 00:00:08,570 --> 00:00:12,210 We've looked in detail now at lists, tuples, 5 00:00:12,210 --> 00:00:15,230 and in this lesson we've been focusing on dictionaries. 6 00:00:15,230 --> 00:00:18,810 Next up, we'd like to take a look at the set collection, 7 00:00:18,810 --> 00:00:21,440 which is also built into Python. 8 00:00:21,440 --> 00:00:24,910 A set is a collection of unique values, 9 00:00:24,910 --> 00:00:28,700 an unordered collection of unique values specifically, 10 00:00:28,700 --> 00:00:32,020 so you should never depend on the order 11 00:00:32,020 --> 00:00:34,640 in which items appear in a set. 12 00:00:34,640 --> 00:00:38,000 It is mainly used to determine membership, 13 00:00:38,000 --> 00:00:40,180 whether or not something is in a set 14 00:00:40,180 --> 00:00:43,230 and we're going to show you a number of different operations 15 00:00:43,230 --> 00:00:46,230 that are used along those lines. 16 00:00:46,230 --> 00:00:49,540 So let's go ahead and create our very first set. 17 00:00:49,540 --> 00:00:52,710 In this case, we have a set of strings called colors. 18 00:00:52,710 --> 00:00:56,600 And you'll notice that sets are delimited by curly braces, 19 00:00:56,600 --> 00:00:58,280 just like dictionaries. 20 00:00:58,280 --> 00:01:03,060 However, in a dictionary we have a key, a colon and a value 21 00:01:03,060 --> 00:01:06,240 for each item in the comma separated list. 22 00:01:06,240 --> 00:01:10,290 Whereas in a set, we just have a key in each item 23 00:01:10,290 --> 00:01:13,670 that we are passing into the curly braces here. 24 00:01:13,670 --> 00:01:15,560 So another thing I wanna point out 25 00:01:15,560 --> 00:01:18,370 is that I don't have all unique values here. 26 00:01:18,370 --> 00:01:21,680 So what happens is, as the set gets created, 27 00:01:21,680 --> 00:01:25,320 each of these values is going to be inserted into the set 28 00:01:25,320 --> 00:01:28,240 and then when we get to the second copy of red, 29 00:01:28,240 --> 00:01:30,650 it will try to insert that into the set, 30 00:01:30,650 --> 00:01:32,320 see that the value is already there 31 00:01:32,320 --> 00:01:35,060 and simply discard the extra value. 32 00:01:35,060 --> 00:01:39,750 So duplicate elimination is one of the key capabilities 33 00:01:39,750 --> 00:01:42,540 of sets it ignores the duplicates. 34 00:01:42,540 --> 00:01:45,900 So let's go ahead and evaluate colors 35 00:01:45,900 --> 00:01:47,800 and you'll notice two things about 36 00:01:47,800 --> 00:01:52,800 the resulting curly bracketed list of keys. 37 00:01:52,920 --> 00:01:55,730 In this case, there's only one key 38 00:01:55,730 --> 00:01:58,580 with the value red even though we started with two 39 00:01:58,580 --> 00:02:00,880 in the statement that we wrote up above. 40 00:02:00,880 --> 00:02:03,540 And you'll also notice that the values are not 41 00:02:03,540 --> 00:02:06,950 in the same order that we specified 42 00:02:06,950 --> 00:02:09,940 when we created the set in the first place. 43 00:02:09,940 --> 00:02:12,500 So you should never rely on the order 44 00:02:12,500 --> 00:02:17,100 in which the elements are listed inside of a set. 45 00:02:17,100 --> 00:02:22,100 It is mainly again used for membership style testing. 46 00:02:22,100 --> 00:02:24,500 Now you can determine how many elements, 47 00:02:24,500 --> 00:02:26,160 are in a set very easily. 48 00:02:26,160 --> 00:02:29,360 So we have that built in len function once again 49 00:02:29,360 --> 00:02:31,580 and if you pass it a set, 50 00:02:31,580 --> 00:02:34,060 it will count up the number of elements in that set 51 00:02:34,060 --> 00:02:35,770 and give that back to you. 52 00:02:35,770 --> 00:02:39,540 And you can also very easily check whether a value, 53 00:02:39,540 --> 00:02:44,100 is in a set by using the in and not in operators. 54 00:02:44,100 --> 00:02:47,320 So if you want to do a membership test, for example, 55 00:02:47,320 --> 00:02:50,900 if you wanna see if red is in colors, 56 00:02:50,900 --> 00:02:53,260 you can write an expression like this, 57 00:02:53,260 --> 00:02:55,680 you might use that in an if statement for example, 58 00:02:55,680 --> 00:02:57,930 and in this case, we can see that indeed, 59 00:02:57,930 --> 00:03:02,930 red is in colors whereas if we do purple in colors, 60 00:03:04,070 --> 00:03:07,810 of course, that's false because purple is not in colors. 61 00:03:07,810 --> 00:03:09,360 And along those same lines, 62 00:03:09,360 --> 00:03:13,690 if I then test not in colors, purple not in colors, 63 00:03:13,690 --> 00:03:16,490 you can see that indeed, that's true. 64 00:03:16,490 --> 00:03:18,610 Now, as you might expect, 65 00:03:18,610 --> 00:03:21,820 you are able to iterate through sets, 66 00:03:21,820 --> 00:03:24,470 just like you can iterate through lists, 67 00:03:24,470 --> 00:03:27,340 and tuples, and dictionaries and strings. 68 00:03:27,340 --> 00:03:30,390 So, again, Python's really flexible 69 00:03:30,390 --> 00:03:32,450 in that you can do a lot of the same things 70 00:03:32,450 --> 00:03:34,670 across many different data types. 71 00:03:34,670 --> 00:03:36,840 So sets are iterable 72 00:03:36,840 --> 00:03:38,640 and here's a little for loop, 73 00:03:38,640 --> 00:03:40,850 that's going to walk through each color 74 00:03:40,850 --> 00:03:43,480 in the set called colors and for each one, 75 00:03:43,480 --> 00:03:44,920 in this case, we're going to give you 76 00:03:44,920 --> 00:03:47,860 the all uppercase version of the string, 77 00:03:47,860 --> 00:03:50,320 separated from the next by a space. 78 00:03:50,320 --> 00:03:51,710 So let's go ahead and do that 79 00:03:51,710 --> 00:03:55,380 and you can see we get the same values once again. 80 00:03:55,380 --> 00:03:59,140 However, I would like you to notice a couple of things here. 81 00:03:59,140 --> 00:04:00,960 Number one, they're all uppercase, 82 00:04:00,960 --> 00:04:03,440 because we called upper on each element. 83 00:04:03,440 --> 00:04:06,210 And number two, they're not in the same order 84 00:04:06,210 --> 00:04:08,690 that they were displayed up above either. 85 00:04:08,690 --> 00:04:10,920 So again, don't rely on the order 86 00:04:10,920 --> 00:04:13,140 of the elements within that set, 87 00:04:13,140 --> 00:04:16,020 I can't emphasize that enough. 88 00:04:16,020 --> 00:04:19,500 Now, you can also create sets, 89 00:04:19,500 --> 00:04:23,090 using a built in function called set. 90 00:04:23,090 --> 00:04:25,330 So to demonstrate this concept, 91 00:04:25,330 --> 00:04:28,700 let me first create a list called numbers. 92 00:04:28,700 --> 00:04:31,630 And in this particular case, we have a list 93 00:04:32,860 --> 00:04:35,430 with the elements from zero to nine in it 94 00:04:35,430 --> 00:04:36,600 that we're going to create, 95 00:04:36,600 --> 00:04:38,350 and we're going to concatenate that 96 00:04:38,350 --> 00:04:41,160 with another list containing zero through four. 97 00:04:41,160 --> 00:04:44,090 So there will be some duplicates in this list. 98 00:04:44,090 --> 00:04:47,070 And to show you that we'll display the list here. 99 00:04:47,070 --> 00:04:49,860 So you see zero through nine and then zero through four. 100 00:04:49,860 --> 00:04:53,130 So zero through four are duplicated twice 101 00:04:53,130 --> 00:04:55,090 in this particular list. 102 00:04:55,090 --> 00:04:58,410 Now, when you use the set built in function, 103 00:04:58,410 --> 00:05:01,000 and you give it a sequence of values, 104 00:05:01,000 --> 00:05:03,770 it's going to create a set object. 105 00:05:03,770 --> 00:05:06,340 And as you might expect, 106 00:05:06,340 --> 00:05:10,400 that set object will have only the unique items 107 00:05:10,400 --> 00:05:12,720 from the argument sequence. 108 00:05:12,720 --> 00:05:14,890 So if I do that, you can see we wind up 109 00:05:14,890 --> 00:05:17,930 with a set containing only zero through nine. 110 00:05:17,930 --> 00:05:20,410 And the duplicate values at the end of the list 111 00:05:20,410 --> 00:05:24,320 were simply ignored as the set was created. 112 00:05:24,320 --> 00:05:29,320 Now, you may recall that when I created an empty dictionary, 113 00:05:29,890 --> 00:05:33,090 I did that with an empty set of curly braces. 114 00:05:33,090 --> 00:05:37,110 Well, that means I can't use an empty set of curly braces 115 00:05:37,110 --> 00:05:39,670 to define an empty set object. 116 00:05:39,670 --> 00:05:42,070 So if you do need an empty set, 117 00:05:42,070 --> 00:05:44,830 the way you're going to do that is with the set function 118 00:05:44,830 --> 00:05:46,430 with no arguments, 119 00:05:46,430 --> 00:05:50,640 and that is going to be represented in its string format, 120 00:05:50,640 --> 00:05:53,740 as the word set followed by empty parentheses. 121 00:05:53,740 --> 00:05:56,770 So sets are a little bit different 122 00:05:56,770 --> 00:06:01,570 in terms of how they are represented in string format 123 00:06:01,570 --> 00:06:04,150 for an empty set versus an empty list 124 00:06:04,150 --> 00:06:07,980 which would be displayed as square brackets, 125 00:06:07,980 --> 00:06:11,390 an empty tuple, which would be displayed as parentheses, 126 00:06:11,390 --> 00:06:13,560 or an empty dictionary, 127 00:06:13,560 --> 00:06:17,500 which would be displayed as a set of curly braces. 128 00:06:17,500 --> 00:06:21,610 Now, just for your information sets are mutable, 129 00:06:21,610 --> 00:06:24,690 meaning you can modify their contents. 130 00:06:24,690 --> 00:06:29,010 The elements within sets, however, have to be immutable, 131 00:06:29,010 --> 00:06:32,170 because they have to also be hashable, 132 00:06:32,170 --> 00:06:35,850 only immutable objects can be hashable. 133 00:06:35,850 --> 00:06:40,390 And the hashability is what enables items to be placed 134 00:06:40,390 --> 00:06:44,960 into a set or a dictionary and allows the data structures 135 00:06:44,960 --> 00:06:47,930 to maintain those unique keys. 136 00:06:47,930 --> 00:06:51,520 Now, it turns out that there is a type 137 00:06:51,520 --> 00:06:55,420 in the Python standard library called frozen set, 138 00:06:55,420 --> 00:06:58,740 which is an immutable representation of a set. 139 00:06:58,740 --> 00:07:03,450 So if in fact you need a frozen set of elements, 140 00:07:03,450 --> 00:07:05,610 meaning that once the set is created, 141 00:07:05,610 --> 00:07:07,520 it can never be modified. 142 00:07:07,520 --> 00:07:09,960 Then you can use the frozen set type 143 00:07:09,960 --> 00:07:13,030 and frozen sets because they are immutable, 144 00:07:13,030 --> 00:07:16,170 if they also contain only immutable objects 145 00:07:16,170 --> 00:07:18,170 which they have to for keys, 146 00:07:18,170 --> 00:07:22,430 you can then use them actually as keys in other sets, 147 00:07:22,430 --> 00:07:24,893 or in dictionaries as well.