1 00:00:00,710 --> 00:00:04,190 - In the proceeding video, we began to introduce the concept 2 00:00:04,190 --> 00:00:07,800 of list comprehensions in Python. 3 00:00:07,800 --> 00:00:08,633 In this video, 4 00:00:08,633 --> 00:00:11,030 we're going to talk about generator expressions. 5 00:00:11,030 --> 00:00:14,900 Which look almost identical to list comprehensions. 6 00:00:14,900 --> 00:00:17,070 The only difference visually 7 00:00:17,070 --> 00:00:18,860 is that list comprehensions 8 00:00:18,860 --> 00:00:21,000 are defined in square brackets 9 00:00:21,000 --> 00:00:22,700 and generator expressions 10 00:00:22,700 --> 00:00:26,330 are defined in parenthesis as you'll see momentarily. 11 00:00:26,330 --> 00:00:29,460 Now there are a couple of other differences 12 00:00:29,460 --> 00:00:33,470 between list comprehensions and generator expressions. 13 00:00:33,470 --> 00:00:37,350 List comprehensions are what we call greedy expressions, 14 00:00:37,350 --> 00:00:39,860 they have to evaluate immediately 15 00:00:39,860 --> 00:00:42,460 because their job is to immediately 16 00:00:42,460 --> 00:00:44,705 produce a list containing values 17 00:00:44,705 --> 00:00:48,000 as specified in the comprehension. 18 00:00:48,000 --> 00:00:49,570 On the other hand there's lots 19 00:00:49,570 --> 00:00:51,910 of functional style programming techniques 20 00:00:51,910 --> 00:00:55,380 that use what's known as lazy evaluation. 21 00:00:55,380 --> 00:00:59,520 Where values get produced on demand as needed 22 00:00:59,520 --> 00:01:02,750 or where values only get produced 23 00:01:02,750 --> 00:01:04,930 once you begin to execute 24 00:01:04,930 --> 00:01:07,150 the code that implements 25 00:01:07,150 --> 00:01:10,230 the functional style programming technique. 26 00:01:10,230 --> 00:01:14,600 Generator expressions are on demand expressions. 27 00:01:14,600 --> 00:01:17,150 They produce one value at a time, 28 00:01:17,150 --> 00:01:21,500 as that value gets asked for somewhere in your program. 29 00:01:21,500 --> 00:01:23,130 And because of that 30 00:01:23,130 --> 00:01:26,760 they can actually significantly improve performance 31 00:01:26,760 --> 00:01:30,570 and reduce memory consumption in an application. 32 00:01:30,570 --> 00:01:35,540 So for example if were going to use a list comprehension 33 00:01:35,540 --> 00:01:38,500 to produce a one million element list, 34 00:01:38,500 --> 00:01:41,320 I would have to create that one million element list 35 00:01:41,320 --> 00:01:45,490 immediately and all of that memory would be occupied, 36 00:01:45,490 --> 00:01:48,610 to store the actual elements of the list. 37 00:01:48,610 --> 00:01:52,790 On the other hand if I create a generator expression 38 00:01:52,790 --> 00:01:54,620 for one million elements, 39 00:01:54,620 --> 00:01:57,304 I can ask for one element at a time, 40 00:01:57,304 --> 00:01:59,060 do something with that 41 00:01:59,060 --> 00:02:03,370 and then move on and thus tremendously reduce 42 00:02:03,370 --> 00:02:06,861 the amount of memory that's required for processing 43 00:02:06,861 --> 00:02:09,180 those one million elements. 44 00:02:09,180 --> 00:02:12,650 In fact I could have a billion element generator 45 00:02:12,650 --> 00:02:16,080 and maybe my program only asks for ten elements, 46 00:02:16,080 --> 00:02:19,750 so I never wind up processing most of the elements 47 00:02:19,750 --> 00:02:22,754 that the generator is capable of producing 48 00:02:22,754 --> 00:02:25,212 and in the case of a list comprehension, 49 00:02:25,212 --> 00:02:28,200 I would have to produce all one billion elements 50 00:02:28,200 --> 00:02:32,940 in advance before I could then use that list of elements 51 00:02:32,940 --> 00:02:35,210 and do something with it in the program. 52 00:02:35,210 --> 00:02:38,940 So generator expressions can be a really handy 53 00:02:38,940 --> 00:02:41,930 technique and they're just one of several different 54 00:02:41,930 --> 00:02:46,220 techniques that are capable of producing values on demand 55 00:02:46,220 --> 00:02:49,090 in the Python programming language. 56 00:02:49,090 --> 00:02:50,240 So let's go ahead 57 00:02:50,240 --> 00:02:53,880 and take a look at a basic generator expression, 58 00:02:53,880 --> 00:02:56,940 so I'm gonna switch back over to my terminal window here. 59 00:02:56,940 --> 00:02:59,670 And for the purpose of this demonstration 60 00:02:59,670 --> 00:03:01,670 will define a list called numbers 61 00:03:01,670 --> 00:03:06,085 with a bunch of integers in a kind of random order here. 62 00:03:06,085 --> 00:03:10,070 Some of them of course are positive and some are negative. 63 00:03:10,070 --> 00:03:12,400 And we're going to create a little four loop here 64 00:03:12,400 --> 00:03:16,070 that's going to use a generator expression to pick off 65 00:03:16,070 --> 00:03:19,000 the odd values and just display them. 66 00:03:19,000 --> 00:03:21,460 Now, I'll talk more about generator expressions 67 00:03:21,460 --> 00:03:22,550 after this as well. 68 00:03:22,550 --> 00:03:26,390 So this expression right here looks very much 69 00:03:26,390 --> 00:03:28,740 like the list comprehensions we were doing 70 00:03:28,740 --> 00:03:30,720 in the proceeding video but notice 71 00:03:30,720 --> 00:03:34,270 it has parenthesis around it, rather than square brackets. 72 00:03:34,270 --> 00:03:37,110 If I put square brackets here, this, 73 00:03:37,110 --> 00:03:38,800 it would be a list comprehension 74 00:03:38,800 --> 00:03:42,910 and it would have to first evaluate in its entirety 75 00:03:42,910 --> 00:03:45,180 to create the entire list. 76 00:03:45,180 --> 00:03:48,190 Only then would the four loop be able 77 00:03:48,190 --> 00:03:51,060 to walk through all the elements within that list 78 00:03:51,060 --> 00:03:52,770 and then do whatever the four loop 79 00:03:52,770 --> 00:03:55,130 says to do in the body here. 80 00:03:55,130 --> 00:03:57,264 Now. If it's a generator expression 81 00:03:57,264 --> 00:04:00,760 the four loop can basically ask the expression 82 00:04:00,760 --> 00:04:03,880 to give me the next item in the sequence 83 00:04:03,880 --> 00:04:06,040 during each ideration of the loop 84 00:04:06,040 --> 00:04:08,660 and then print out the corresponding value. 85 00:04:08,660 --> 00:04:09,700 So, here you can see 86 00:04:09,700 --> 00:04:13,180 we're getting the square of X for each value of X 87 00:04:13,180 --> 00:04:14,970 in the numbers list 88 00:04:14,970 --> 00:04:18,120 where X divided by 2, 89 00:04:18,120 --> 00:04:20,720 gives a remainder that's not equal to 0. 90 00:04:20,720 --> 00:04:23,630 So basically, if it's equal to 0, it's even. 91 00:04:23,630 --> 00:04:25,493 If it's not equal to 0 then it's odd 92 00:04:25,493 --> 00:04:29,210 and so this is going to pick off the odd values 93 00:04:29,210 --> 00:04:34,210 in that list and only give me the squares of the odd values. 94 00:04:34,340 --> 00:04:36,290 So, let's go ahead and execute that loop 95 00:04:36,290 --> 00:04:40,590 and we can see 10 was ignored but 3 was squared. 96 00:04:40,590 --> 00:04:45,070 7 was squared to give us 49, 1 squared still gives us 1 , 97 00:04:45,070 --> 00:04:49,071 9 squared gives us 81, 4 was ignored, 2 was ignored, 98 00:04:49,071 --> 00:04:54,071 8 was ignored, 5 squared gave us 25 and 6 was ignored. 99 00:04:54,100 --> 00:04:56,830 So we filtered out half of the elements 100 00:04:56,830 --> 00:04:58,700 in this particular example 101 00:04:58,700 --> 00:05:01,770 and produced the squares of the other half of the elements 102 00:05:01,770 --> 00:05:03,710 and in terms of performance here 103 00:05:03,710 --> 00:05:04,730 because we're only dealing 104 00:05:04,730 --> 00:05:06,660 with ten elements in the first place. 105 00:05:06,660 --> 00:05:09,480 There's no major difference in performance 106 00:05:09,480 --> 00:05:12,210 but if we were dealing with millions of elements 107 00:05:12,210 --> 00:05:15,550 then you would have a massive difference in performance 108 00:05:15,550 --> 00:05:18,290 between creating a list of that size, 109 00:05:18,290 --> 00:05:19,580 the memory that it takes 110 00:05:19,580 --> 00:05:23,260 verses using a generator expression. 111 00:05:23,260 --> 00:05:26,280 Now, just to show you that a generator expression 112 00:05:26,280 --> 00:05:28,890 is not actually creating a list. 113 00:05:28,890 --> 00:05:31,000 I'm going to take this same expression 114 00:05:31,000 --> 00:05:32,780 and assign it to a variable. 115 00:05:32,780 --> 00:05:35,150 I'll call it square of odds, here. 116 00:05:35,150 --> 00:05:37,800 And if I now go and evaluate 117 00:05:37,800 --> 00:05:39,653 square of odds. 118 00:05:40,490 --> 00:05:42,300 You can see that it tells me this thing 119 00:05:42,300 --> 00:05:44,714 is actually a generator object, 120 00:05:44,714 --> 00:05:47,460 based on a generator expression. 121 00:05:47,460 --> 00:05:49,430 So this is just a notation 122 00:05:49,430 --> 00:05:51,730 that tells you that this thing, 123 00:05:51,730 --> 00:05:54,493 is in fact an object in memory.