1 00:00:00,000 --> 00:00:03,012 Throughout my experience with working with large data 2 00:00:03,013 --> 00:00:06,762 sets, I've noticed some techniques that data collecting 3 00:00:06,763 --> 00:00:10,836 institutes use to save storage and make them 4 00:00:10,837 --> 00:00:12,804 their data have less space 5 00:00:12,805 --> 00:00:15,338 on computer disks. A common technique 6 00:00:15,339 --> 00:00:17,226 is not to use decimals. 7 00:00:17,227 --> 00:00:19,066 For example, if you search for temperature 8 00:00:19,067 --> 00:00:21,674 data, you'll not see the actual temperatures. 9 00:00:21,675 --> 00:00:23,034 Let's say these are celsius. 10 00:00:23,035 --> 00:00:25,164 You'll not see them like that, but that 11 00:00:25,165 --> 00:00:27,292 you'll see them without the dots because that 12 00:00:27,293 --> 00:00:29,890 will save storage in the servers. 13 00:00:29,891 --> 00:00:33,628 So one character less, that means some bytes less. 14 00:00:33,629 --> 00:00:38,140 So your duty as a programmer is to actually divide all 15 00:00:38,141 --> 00:00:42,112 those values by 10, so that you get the actual data. 16 00:00:42,113 --> 00:00:45,648 So let's say these data were in some text files, and 17 00:00:45,649 --> 00:00:49,222 we open these data with Python, and the course actually covers 18 00:00:49,223 --> 00:00:52,486 how to manage, how to open text files in Python. 19 00:00:52,487 --> 00:00:54,580 But let us skip that part for now. 20 00:00:54,581 --> 00:00:57,460 So let's say we got them loaded as a list here. 21 00:00:57,461 --> 00:01:01,076 Now we want to actually divide by 10 any of them. 22 00:01:01,077 --> 00:01:02,950 So how do you do that? 23 00:01:02,951 --> 00:01:07,278 Well, a technique is to iterate over temps. 24 00:01:07,279 --> 00:01:12,950 So for temp in temps, and in each iteration, 25 00:01:12,951 --> 00:01:16,668 you want to do temp divided by ten. 26 00:01:16,669 --> 00:01:20,908 But where do we store this temp divided by ten? 27 00:01:20,909 --> 00:01:23,530 Well, we can store it in a new list. 28 00:01:23,531 --> 00:01:25,110 Let's say new_temps. 29 00:01:26,010 --> 00:01:28,080 That is an empty list for now. 30 00:01:28,081 --> 00:01:31,515 So the loop will iterate through this list, and in each iteration, 31 00:01:31,516 --> 00:01:33,700 [Author typing] 32 00:01:33,701 --> 00:01:40,530 we going to append this value to new_temps. 33 00:01:40,531 --> 00:01:46,452 So append this value. That looks okay, 34 00:01:46,453 --> 00:01:49,970 and then print(new temps). 35 00:01:49,971 --> 00:01:52,240 Let me go ahead and execute that. 36 00:01:52,241 --> 00:01:54,566 [Author typing] 37 00:01:54,566 --> 00:01:57,086 So this is the desired output. 38 00:01:57,087 --> 00:01:59,368 Now, that is very correct. 39 00:01:59,369 --> 00:02:02,152 But there is a neater way to do 40 00:02:02,153 --> 00:02:05,064 this in just one line of Python code, 41 00:02:05,065 --> 00:02:07,954 and that is by using list comprehensions. 42 00:02:07,955 --> 00:02:09,639 It goes like this. 43 00:02:10,810 --> 00:02:13,004 You don't have to create an empty list 44 00:02:13,005 --> 00:02:16,130 because a list will be generated dynamically. 45 00:02:16,131 --> 00:02:23,334 You'd say temp / 10 for temp in temps, 46 00:02:24,833 --> 00:02:28,624 print(new_temps) to see what we get, 47 00:02:28,625 --> 00:02:31,950 execute, and you get exactly the same output. 48 00:02:33,170 --> 00:02:35,898 So that is a list comprehension. 49 00:02:35,899 --> 00:02:39,332 It's a way to build lists without having to 50 00:02:39,333 --> 00:02:42,932 create a for loop, a standard for loop, because 51 00:02:42,933 --> 00:02:46,440 we actually have an inline for loop in here. 52 00:02:46,441 --> 00:02:49,208 So what's going on here is that we have an 53 00:02:49,209 --> 00:02:52,520 iteration here, and in each iteration we're going to store 54 00:02:52,521 --> 00:02:56,428 this value in the list, in the new_temps list. 55 00:02:56,429 --> 00:03:01,292 So this is like saying store temp divided by ten, 56 00:03:01,293 --> 00:03:04,594 but Python will say, okay, but what is temp? 57 00:03:04,595 --> 00:03:06,490 It's a new variable. 58 00:03:06,491 --> 00:03:08,258 We haven't defined it anywhere. 59 00:03:08,259 --> 00:03:11,794 And then we say, well, for temp in temps. 60 00:03:11,795 --> 00:03:13,970 So temp is a variable of temps. 61 00:03:13,971 --> 00:03:18,076 So that's how you create a list with these 62 00:03:18,077 --> 00:03:21,470 values for each of the values of temps. 63 00:03:22,370 --> 00:03:24,270 That is a list comprehension. 64 00:03:24,271 --> 00:03:27,766 [Outro sound]