1 00:00:00,720 --> 00:00:03,130 - In this Intro to Data Science section, we're going 2 00:00:03,130 --> 00:00:06,380 to talk about dynamic visualization. 3 00:00:06,380 --> 00:00:10,050 We'll do that over the course of a couple of videos here. 4 00:00:10,050 --> 00:00:13,050 You may recall from the proceeding Intro to Data Science 5 00:00:13,050 --> 00:00:17,320 section where we first introduced static visualizations 6 00:00:17,320 --> 00:00:20,010 that we did a die rolling simulation. 7 00:00:20,010 --> 00:00:23,100 We rolled a six sided die some number of times, 8 00:00:23,100 --> 00:00:26,600 and we talked about the fact that if it's truly random 9 00:00:26,600 --> 00:00:29,800 each of those six values have an equal probability 10 00:00:29,800 --> 00:00:33,650 or likelihood of being chosen every time we pick 11 00:00:33,650 --> 00:00:35,500 a random number. 12 00:00:35,500 --> 00:00:38,610 Every die face on a six sided die has a approximately 13 00:00:38,610 --> 00:00:42,180 a one in sixth chance of being chosen on a given 14 00:00:42,180 --> 00:00:47,180 random roll, and that equates to approximately 16.667%. 15 00:00:49,290 --> 00:00:52,600 You may recall that we ran that static visualization 16 00:00:52,600 --> 00:00:53,560 several times. 17 00:00:53,560 --> 00:00:57,288 First we started with only 600 die rolls, and we saw 18 00:00:57,288 --> 00:01:01,900 that yes all of the die faces came out close 19 00:01:01,900 --> 00:01:06,900 to 100 each, which is what we would expect if we have 600 20 00:01:06,920 --> 00:01:08,080 total die rolls. 21 00:01:08,080 --> 00:01:10,530 But they weren't so close that the bars looked 22 00:01:10,530 --> 00:01:12,230 to be about the same height. 23 00:01:12,230 --> 00:01:16,150 Then we ran the example again, and we did 60000 rolls. 24 00:01:16,150 --> 00:01:20,520 Now the bar heights were getting closer to being even. 25 00:01:20,520 --> 00:01:23,510 Then we took it a step further and did six million rolls, 26 00:01:23,510 --> 00:01:28,250 and we got nearly one million of each the different faces. 27 00:01:28,250 --> 00:01:32,100 Those static visualizations were helpful for starting 28 00:01:32,100 --> 00:01:35,560 to understand the law of large numbers 29 00:01:35,560 --> 00:01:39,600 which shows that as we roll those dice more and more 30 00:01:39,600 --> 00:01:43,920 and more that we narrow in on, or close in on, 31 00:01:43,920 --> 00:01:48,920 the 16.667% we expect for each of those die faces. 32 00:01:50,310 --> 00:01:53,800 As helpful as that was, to really get a sense 33 00:01:53,800 --> 00:01:57,050 of the law of large numbers in action, it's helpful 34 00:01:57,050 --> 00:02:01,210 to do it with an animated visualization. 35 00:02:01,210 --> 00:02:05,960 A dynamic visualization that changes as we roll the dice. 36 00:02:05,960 --> 00:02:09,842 THat's what we're going to focus on in this example. 37 00:02:09,842 --> 00:02:13,902 In our static visualizations, we ran the simulation first 38 00:02:13,902 --> 00:02:16,440 then displayed the results. 39 00:02:16,440 --> 00:02:20,000 With dynamic visualizations, we're going to take advantage 40 00:02:20,000 --> 00:02:24,660 of a capability from the Matplotlib visualization library 41 00:02:24,660 --> 00:02:29,140 called a FuncAnimation in which we use a function 42 00:02:29,140 --> 00:02:33,030 to drive an animation that updates the visualization 43 00:02:33,030 --> 00:02:38,030 dynamically every period of time in our particular case. 44 00:02:38,960 --> 00:02:41,660 As you'll see momentarily, every approximately 45 00:02:41,660 --> 00:02:45,670 33 milliseconds we're going to update what is presented 46 00:02:45,670 --> 00:02:48,940 on screen, so the bars will be jumping and dancing 47 00:02:48,940 --> 00:02:52,420 and you'll really get a sense of the law of large numbers 48 00:02:52,420 --> 00:02:53,689 in action. 49 00:02:53,689 --> 00:02:57,100 The way FuncAnimation works is it drives 50 00:02:57,100 --> 00:03:00,080 an animation one frame at a time, what we call 51 00:03:00,080 --> 00:03:02,880 a frame-by-frame animation. 52 00:03:02,880 --> 00:03:06,048 Every animation frame is going to specify everything 53 00:03:06,048 --> 00:03:10,470 about the diagram that needs to be updated during 54 00:03:10,470 --> 00:03:13,420 one individual update. 55 00:03:13,420 --> 00:03:16,310 By stringing a whole bunch of these together, 56 00:03:16,310 --> 00:03:20,355 we will get that animated effect that we're looking for. 57 00:03:20,355 --> 00:03:24,170 In this example, every animation frame we create 58 00:03:24,170 --> 00:03:28,000 is going to roll the dice a specified number of times. 59 00:03:28,000 --> 00:03:31,670 As you'll see momentarily, we've created this as a script 60 00:03:31,670 --> 00:03:35,010 where you can specify arguments to the script 61 00:03:35,010 --> 00:03:39,075 that indicate how many dice to roll and how many 62 00:03:39,075 --> 00:03:42,600 animation frames to perform. 63 00:03:42,600 --> 00:03:45,140 We're going to on each frame clear 64 00:03:45,140 --> 00:03:48,130 the current visualization that we had already displayed, 65 00:03:48,130 --> 00:03:52,410 and create an entirely new visualization with all 66 00:03:52,410 --> 00:03:55,613 of the updated die frequency information. 67 00:03:56,452 --> 00:04:00,138 The more frames-per-second you can display, 68 00:04:00,138 --> 00:04:04,180 the smoother your animation is going to appear. 69 00:04:04,180 --> 00:04:08,060 For example, video games with fast-moving elements in them 70 00:04:08,060 --> 00:04:11,530 generally are going to try to aim for at least 30 71 00:04:11,530 --> 00:04:13,020 frames-per-second. 72 00:04:13,020 --> 00:04:17,010 Nowadays, it's not uncommon to have 60 frames-per-second 73 00:04:17,010 --> 00:04:22,010 or higher for some of these gaming devices 74 00:04:22,060 --> 00:04:24,760 that are now available to us. 75 00:04:24,760 --> 00:04:27,220 The amount of work, however, that you do in each 76 00:04:27,220 --> 00:04:30,640 of the animation frames in a FuncAnmation can effect 77 00:04:30,640 --> 00:04:34,469 your animation speed, so we do want to try to minimize 78 00:04:34,469 --> 00:04:37,490 the amount of work we do in each frame to keep 79 00:04:37,490 --> 00:04:40,290 that smooth animation going. 80 00:04:40,290 --> 00:04:42,750 In this example, I mentioned a moment ago, we're going 81 00:04:42,750 --> 00:04:45,830 to use approximately 33 milliseconds between 82 00:04:45,830 --> 00:04:48,450 our animation frames, which will give us 83 00:04:48,450 --> 00:04:51,020 about 30 frames-per-second. 84 00:04:51,020 --> 00:04:54,250 If you divide a thousand milliseconds by 33 milliseconds, 85 00:04:54,250 --> 00:04:59,050 that gives you about 30 frames-per-second being displayed. 86 00:04:59,050 --> 00:05:02,420 Let's actually demonstrate the final result, 87 00:05:02,420 --> 00:05:05,130 and then in the next video, we'll start talking about 88 00:05:05,130 --> 00:05:09,720 the code that enabled us to build the dynamic visualization. 89 00:05:09,720 --> 00:05:12,850 Let me switch over to a terminal window here. 90 00:05:12,850 --> 00:05:17,850 You can see I've listed out the contents of my ch06 folder, 91 00:05:18,020 --> 00:05:20,580 which contains all of the code for the lesson 92 00:05:20,580 --> 00:05:22,370 we're currently talking about. 93 00:05:22,370 --> 00:05:25,996 The script that we're going to look at starting in the next 94 00:05:25,996 --> 00:05:28,940 video is defined in the RollDieDynanic.py file. 95 00:05:31,320 --> 00:05:34,820 If we I would like to run this file, I'm going to go ahead 96 00:05:34,820 --> 00:05:38,520 and type ipython followed by the name of the file. 97 00:05:38,520 --> 00:05:42,590 This particular example also demonstrates for the first time 98 00:05:42,590 --> 00:05:47,040 how to process command line arguments in Python. 99 00:05:47,040 --> 00:05:50,110 We're going to allow this script to receive two 100 00:05:50,110 --> 00:05:51,760 command line arguments. 101 00:05:51,760 --> 00:05:54,650 The first one is going to represent the number 102 00:05:54,650 --> 00:05:58,341 of animation frames we would like to perform in total. 103 00:05:58,341 --> 00:06:01,651 This going to also control the overall duration 104 00:06:01,651 --> 00:06:03,390 of my animation. 105 00:06:03,390 --> 00:06:06,651 Let's say I wanted to do 6000 animation frames. 106 00:06:06,651 --> 00:06:09,890 The second command line argument is going to represent 107 00:06:09,890 --> 00:06:14,260 the number of dice to roll in each animation frame. 108 00:06:14,260 --> 00:06:18,100 To start out, I'm going to do 6000 die... 109 00:06:18,100 --> 00:06:22,070 6000 animations frames, one die roll at a time. 110 00:06:22,070 --> 00:06:25,980 That's going to enable us to really see the bars changing 111 00:06:25,980 --> 00:06:29,660 on a die-by-die or a roll-by-roll basis. 112 00:06:29,660 --> 00:06:32,870 Let me go ahead and run that and get that on the screen here 113 00:06:32,870 --> 00:06:36,655 and we'll get a chance to see it in action. 114 00:06:36,655 --> 00:06:39,660 After I let this run for a little bit here, 115 00:06:39,660 --> 00:06:43,010 we'll kill it off and start it up again rolling 116 00:06:43,010 --> 00:06:45,900 more dice at a time-per-frame. 117 00:06:45,900 --> 00:06:50,170 As you can see here, it's pretty slow in terms of updating, 118 00:06:50,170 --> 00:06:52,720 but there's a couple of things that you can see 119 00:06:52,720 --> 00:06:53,590 going on here. 120 00:06:53,590 --> 00:06:57,880 First of all, the dynamic visualization. 121 00:06:57,880 --> 00:07:00,602 So you do see the bars and the text changing 122 00:07:00,602 --> 00:07:02,420 as we roll the dice. 123 00:07:02,420 --> 00:07:05,940 You can see how many die rolls there are at a give time here 124 00:07:05,940 --> 00:07:09,870 Also, notice that Matplotlib and Seaborn, 125 00:07:09,870 --> 00:07:12,450 the two libraries that we're using for the visualization 126 00:07:12,450 --> 00:07:15,356 here, are capable of dynamically adjusting 127 00:07:15,356 --> 00:07:19,440 the y-axis here to represent the magnitudes 128 00:07:19,440 --> 00:07:21,249 of the bars that we have. 129 00:07:21,249 --> 00:07:25,200 The longest bar is effecting the total height 130 00:07:25,200 --> 00:07:27,330 of the diagram at any given time. 131 00:07:27,330 --> 00:07:31,810 Right now, we have more fives than we have of anything else. 132 00:07:31,810 --> 00:07:35,080 Over 20% of our rolls so far have been fives. 133 00:07:35,080 --> 00:07:37,230 Now it's decreasing a little bit. 134 00:07:37,230 --> 00:07:40,542 But you can see the bars heights are bouncing around, 135 00:07:40,542 --> 00:07:45,180 the number of each face is changing dynamically. 136 00:07:45,180 --> 00:07:47,130 I'm going to go ahead and kill this off here 137 00:07:47,130 --> 00:07:49,920 because it's kind of slow when we're only rolling 138 00:07:49,920 --> 00:07:53,260 one at time, so we'd have to wait a pretty long time 139 00:07:53,260 --> 00:07:57,170 before those bars actually balance out to demonstrate 140 00:07:57,170 --> 00:07:59,897 that law of large numbers in action. 141 00:07:59,897 --> 00:08:03,820 What we're going to do next is run this thing again, 142 00:08:03,820 --> 00:08:07,460 but this time, let's say we want to do 10 thousand 143 00:08:07,460 --> 00:08:09,760 animation frames, actually it doesn't really matter 144 00:08:09,760 --> 00:08:12,050 cause I'm going to cut it off a little bit short, 145 00:08:12,050 --> 00:08:16,294 but let's say we want to roll 600 dice at a time. 146 00:08:16,294 --> 00:08:20,470 If I roll 600 dice at a time, that's 600 dice 147 00:08:20,470 --> 00:08:24,000 and we're trying to do that 30 times-per-second. 148 00:08:24,000 --> 00:08:27,810 So that's 18 thousand dice per second that we want to roll 149 00:08:27,810 --> 00:08:32,340 to try to see how that effects the animation in this case. 150 00:08:32,340 --> 00:08:33,823 Let's go ahead and run that. 151 00:08:34,780 --> 00:08:37,110 In a moment window will pop up, and you can see 152 00:08:37,110 --> 00:08:40,849 it's cranking through die roll super quickly now. 153 00:08:40,849 --> 00:08:44,050 Notice right away that the bars are already starting 154 00:08:44,050 --> 00:08:47,250 to even out, and that all of our percentages 155 00:08:47,250 --> 00:08:49,840 are in the 16% range. 156 00:08:49,840 --> 00:08:53,690 You can now really see that law of large numbers in action. 157 00:08:53,690 --> 00:08:58,690 The more dice we roll, the faster we converge on 16.667%. 158 00:09:01,100 --> 00:09:04,620 We're already up to 160 thousand plus rolls here, 159 00:09:04,620 --> 00:09:09,083 and the bars are almost even in height based on the current 160 00:09:09,083 --> 00:09:11,577 die frequency values. 161 00:09:11,577 --> 00:09:16,577 This really emphasizes that law of large numbers in action. 162 00:09:17,370 --> 00:09:21,500 We're going to terminate this by closing the window, 163 00:09:21,500 --> 00:09:24,653 and in the next video, I'm going to show you how we 164 00:09:24,653 --> 00:09:28,000 created that animation in the context 165 00:09:28,000 --> 00:09:31,463 of the RollDieDynamic.py script.