1 00:00:07,100 --> 00:00:10,120 - Welcome to lesson 9, files and exceptions. 2 00:00:10,120 --> 00:00:12,000 In this lesson, we're going to look at a number 3 00:00:12,000 --> 00:00:14,350 of different file processing techniques, 4 00:00:14,350 --> 00:00:18,350 and also introduce the Python exception handling mechanism 5 00:00:18,350 --> 00:00:22,150 for dealing with problems as they occur at execution time. 6 00:00:22,150 --> 00:00:25,127 And the reason we bundled these two things together is that 7 00:00:25,127 --> 00:00:27,250 in the context of file processing, 8 00:00:27,250 --> 00:00:30,330 you will often run in to various exceptions, 9 00:00:30,330 --> 00:00:33,990 so it just felt like a good way to introduce both topics. 10 00:00:33,990 --> 00:00:35,660 Now, we'll start out in this lesson 11 00:00:35,660 --> 00:00:38,650 with some basic text-file processing 12 00:00:38,650 --> 00:00:41,100 and we'll use similar techniques to work 13 00:00:41,100 --> 00:00:44,110 with what's known as JavaScript Object Notation, 14 00:00:44,110 --> 00:00:47,530 and that which we call JSON for short is 15 00:00:47,530 --> 00:00:50,930 one of the most popular data interchange formats 16 00:00:50,930 --> 00:00:54,290 for sending and receiving data over the Internet. 17 00:00:54,290 --> 00:00:57,290 So we'll demonstrate a library called json, 18 00:00:57,290 --> 00:00:59,100 a module called json that's built 19 00:00:59,100 --> 00:01:01,350 into the Python standard library. 20 00:01:01,350 --> 00:01:03,600 And in both the text-file processing 21 00:01:03,600 --> 00:01:06,770 and the JSON file processing we will be using 22 00:01:06,770 --> 00:01:10,240 a Python statement with, which enables you 23 00:01:10,240 --> 00:01:13,240 to acquire a resource such as a file 24 00:01:13,240 --> 00:01:15,710 that you open a connection to on your disk, 25 00:01:15,710 --> 00:01:19,440 use that resource and then automatically de-allocate 26 00:01:19,440 --> 00:01:22,950 or close that resource when you're done using it, 27 00:01:22,950 --> 00:01:25,880 so with statements are going to be frequently used 28 00:01:25,880 --> 00:01:29,300 any time you do things like opening files 29 00:01:29,300 --> 00:01:33,020 for reading or writing, opening database connections 30 00:01:33,020 --> 00:01:36,210 to read from a database or write into a database, 31 00:01:36,210 --> 00:01:39,100 opening network connections to send bytes 32 00:01:39,100 --> 00:01:40,930 and characters over the Internet, 33 00:01:40,930 --> 00:01:42,930 or receive them onto your machine, 34 00:01:42,930 --> 00:01:46,390 and many other similar scenarios, as well. 35 00:01:46,390 --> 00:01:48,640 So after that, we're going to move onto 36 00:01:48,640 --> 00:01:50,930 the exception handling part of the lesson, 37 00:01:50,930 --> 00:01:54,090 and there we'll introduce the various syntax elements 38 00:01:54,090 --> 00:01:56,020 for exception handling in Python. 39 00:01:56,020 --> 00:01:58,940 And part of what I'll do there is compare and contrast 40 00:01:58,940 --> 00:02:01,580 how Python does things with some of the other 41 00:02:01,580 --> 00:02:06,580 C-based languages, specifically C++, Java and C# as well. 42 00:02:07,040 --> 00:02:08,580 You'll see a lot of similarities, 43 00:02:08,580 --> 00:02:12,500 but you will also see some differences in Python's way 44 00:02:12,500 --> 00:02:16,120 of doing things with the exception handling mechanism. 45 00:02:16,120 --> 00:02:18,864 As part of that we'll introduce the try statement. 46 00:02:18,864 --> 00:02:23,340 It has instead of a catch clause, it has an except clause, 47 00:02:23,340 --> 00:02:26,010 unlike some of the other programming languages. 48 00:02:26,010 --> 00:02:27,930 We'll also introduce the else clause 49 00:02:27,930 --> 00:02:30,800 which none of the other programming languages has. 50 00:02:30,800 --> 00:02:33,070 And the finally clause, as well, 51 00:02:33,070 --> 00:02:36,317 which you may or may not know from Java and C#. 52 00:02:38,040 --> 00:02:40,310 Also in the exception handling presentation, 53 00:02:40,310 --> 00:02:42,590 we'll take a look at the raise statement, 54 00:02:42,590 --> 00:02:45,990 which is how you throw an exception in Python. 55 00:02:45,990 --> 00:02:48,930 We call that raising an exception in Python. 56 00:02:48,930 --> 00:02:52,680 And we'll take a look in more detail at what's known 57 00:02:52,680 --> 00:02:56,130 as a traceback in the context of Python. 58 00:02:56,130 --> 00:02:58,500 Other programming languages typically refer to that 59 00:02:58,500 --> 00:03:00,120 as a stack trace. 60 00:03:00,120 --> 00:03:03,240 Python has some different information in the traceback 61 00:03:03,240 --> 00:03:05,760 and it's ordered differently compared to a lot 62 00:03:05,760 --> 00:03:09,140 of the other programming languages you may be familiar with. 63 00:03:09,140 --> 00:03:10,470 Now, once we get to the intro 64 00:03:10,470 --> 00:03:13,000 to data science section in this lesson, 65 00:03:13,000 --> 00:03:16,670 we'll go back to our discussions of file processing, 66 00:03:16,670 --> 00:03:21,130 this time in the context of a library called csv, 67 00:03:21,130 --> 00:03:23,380 which stands for comma-separated values 68 00:03:23,380 --> 00:03:26,610 that's built into the Python standard library. 69 00:03:26,610 --> 00:03:29,150 And one of the reasons for that is 70 00:03:29,150 --> 00:03:31,450 from this point forward we're going to do a lot 71 00:03:31,450 --> 00:03:34,140 with data sets in subsequent lessons 72 00:03:34,140 --> 00:03:36,530 and a lot of those data sets are provided 73 00:03:36,530 --> 00:03:39,030 in comma-separated value format, 74 00:03:39,030 --> 00:03:42,500 so it is important to understand how to read that format, 75 00:03:42,500 --> 00:03:45,100 also how to write that format as well. 76 00:03:45,100 --> 00:03:46,940 And to finish up that section, 77 00:03:46,940 --> 00:03:49,350 we're also going to do that in the context 78 00:03:49,350 --> 00:03:50,920 of the pandas library, 79 00:03:50,920 --> 00:03:53,970 which it turns out provides a super simple function 80 00:03:53,970 --> 00:03:56,220 that you can use to load data 81 00:03:56,220 --> 00:04:00,253 from a data set represented as a CSV file.