1 00:00:00,000 --> 00:00:02,574 In this lesson, I'm going to show you how you 2 00:00:02,586 --> 00:00:05,000 can read some text from a file with Python 3 00:00:05,001 --> 00:00:09,000 and how to organize this text so you can use it in your program. 4 00:00:09,001 --> 00:00:15,000 So I have first created a new Python file here, read-write.python for this section. 5 00:00:15,001 --> 00:00:21,000 And then in the same directory here, at the same level, I have created a file test. 6 00:00:21,001 --> 00:00:26,000 Okay, this is just the name for the file, which contains a list of numbers. 7 00:00:26,001 --> 00:00:29,000 And actually, you can see we have just one number per line. 8 00:00:29,001 --> 00:00:33,000 Okay, those are integer numbers with positive and negatives. 9 00:00:33,001 --> 00:00:38,000 One thing you can see is that this file doesn't have any extension, okay? 10 00:00:38,001 --> 00:00:41,000 You can create a file without any extension. That's not the problem. 11 00:00:41,001 --> 00:00:46,000 On Windows, it's going to say that maybe it's not going to recognize the file, 12 00:00:46,001 --> 00:00:50,000 but you actually don't need an extension to create a text file. 13 00:00:50,001 --> 00:00:54,000 You are going to be able to read it from Python without any problem. 14 00:00:54,001 --> 00:00:58,000 So you can create a file, put it in the same folder here. 15 00:00:58,001 --> 00:01:01,000 If you want to do that, you can just create a new file here. 16 00:01:01,001 --> 00:01:05,000 Don't create a Python file. Just create a new file, give 17 00:01:05,001 --> 00:01:08,000 it a name, and that's it, all from the file manager. 18 00:01:08,001 --> 00:01:11,000 And now, how we are going to read that file from Python. 19 00:01:11,001 --> 00:01:15,000 So I'm going first to write the code structure, okay? 20 00:01:15,001 --> 00:01:17,000 And that's going to be the same every time. 21 00:01:17,001 --> 00:01:19,000 And then I'm going to explain it. 22 00:01:19,001 --> 00:01:23,000 So with keyword, with-- so open function. 23 00:01:23,001 --> 00:01:27,000 The first parameter is the name of the file or the path. 24 00:01:28,000 --> 00:01:35,000 The second is R for reading, and then you have as F colon. 25 00:01:35,001 --> 00:01:37,000 Go back to a new line with indentation. 26 00:01:37,001 --> 00:01:42,000 And if you want, for example, to print the entire 27 00:01:42,001 --> 00:01:44,000 file, you can do print, and then F dot read. 28 00:01:44,001 --> 00:01:46,000 So let's actually run that. 29 00:01:46,001 --> 00:01:51,000 And you can see we have the content of the file here. 30 00:01:51,001 --> 00:01:53,000 Okay, all the numbers you see here are here. 31 00:01:53,001 --> 00:01:54,000 So that worked. 32 00:01:54,001 --> 00:01:57,000 And now let me explain that a bit more. 33 00:01:57,001 --> 00:02:00,000 So this structure you are going to use it to read from any file. 34 00:02:00,001 --> 00:02:03,000 Okay, you start with the with keyword, okay? 35 00:02:03,001 --> 00:02:05,000 And then open function, okay? 36 00:02:05,001 --> 00:02:09,000 The thing with file is that you need to first 37 00:02:09,001 --> 00:02:11,000 open it, and then you need to close the file. 38 00:02:11,001 --> 00:02:13,000 That's very important. 39 00:02:13,001 --> 00:02:14,000 Here you can see we don't close the file. 40 00:02:14,001 --> 00:02:16,000 We don't call a close function. 41 00:02:16,001 --> 00:02:17,000 Why is that? 42 00:02:17,001 --> 00:02:21,000 Because the with keyword is going to take care of that, okay? 43 00:02:21,001 --> 00:02:24,000 So whatever is after the with, you can be sure 44 00:02:24,001 --> 00:02:26,000 that you can do whatever you want with this file. 45 00:02:26,001 --> 00:02:29,000 And then when it's going to go out, or if you have an error, 46 00:02:29,001 --> 00:02:32,000 the with keyword will make sure that the file is closed. 47 00:02:32,001 --> 00:02:36,000 So you can open it again safely the next time. 48 00:02:36,001 --> 00:02:39,000 So the function open is going to open the file. 49 00:02:39,001 --> 00:02:40,000 You need first to provide the path. 50 00:02:40,001 --> 00:02:42,000 Okay, so this is the path. 51 00:02:42,001 --> 00:02:46,000 This is directly the name of the file because we are in the same folder. 52 00:02:46,001 --> 00:02:51,000 And then you need to provide, so between quotes, you need to provide the permission. 53 00:02:51,001 --> 00:02:53,000 What do you want to do with the file? 54 00:02:53,001 --> 00:02:57,000 Here we just want to read, so you're going to put the letter R. 55 00:02:57,001 --> 00:02:58,000 As simple as that. 56 00:02:58,001 --> 00:03:01,000 And then you have ask keyword and then F. 57 00:03:01,001 --> 00:03:06,000 F is going to be the variable you can use inside the with block. 58 00:03:06,001 --> 00:03:10,000 And this represents actually the file that was opened here. 59 00:03:10,001 --> 00:03:15,000 Okay, so you write this and then you can do whatever you want with the file. 60 00:03:15,001 --> 00:03:20,000 And now we have F.read, which is a function you can 61 00:03:20,001 --> 00:03:23,000 use on a file, which is simply going to read the file. 62 00:03:23,001 --> 00:03:27,000 So to read everything and we just print that. 63 00:03:27,001 --> 00:03:32,000 All right, and now what if we want, for example, to read each line separately 64 00:03:32,001 --> 00:03:34,000 and not just get the whole text in one block? 65 00:03:34,001 --> 00:03:37,000 What you can do, I'm going to remove that. 66 00:03:37,001 --> 00:03:39,000 You can do for line in F. 67 00:03:39,001 --> 00:03:43,000 So this also is a very common structure for line in F. 68 00:03:43,001 --> 00:03:46,000 And let's do print line. 69 00:03:46,001 --> 00:03:49,000 So for line in F, so the file that you get here 70 00:03:49,001 --> 00:03:53,000 is going to iterate through all the lines of the file. 71 00:03:53,001 --> 00:03:58,000 So I can run this and you can see we still have the same numbers. 72 00:03:58,001 --> 00:04:01,000 But we also have a new line every time. 73 00:04:01,001 --> 00:04:04,000 Okay, we have an additional new line. 74 00:04:04,001 --> 00:04:06,000 Okay, so how to get rid of this. 75 00:04:06,001 --> 00:04:09,000 So this is very common when you do this structure. 76 00:04:09,001 --> 00:04:11,000 How to get rid of this? 77 00:04:11,001 --> 00:04:12,000 Well, line is a string. 78 00:04:12,001 --> 00:04:18,000 And the thing is that every time you go to a new line with 79 00:04:18,001 --> 00:04:21,000 this structure, it's going to add a new line character. 80 00:04:21,001 --> 00:04:25,000 So you need to remove that new line character for each line you read. 81 00:04:25,001 --> 00:04:30,000 And to do that, you simply do line.rsstrip. 82 00:04:30,001 --> 00:04:33,000 Okay, this is a function on string. 83 00:04:33,001 --> 00:04:37,000 And you can do quote and backslash n. 84 00:04:37,001 --> 00:04:40,000 So actually, what is this backslash n? 85 00:04:40,001 --> 00:04:43,000 Okay, if I am going to do something print, 86 00:04:44,000 --> 00:04:49,000 I'm going to do a backslash n, b backslash n, c. 87 00:04:49,001 --> 00:04:52,000 Okay, and I'm going to run that. 88 00:04:52,001 --> 00:04:56,000 And you can see we have a, b, c each on a new line. 89 00:04:56,001 --> 00:05:00,000 Okay, the backslash n character is actually a special character. 90 00:05:00,001 --> 00:05:05,000 So when you do backslash plus letter, you usually have a special character. 91 00:05:05,001 --> 00:05:08,000 And this is simply going back to a new line. 92 00:05:08,001 --> 00:05:11,000 So the problem with this structure is simply that 93 00:05:11,001 --> 00:05:14,000 you get a new new line character for each line. 94 00:05:14,001 --> 00:05:18,000 And the r.sstrip is simply going to remove it from the end of the string. 95 00:05:18,001 --> 00:05:21,000 So if you have this string, for example, 96 00:05:21,001 --> 00:05:26,000 and you use that function, it's going to remove the last backslash n. 97 00:05:26,001 --> 00:05:30,000 Okay, so now I'm going to run it like this. 98 00:05:30,001 --> 00:05:34,489 And you can see we have each line without any 99 00:05:34,501 --> 00:05:39,000 additional backslash n on new line character. 100 00:05:39,001 --> 00:05:43,000 Great, so now you can open the file and process each line separately. 101 00:05:44,000 --> 00:05:46,000 What I'm going to show you now is about this. 102 00:05:46,001 --> 00:05:51,000 I'm going to come back to this file name or file path. 103 00:05:51,001 --> 00:05:54,000 So you have two ways to provide the path here. 104 00:05:54,001 --> 00:05:58,000 Either the relative path, which is what we've done here. 105 00:05:58,001 --> 00:06:05,000 So relative is simply that where is this file relative to that one? 106 00:06:05,001 --> 00:06:08,000 This is in the same folder, so you just write the name of the file. 107 00:06:08,001 --> 00:06:11,000 If this is somewhere else, you need to change the path, okay? 108 00:06:12,000 --> 00:06:15,000 So if you create a new folder and you put this in the folder, 109 00:06:15,001 --> 00:06:17,000 you need to put the name of the folder first. 110 00:06:17,001 --> 00:06:22,000 Now what you can also do is to provide the absolute path. 111 00:06:22,001 --> 00:06:29,000 So I'm going to go here and right click and do copy path, okay? 112 00:06:29,001 --> 00:06:31,602 And you can see, so we have the relative path, 113 00:06:31,614 --> 00:06:34,000 which is that one, path from content root. 114 00:06:34,001 --> 00:06:41,000 And we have the absolute path, which is actually where the file is exactly, okay? 115 00:06:41,001 --> 00:06:45,386 So see users, my username, and then PyCharm 116 00:06:45,398 --> 00:06:50,000 projects, my first project and the file name. 117 00:06:50,001 --> 00:06:53,000 So I'm going to copy this and I'm going to replace it here. 118 00:06:53,001 --> 00:06:58,000 And using this, I'm going to be able to open the file the same way. 119 00:06:58,001 --> 00:07:01,000 But for Windows, we need to change something, okay? 120 00:07:01,001 --> 00:07:07,000 If you're using Linux or Mac OS, you don't need to change anything on the path 121 00:07:07,001 --> 00:07:10,000 because it's going to use forward slashes, okay? 122 00:07:10,001 --> 00:07:13,000 So forward slash is simply like this, okay? 123 00:07:13,001 --> 00:07:15,000 But on Windows, you have backslashes. 124 00:07:15,001 --> 00:07:19,000 And as you have seen with the backslash, and when you use a backslash, 125 00:07:19,001 --> 00:07:23,000 usually it means that you have a special character. 126 00:07:23,001 --> 00:07:27,000 So what we need to do if you want to use a backslash in a string actually, 127 00:07:27,001 --> 00:07:33,000 without triggering a special character, you need to double each backslash. 128 00:07:33,001 --> 00:07:35,000 So that's what I'm going to do. 129 00:07:35,001 --> 00:07:39,000 Every time I see a backslash, I double it with another backslash. 130 00:07:39,001 --> 00:07:43,000 As you can see here, you have a recognized 131 00:07:43,001 --> 00:07:45,000 backslash F, which is going to do something. 132 00:07:45,001 --> 00:07:50,000 If I put a backslash here, it's simply a normal backslash, okay? 133 00:07:50,001 --> 00:07:52,000 And now I have the absolute path. 134 00:07:52,001 --> 00:07:55,000 I'm going to run this and this is going to work the same. 135 00:07:55,001 --> 00:08:01,000 So you can either provide the absolute path or the 136 00:08:01,001 --> 00:08:03,000 relative path, depending on what is more convenient. 137 00:08:03,001 --> 00:08:07,000 So now what you can do with this, well, you can do whatever you want. 138 00:08:07,001 --> 00:08:10,000 For example, you could create a number list, 139 00:08:10,001 --> 00:08:14,000 again, empty number list before the width structure. 140 00:08:14,001 --> 00:08:17,000 And then for each line, you're going to add 141 00:08:17,001 --> 00:08:20,000 the new line or the new number in the list. 142 00:08:20,001 --> 00:08:26,000 Okay, so make sure you also cast, and the number, so cast this 143 00:08:26,001 --> 00:08:31,000 to integer float, so you can make them usable in your list 144 00:08:31,001 --> 00:08:33,000 when you want to make computations with numbers. 145 00:08:33,001 --> 00:08:36,000 All right, so now you can read from a file, 146 00:08:36,001 --> 00:08:40,000 and you can also see how to organize the content you read, 147 00:08:40,001 --> 00:08:43,000 so you can then process it in your Python code.