1 00:00:00,800 --> 00:00:02,490 - [Instructor] In this and the next several videos, 2 00:00:02,490 --> 00:00:03,760 we're going to take a look at 3 00:00:03,760 --> 00:00:06,870 regular expression processing in Python. 4 00:00:06,870 --> 00:00:09,330 If you're not familiar with regular expressions, 5 00:00:09,330 --> 00:00:13,140 they are generally used for matching patterns within text 6 00:00:13,140 --> 00:00:15,310 and once you've matched those patterns, 7 00:00:15,310 --> 00:00:17,710 you typically are going to use the results 8 00:00:17,710 --> 00:00:20,120 to do things like validate data, 9 00:00:20,120 --> 00:00:22,440 making sure that it's in the right format, 10 00:00:22,440 --> 00:00:26,550 cleaning up data to prepare to use it in your code, 11 00:00:26,550 --> 00:00:29,380 scraping data to extract information. 12 00:00:29,380 --> 00:00:32,360 For example, you might want to get hashtags 13 00:00:32,360 --> 00:00:34,490 out of social media posts 14 00:00:34,490 --> 00:00:38,970 and sometimes you'll use them for transforming data as well. 15 00:00:38,970 --> 00:00:41,750 Now rarely do you ever have to create 16 00:00:41,750 --> 00:00:44,290 your own regular expressions nowadays 17 00:00:44,290 --> 00:00:48,070 because there are these tremendous repositories out there 18 00:00:48,070 --> 00:00:50,620 that catalog regular expressions 19 00:00:50,620 --> 00:00:52,930 and help you find what you need. 20 00:00:52,930 --> 00:00:56,910 So these are just a few of the repositories that I've used 21 00:00:56,910 --> 00:00:59,270 and there are many more out there as well. 22 00:00:59,270 --> 00:01:01,380 So if you find the need to use 23 00:01:01,380 --> 00:01:03,610 the techniques that I'm about to show you 24 00:01:03,610 --> 00:01:06,540 and you have a particular pattern you're looking to match, 25 00:01:06,540 --> 00:01:09,750 I would go take a look at the repositories first, 26 00:01:09,750 --> 00:01:12,390 and one of the great things about iPython 27 00:01:12,390 --> 00:01:14,500 and the interactive mode is that 28 00:01:14,500 --> 00:01:17,190 it's really easy to go ahead and work 29 00:01:17,190 --> 00:01:19,760 with regular expressions and test them out 30 00:01:19,760 --> 00:01:22,860 so that you can see if they perform 31 00:01:22,860 --> 00:01:25,583 the tasks that you need in your code.