1 00:00:01,640 --> 00:00:03,350 - [Lecturer] Before we get into actually using 2 00:00:03,350 --> 00:00:06,530 the APIs, it is important to get a sense of what 3 00:00:06,530 --> 00:00:09,260 you are going to receive as a developer 4 00:00:09,260 --> 00:00:12,210 when you make requests through the Twitter APIs 5 00:00:12,210 --> 00:00:14,620 to get tweet-based information. 6 00:00:14,620 --> 00:00:18,620 So what exactly is in a tweet from a developer perspective, 7 00:00:18,620 --> 00:00:20,720 not from a user perspective? 8 00:00:20,720 --> 00:00:24,670 So all the twitter API methods that we're going to be using 9 00:00:24,670 --> 00:00:26,850 in general are going to return objects 10 00:00:26,850 --> 00:00:31,330 in JavaScript Object Notation format or JSON for short. 11 00:00:31,330 --> 00:00:34,000 This is a text-based data interchanged format 12 00:00:34,000 --> 00:00:37,815 and it's extremely popular in web services. 13 00:00:37,815 --> 00:00:40,600 It's the preferred way to send and receive objects 14 00:00:40,600 --> 00:00:41,890 over the Internet. 15 00:00:41,890 --> 00:00:44,730 All of the objects are represented in data structures 16 00:00:44,730 --> 00:00:47,610 that look very much like Python dictionary, 17 00:00:47,610 --> 00:00:51,340 so it's a familiar format to just about anybody 18 00:00:51,340 --> 00:00:53,980 who works in the context of Python. 19 00:00:53,980 --> 00:00:57,980 And most importantly it's both human and computer readable. 20 00:00:57,980 --> 00:01:00,520 So the basic format of a JSON object 21 00:01:00,520 --> 00:01:04,290 is a set of curly braces with property name 22 00:01:04,290 --> 00:01:07,140 and value pairs separated by colons 23 00:01:07,140 --> 00:01:11,010 and each key value pair is going to have a comma 24 00:01:11,010 --> 00:01:13,990 separating it from the next one in the list. 25 00:01:13,990 --> 00:01:17,000 If you want to do an array in JSON format, 26 00:01:17,000 --> 00:01:19,550 it looks very much like a Python list does. 27 00:01:19,550 --> 00:01:21,990 So simply a bunch of comma to limited values 28 00:01:21,990 --> 00:01:24,460 in a set of square brackets. 29 00:01:24,460 --> 00:01:27,080 And the nice thing about the Tweepy library is that 30 00:01:27,080 --> 00:01:31,200 it handles the JavaScript object notation stuff for you 31 00:01:31,200 --> 00:01:33,770 behind the scenes, so you don't have to figure out 32 00:01:33,770 --> 00:01:35,720 how to do the formatting, 33 00:01:35,720 --> 00:01:38,620 it will take care of all of that for you. 34 00:01:38,620 --> 00:01:43,090 Now there's lots of properties in a tweet object 35 00:01:43,090 --> 00:01:44,350 that you get back. 36 00:01:44,350 --> 00:01:47,180 So every tweet, which is also sometimes referred to 37 00:01:47,180 --> 00:01:50,770 as a status update contains a tremendous amount 38 00:01:50,770 --> 00:01:54,680 of metadata that describes various aspects 39 00:01:54,680 --> 00:01:56,590 of what's in that tweet. 40 00:01:56,590 --> 00:01:58,820 When was that tweet created? 41 00:01:58,820 --> 00:02:00,180 Who created it? 42 00:02:00,180 --> 00:02:02,880 What are the hashtags inside that tweet? 43 00:02:02,880 --> 00:02:05,780 Those are the words that start with pound symbol 44 00:02:05,780 --> 00:02:07,480 or a hash symbol. 45 00:02:07,480 --> 00:02:09,790 What URLs are in the tweet? 46 00:02:09,790 --> 00:02:12,640 Are there mentions of other tweeter users, 47 00:02:12,640 --> 00:02:16,310 which are known as @ mentions inside the tweet? 48 00:02:16,310 --> 00:02:19,490 Are there any images or videos and other information 49 00:02:19,490 --> 00:02:20,550 inside the tweet? 50 00:02:20,550 --> 00:02:24,180 All of this and more is embedded inside 51 00:02:24,180 --> 00:02:26,540 of every tweet object that you get back. 52 00:02:26,540 --> 00:02:30,520 So here we have a table that lists a bunch of key properties 53 00:02:30,520 --> 00:02:31,880 inside of a tweet object, 54 00:02:31,880 --> 00:02:35,210 and of course in the online documentation for Twitter 55 00:02:35,210 --> 00:02:37,990 they have details that go far 56 00:02:37,990 --> 00:02:39,820 beyond what I'm showing you here 57 00:02:39,820 --> 00:02:41,970 about what all of these keys mean 58 00:02:41,970 --> 00:02:45,630 and what the format is of the data that is associated 59 00:02:45,630 --> 00:02:47,500 with each of these keys as well. 60 00:02:47,500 --> 00:02:50,240 So you can see when the tweet was created, 61 00:02:50,240 --> 00:02:52,640 you can find what they call the entities, 62 00:02:52,640 --> 00:02:54,660 which are all those things I was just listing 63 00:02:54,660 --> 00:02:56,803 up above a few moments ago. 64 00:02:57,730 --> 00:03:01,450 You can see the so called extended tweet information, 65 00:03:01,450 --> 00:03:02,730 which is for tweets that 66 00:03:02,730 --> 00:03:07,650 are in between 141 and 280 characters, 67 00:03:07,650 --> 00:03:10,310 the new larger tweet format. 68 00:03:10,310 --> 00:03:12,920 You can see how many times people favorited 69 00:03:12,920 --> 00:03:17,420 a given tweet, which is kind of liking it on Facebook. 70 00:03:17,420 --> 00:03:21,500 And you can also see if the user 71 00:03:21,500 --> 00:03:24,270 has allowed Twitter to track their location, 72 00:03:24,270 --> 00:03:26,910 the exact coordinates where the user was 73 00:03:26,910 --> 00:03:29,910 at the time that the tweet was sent. 74 00:03:29,910 --> 00:03:32,650 Now this is almost always null 75 00:03:32,650 --> 00:03:36,210 because by default in Twitter, 76 00:03:36,210 --> 00:03:39,480 they disable this and people have to opt in 77 00:03:39,480 --> 00:03:41,460 to location tracking. 78 00:03:41,460 --> 00:03:43,410 Some of the other key things you'll find, 79 00:03:43,410 --> 00:03:47,110 the place that's associated with the tweet in anything. 80 00:03:47,110 --> 00:03:50,840 So if somebody tweets about Time Square New York, 81 00:03:50,840 --> 00:03:54,563 that will automatically get turned into a place object. 82 00:03:55,960 --> 00:03:59,130 There's an ID for the tweet that's an integer 83 00:03:59,130 --> 00:04:01,460 and there's also a string representation 84 00:04:01,460 --> 00:04:04,390 of that ID as well, which is the one that Twitter 85 00:04:04,390 --> 00:04:08,060 actually recommends using if you ever need an ID. 86 00:04:08,060 --> 00:04:09,610 There's the language of the tweet. 87 00:04:09,610 --> 00:04:13,010 So for example, en for English or fr for French. 88 00:04:13,010 --> 00:04:15,490 And we'll see some of that here in the examples 89 00:04:15,490 --> 00:04:16,910 later on in this lesson. 90 00:04:16,910 --> 00:04:19,210 How many times the tweet gets retweeted, 91 00:04:19,210 --> 00:04:22,040 meaning somebody sent it again. 92 00:04:22,040 --> 00:04:26,400 So if you ever see a tweet that has the letters RT in it, 93 00:04:26,400 --> 00:04:29,950 RT at the beginning of a tweet means that it was retweeted, 94 00:04:29,950 --> 00:04:34,870 and that's actually a tweeter reserved word, if you will, 95 00:04:34,870 --> 00:04:37,350 to indicate the concept of retweeting. 96 00:04:37,350 --> 00:04:40,390 And there's several such words like that. 97 00:04:40,390 --> 00:04:43,010 The text attribute is going to tell you 98 00:04:43,010 --> 00:04:47,910 the text of the tweet in 140 character or less format. 99 00:04:47,910 --> 00:04:50,850 If it's an extended tweet, it will actually show you 100 00:04:50,850 --> 00:04:53,300 or give you back a truncated version 101 00:04:53,300 --> 00:04:56,330 of the extended tweet, excuse me. 102 00:04:56,330 --> 00:04:59,350 And then separately, it'll give you a user object 103 00:04:59,350 --> 00:05:02,720 that represents who posted the tweet in the first place, 104 00:05:02,720 --> 00:05:06,380 and user objects themselves also have key value pairs 105 00:05:06,380 --> 00:05:09,280 within them of a additional information. 106 00:05:09,280 --> 00:05:13,020 Same thing is true up here with the place object as well. 107 00:05:13,020 --> 00:05:18,020 So here we have some sample tweet JSON that we've embedded 108 00:05:18,620 --> 00:05:21,770 into this page that I'm showing you at the moment. 109 00:05:21,770 --> 00:05:25,560 And this is an actual tweet that we pulled off 110 00:05:25,560 --> 00:05:28,790 the @nasa Twitter account back 111 00:05:28,790 --> 00:05:30,630 when we were writing the book. 112 00:05:30,630 --> 00:05:34,490 Some of the fields are not returned with every single 113 00:05:34,490 --> 00:05:38,330 object that you get back, so it depends on which API 114 00:05:38,330 --> 00:05:41,590 method you are calling as to what you'll actually get back. 115 00:05:41,590 --> 00:05:43,430 So some where in every tweet object, 116 00:05:43,430 --> 00:05:46,130 some where only in certain tweet objects, 117 00:05:46,130 --> 00:05:48,730 and all of that information is explained 118 00:05:48,730 --> 00:05:50,700 in the online documentation. 119 00:05:50,700 --> 00:05:53,420 So this particular tweet that I showed you up above 120 00:05:53,420 --> 00:05:58,420 was created originally on September fifth of 2018. 121 00:05:58,810 --> 00:06:01,890 This is the ID information for the tweet. 122 00:06:01,890 --> 00:06:04,060 The strings are more portable which is why 123 00:06:04,060 --> 00:06:06,970 they generally recommend using strings 124 00:06:08,882 --> 00:06:12,080 for interacting with tweet IDs if you need them. 125 00:06:12,080 --> 00:06:14,470 This is the original text of the tweet. 126 00:06:14,470 --> 00:06:17,760 Sometimes you'll see a dot dot dot notation in the tweet 127 00:06:17,760 --> 00:06:20,100 indicating that a portion of the tweet 128 00:06:20,100 --> 00:06:23,110 was chopped off because it couldn't fit 129 00:06:23,110 --> 00:06:25,710 in the 140 character limit. 130 00:06:25,710 --> 00:06:27,860 And that limit, by the way, includes things 131 00:06:27,860 --> 00:06:30,800 like the URLs and the hashtags and other stuff 132 00:06:30,800 --> 00:06:32,680 that shows up in the tweet. 133 00:06:32,680 --> 00:06:35,840 If it was truncated, it will tell you that. 134 00:06:35,840 --> 00:06:37,600 Then we have this entities key. 135 00:06:37,600 --> 00:06:39,800 And in the entities key you can see 136 00:06:39,800 --> 00:06:43,430 a collection of hashtags, a collection of user mentions, 137 00:06:43,430 --> 00:06:44,850 a collection of URLs. 138 00:06:44,850 --> 00:06:48,160 We tried to format this to make it a little bit more 139 00:06:48,160 --> 00:06:50,810 readable for your own purpose. 140 00:06:50,810 --> 00:06:54,380 As you scroll down you can see whether it was replying 141 00:06:54,380 --> 00:06:56,900 to some other tweet, who the user was. 142 00:06:56,900 --> 00:06:59,780 And these are a whole bunch of pieces of information 143 00:06:59,780 --> 00:07:01,460 about that user. 144 00:07:01,460 --> 00:07:04,920 So in the case of NASA, they, at the time, 145 00:07:04,920 --> 00:07:09,920 had over 29 million followers of that particular account. 146 00:07:10,500 --> 00:07:14,300 And there are Twitter users who have over 100 million 147 00:07:14,300 --> 00:07:17,170 followers nowadays as well. 148 00:07:17,170 --> 00:07:19,780 As you scroll through here, or as I scroll through here, 149 00:07:19,780 --> 00:07:22,270 you can see there's all sorts of information 150 00:07:22,270 --> 00:07:26,530 that you have access to on a tweet by tweet basis. 151 00:07:26,530 --> 00:07:29,900 So lots of different information that you can take 152 00:07:29,900 --> 00:07:34,230 advantage of as you work your way through the Twitter APIs 153 00:07:34,230 --> 00:07:36,140 and learn how to use them. 154 00:07:36,140 --> 00:07:40,710 So there are some Twitter object, JSON object resources 155 00:07:40,710 --> 00:07:42,850 that you should be aware of. 156 00:07:42,850 --> 00:07:46,820 So the complete list of all the tweet object attributes, 157 00:07:46,820 --> 00:07:49,520 if you go to this URL that you now see 158 00:07:49,520 --> 00:07:52,360 in the address bar or if you just search for this 159 00:07:52,360 --> 00:07:54,660 on the Twitter developer website using 160 00:07:54,660 --> 00:07:56,610 the search field up here, 161 00:07:56,610 --> 00:07:59,740 this is a complete description of everything 162 00:07:59,740 --> 00:08:01,690 that you'll find in a tweet object. 163 00:08:01,690 --> 00:08:03,870 By the way, they'll often have an overview tab 164 00:08:03,870 --> 00:08:06,660 that gives you a general overview and a Guides tab 165 00:08:06,660 --> 00:08:09,740 that gives you a lot more detail on how to use 166 00:08:09,740 --> 00:08:12,400 particular APIs for instance. 167 00:08:12,400 --> 00:08:16,440 Going back over here, you'll also want to see the details 168 00:08:16,440 --> 00:08:20,933 that have to do with the switch from 140 to 280 characters. 169 00:08:22,100 --> 00:08:25,210 So that's what are called extended tweets. 170 00:08:25,210 --> 00:08:28,040 And this is just lowered down on the same page 171 00:08:28,040 --> 00:08:29,500 I was showing you. 172 00:08:29,500 --> 00:08:32,740 And then finally there's a general overview 173 00:08:32,740 --> 00:08:36,000 of all the JSON objects that the twitter APIs return 174 00:08:36,000 --> 00:08:38,670 and links to the specific object details. 175 00:08:38,670 --> 00:08:43,670 You can find that also at the Tweet Objects page. 176 00:08:43,770 --> 00:08:46,900 It looks like they redid that URL to point to this page 177 00:08:46,900 --> 00:08:49,580 since I had originally created that. 178 00:08:49,580 --> 00:08:52,850 So with that said, next up we're going to talk briefly 179 00:08:52,850 --> 00:08:54,690 about getting Tweepy installed 180 00:08:54,690 --> 00:08:57,190 and then we're going to jump into the source code.