1 00:00:00,000 --> 00:00:04,700 No, I'm excited to introduce Matt. Welsh here for a talk, on 2 00:00:04,700 --> 00:00:08,900 building applications using large language models. Matt is the 3 00:00:08,900 --> 00:00:12,500 co-founder and CEO of fixie, .a I a 4 00:00:12,500 --> 00:00:16,200 start-up developing a new Computing platform for AI based 5 00:00:16,200 --> 00:00:20,900 applications. Previously was senior vice president of engineering at octo ML and 6 00:00:20,900 --> 00:00:24,600 he spent time at Apple and Google and as a professor at Harvard 7 00:00:24,600 --> 00:00:28,300 and he's also the co-author of one of the first O'Reilly books ever bought 8 00:00:28,300 --> 00:00:29,700 running Linux. So, 9 00:00:30,300 --> 00:00:32,500 Really happy to have you here, Matt, go ahead and take it away. 10 00:00:34,000 --> 00:00:38,800 All right. Thanks so much for having me. Yeah. It was gosh, maybe almost 11 00:00:38,800 --> 00:00:42,900 30 years ago that I wrote my first 12 00:00:42,900 --> 00:00:46,800 O'Reilly book, I guess it was. My also my only O'Reilly book but it was the best 13 00:00:46,800 --> 00:00:50,500 O'Reilly book. So once you've written one, you're pretty much done 14 00:00:51,500 --> 00:00:55,600 today. What I'd like to do is talk to you a little bit. I mean, Maltese 15 00:00:55,900 --> 00:00:59,200 talk was a great overview of some of the 16 00:00:59,200 --> 00:01:03,600 challenges around building applications that use large language. 17 00:01:03,700 --> 00:01:07,700 Models. What we're doing at fixie is building on 18 00:01:07,700 --> 00:01:11,900 top of some of those same ideas, but also enabling people to 19 00:01:11,900 --> 00:01:15,300 connect their own applications using llms 20 00:01:15,300 --> 00:01:19,700 with their own data and not just static data, but also live 21 00:01:19,700 --> 00:01:23,800 data coming from things like apis, as well as using the language 22 00:01:23,800 --> 00:01:27,900 model to to drive application logic and so 23 00:01:27,900 --> 00:01:30,900 we're seeing a lot of interesting potential applications here. 24 00:01:31,600 --> 00:01:35,700 So I think everyone here is probably freaking out 25 00:01:35,700 --> 00:01:38,900 about chatgpt and these Technologies 26 00:01:39,600 --> 00:01:43,600 this guy in particular, he should be freaking out that he's using a cell phone 27 00:01:43,600 --> 00:01:47,700 from 1997, I suppose. But I think one of the reasons that 28 00:01:47,700 --> 00:01:51,900 the world is kind of going crazy around this Tech stack. 29 00:01:51,900 --> 00:01:55,700 Right now is that these large language models are a bit 30 00:01:55,700 --> 00:01:59,800 like an alien technology that just landed in our backyard and we're just figuring 31 00:01:59,800 --> 00:02:01,500 out now how 32 00:02:01,600 --> 00:02:05,600 How to make the best use of it. You know it's not like 33 00:02:05,600 --> 00:02:09,600 we've seen this very slow, gradual Improvement and 34 00:02:09,600 --> 00:02:13,700 in natural language models over the course of years 35 00:02:13,700 --> 00:02:17,800 and now they're just starting to get fairly good. This 36 00:02:17,800 --> 00:02:21,900 this transformation has kind of happened overnight. And 37 00:02:21,900 --> 00:02:25,400 you know, this would be as if, you know, computer 38 00:02:25,400 --> 00:02:29,700 Graphics went, you know, in the span of something like six months from, 39 00:02:29,700 --> 00:02:31,600 you know, having the quality of 40 00:02:31,600 --> 00:02:35,400 Of pong to, you know, Red Dead Redemption to. 41 00:02:35,400 --> 00:02:39,900 And so I think this is the reason that most of the world is going a little bit 42 00:02:39,900 --> 00:02:43,100 crazy here. So 43 00:02:43,100 --> 00:02:47,800 the interesting thing about large language models is that 44 00:02:47,800 --> 00:02:51,900 you can think of them like a virtual machine that you can program 45 00:02:51,900 --> 00:02:55,800 in English. And this is a really interesting 46 00:02:55,800 --> 00:02:59,900 characteristic of these models. The interesting thing about it, 47 00:02:59,900 --> 00:03:01,500 one interesting aspect is 48 00:03:01,600 --> 00:03:05,800 is this was been discovered empirically that is 49 00:03:06,200 --> 00:03:10,800 that large language models were not previously known to have 50 00:03:11,000 --> 00:03:15,600 a kind of reasoning ability. In fact, they were, you 51 00:03:15,600 --> 00:03:19,500 know, designed to autocomplete text. And 52 00:03:19,600 --> 00:03:23,800 then, you know, people were discovering empirically 53 00:03:23,800 --> 00:03:27,400 that you could actually get the language model to follow 54 00:03:27,400 --> 00:03:31,100 instructions and to take a complex problem statement and 55 00:03:31,600 --> 00:03:35,000 Break it in two steps and then execute those steps one at a time 56 00:03:36,000 --> 00:03:40,800 and manipulating a kind of world model as the execution precedes. 57 00:03:42,400 --> 00:03:43,000 So 58 00:03:44,200 --> 00:03:47,900 You know, let me show you an example of how that might work 59 00:03:48,500 --> 00:03:52,900 when you program a large language model as a virtual machine, 60 00:03:52,900 --> 00:03:56,900 you can effectively teach the language model, a new 61 00:03:56,900 --> 00:04:00,600 trick on the fly. If you've seen the movie, The Matrix, it's like that scene 62 00:04:00,600 --> 00:04:04,800 where, you know, they need to learn how to fly the helicopter, all of a sudden. And 63 00:04:04,800 --> 00:04:07,400 all of a sudden they have learned how to fly a helicopter. 64 00:04:07,400 --> 00:04:11,600 So by sending the model small 65 00:04:11,600 --> 00:04:14,100 number of examples of a task, 66 00:04:14,200 --> 00:04:18,700 Ask that you want to perform you can teach it a new trick you can say. In this case 67 00:04:19,000 --> 00:04:23,900 I'm going to teach the language model, how to look up stock prices and how to answer 68 00:04:23,900 --> 00:04:27,900 questions about stock prices. In this case, what I'll do is I'll say 69 00:04:27,900 --> 00:04:31,900 what is the current stock price for Apple and instruct the model that what it 70 00:04:31,900 --> 00:04:35,600 needs to do is call some external function to 71 00:04:35,600 --> 00:04:39,400 actually get the data that it needs. Because the language model does not know 72 00:04:39,700 --> 00:04:41,600 internally what the stock price is. 73 00:04:42,400 --> 00:04:46,700 But assuming that the model can do that and it gets back a response 74 00:04:46,700 --> 00:04:50,800 from that function, then the share price can be shared 75 00:04:50,800 --> 00:04:54,700 back as part of the response. And so the missing piece 76 00:04:54,700 --> 00:04:58,900 here is the ability to connect the language model to 77 00:04:58,900 --> 00:05:02,600 some external API or data source and 78 00:05:02,600 --> 00:05:06,900 this turns out to be quite easy to do and it turns out to be 79 00:05:06,900 --> 00:05:09,800 an extremely powerful abstraction for building 80 00:05:09,800 --> 00:05:12,200 new software systems, 81 00:05:13,600 --> 00:05:17,900 So where I kind of think a lot of this is going to be going in. The future is 82 00:05:18,900 --> 00:05:22,900 we are going to start treating large language models like 83 00:05:22,900 --> 00:05:26,400 a new kind of computational substrate. That is 84 00:05:27,300 --> 00:05:31,900 the language model itself is the code that is we're not using 85 00:05:31,900 --> 00:05:35,600 the language model to generate code and then compiling the code and running it on the 86 00:05:35,600 --> 00:05:39,900 machine but think of the language model as the machine that you're 87 00:05:39,900 --> 00:05:42,900 programming but you're programming it by teaching, 88 00:05:43,300 --> 00:05:47,900 Get a new skill in English and so you're teaching the language model how 89 00:05:47,900 --> 00:05:51,800 to interface to an API, you know, maybe how to pull data from a 90 00:05:51,800 --> 00:05:55,700 database, how to transform data in certain ways and and 91 00:05:55,700 --> 00:05:59,900 also potentially how to use software that was originally meant for humans and we've seen some 92 00:05:59,900 --> 00:06:03,900 really interesting demos and products built up around this 93 00:06:03,900 --> 00:06:07,800 idea of interconnecting, the language model to Conventional software 94 00:06:07,800 --> 00:06:11,900 systems. And this is the reason that I think the 95 00:06:11,900 --> 00:06:13,200 industry is going berserk. 96 00:06:13,300 --> 00:06:17,300 Around these ideas right now is primarily because this is such a 97 00:06:17,300 --> 00:06:21,500 powerful and new way of instructing machines 98 00:06:21,800 --> 00:06:25,600 that it's, it's having a dramatic impact on our entire industry. 99 00:06:26,600 --> 00:06:30,900 Okay, so the question is, how do we get all these things? How do we get there right 100 00:06:30,900 --> 00:06:34,700 today? You could start by going and grabbing 101 00:06:34,700 --> 00:06:38,900 an API key from openai and starting to feed 102 00:06:38,900 --> 00:06:42,600 prompts into a model and getting back responses. And then, 103 00:06:42,600 --> 00:06:46,900 you know, you need to do something with those responses. But this turns out to be, 104 00:06:46,900 --> 00:06:50,900 very cumbersome, very low-level, way of doing things. And where we see the world 105 00:06:50,900 --> 00:06:54,800 going, is kind of a new programming abstraction that's built up 106 00:06:54,800 --> 00:06:56,200 around all of these ideas. 107 00:06:56,700 --> 00:07:00,800 He's so say you've got an amazing large language, model 108 00:07:00,800 --> 00:07:04,300 app that you want to build. How do you do it today? Well, 109 00:07:04,300 --> 00:07:08,900 the first thing that you need to do is figure out which model you want to 110 00:07:08,900 --> 00:07:12,600 work with. And today, of course, most people are using 111 00:07:12,600 --> 00:07:16,700 openai as models. But, you know, there's also anthropic 112 00:07:16,700 --> 00:07:20,400 models and there's some open source models, like llama and alpaca. 113 00:07:20,400 --> 00:07:24,900 And there's Google is coming out with its models and other companies are coming out 114 00:07:24,900 --> 00:07:26,200 with theirs. I actually 115 00:07:26,600 --> 00:07:30,400 Fact that there will be a potentially hundreds of open-source 116 00:07:30,400 --> 00:07:34,500 large language models that are competitive with today's gpt-3, 117 00:07:34,500 --> 00:07:38,900 for model in the not-too-distant future because the technology 118 00:07:38,900 --> 00:07:42,300 for training these models and the datasets that you need are becoming 119 00:07:42,300 --> 00:07:46,900 much more widely available. So you got to figure out which model you want to use, 120 00:07:46,900 --> 00:07:50,300 okay? Then of course, you've got to build connections between 121 00:07:50,300 --> 00:07:54,900 your application and the model and whatever data sources, you 122 00:07:54,900 --> 00:07:56,600 want to interface with, whether it's 123 00:07:56,600 --> 00:08:00,700 It's databases, web pages, S3, buckets, external 124 00:08:00,700 --> 00:08:04,600 apis and multi is talk showed examples of how to do that. 125 00:08:04,600 --> 00:08:08,800 But then of course you've got all of the, let's say fun, 126 00:08:08,800 --> 00:08:12,800 Cloud crap to deal with and what I mean by that is where are you 127 00:08:12,800 --> 00:08:16,400 hosting all of this logic? How are you interconnecting? All these pieces together. 128 00:08:16,400 --> 00:08:20,500 Many of these things require specialized 129 00:08:20,500 --> 00:08:24,800 infrastructure like a vector data base like we V8 or pine cone of 130 00:08:24,800 --> 00:08:26,400 course you need to have you know, your 131 00:08:26,600 --> 00:08:30,600 Chattering, you're testing your evaluation, your authentication, all of these 132 00:08:30,600 --> 00:08:34,700 things need to be built up in order to use this stuff in a production environment. 133 00:08:34,700 --> 00:08:38,900 So I hope that, you know, I'm kind of setting this up for you 134 00:08:38,900 --> 00:08:42,700 is you know of course, Matt isn't there a better way in? This is maybe 135 00:08:42,700 --> 00:08:46,600 where this talk takes a left turn and sounds a little more like an infomercial. 136 00:08:46,600 --> 00:08:50,700 But I do want to show you how we're thinking about this 137 00:08:50,700 --> 00:08:54,700 at fixie because this is an example of how I think 138 00:08:54,700 --> 00:08:56,600 the platform. 139 00:08:56,600 --> 00:09:00,900 Form for building large language models is going to evolve in the future. We 140 00:09:00,900 --> 00:09:04,800 have a particular way of thinking about this at fixie, but we're also seeing 141 00:09:04,800 --> 00:09:08,200 plenty of other open source projects and Commercial companies 142 00:09:08,200 --> 00:09:10,600 building out their own kind of platform. 143 00:09:11,800 --> 00:09:15,300 So the idea with fixie is this is a cloud hosted 144 00:09:15,300 --> 00:09:19,900 service, it integrates all of the pieces you 145 00:09:19,900 --> 00:09:23,800 need to build large language model applications and in 146 00:09:23,800 --> 00:09:27,900 particular it incorporates a wide range of models under the 147 00:09:27,900 --> 00:09:31,700 hood as well as all the cloud infrastructure pieces that I mentioned earlier 148 00:09:32,300 --> 00:09:36,600 and the key idea in fixie is that, you build? What we call 149 00:09:36,600 --> 00:09:40,900 agents, where an agent is a large language model, plus, 150 00:09:40,900 --> 00:09:41,700 a little bit of 151 00:09:41,800 --> 00:09:45,600 Code that knows how to talk to some external system. So for 152 00:09:45,600 --> 00:09:49,300 example, a database agent would go from English queries to 153 00:09:49,300 --> 00:09:53,900 SQL queries and fetch data from a database. The GitHub 154 00:09:53,900 --> 00:09:57,900 agent would go from English to the GitHub API and 155 00:09:57,900 --> 00:10:01,900 fetch information about your pull request, or issues, or whatnot. And 156 00:10:01,900 --> 00:10:05,700 our idea is that as you build these agents, first of all the 157 00:10:05,700 --> 00:10:09,800 agents are super easy to build as I showed you just a few slides ago. 158 00:10:09,800 --> 00:10:11,700 The idea of instructing 159 00:10:11,800 --> 00:10:15,600 the language model to do a new trick is not that hard 160 00:10:16,800 --> 00:10:20,900 and once you plug these agents into the back plane, if you will, 161 00:10:21,400 --> 00:10:25,800 they can communicate completely with each other in English. And so, 162 00:10:25,800 --> 00:10:29,000 the idea of building up an application becomes a matter of 163 00:10:29,300 --> 00:10:33,600 composing, the abilities of these different agents, that could be built by different 164 00:10:33,600 --> 00:10:37,900 people and even running in different places on the internet and having them 165 00:10:37,900 --> 00:10:40,100 communicate with one another in English. 166 00:10:41,200 --> 00:10:45,700 I'm going to show you a quick example of what that might look like in a customer service 167 00:10:45,700 --> 00:10:48,900 context. So in this case we have 168 00:10:50,000 --> 00:10:54,300 a user who's gone on to a website and you know maybe there's a little chat 169 00:10:54,300 --> 00:10:58,400 widget on the website that they can type in a question 170 00:10:58,400 --> 00:11:02,900 we've all seen these things pop up. So in this case the user saying 171 00:11:02,900 --> 00:11:06,700 it, you know, I ordered the wrong size t-shirt, I'd like to get one size larger, 172 00:11:08,100 --> 00:11:11,000 this ticket, plus some other relevant context would 173 00:11:11,000 --> 00:11:15,900 Flow into the fixie platform and fixie. Using the large language model, would figure out 174 00:11:15,900 --> 00:11:19,500 what are the sequence of steps that need to be taken in order 175 00:11:19,500 --> 00:11:23,900 to handle this particular ticket because it's a return 176 00:11:23,900 --> 00:11:27,600 ticket. So they're asking about an exchange. So 177 00:11:27,600 --> 00:11:31,800 in this instance, the ticket would flow through a set of 178 00:11:31,800 --> 00:11:35,900 agents that individually, use the language model to do things 179 00:11:35,900 --> 00:11:39,900 like look up the customers order history. Check the 180 00:11:39,900 --> 00:11:41,000 stock to see if the 181 00:11:41,000 --> 00:11:45,900 The t-shirt is still available, generate a return label by calling, 182 00:11:45,900 --> 00:11:49,600 like, the ups apis for that. And then, finally, 183 00:11:49,600 --> 00:11:53,800 drafting an email reply, that could be sent back either directly 184 00:11:53,800 --> 00:11:57,700 to the customer, or potentially reviewed by a human before 185 00:11:57,700 --> 00:12:01,800 it's sent out. This is just an example. And I think the 186 00:12:01,800 --> 00:12:05,700 important thing here is that the sequence of agents in the 187 00:12:05,700 --> 00:12:09,600 sequence of steps that are invoked really, comes down to 188 00:12:09,600 --> 00:12:10,500 the 189 00:12:11,100 --> 00:12:15,500 if Akin put that was provided and this is the kind of Rich powerful 190 00:12:15,700 --> 00:12:19,700 symbolic manipulation that large language models are so good at 191 00:12:21,100 --> 00:12:25,700 Another thing about fixie is that we allow agents to process 192 00:12:25,700 --> 00:12:29,800 any kind of data or any kind of media. So it's not just text, 193 00:12:29,800 --> 00:12:33,900 think about this, a little bit like email attachments, 194 00:12:33,900 --> 00:12:37,800 where, you know, I can send you an email with a image in it or a 195 00:12:37,800 --> 00:12:41,900 PowerPoint deck and you as a recipient of that 196 00:12:41,900 --> 00:12:45,700 email know how to deal with those attachments in most cases. 197 00:12:45,700 --> 00:12:49,700 So in this case, if we were to say, you know, generate an image 198 00:12:49,700 --> 00:12:50,800 of a red panda and 199 00:12:51,100 --> 00:12:55,800 get on top of the night sky fixie would build up a workflow 200 00:12:55,800 --> 00:12:59,900 that first generates the red panda image. Second generates the 201 00:12:59,900 --> 00:13:03,400 night. Sky image, third uses a different 202 00:13:03,400 --> 00:13:07,900 agent to mask out the background of the red panda. And then 203 00:13:07,900 --> 00:13:11,800 a final agent, that Composites, the two images together to give 204 00:13:11,800 --> 00:13:15,900 you the final result. And again, you're doing all of this directly 205 00:13:15,900 --> 00:13:19,600 in natural language and the individual agents. Are executing those 206 00:13:19,600 --> 00:13:20,200 steps. 207 00:13:22,400 --> 00:13:26,900 So, as far as you know, how do 208 00:13:26,900 --> 00:13:30,800 you build agents in fixie, the 209 00:13:30,800 --> 00:13:34,700 key idea in fixie, is that you use a set of few shot, 210 00:13:34,700 --> 00:13:38,700 learning examples. Those were exactly those few shots that I 211 00:13:38,700 --> 00:13:42,900 showed earlier for the agent that generates 212 00:13:42,900 --> 00:13:46,600 stock prices. But then you simply augment it with 213 00:13:46,600 --> 00:13:50,700 a little bit of code, that knows how to call out to some external API 214 00:13:50,700 --> 00:13:52,300 or access some external data. 215 00:13:52,400 --> 00:13:56,600 De and the idea is that this is just so easy to do. Most 216 00:13:56,600 --> 00:14:00,800 developers can build an agent in something like 10 minutes, the 217 00:14:00,800 --> 00:14:04,900 GitHub agent that we've built is took me something like 20 minutes 218 00:14:04,900 --> 00:14:08,900 to build. It's just incredibly easy to put these examples 219 00:14:08,900 --> 00:14:12,900 together and then just provide those little low level function hooks that 220 00:14:12,900 --> 00:14:16,400 need to be invoked. The reason this ends up being 221 00:14:16,400 --> 00:14:20,600 interesting is the language we're leaning heavily on the language 222 00:14:20,600 --> 00:14:22,300 model to do the heavy lifting here. 223 00:14:22,400 --> 00:14:26,600 R, right? It's responsible for interpreting the query from the user and 224 00:14:26,800 --> 00:14:30,900 kind of transforming it into these underlying function 225 00:14:30,900 --> 00:14:34,900 calls and those are, that's what the language models are good at. So we're using 226 00:14:34,900 --> 00:14:36,600 that as much as we possibly can. 227 00:14:38,200 --> 00:14:42,900 Few people have asked on the, on the chat here 228 00:14:42,900 --> 00:14:46,400 about Lang chain and a lot of people ask me this question so I'll just address it here. 229 00:14:46,700 --> 00:14:50,300 Lang chain is another way to build agents in fixie. So 230 00:14:50,300 --> 00:14:54,600 Liang Shan is a this fantastic Library. It's in Python and typescript 231 00:14:54,600 --> 00:14:58,700 that let you build up large language model applications 232 00:14:58,900 --> 00:15:02,600 fixie supports it. We have a simpler model here for 233 00:15:02,600 --> 00:15:06,800 building agents and we kind of think that Lang chain is useful in the cases where you 234 00:15:06,800 --> 00:15:07,500 want to have full control. 235 00:15:08,200 --> 00:15:12,900 All over what's going on and have more sophisticated agent behaviors. 236 00:15:12,900 --> 00:15:16,900 But if you want to build simpler agents than this approach here 237 00:15:16,900 --> 00:15:18,300 is pretty easy to do. 238 00:15:20,700 --> 00:15:24,700 The other thing we can do is build agents automatically without any 239 00:15:24,700 --> 00:15:28,800 real code. And in this case, I'm showing it as code. But you can pretty much think about 240 00:15:28,800 --> 00:15:32,900 this as just configuration here. What we're doing is 241 00:15:32,900 --> 00:15:36,600 we're providing a list of web pages in this case. 242 00:15:36,600 --> 00:15:40,700 The Wikimedia Wikipedia articles about the TV show, 243 00:15:40,700 --> 00:15:44,900 Silicon Valley and now I have an agent that knows how to answer 244 00:15:44,900 --> 00:15:48,700 questions about Silicon Valley so I could go to this agent without 245 00:15:48,700 --> 00:15:50,300 writing any additional code. Just this 246 00:15:50,900 --> 00:15:54,400 Configuration here and ask questions about 247 00:15:55,300 --> 00:15:57,900 about the TV show and what happened in it. 248 00:16:00,000 --> 00:16:04,700 What I'd like to do is just in real few quick minutes. Here is 249 00:16:04,700 --> 00:16:07,300 just show a live demo of this working in action 250 00:16:07,300 --> 00:16:11,500 and I want to emphasize that this is all very 251 00:16:11,500 --> 00:16:15,900 Rough and Ready, kind of early beta quality software. So you'll forgive me if things 252 00:16:15,900 --> 00:16:19,800 don't work out as expected. So, what we're looking at 253 00:16:19,800 --> 00:16:23,300 here is the fixie developer dashboard. 254 00:16:23,300 --> 00:16:27,500 Keep in mind, fixie is a platform. It's not a website. 255 00:16:27,500 --> 00:16:29,500 It is not. 256 00:16:30,000 --> 00:16:34,900 You know, going to a different web page and chatting with fixie, instead of say 257 00:16:34,900 --> 00:16:38,600 chatgpt instead, it's a set of apis that you can integrate 258 00:16:38,600 --> 00:16:42,900 into your own application, but we do have this developer 259 00:16:42,900 --> 00:16:46,800 dashboard to make it really easy to experiment with it. And what you're going to see 260 00:16:46,800 --> 00:16:50,600 here is we have a bunch of agents that have been built. That 261 00:16:50,600 --> 00:16:54,800 each of these agents knows how to talk to some different 262 00:16:54,800 --> 00:16:57,500 kind of system. For example, we have a 263 00:16:57,500 --> 00:16:59,900 pillow, a 264 00:17:00,200 --> 00:17:04,700 That invokes the python pillow library for manipulating images. The 265 00:17:05,100 --> 00:17:09,600 Google Calendar agent can answer questions about your calendar and create 266 00:17:09,600 --> 00:17:13,300 events on your calendar. We have the stock agent that I mentioned 267 00:17:13,300 --> 00:17:17,800 earlier and so what I can do is, you know, pick any one of these 268 00:17:17,800 --> 00:17:21,900 agents that, you know, I might want to talk to, and get 269 00:17:21,900 --> 00:17:25,600 it to answer some questions for me. So, in this case, I'm going to 270 00:17:25,600 --> 00:17:29,300 chat with the GitHub agent and I'll say, you know, how 271 00:17:30,200 --> 00:17:34,800 How many issues are assigned to in be 272 00:17:34,800 --> 00:17:38,900 Welsh which is my GitHub username. And what you should see 273 00:17:38,900 --> 00:17:42,900 here is fixie is sending this query into the agent. The agent 274 00:17:42,900 --> 00:17:46,900 is converting that into a query against the 275 00:17:46,900 --> 00:17:50,600 GitHub API. In this case, we happen to be using a 276 00:17:50,600 --> 00:17:54,900 SQL database. It's not a database, 277 00:17:54,900 --> 00:17:58,900 a SQL interface to the GitHub API, so we convert the 278 00:17:58,900 --> 00:18:00,100 English to sequel. 279 00:18:00,800 --> 00:18:04,600 And then when we issue the query against GitHub, I get back the answer, they're 280 00:18:04,600 --> 00:18:08,900 21 issues assigned, him D Wells which actually makes me nervous because I 281 00:18:08,900 --> 00:18:12,700 didn't know there were so many and then I can say, you know, show them to me. 282 00:18:14,200 --> 00:18:18,000 And when I do this, i-i'm sorry, 283 00:18:19,200 --> 00:18:23,800 this is the downside of doing demos live is that we're changing things all the time, 284 00:18:24,000 --> 00:18:28,800 and sometimes things don't work. As expected, let me show you the version of this that I had done 285 00:18:28,800 --> 00:18:29,700 previously. 286 00:18:30,200 --> 00:18:34,400 And what I say here is, you know, show them to me. And it says, 287 00:18:34,400 --> 00:18:38,600 okay, I will go and fetch the open issues, assigned to MD, Welsh and 288 00:18:38,600 --> 00:18:42,700 return them as a list. So this worked pretty well 289 00:18:43,300 --> 00:18:47,800 in this instance. So now we're going to come back and look at another 290 00:18:47,800 --> 00:18:51,000 example, where what I might want to be doing here is 291 00:18:52,200 --> 00:18:55,100 fetching some data from the web and then visualizing it. 292 00:18:55,700 --> 00:18:59,500 So in this case, I said make a pie chart of the top 10 countries 293 00:18:59,500 --> 00:19:03,800 by population and fixie takes 294 00:19:03,800 --> 00:19:07,900 in this query and it says, okay, the first thing I recognize is this is a 295 00:19:07,900 --> 00:19:11,900 multi-step tasks. I first need to find out what are the top 296 00:19:11,900 --> 00:19:15,900 10 countries by population and it sends 297 00:19:15,900 --> 00:19:19,700 that query into the router, the router in fixie is responsible for 298 00:19:19,700 --> 00:19:23,600 deciding which agent should get the call and sends that 299 00:19:23,600 --> 00:19:25,600 into a search agent the sir 300 00:19:25,800 --> 00:19:29,900 Agent gets the query says, look up the top 10 301 00:19:29,900 --> 00:19:33,500 countries by population does a Google search for that information 302 00:19:33,900 --> 00:19:37,700 and comes back with the list of each country, the top 10 countries and their 303 00:19:37,700 --> 00:19:41,600 relative populations. Then we send this data 304 00:19:41,600 --> 00:19:45,900 back up into the router and say the second step is to generate 305 00:19:45,900 --> 00:19:49,700 a pie chart from this data, we send down the data 306 00:19:49,700 --> 00:19:53,400 into the chart agent in the chart agent. Generates a pie 307 00:19:53,400 --> 00:19:54,000 chart 308 00:19:55,700 --> 00:19:59,800 Next, if I were to say, make a bar chart instead, in 309 00:19:59,800 --> 00:20:03,900 this case, I didn't say make a bar chart of the top 10 countries by 310 00:20:03,900 --> 00:20:06,800 population. The the 311 00:20:07,500 --> 00:20:11,900 conversational context is present here and therefore, it's really, 312 00:20:11,900 --> 00:20:15,900 really easy to just understand that I need 313 00:20:15,900 --> 00:20:19,600 mean to do exactly the same thing but just using a bar chart instead and so the 314 00:20:19,600 --> 00:20:21,200 final result is a bar chart. 315 00:20:21,900 --> 00:20:24,800 So I think this is just intended to show a little bit of the 316 00:20:24,800 --> 00:20:28,600 power of building out. 317 00:20:28,600 --> 00:20:32,900 These individual agents, getting them communicating with one. Another in English, 318 00:20:33,200 --> 00:20:37,900 hopefully showing that it's really easy to do and all these things 319 00:20:37,900 --> 00:20:41,700 are running in the cloud. So if this has gotten you excited, you know, definitely 320 00:20:41,700 --> 00:20:45,700 head on over to our website. It's fixie F IX, I E.A I 321 00:20:45,700 --> 00:20:49,700 and you can sign up for the developer preview. It's completely free to 322 00:20:49,700 --> 00:20:51,500 use. You can build 323 00:20:51,800 --> 00:20:55,500 You can build client applications, you can interact with the existing agents in the 324 00:20:55,500 --> 00:20:59,700 system and everything that I just showed you should should 325 00:20:59,700 --> 00:21:02,800 be working. Okay, so takeaways here 326 00:21:02,800 --> 00:21:06,900 basically. We're just looking at large 327 00:21:06,900 --> 00:21:09,700 language models as this new kind of computational engine. 328 00:21:09,700 --> 00:21:13,800 And what we're seeing is that it's still pretty hard to 329 00:21:13,800 --> 00:21:17,000 tap into their abilities. There's a lot of Machinery that you need to build up, 330 00:21:17,000 --> 00:21:21,800 but once you build up that Machinery, we think that it be 331 00:21:21,900 --> 00:21:25,600 Comes extremely compelling to start replacing conventional software with this 332 00:21:25,600 --> 00:21:29,600 concept of natural language agents. And so, you know, my 333 00:21:29,600 --> 00:21:33,700 general message is if you haven't been freaking out yet. Now now might be a 334 00:21:33,700 --> 00:21:37,600 pretty good time to start. So with that, thank you very much 335 00:21:37,900 --> 00:21:41,300 and if we have time for questions, I'm happy to answer any questions. 336 00:21:43,200 --> 00:21:47,800 Thanks, Matt. We've got time for one question, which is come up quite a 337 00:21:47,800 --> 00:21:51,900 bit here in the Q&A. It's about testing obviously with the current 338 00:21:51,900 --> 00:21:55,900 error rates of large language models. Now, how are you going to test the applications that 339 00:21:55,900 --> 00:21:59,900 you would build with fixie? Yes, exactly. And I think, you know, Maltese 340 00:21:59,900 --> 00:22:03,900 talked earlier also spoke to that. I think that my expectation is 341 00:22:03,900 --> 00:22:07,400 that any company, any platform, any system 342 00:22:07,400 --> 00:22:11,900 that's building around large language models is going to have to have a solution for Quality 343 00:22:11,900 --> 00:22:12,900 evaluation and testing. 344 00:22:13,700 --> 00:22:17,900 The sexy the way that we're thinking about that is first of all, you can effectively 345 00:22:17,900 --> 00:22:21,500 unit test the individual agents and that's really helpful 346 00:22:21,500 --> 00:22:25,900 because an agent itself is fairly constrained and what it should be allowed 347 00:22:25,900 --> 00:22:29,900 to do. And what kind of answers is going to give you, and now, 348 00:22:29,900 --> 00:22:33,600 you need at a second level above that kind of rigorous unit 349 00:22:33,600 --> 00:22:37,700 testing of One agent. You need to be able to have the same kind of integration 350 00:22:37,700 --> 00:22:41,800 testing. Make sure that working multiple agents are going to work together, 351 00:22:41,800 --> 00:22:42,700 effectively. 352 00:22:43,500 --> 00:22:47,900 This is obviously an ongoing area of research for us and for many others. 353 00:22:47,900 --> 00:22:51,700 And so, we are integrating a lot of capabilities into the fixie 354 00:22:51,700 --> 00:22:55,800 platform to make that super easy and automated. But this is going to 355 00:22:55,800 --> 00:22:56,800 evolve a lot over time. 356 00:22:58,300 --> 00:23:02,900 Just like anything else in the space. It sounds like moving fast. It sure is. 357 00:23:03,300 --> 00:23:07,500 Thanks very much for spending some time with us today. Great presentation, really appreciate it. 358 00:23:07,700 --> 00:23:08,200 Thank you.