1 00:00:00,770 --> 00:00:03,310 - [Instructor] One of our biggest goals in this lesson 2 00:00:03,310 --> 00:00:06,376 is to give you experience with a variety of cloud 3 00:00:06,376 --> 00:00:10,090 and desktop-based big-data software. 4 00:00:10,090 --> 00:00:13,140 Now, the cloud vendors are going to be focused on 5 00:00:13,140 --> 00:00:14,770 providing you with what they call 6 00:00:14,770 --> 00:00:17,070 service-oriented architecture. 7 00:00:17,070 --> 00:00:20,870 So, as a service, they provide you with these 8 00:00:20,870 --> 00:00:25,220 complex capabilities that enable you to write Hadoop apps 9 00:00:25,220 --> 00:00:30,100 and Spark apps and pub/sub based IoT applications 10 00:00:30,100 --> 00:00:34,790 by registering for, potentially paying for, 11 00:00:34,790 --> 00:00:38,700 usually paying for, and working with all of their services 12 00:00:38,700 --> 00:00:42,360 in the cloud and as part of this lesson you're going to be 13 00:00:42,360 --> 00:00:44,110 doing that a number of times. 14 00:00:44,110 --> 00:00:48,060 You'll be using a MongoDB Atlas cluster 15 00:00:48,060 --> 00:00:51,210 to do a NoSQL database in the cloud. 16 00:00:51,210 --> 00:00:55,590 We're going to do a Microsoft Azure HDInsight 17 00:00:55,590 --> 00:00:59,870 cloud-based cluster for our Hadoop example. 18 00:00:59,870 --> 00:01:04,610 We'll also do in that same service, a Spark cluster 19 00:01:04,610 --> 00:01:08,290 and we'll do a Spark cluster running locally on the desktop 20 00:01:08,290 --> 00:01:12,420 as well and we'll call that "a cluster," because 21 00:01:12,420 --> 00:01:15,230 when it's running locally, it's one computer 22 00:01:15,230 --> 00:01:19,220 but it simulates the concept of a cluster with threads 23 00:01:19,220 --> 00:01:20,820 of execution. 24 00:01:20,820 --> 00:01:24,130 We'll also be using other types of online services 25 00:01:24,130 --> 00:01:27,250 like the Freeboard.io dashboard visualizations 26 00:01:27,250 --> 00:01:30,010 I mentioned in a couple of the preceding videos, 27 00:01:30,010 --> 00:01:32,770 the PubNub publish subscribe service, 28 00:01:32,770 --> 00:01:36,800 and the Dweet.io publish subscribe service as well. 29 00:01:36,800 --> 00:01:40,540 And there are many other options out there for you. 30 00:01:40,540 --> 00:01:43,890 We chose the particular ones that we're showing you 31 00:01:43,890 --> 00:01:47,280 for their ease of use and sometimes for their 32 00:01:48,280 --> 00:01:50,940 free or free tier or 33 00:01:52,200 --> 00:01:53,870 credits that they give you 34 00:01:53,870 --> 00:01:56,410 so that you have a chance to experience 35 00:01:56,410 --> 00:01:59,430 these different capabilities and work with them 36 00:01:59,430 --> 00:02:03,390 but there are plenty of similar services from companies like 37 00:02:03,390 --> 00:02:06,590 Amazon and Google and IBM Watson. 38 00:02:06,590 --> 00:02:10,160 There's also, for the purpose of Hadoop and Spark, 39 00:02:10,160 --> 00:02:11,930 free desktop versions 40 00:02:11,930 --> 00:02:15,230 of the Hortonworks and Cloudera platforms 41 00:02:15,230 --> 00:02:18,570 and they recently merged so those will eventually become 42 00:02:18,570 --> 00:02:20,520 one platform most likely 43 00:02:20,520 --> 00:02:23,910 and they too have paid cloud-based versions 44 00:02:23,910 --> 00:02:27,810 often running on the clouds of other providers 45 00:02:27,810 --> 00:02:32,684 like Amazon, Google, IBM, and Microsoft. 46 00:02:32,684 --> 00:02:35,580 And then finally, you also have the ability 47 00:02:35,580 --> 00:02:40,140 to work with a Spark cluster, a single-node Spark cluster 48 00:02:40,140 --> 00:02:44,460 in the context of the free Databricks Community Edition. 49 00:02:44,460 --> 00:02:47,070 The folks who founded Databricks are actually the people 50 00:02:47,070 --> 00:02:51,720 who originally created Spark, so it's an interesting website 51 00:02:51,720 --> 00:02:54,050 for a number of reasons, not the least of which 52 00:02:54,050 --> 00:02:56,550 is that they created the software in the first place 53 00:02:56,550 --> 00:03:00,050 and they also have a lot of great learning resources 54 00:03:00,050 --> 00:03:02,840 for those of you who are interested in pursuing 55 00:03:02,840 --> 00:03:06,753 the Spark platform in more detail.