From the course: MongoDB Aggregation Pipeline: Advanced Data Analysis and Manipulation

Application and data model

- [Instructor] Before we get started with writing aggregation pipelines, let's first learn a little bit about the dataset and tools we'll be working with in this series. To follow along with the series, you'll need access to a MongoDB database. You can install the database locally on your machine by downloading it from mongodb.com. You can also create a free cluster using MongoDB Atlas or deploy a server on a cloud provider like Microsoft Azure or DigitalOcean. Once you have a database server up and running, connect to it using MongoDB Compass, the official GUI tool for interacting with MongoDB. Compass can be downloaded from the MongoDB website and it's free to use. Once you have MongoDB Compass downloaded and installed, connect to your MongoDB database server. Depending on where you deploy the server, you'll simply need to provide the connection string to MongoDB Compass. For this series, I'll be using MongoDB Atlas so I will just head over to my MongoDB Atlas dashboard, find the database cluster that I'm going to use. In this case, it's called LinkedIn and hit the connect button. I'm going to be connecting with MongoDB Compass and I will just copy the connection string. Now let's go into MongoDB Compass and paste our connection string. And before we can connect, we'll need to provide a username and password, which I've already set as LinkedIn and pass123. And yours will likely be something different but that's okay. Just make sure that you do have a user and you do have a password. And then we'll hit connect. Once connected, we'll see all of our databases within this cluster. And currently we have admin, config and local but we want to work with our own database to write these aggregations. So let's go ahead and create a new database by hitting the Create database button, and we'll call our database LinkedIn. And the initial collection name we'll set is customers. Once our database is created, we'll create three additional collections, which are going to represent our dataset for this course. And those additional collections are product, vendors and orders. Now that we have our collection set up, we don't have any data in them so we'll need to go and get the data. To get our dataset, let's navigate back to the GitHub repo that has a data folder that represents our dataset. Now, I have the dataset in two different versions. I have a number of JSON files that you can import directly or a faker-generate-dataset Node.js file, which is going to automatically generate and add all the data into your collections. I recommend using the JSON files because it will ensure that you are using the same exact data that I will be in the course. If you use the faker-generate-dataset.js file, you'll have the right data format but your results will likely be different. Now to show you how to import data into our database, let's go back into MongoDB Compass. And here in the customer's collection, I'm going to hit the import data button, hit the select a file and I'm already in the right directory. But if you don't have this dataset, go to the GitHub repo and clone it or download it. And let's import our customers. So I will double click customers. It is in JSON format, and I'll hit import. Now, this is going to take a couple of seconds to load 100,000 documents into our database, and we'll repeat the process for orders, products, and vendors. I'm not going to show you how to do the rest of those for brevity, but you need to do that before we continue to writing the aggregations. Once our data is imported, we'll hit done and we can look at our records. We have 100,000 documents and each document has an _id field, the full name of the customer, as well as their address. So you'll do the same process for orders, import data, select file, and then orders has five different files because we want to have a large number of orders to work with. So you can just do those sequentially. The next thing we're going to do is set up our application so that we can write aggregations in code as well. And we're going to do this in the GitHub repo that we cloned earlier. So I have that repo opened up here in Visual Studio Code. And looking at the dataset, we have a number of different files. So we have the data folder, which has our sample dataset. We have this test.js file, which is going to ensure that we've set everything up correctly. And the next thing we'll need to do is set up our environment variable to connect to MongoDB. So if we open .env.example, it has a MKONGODB_URI field with a place to set our connection string. So let's rename this file .env, and we'll just get rid of the .example. And then we'll have to set our MONGODB_URI. And then to get this again, since I'm using MongoDB Atlas, I'll go back to my dashboard here, go back and now I'm connecting to my application. I will copy the connection string, go back to code and paste it. And just like in the Compass example, I'll have to set my username and password, which I've set as LinkedIn, pass123. And to ensure that all of this works, let's go ahead and run the test.js file, which is going to go into our LinkedIn database into the customer's collection and get the first record out. Before we can execute this code though, we do need to run npm install to get our dependencies installed locally. And now that our dependencies are installed and ready to go, let's clear this and let's run node test.js. And what we should get back is the first record in the customer's collection. And we do, we get Willie Conn and his information. So now that we're ready and we have our database set up and data ready to go, we are ready to start learning about the MongoDB aggregation framework.

Contents