From the course: Synthetic Data for Software Testers
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
Creating your first synthetic dataset
From the course: Synthetic Data for Software Testers
Creating your first synthetic dataset
- [Instructor] Have you ever encountered obstacles in acquiring test data due to privacy or accessibility issues? Well, the solution is crafting your own synthetic dataset. In this lesson, we're going to focus on guiding you through the creation of your inaugural dataset. Synthetic data is fabricated information that mimics real world data, allowing us to bypass privacy issues while retaining valuable statistical properties for things like analysis, development, and testing. Using Python, a favorite tool among data scientists, let's generate a synthetic data set. We'll need Python installed along with the Pandas library, which is a cornerstone for data manipulation. Over in our IDE, we want to make sure that we have the Pandas library installed. Now, I know that it's installed on my machine, but in order to install it on yours, you're going to want to use either the pip install pandas or pip3 install pandas, depending on your flavor of Python. All of these commands will be available…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.