From the course: Learning Data Science: Understanding the Basics

Unlock the full course today

Join today to access over 25,100 courses taught by industry experts.

Use statistics and software

Use statistics and software

- Because data science is still being defined by practice, there's an extra emphasis on using common software and tools. Data scientists are like early archaeologists. So think of software as the brushes and pickaxes you'll need to make discoveries. Try not to get too focused on learning all the tools. The tools in themselves will not make you a data scientist. It's the scientific method, and not the tools, that make someone a data scientist. The tools basically fall into three categories, storing, scrubbing, and analyzing. To store the data you can use spreadsheets, databases and key value stores. Some popular ones are Hadoop, Cassandra, and POST REST SQL. Scrubbing is a common practice to make the data easier to work with. Here you use text editors, scripting tools, and programming languages like Python and SCALLOP. Finally, there are the statistical packages to help analyze the data. The most popular are the open-source package R, SBSS, and Python's data libraries. When you use…

Contents