From the course: Introduction to Machine Learning with KNIME
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
Balancing data with Row Sampling node - KNIME Tutorial
From the course: Introduction to Machine Learning with KNIME
Balancing data with Row Sampling node
- [Instructor] We're going to continue with the select task of the data preparation phase with the row sampling node, but we're also going to talk a bit about balancing. Let's take a look. Okay we're going to continue where we left off and let me just start to type in sampling. We actually see a bunch here so I want to draw your attention to something. We have the row sampling node which is the one that we want, but also if we scroll down, there are numerous things. We have a database version of sampling in the KNIME Labs. We have a Spark version of row sampling. So we don't want to forget all the additional power here. What will happen over time is by using the combination of all of the different ways of pulling up these nodes, you're going to start to learn all of the numerous ones that might apply to your particular situation. So let's start with the most basic one: the row sampling node. I think, generally speaking, folks are too afraid to sample, or rather, I think most of us…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
-
(Locked)
Merging with the Joiner node2m 24s
-
(Locked)
Aggregating with the GroupBy node1m 37s
-
(Locked)
Creating new variables with Construct3m 24s
-
(Locked)
Select data with Column Filter2m 17s
-
(Locked)
Balancing data with Row Sampling node3m 18s
-
(Locked)
Clean data with the Missing Value node2m 46s
-
(Locked)
Format with Cell Splitter3m 37s
-
(Locked)
-
-
-