From the course: AI in Risk Management and Fraud Detection

Unlock this course with a free trial

Join today to access over 25,600 courses taught by industry experts.

Building and evaluating baseline models

Building and evaluating baseline models

- [Presenter] With your features engineered and your data cleaned, it's time to build your first fraud detection models. Let's train and evaluate two foundational algorithms: logistic regression, and decision trees. These are your baseline models. Simple, interpretable, and fast to train. They help establish a performance benchmark and provide insights into how well your data is structured for fraud prediction. First, we'll split the dataset. A common approach is to use 80% of the data for training and 20% for testing. This ensures the model learns from the majority of the data, but is still evaluated on unseen transactions. In ChatGPT, try "Split the dataset 80/20 using train_test_split with random_state=42." The random_state=42 means every time you run this command, it will split the data in the same way. The data has been split into 1,600 samples in training, and 400 samples in the test set. There are 133 numeric columns. Now that the data's been split, we can train our logistic…

Contents