In this blog. I am going to discuss the 6 step framework, an approach to solve any Machine Learning problem. Let's don't waste any time and dive into the topic.
Step 1: Frame the problem
As a first step, you need to articulate your problem by identifying the type which depends on your business problem.
Type can be anything like Binary classification, Unidimensional regression, Multi-class single-label classification, Multi-class multi-label classification, Multidimensional regression, Clustering(unsupervised), other(translation, parsing, boundary box id, etc..)
Step 2: Get the Data
The next step is to get the data and store it in the right format according to your problem statement.
Analyze your data to check whether you have enough data or not also check the quality of the data.
The quality of the data fundamentally determines if you will be able to solve the problem at all or not.
Step 3: Data Pre-processing
After having the data next step is to analyze it and extract insights to make business decisions.
Also, apply basic data pre-processing operations to bring the data in a go to go format.
Choose the right library.
Step 4: Evaluation Metric
The most important step is to know how to evaluate our results.
We need to choose the right evaluation metric according to the problem we are going to solve.
For example: If we have an imbalance dataset then we usually choose the ROC-AUC metric.
Step 5: Split the Data
In any machine learning problem, we split the data into multiple sets like training, validation, and test.
Stratified splitting is the most used for classification problems and K-Fold for regression problems.
The most important thing to note is whatever operations you apply on the train set must be applied to the validation and test set.
Step 6: Apply ML algorithms
And finally, we will apply ML models to the data. We can't say which models work best it's just hit and trail.
Apply multiple algorithms do hyperparameter tuning, evaluate the results, and choose the best model which gives satisfying results.
Benchmark your solution based on your selected evaluation metric.
That's all from my end folks. Hope you enjoyed this. Connect with me on Twitter, where I post daily about DataScience and Machine Learning.