ARTIFICIAL INTELLIGENCE
AI Knowledge Base by Sunil Marella

Introduction

Machine Learning is learning from data and make decisiomn without being explicitly programmed. It is about creating predictive models from data by finding patterns.

Main phases of Machine Learning

  • Data Collection: Gathering data from various sources
  • Data Preprocessing:
  • Model Training:
  • Model Evaluation:
  • Model Deployment:
  • Model Monitoring:

How Machine Learning Works

ML Models extract pattern from historical data to make predictions or decisions about new/unseen data.

Machine Learning Models

There are 2 main phases

  1. Training
  2. Inferencing

Machine Learning Types

Multiple types of machine learning, and use the model depending on what you’re trying to predict.

  • Supervised Learning
    • Regression
    • Classification
      • Binary Classification
      • Multiclass Classification
      • Multi-label Classification
  • Unsupervised Learning
    • Clustering
    • Dimensionality Reduction
    • Association Rules
  • Reinforcement Learning
  • Semi-Supervised Learning
  • Self-Supervised Learning

Supervised Learning

If a model learn from labeled data (Correct answers already know during training).

Ex:

  • Speech recognisation: from audio to text transcript
  • Tranlation: english to dutch
  • Spam filtering: emaill spam or not

Algrithem Types

  • Classification: Predicts which category something belongs to (small number of possible outputs).
    • Binary classification: Predicts one or two possible outcomes. Ex: patient is diabetic or not. whether a mail is spam or not.
    • Multiclass classification: Predicts one or multiple possible outcomes. Ex: Movie is horror, comedy or fiction etc.
  • Regression: Predicts a continuse numaric value (Many possible outputs and also often many inputs required). House pricing, stock price, sales predictioon etc.

Unsupervised Learning

If a model learns from the data without labeled and try to automatically group them into clusters.

Algrithem Types

  • Clustering/Grouping - Group similar datapoints to gether. Ex: Google news - looks for thosends of news articles and group related stories together, based on customer age, salary, location predicting a premium, budget/medium customer.
  • Dimensionality Reduction - Take big data set and compress it to much smaller data set using few numbers (While loosing as less data as possible). Ex: Customer data contains lot of information, reduce to key 10 features related to financial.
  • Anomaly Detection - Find unusual data points. Ex: Financial fraud systems, AIops etc

Supervised Learning - Training the model

Regression

Teaching a machine learning model to predict a continuous numeric value from input data.

Typical workflow:

  1. Collect data
  2. Split data
    • Training data
    • Testing data
  3. Train the regression model
  4. Evaluate using metrics: To check if the model is good, data scientists use evaluation metrics, for example:
    • MAE (Mean Absolute Error) – average prediction error
    • MSE (Mean Squared Error) – squared prediction error
    • RMSE (Root Mean Squared Error) – square root of MSE
    • R² (R-squared) – how well the model explains the data
  5. Improve the model

Real world scenario - Iterative training of training, validating & evaluating model.


Classification

  • Binary Classification
  • Multiclass Classification

1. Binary Classification

Binary classification is a type of supervised machine learning task where a model learns from labeled data to classify inputs into one of two possible categories.

These categories are usually represented as:

  • 0 or 1
  • True or False
  • Yes or No
  • Positive or Negative

Popular algorithms for binary classification:

  • Logistic Regression
  • Decision Trees
  • Random Forest
  • Support Vector Machines (SVM)
  • Neural Networks
  • Gradient Boosting (XGBoost, LightGBM)

2. Multiclass Classification


Unsupervised Learning - Training the model

Clustering