February 21, 2023

3 Modeling Techniques

1 Modeling Methods

#	Modeling Methods	Response Variable: Numerical /Categorical	Supervised or Unsupervised	Strategy
1	Linear & Polynomial Regression	Numerical	Supervised	Error Based Minimizing Error
2	Logistic Regression	Categorical (Binary)	Supervised	Maximizing Likelihood
3	Discriminant Analysis	Categorical	Supervised
4	K Nearest Neighbor	Categorical	Supervised	Similarity Based
5	Decision and Regression Trees	Categorical + Numerical	Supervised	Information Based
6	Naïve Bayes	Categorical	Supervised	Probability Based
7	Neural Networks	Numerical + Categorical	Supervised	Mimicking Human Brain
8	Clustering		Unsupervised
9	Principal Component Analysis		Unsupervised
10	Support Vector Machines	Categorical	Supervised	Error Based
11	ARIMA : Time Series	Numerical	Supervised	Auto Regression & Moving Average

2 Estimation or Classification

Goals of Machine Learning Application: Estimation or Classification

Estimation – Regression modeling technique is used

Output is a number
- House price
- Product sales for next quarter
- GNP growth for the next quarter
- Employment
Classification – Naïve Bayes, Decision Trees etc. modeling techniques are used

Output is a categorical variable
- Sports team will win or lose
- Email is junk or not
- Which grade student will get
- Tweet is positive or negative

3 Classification of Modeling Methods

Response Variable

Numerical or Categorical

Supervised or unsupervised

Strategy

Error based learning
Similarity Based Learning
Information Based Learning
Probability Based Learning
Mimicking the Human Brain

4 Supervised vs. Unsupervised

Supervisor learning is the most common learning type where there is a target/output variable (which is also called supervisor)

Supervisor (target variable) teaches the algorithm how to build/learn the pattern model
In PA, supervised learning ≈ predictive modeling

Unsupervised learning has NO target variable

No supervisor to teach → algorithm has to learn by itself
In PA, unsupervised learning ≈ descriptive modeling

5 Classifying Based on Strategy to Build a Model

5.1 Error based learning

Linear Multi Variable Regression
Support Vector Machine

In error-based machine learning

We perform a search for a set of parameters for a parameterized model
That minimizes the total error across the predictions made by the model
With respect to a set of training instances (training data)

5.2 Similarity Based Learning

K Nearest Neighbor

Compute the distance matrices between objects

5.3 Information Based Learning

Decision Trees
Regression Trees
Split of decision trees are based on the entropy of the tables

Learn by Asking Questions

The Socratic approach to questioning is based on the practice of disciplined, thoughtful dialogue.
Socrates, the early Greek philosopher/teacher, believed that disciplined practice of thoughtful questioning enabled the student to examine ideas logically and to determine the validity of those ideas.

5.4 Probability Based Learning

Naïve Bayes

Provides a way to compute reverse probability.

Given , we can compute

Naïve Assumption: Assuming Variable Independence

5.5 Mimicking the Human Brain: Neural Networks

Extract linear combinations of the inputs
Model the target as the non-linear functions of these features

Deep Learning: Complex set of Neural Networks with many layers of processing

Main Applications of Deep Learning Neural Networks

Image Recognition
- Convolution Neural Networks
Image Classification
- Convolution Neural Networks
Hand Writing Identification
Speech Recognition
- Long Short-Term Memory Networks

# DS # ML # Data Mining