1 Similarity Based Learning
Compute the distance matrices between objects
2 k Nearest Neighbor (kNN) Model
2.1 Pros and Cons of kNN
Pros | Cons |
---|---|
Simple and Effective | Does not produce a model, limiting the ability to understand how the features are related to the class |
Makes no assumption about the underlying data distribution Non-parametric |
Requires selection of an appropriate value of âkâ |
2.2 Example
3 kNN Model Assessment
4 Data Normalization: Standardization & Scaling
Suppose we have 2 data items
- Height: varies from 4 â 7 feet
- Net Worth: 100B
If we use both the variables in a model
- Net Worth will dominate because it contains large values
Solution
- Standardize
- Scale