Selecting Data For Modeling
Two Approaches
Dot notation
Select the "prediction target"
Selecting with a column list
Selects the "features"
Dot Notation (choosing pred. target)
Select column we want to predict (Prediction Target)
# By convention, this is named y
y = model.ColumnChoosing "features"
Features are columns inputted into model
features = [price, name, clothes]
# By convention, this is named X
X = model[features]Steps to building and using a model
Define
What type of model? (eg. decision tree)
Fit
Capture patterns from model
Predict
Evaluate
Determine accuracy of model's predictions
Example using scikit-learn
If you do not specify random_state number, model may allow for some randomness in model training
Last updated
Was this helpful?