Project: ML - Neural Networks (Flower Type & Wine Cultivator)


Problem 1 (Flower Type):

  • Predict species of flower from flower properties: sepal width and length, petal width and length
  • Binary classification (only two species of flowers), using MLPClassifier (NN)

Problem 2 (Wine Cultivator):

  • Predict cultivator from wine properties (data: alcohol, malic acid, color intensity, hue, magnesium, etc)
  • Multiclass classification (3 cultivators), using MLPClassifier (NN)


Tools:

  • Feature Engineering: rescale StandardScaler rescaling and reshuffle df
  • Models: NNT3 Algorithm, MLPClassifier
  • Model validation: holdout validation (train_test_split (default, test=0.25))
  • Error Metric: AUC, classification_report, confusion_matrix


load defaults

In [32]:
import numpy as np
import pandas as pd
import seaborn as sns
import re
import requests 

%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
from matplotlib import rcParams
import matplotlib.dates as mdates
from datetime import datetime
from IPython.display import display, Math

from functions import *

plt.style.use('seaborn')
plt.rcParams.update({'axes.titlepad': 20, 'font.size': 12, 'axes.titlesize':20})

colors = [(0/255,107/255,164/255), (255/255, 128/255, 14/255), 'red', 'green', '#9E80BA', '#8EDB8E', '#58517A']
Ncolors = 10
color_map = plt.cm.Blues_r(np.linspace(0.2, 0.5, Ncolors))
#color_map = plt.cm.tab20c_r(np.linspace(0.2, 0.5, Ncolors))


#specific to this project
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
#to normalize data
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import classification_report, confusion_matrix

print("Defaults Loaded")
Defaults Loaded


Problem 1: Predict Flower Type from flower properties

In [66]:
# Read in dataset
iris = pd.read_csv("./data/iris.csv")

display(iris[:3])

# shuffle rows
shuffled_rows = np.random.permutation(iris.index)
iris = iris.loc[shuffled_rows,:]

my_dict = {'Iris-versicolor': 1, 'Iris-virginica': 2}
iris['species'] = iris['species'].map(my_dict)

display(iris.describe().transpose())
sepal_length sepal_width petal_length petal_width species
0 7.0 3.2 4.7 1.4 Iris-versicolor
1 6.4 3.2 4.5 1.5 Iris-versicolor
2 6.9 3.1 4.9 1.5 Iris-versicolor
count mean std min 25% 50% 75% max
sepal_length 100.0 6.262 0.662834 4.9 5.800 6.3 6.700 7.9
sepal_width 100.0 2.872 0.332751 2.0 2.700 2.9 3.025 3.8
petal_length 100.0 4.906 0.825578 3.0 4.375 4.9 5.525 6.9
petal_width 100.0 1.676 0.424769 1.0 1.300 1.6 2.000 2.5
species 100.0 1.500 0.502519 1.0 1.000 1.5 2.000 2.0
In [44]:
X = iris.drop('species',axis=1)
y = iris['species']

X_train, X_test, y_train, y_test = train_test_split(X, y)

#normalize data using StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
StandardScaler(copy=True, with_mean=True, with_std=True)

# Now apply the transformations to the data:
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

print("data normalization data")
data normalization data
In [45]:
mlp = MLPClassifier(hidden_layer_sizes=(13,13,13),max_iter=500)
mlp.fit(X_train,y_train)
Out[45]:
MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(13, 13, 13), learning_rate='constant',
       learning_rate_init=0.001, max_iter=500, momentum=0.9,
       n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
       random_state=None, shuffle=True, solver='adam', tol=0.0001,
       validation_fraction=0.1, verbose=False, warm_start=False)


Error Metrics: Confusion Matrix and AUC

In [63]:
predictions = mlp.predict(X_test)

#classification report
print(classification_report(y_test,predictions))

#confusion matrix
True_positives = confusion_matrix(y_test,predictions)[0][0]
False_Positives = confusion_matrix(y_test,predictions)[0][1]
False_Negatives = confusion_matrix(y_test,predictions)[1][0]
True_Negatives = confusion_matrix(y_test,predictions)[1][1]

print("True_positives: {:d}".format(True_positives))
print("False_Positives: {:d}".format(False_Positives))
print("False_Negatives: {:d}".format(False_Negatives))
print("True_Negatives: {:d}".format(True_Negatives))

#AUC
auc = roc_auc_score(y_test, predictions)
print("\nAUC: {:0.3f}".format(auc))
              precision    recall  f1-score   support

           1       1.00      0.83      0.91        12
           2       0.87      1.00      0.93        13

   micro avg       0.92      0.92      0.92        25
   macro avg       0.93      0.92      0.92        25
weighted avg       0.93      0.92      0.92        25

True_positives: 10
False_Positives: 2
False_Negatives: 0
True_Negatives: 13

AUC: 0.917


Problem 2: Predict cultivator from Wine properties

In [22]:
columns = ["Cultivator", "Alchol", "Malic_Acid", "Ash", "Alcalinity_of_Ash", 
           "Magnesium", "Total_phenols", "Falvanoids", "Nonflavanoid_phenols", 
           "Proanthocyanins", "Color_intensity", "Hue", "OD280", "Proline"]
wine = pd.read_csv('./data/wine_data.csv', names = columns)
display(wine.iloc[:3,:12])
Cultivator Alchol Malic_Acid Ash Alcalinity_of_Ash Magnesium Total_phenols Falvanoids Nonflavanoid_phenols Proanthocyanins Color_intensity Hue
0 1 14.23 1.71 2.43 15.6 127 2.80 3.06 0.28 2.29 5.64 1.04
1 1 13.20 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.38 1.05
2 1 13.16 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.68 1.03
In [23]:
wine.describe().transpose()
Out[23]:
count mean std min 25% 50% 75% max
Cultivator 178.0 1.938202 0.775035 1.00 1.0000 2.000 3.0000 3.00
Alchol 178.0 13.000618 0.811827 11.03 12.3625 13.050 13.6775 14.83
Malic_Acid 178.0 2.336348 1.117146 0.74 1.6025 1.865 3.0825 5.80
Ash 178.0 2.366517 0.274344 1.36 2.2100 2.360 2.5575 3.23
Alcalinity_of_Ash 178.0 19.494944 3.339564 10.60 17.2000 19.500 21.5000 30.00
Magnesium 178.0 99.741573 14.282484 70.00 88.0000 98.000 107.0000 162.00
Total_phenols 178.0 2.295112 0.625851 0.98 1.7425 2.355 2.8000 3.88
Falvanoids 178.0 2.029270 0.998859 0.34 1.2050 2.135 2.8750 5.08
Nonflavanoid_phenols 178.0 0.361854 0.124453 0.13 0.2700 0.340 0.4375 0.66
Proanthocyanins 178.0 1.590899 0.572359 0.41 1.2500 1.555 1.9500 3.58
Color_intensity 178.0 5.058090 2.318286 1.28 3.2200 4.690 6.2000 13.00
Hue 178.0 0.957449 0.228572 0.48 0.7825 0.965 1.1200 1.71
OD280 178.0 2.611685 0.709990 1.27 1.9375 2.780 3.1700 4.00
Proline 178.0 746.893258 314.907474 278.00 500.5000 673.500 985.0000 1680.00
In [25]:
print(wine.shape)
(178, 14)

178 data points with 13 features and 1 label column

In [28]:
X = wine.drop('Cultivator',axis=1)
y = wine['Cultivator']

X_train, X_test, y_train, y_test = train_test_split(X, y)

#normalize data using StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
StandardScaler(copy=True, with_mean=True, with_std=True)

# Now apply the transformations to the data:
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

print("data normalization data")
data normalization data
/Users/BrunoHenriques/anaconda3/lib/python3.6/site-packages/sklearn/preprocessing/data.py:617: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.
  return self.partial_fit(X, y)
/Users/BrunoHenriques/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:12: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.
  if sys.path[0] == '':
/Users/BrunoHenriques/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:13: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.
  del sys.path[0]


Train the model

In [30]:
mlp = MLPClassifier(hidden_layer_sizes=(13,13,13),max_iter=500)
mlp.fit(X_train,y_train)
Out[30]:
MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(13, 13, 13), learning_rate='constant',
       learning_rate_init=0.001, max_iter=500, momentum=0.9,
       n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
       random_state=None, shuffle=True, solver='adam', tol=0.0001,
       validation_fraction=0.1, verbose=False, warm_start=False)

parameters can be adjusted

In [36]:
predictions = mlp.predict(X_test)
print(confusion_matrix(y_test,predictions))
print(classification_report(y_test,predictions))
[[18  0  0]
 [ 1 12  0]
 [ 0  1 13]]
              precision    recall  f1-score   support

           1       0.95      1.00      0.97        18
           2       0.92      0.92      0.92        13
           3       1.00      0.93      0.96        14

   micro avg       0.96      0.96      0.96        45
   macro avg       0.96      0.95      0.95        45
weighted avg       0.96      0.96      0.96        45

  • coefs_ is a list of weight matrices, where weight matrix at index i represents the weights between layer i and layer i+1.
  • intercepts_ is a list of bias vectors, where the vector at index i represents the bias values added to layer i+1.
In [34]:
print(len(mlp.coefs_))
print(len(mlp.coefs_[0]))
print(len(mlp.intercepts_[0]))
4
13
13