The renowned iris from:

https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data

import numpy as np

import pandas as pd

import os

os.chdir("D:\Python\Numpy_JP\ch04-3")

# Choose your own working directorydf = pd.read_csv('iris.data', header=None)

print(df)

1. 130 Machine Learning Projects Solved and Explained

2. The New Intelligent Sales Stack

3. Time Series and How to Detect Anomalies in Them — Part I

4. Beginners Guide -CNN Image Classifier | Part 1

We would have the following data frame of iris:

`x = df.iloc[0:100,[0, 1, 2, 3]].values`

y = df.iloc[0:100,4].values

y = np.where(y=='Iris-setosa', 0, 1)

We take the first 100 rows, and divide the dataset into *x* (feature) & *y* (target). Then *np.where* transform the nominal data *y* from text to numeric.

x_train = np.empty((80, 4))

x_test = np.empty((20, 4))

y_train = np.empty(80)

y_test = np.empty(20)x_train[:40],x_train[40:] = x[:40],x[50:90]

x_test[:10],x_test[10:] = x[40:50],x[90:100]

y_train[:40],y_train[40:] = y[:40],y[50:90]

y_test[:10],y_test[10:] = y[40:50],y[90:100]

Row 1~50 are “*Iris-setosa*”, and row 51~100 are “*Iris-virginica*.” So we collect row 1~40 of and row 51~90 to be the training set, and the rest rows are testing set.

def sigmoid(x):

return 1/(1+np.exp(-x))def activation(x, w, b):

return sigmoid(np.dot(x, w)+b)def update(x, y_train, w, b, eta):

y_pred = activation(x, w, b)

# activator

a = (y_pred - y_train) * y_pred * (1- y_pred)

# partial derivative loss function for i in range(4):

w[i] -= eta * 1/float(len(y)) * np.sum(a*x[:,i])

b -= eta * 1/float(len(y))*np.sum(a)

return w, b

Let’s probe into the math behind the preceding code:

**Activator**: sigmoid function**Loss function**: MSE (mean square error)**Optimizer**: gradient descend**Weight updates**: tiresome math work

`weights = np.ones(4)/10 `

bias = np.ones(1)/10

eta = 0.1

for _ in range(15): # Run both epoch=15 & epoch=100

weights, bias = update(x_train, y_train, weights, bias, eta=0.1)

**Initial weights**: Let wi & b all be 0.1**Learning rate**: set eta= 1**Epoch**: Run both epoch= 15 & epoch= 100

`print("Epochs = 15") # Run both epoch=15 & epoch=100`

print('weights = ', weights, 'bias = ', bias)

print("y_test = {}".format(y_test))

activation(x_test, weights, bias)

If we set the decision boundary at 0.5, then we get 100% accuracy in both epochs = 15 & 100.

The first 10 predictions of epochs = 15 are between 0.46~0.49, while the first 10 predictions of epochs = 100 are between 0.23~0.30. The last 10 predictions of epochs = 15 are between 0.57~0.63, while the last 10 predictions of epochs = 100 are between 0.64~0.81. As epochs rises, values of the two flower group become more distance.

*Without any **NN** framework*, we built up a single-layer neural network ! We discovered concepts like **activator**(*sigmoid***)**, **loss function**(*MSE*), **optimizer**(*gradient descend*), **weight updates**(*tiresome math work*),** initial weights**(*wi = b= 1*),** learning rate**(*eta= 1*)**, epoch**(*tried epochs= 15 & 100*).

[1] Yoshida, T. & Ohata, S. (2018). Genba de Tsukaera! NumPy Data Shori Nyumon kikaigakushu| datascience de yakudatsu kosoku shorishuho. Japan, JP: SHOEISHA.

[2] Bre, F. et al(2020). An efficient metamodel-based method to carry out multi-objective building performance optimizations. Energy and Buildings, 206, (unknown).

[3] Subramanian, V. (2018). Deep Learning with PyTorch. Birmingham, UK: Packt Publishing.

[4] Roughgarden, T. & Valiant, G.(2015). CS168: The Modern Algorithmic Toolbox Lecture #15: Gradient Descent Basics. Retrieved from

http://theory.stanford.edu/~tim/s15/l/l15.pdf

Credit: BecomingHuman By: Morton Kuo