Building a Neuron Network from scratch (Only Numpy)

These days, the creation of synthetic intelligence fashions is facilitated by way of numerous libraries and ready-made sources. Though these sources are extraordinarily helpful, they usually obscure the inner workings of the fashions.

With this in thoughts, I launched into a venture to understand how a neural community operates with out the help of libraries, relying solely on Linear Algebra. The dataset used consisted of handwritten letters, with the mannequin tasked to foretell the written letter.

# Convert the DataFrame to a NumPy array and get its dimensions
information = information.to_numpy()
m, n = information.form# Shuffle the information
np.random.shuffle(information)
# Break up the information into coaching and validation units
train_data = information[1000:]
dev_data = information[:1000]
# Transpose the coaching information in order that samples are in columns
data_train = train_data.T
Y_train = data_train[0]
X_train = data_train[1:] / 255.0
# Transpose the validation information in order that samples are in columns
data_dev = dev_data.T
Y_dev = data_dev[0]
X_dev = data_dev[1:] / 255.0
# Get the variety of coaching examples
_, m_train = X_train.form

This code snippet processes a dataset for machine studying purposes. It begins by changing a DataFrame (information) right into a NumPy array (information.to_numpy()), important for environment friendly numerical computations. The size of this array (m for rows and n for columns) are then captured to grasp the dataset’s measurement. Subsequent, the information is shuffled randomly (np.random.shuffle(information)) to take away any inherent order, which helps forestall biases throughout mannequin coaching.

Following shuffling, the dataset is cut up into two units: coaching and validation information. The coaching information (train_data) contains examples from index 1000 onwards, that are transposed (data_train.T) to prepare every pattern in columns. From this transposed information, labels (Y_train) and options (X_train) are extracted: labels are taken from the primary row of data_train, whereas options are normalized by dividing by 255.0 to scale them between 0 and 1. Equally, the validation information (dev_data) consists of the primary 1000 examples, that are additionally transposed (data_dev.T). Labels (Y_dev) and options (X_dev) for validation are equally extracted and normalized.

Then, we transfer on to creating the fundamental neural community strategies, similar to ReLU and Softmax layers. Moreover, the Sparse Categorical Cross-Entropy loss operate needed to be created. Lastly, now we have the golden boys: the propagation features.

def forward_prop(W1, b1, W2, b2, W3, b3, X):
Z1 = np.dot(W1, X) + b1
A1 = relu(Z1)
Z2 = np.dot(W2, A1) + b2
A2 = relu(Z2)
Z3 = np.dot(W3, A2) + b3
A3 = softmax(Z3)
return Z1, A1, Z2, A2, Z3, A3def backward_propagation(X, Y, cache, parameters):
grads = {}
L = len(parameters) // 2
m = X.form[1]
Y = Y.T
dZL = cache['A' + str(L)]
dZL[Y, range(m)] -= 1
grads['dW' + str(L)] = 1 / m * dZL.dot(cache['A' + str(L-1)].T)
grads['db' + str(L)] = 1 / m * np.sum(dZL, axis=1, keepdims=True)
for l in reversed(vary(1, L)):
dZ = parameters['W' + str(l+1)].T.dot(dZL) * relu_derivative(cache['Z' + str(l)])
grads['dW' + str(l)] = 1 / m * dZ.dot(cache['A' + str(l-1)].T)
grads['db' + str(l)] = 1 / m * np.sum(dZ, axis=1, keepdims=True)
dZL = dZ
return grads
def update_parameters(parameters, grads, learning_rate):
L = len(parameters) // 2
for l in vary(1, L + 1):
parameters['W' + str(l)] -= learning_rate * grads['dW' + str(l)]
parameters['b' + str(l)] -= learning_rate * grads['db' + str(l)]
return parameters

The ahead propagation operate takes enter information (X) and parameters (W and b for every layer), passing it by way of every layer of the neural community. It computes weighted sums (Z) and applies activation features (relu for hidden layers and softmax for the output layer) to generate activations (A). These activations are essential as they signify the neural community’s output after every layer, capturing nonlinearities and making ready the information for prediction. The backward propagation operate, alternatively, is just like the detective of the operation. It traces again by way of the community, utilizing cached activations and gradients, to compute how a lot every parameter (weights and biases) contributed to the error between predicted and precise outputs. By calculating gradients recursively from the output layer to the enter layer, it updates parameters in a course that minimizes prediction errors throughout coaching. This dynamic duo of features is crucial in coaching neural networks, enabling them to be taught from information iteratively and enhance their predictions over time.

Then, we simply use all of the issues we created:

layer_dims = [
784, 
128, 
64, 
26]parameters = SequentialModel(X_train, Y_train, layer_dims, learning_rate=0.1, epochs=30)

After coaching the mannequin, it’s essential to create the accuracy operate to measure how properly the mannequin generalized. In the long run, the outcomes have been fairly stable: 74.7% accuracy and 0.894 loss.

The ultimate pocket book may be discovered on Kaggle: NN from Scratch (only Numpy).

Discover extra of my tasks in my portfolio: Adriano Leão’s Portfolio.

Source link

Building a Neuron Network from scratch (Only Numpy)

Working with Input-Convex Neural Networks part3(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

Embracing the Future: The Rise of AI-Driven Development in Software Engineering The software… | by DevBlogs | Jul, 2024

Research on Metaheuristic methods part4(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

How Real-Time Data Analytics and AI Are Transforming Heavy Equipment Operations

NVIDIA Accelerates Google Quantum AI Processor Design With Simulation of Quantum Device Physics

Game Development and Cloud Computing: Benefits of Cloud-Native Game Servers

Teradata AI Unlimited in Microsoft Fabric is Now Available for Public Preview through Microsoft Fabric Workload Hub

Cognigy Unveils Agentic AI: Transforming the Future of Enterprise Contact Centers

Our Picks

AI Transformation — Success Beyond Data Quality Perfection | by Hans Christian Ekne | Apr, 2024

Harnessing AI for Biodiversity: Advanced Deep Learning Models in the GeoLifeCLEF 2024 Kaggle Challenge | by Teja | May, 2024

Working with Monge Maps part3(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

Most Popular

Revolutionizing the Way We Find Love

Will GenAI Replace Data Engineers? No – And Here’s Why.

Assortment Optimization Machine Learning | by Danishaliarshar | Mar, 2024

Building a Neuron Network from scratch (Only Numpy)

Related Posts