Human Train Recognition (HAR) is an important area in ubiquitous computing and has functions in properly being monitoring, smart homes, and further. This mission targets to leverage deep learning strategies to exactly predict six fully totally different actions (strolling, strolling upstairs, strolling downstairs, sitting, standing, and laying) using data from gyroscopes and accelerometers. The dataset is collected from the UCI Machine Finding out Repository (UCI HAR) and entails 30 contributors performing the actions whereas sporting a Samsung Galaxy S II smartphone.
Data Exploration and Preprocessing
The dataset incorporates readings from the smartphone’s accelerometer and gyroscope, captured at 50Hz. It consists of preprocessed data partitioned into teaching (70%) and check out (30%) models. Each participant’s data was reworked into 561-feature vectors derived from time and frequency domains.
Loading and Normalizing the Data
We load the knowledge and separate the choices and purpose variable. The choices are then normalized using MinMaxScaler, and the train labels are encoded using LabelEncoder.
import pandas as pd
from sklearn.preprocessing import MinMaxScaler, LabelEncoder
from sklearn.model_selection import train_test_split# Load the dataset
train_data = pd.read_csv('follow.csv')
test_data = pd.read_csv('check out.csv')
# Separate choices and purpose variable
X_train_full = train_data.drop(columns=['subject', 'Activity'])
y_train_full = train_data['Activity']
X_test = test_data.drop(columns=['subject', 'Activity'])
y_test = test_data['Activity']
# Normalize the operate vectors using MinMaxScaler
scaler = MinMaxScaler()
X_train_full_scaled = scaler.fit_transform(X_train_full)
X_test_scaled = scaler.rework(X_test)
# Encode train labels using LabelEncoder
label_encoder = LabelEncoder()
y_train_full_encoded = label_encoder.fit_transform(y_train_full)
y_test_encoded = label_encoder.rework(y_test)
Attribute Extraction and Visualization
To raised understand the knowledge, we visualize some choices using scatter matrix plots. This helps in determining patterns and correlations throughout the data.
import plotly.categorical as px# Convert X_train to a DataFrame for visualization
X_train_df = pd.DataFrame(X_train_full_scaled, columns=train_data.columns[:-2])
X_train_df['Activity'] = label_encoder.inverse_transform(y_train_full_encoded)
# Visualize only a few choices
fig = px.scatter_matrix(X_train_df, dimensions=['tBodyAcc-mean()-X', 'tBodyAcc-mean()-Y', 'tBodyAcc-mean()-Z'], shade='Train')
fig.current()
Model Construction and Design Choices
We use a deep neural group (DNN) to classify the actions. The model consists of a variety of dense layers with ReLU activation and dropout layers to forestall overfitting.
import tensorflow as tf
from tensorflow.keras.fashions import Sequential
from tensorflow.keras.layers import Dense, Dropout# Define the model
model = Sequential([
Dense(128, activation='relu', input_shape=(X_train_full_scaled.shape[1],)),
Dropout(0.5),
Dense(64, activation='relu'),
Dropout(0.5),
Dense(32, activation='relu'),
Dense(6, activation='softmax') # 6 classes for six actions
])
# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# Put together the model
historic previous = model.match(X_train_full_scaled, y_train_full_encoded, epochs=50, batch_size=32, validation_split=0.2)
layers = [100, 25, 12, 6, 6]layers_str = ["Input"] + ["Hidden"] * (len(layers) - 2) + ["Output"]
layers_col = ["none"] + ["none"] * (len(layers) - 2) + ["none"]
layers_fill = ["black"] + ["gray"] * (len(layers) - 2) + ["black"]
penwidth = 10
font = "Hilda 10"
print("digraph G {")
print("tfontname = "{}"".format(font))
print("trankdir=LR")
print("tsplines=line")
print("tnodesep=.15;")
print("tranksep=15;")
print("tedge [color=black, arrowsize=.5];")
print("tnode [fixedsize=true,label="",style=filled," +
"color=none,fillcolor=gray,shape=circle]n")
# Clusters
for i in range(0, len(layers)):
print(("tsubgraph cluster_{} {{".format(i)))
print(("ttcolor={};".format(layers_col[i])))
print(("ttnode [style=filled, color=white, penwidth={},"
"fillcolor={} shape=circle];".format(
penwidth,
layers_fill[i])))
print(("tt"), end=' ')
for a in range(layers[i]):
print("l{}{} ".format(i + 1, a), end=' ')
print(";")
print(("ttlabel = {};".format(layers_str[i])))
print("t}n")
# Nodes
for i in range(1, len(layers)):
for a in range(layers[i - 1]):
for b in range(layers[i]):
print("tl{}{} -> l{}{}".format(i, a, i + 1, b))
print("}")
Hyperparameter Tuning
Using KerasTuner, we supply out hyperparameter tuning to go looking out the optimum model construction. This entails adjusting the number of fashions in each layer and the number of layers.
import keras_tuner as ktdef build_model(hp):
model = Sequential()
model.add(Dense(fashions=hp.Int('fashions', min_value=32, max_value=512, step=32), activation='relu', input_shape=(X_train_full_scaled.kind[1],)))
model.add(Dropout(0.5))
for i in range(hp.Int('num_layers', 1, 3)):
model.add(Dense(fashions=hp.Int('units_' + str(i), min_value=32, max_value=512, step=32), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(6, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
tuner = kt.RandomSearch(build_model, purpose='val_accuracy', max_trials=5, executions_per_trial=3, itemizing='my_dir', project_name='activity_recognition')
tuner.search(X_train_full_scaled, y_train_full_encoded, epochs=50, validation_split=0.2)
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
Final Model Effectivity and Accuracy
The simplest model from hyperparameter tuning is educated and evaluated on the check out set, attaining spectacular accuracy.
# Contemplate the right model on the check out set
best_model = tuner.hypermodel.assemble(best_hps)
best_model.match(X_train_full_scaled, y_train_full_encoded, epochs=50, validation_split=0.2)
test_loss, test_acc = best_model.take into account(X_test_scaled, y_test_encoded)
print(f'Check out accuracy: {test_acc}')
Challenges Confronted and Choices Utilized
Considered one of many main challenges was stopping overfitting because of small dataset dimension. We addressed this by way of the usage of dropout layers and hyperparameter tuning. One different drawback was guaranteeing the model generalizes correctly to unseen data, which we tackled by splitting the knowledge into teaching and validation models and using early stopping all through teaching.
Conclusion and Future Work
This mission demonstrates the potential of deep learning in HAR duties, attaining extreme accuracy in predicting actions. Future work would possibly uncover using additional sophisticated fashions just like Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs) and incorporating additional sensor data for improved accuracy.
Visualizations
- Teaching Historic previous:
import matplotlib.pyplot as pltplt.decide(figsize=(12, 4))
# Plot teaching & validation accuracy values
plt.subplot(1, 2, 1)
plt.plot(historic previous.historic previous['accuracy'])
plt.plot(historic previous.historic previous['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='increased left')
# Plot teaching & validation loss values
plt.subplot(1, 2, 2)
plt.plot(historic previous.historic previous['loss'])
plt.plot(historic previous.historic previous['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='increased left')
plt.current()
2. Confusion Matrix:
# Predictions on the check out set
y_pred = best_model.predict(X_test_scaled)
y_pred_classes = y_pred.argmax(axis=1)# Confusion matrix
conf_matrix = confusion_matrix(y_test_encoded, y_pred_classes)
plt.decide(figsize=(10, 7))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=label_encoder.classes_, yticklabels=label_encoder.classes_)
plt.xlabel('Predicted')
plt.ylabel('Exact')
plt.title('Confusion Matrix')
plt.current()
# Classification report
print(classification_report(y_test_encoded, y_pred_classes, target_names=label_encoder.classes_))
References
UCI Human Activity Recognition Data Set