Driver Drowsiness Classification- Deep Learning

Vraj Mistry
4 min readJul 28, 2022

Drowsiness detection is a car safety technology that helps prevent accidents caused by drivers who fell asleep while driving. According to NHTSA (National Highway Traffic Safety Administration), 91,000 police-reported crashes involved drowsy drivers. These crashes led to an estimated 50,000 people injured and nearly 800 deaths in 2017. Currently, methods are mainly focused on eye blink detection using deep learning or machine learning technologies, however, what if drivers wear sunglasses?

My Approach!!!

What if we took into consideration about driver’s head tilt, yawning, and other factors as well? Yes, that is exactly what I did. Before I move into feature extraction part, I took the data from The “ULg Multimodality Drowsiness Database” also called DROZY, a database containing various types of drowsiness-related data (signals, images, etc.). The dataset contains about 45 video clips which are label as per their Karolinska Sleepiness Scale (KSS). KSS scale ranges from 1 to 9, with 1 being very alert and 9 be very sleepy.

Since the data and labels were lacking in this dataset, I converted labels from 1–9 to 1–3, which stated no drowsiness, moderate drowsiness, and high drowsiness. I would have used video- classification process instead, but since the data is not enough, I would rather extract features first and use them as my model input. This way the model will be more accurate with less data.

Features Extraction

For this particular task, I am going to use TensorFlow-GPU 2.6 and python 3.6 with libraries open-cv, dlib, scipy preinstalled using pip. I will link my GitHub at the very end of the page so you can get full access to it. All the libraries neeed for the feature extraction

from scipy.spatial import distance as dist
from imutils.video import FileVideoStream
from imutils.video import VideoStream
from imutils import face_utils
import numpy as np
import argparse
import imutils
import time
import dlib
import cv2
import datetime
import csv
import os
import math

Average Blink Duration: the duration for which the eye aspect ratio is below 0.3 and then goes above 0.3 is detected as a blink. The time for which a blink happens is called blink duration. Blink duration per minute is averaged out to calculate average blink duration.

Blink Frequency: number of blinks per minute is called blink frequency.

Mouth Aspect Ratio: MAR is calculated to detect whether a person is yawning or not.

Head Pose: different angles are calculated at each frame to get the head pose.

This is what my feature extraction looks like during the process.

Time to make the model!

Since we have already done the feature selection part, we don't have to build the complex model. I am going to use Artificial Neural Network

import numpy as np
import sklearn
from sklearn import preprocessing
#from sklearn.datasets.samples_generator import make_blobs
#from sklearn.preprocessing import LabelEncoder, StandardScaler
import csv
import os
from tensorflow import keras
import random
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, BatchNormalization
from keras import regularizers
import matplotlib.pyplot as plt
#from keras.utils import plot_model
import sklearn
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
import pickle
from keras.utils import np_utils
from keras import optimizers
from keras.models import load_model

Here is the design to the simplest model in deep learning and yet effective because of feature extraction. Make sure since,we are calculating drowniess per minute, you concat your input and then pass into the model.

Model performance using accuracy vs validation accuracy and training loss vs validation loss.

Hyperparameter tuning the model! I change the learning rate(0.01 -> 0.001), different optimizer(RMSprop), number of epochs(20 -> 50).

With Sklearn’s confusion matrix, I evaluated the model on the test set and got 73% accuracy and 89% training accuracy. Using about 4 more hidden layers, I got about 74% accuracy on the test set and 93% accuracy on the training dataset.

I hope you liked my blog post! thank you for taking the time to read and here is my Github link about this project(link).

I am still new when it comes to posting blogs, but I would appreciate any feedback from your side in the comments below. Thank you.

--

--

Vraj Mistry

Master student at San Jose State University, with a strong math background and experience in big data, machine learning, and statistics.