Ahsanullah University of Science & Technology
Real Time Bangladeshi Sign Language Detection Using Faster R-CNN
Oishee Binty Hoque, Mohammad Imrul Jubair, Md. Saiful Islam, Al-Farabi Akash, Alvin Sachie Paulson
Dept. of Computer Science & Engineering Dept. of Computer Science & Engineering
Dept. of Computer Science & Engineering
International Conference on Innovation in Engineering and Technology(ICIET) 2018
Introduction
International Conference on Innovation in Engineering and Technology(ICIET) 2018 212/28/2018
How can a non-signer communicate with a signer?
A real-time interpreter might be a possible solution
Signers Communication
International Conference on Innovation in Engineering and Technology(ICIET) 2018 312/28/2018
Our target is to construct a real-time interpreting system.
To make communication easier
Signers Communication
International Conference on Innovation in Engineering and Technology(ICIET) 2018 4
ই
12/28/2018
Research Domain
International Conference on Innovation in Engineering and Technology(ICIET) 2018 512/28/2018
Research Problem
Bangladeshi Sign Letters Recognition
Real-Time
How can we implement that?
Using Deep Learning based object
detection method
International Conference on Innovation in Engineering and Technology(ICIET) 2018 6
Object, i.e. 'car', 'bicycle', detection
based on Deep Learning
Can 'signs' be detected similarly?
12/28/2018
Related Works
International Conference on Innovation in Engineering and Technology(ICIET) 2018 712/28/2018
Existing WorksTo best of our knowledge -
Rahman et al. 2018
“Bangla Language Modeling Algorithm For Automatic Recognition of Hand-Signspelled Bangla Sign Language”
Yasir et al. 2018
“Bangla Sign Language Recognition Using Convolutional Neural Network”
Ahmed et al. 2016
“Bangladeshi Sign Language Recognition Using Fingertip Position”
Rahman et al. 2015
“Computer Vision Based Bengali Sign Words Recognition Using Contour Analysis”
Rahman et al. 2014
“Realtime Computer Vision-Based Bengali Sign Language Recognition”
International Conference on Innovation in Engineering and Technology(ICIET) 2018 812/28/2018
ReviewsAdditional device for input
Dataset
Requires preprocessing, such as background segmentation
Absence of variations in backgrounds, illuminations, etc (which is MUST for
Deep learning)
Methodology
Traditional machine learning method
Beforehand feature extraction
Output
Not “completely” real time
Not “enough” real time to be integrated in applications i.e. Smartphones
International Conference on Innovation in Engineering and Technology(ICIET) 2018 9
*For detailed review, please go through our paper **Images are collected from Internet and Rahman et al.
12/28/2018
Challenges and Contributionsof our Work
International Conference on Innovation in Engineering and Technology(ICIET) 2018 1012/28/2018
Challenges
The method must be -
Device Independent
Real-Time in erratic background
International Conference on Innovation in Engineering and Technology(ICIET) 2018 1112/28/2018
Challenges
Needs Deep Learning based recognition methods
Dependent on Robust Dataset
Large number of images/class
Enough variation in input data in terms of
Background, Gesture Angle, Age and Gender
International Conference on Innovation in Engineering and Technology(ICIET) 2018 1212/28/2018
Contribution
International Conference on Innovation in Engineering and Technology(ICIET) 2018 13
Domain
Dataset
Open for Community ✓
Dynamic LightningCondition
✓
Dynamic Background ✓
Similar Gesture ✓
Our SystemDevice Based Input ✕
Real Time ✓
Our ContributionDomain
12/28/2018
Methodology(Faster R-CNN)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 1412/28/2018
Faster R-CNN
Basic Concepts
Neural Network & Convolutional Neural Network (CNN)
Anchors
Regional Proposal Network (RPN)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 1512/28/2018
Neural Network
International Conference on Innovation in Engineering and Technology(ICIET) 2018 16
Input
12/28/2018
Neural Network
International Conference on Innovation in Engineering and Technology(ICIET) 2018 17
HiddenInput
12/28/2018
Neural Network
International Conference on Innovation in Engineering and Technology(ICIET) 2018 18
Input Hidden Output
Cat: 90%
Not Cat: 10%
12/28/2018
ConvolutionalNeural Network (CNN)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 19
Convolution Fully Connected
12/28/2018
ConvolutionalNeural Network (CNN)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 20
Convolution Fully Connected
Output
12/28/2018
Cat: 90%
Dog: 5%
Bird: 1%
Faster R-CNN (Anchor)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 21
Center
12/28/2018
Faster R-CNN (Anchor)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 22
Center
12/28/2018
Faster R-CNN (Anchor)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 23
Center
12/28/2018
Some Example of Anchors
Region Proposal Network
Selected Discarded
Faster R-CNN (Architecture)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 26
convolutional feature map(anchors)
Input
CNN
12/28/2018
Faster R-CNN (Architecture)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 27
convolutional feature map proposed regions
Input
CNN
12/28/2018
Faster R-CNN (Architecture)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 28
convolutional feature map(anchors)
proposed regions
flatten images
Input
CNN
12/28/2018
Faster R-CNN (Architecture)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 29
convolutional feature map(anchors)
proposed regions
flatten images
FINALOUTPUT
Input
CNN
12/28/2018
Bangladeshi Sign Language Image Dataset(BdSLImSet)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 3012/28/2018
BdSLImSet
Background Variation
Different Signers
International Conference on Innovation in Engineering and Technology(ICIET) 2018 31
Fig : A bit samples from our Dataset
12/28/2018
BdSLImSet: Labelling
Each image of per class is labeled
Converted into XML files
Available in Github
International Conference on Innovation in Engineering and Technology(ICIET) 2018 32
(https://github.com/imruljubair/bdslimset)
12/28/2018
BdSLImSet: Verification
This dataset has been verified By –DHAKA BADHIR HIGH SCHOOL ( ঢাকা বধির হাই সু্কল)
International Conference on Innovation in Engineering and Technology(ICIET) 2018 33
Source: A book of Bangla Sign Letters
12/28/2018
BdSLImSet: Current specifics
International Conference on Innovation in Engineering and Technology(ICIET) 2018 34
Total Images Total Class
Images/Class
Image Size
Resolution Number of Participants
Training Set: Testing Set
1600 10 100 ≤200kb ≤700*1280 10 80:20
12/28/2018
Experiment & Result
International Conference on Innovation in Engineering and Technology(ICIET) 2018 3512/28/2018
Training Phase
Implementation platform –
Tensorflow GPU-V1.5
CUDA V9.0
CPU from Intel ®. CoreTM i7-7500U of 2.7 GHz upto 3.5Ghz
GPU Nvidia 940mx with 4.00GB
10 classes with 100 images for each letter.
International Conference on Innovation in Engineering and Technology(ICIET) 2018 3612/28/2018
Training Phase
Took about 12 hours
28000 iterations to train the model.
Started with loss of 3.00, quickly dropped to 0.8. Stopped at
International Conference on Innovation in Engineering and Technology(ICIET) 2018 37
0.07
12/28/2018
Experimental Result
Confidence Rate on average 98%.
Detection Time 90.03 ms.
Accuracy Rate 98.20%.
International Conference on Innovation in Engineering and Technology(ICIET) 2018 3812/28/2018
Experimental Result
International Conference on Innovation in Engineering and Technology(ICIET) 2018 3912/28/2018
Experimental Result
International Conference on Innovation in Engineering and Technology(ICIET) 2018 4012/28/2018
Experimental Result
International Conference on Innovation in Engineering and Technology(ICIET) 2018 41
MATCHED
12/28/2018
Experimental Result
International Conference on Innovation in Engineering and Technology(ICIET) 2018 42
ঢ: 90%
উ: 94%
12/28/2018
Experimental Result
International Conference on Innovation in Engineering and Technology(ICIET) 2018 43
আ:94%এ:99%
এ:87%
12/28/2018
Limitations
Two letter with similar gesture had faulty recognition sometimes.
International Conference on Innovation in Engineering and Technology(ICIET) 2018 4412/28/2018
Future Plan
Android Based Recognition System
International Conference on Innovation in Engineering and Technology(ICIET) 2018 45
ই
12/28/2018
Future Plan
Word and Sentence Recognition In Real Time
International Conference on Innovation in Engineering and Technology(ICIET) 2018 4612/28/2018
HELLO
Future Plan
User friendly system
Increasing dataset
Evaluating the performance of our system by genuine users
International Conference on Innovation in Engineering and Technology(ICIET) 2018 47
(i.e. hearing-impaired people or sign language teacher)
12/28/2018
Demo Screenplay
International Conference on Innovation in Engineering and Technology(ICIET) 2018 4812/28/2018
Go to the youtube link:
https://youtu.be/8NLwOpQCmW0
Any Questions?
International Conference on Innovation in Engineering and Technology(ICIET) 2018 4912/28/2018