+ All Categories
Home > Documents > Hassanin M. Al-Barhamtoshy

Hassanin M. Al-Barhamtoshy

Date post: 21-Mar-2022
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
39
By Hassanin M. Al-Barhamtoshy [email protected]
Transcript

By

Hassanin M. Al-Barhamtoshy

[email protected]

Presentation Outline

1. Introduction to Language Engineering

2. Major Technology Advances through 2030

3. What is the Language Engineering technology

4. AI’s Core of Language Engineering technology

(1) NLP (2) SP (3) OCR

5. Challenges facing the advances of Arabic

6. Applied Examples in Arabic Language

7. What do we need from the Datasets

8. Conclusions2

3

▪ 90% of the global population will have a supercomputer in their pocket by 2023.

▪ 1 Trillion sensors will be connected to the internet by 2022.

▪ 10% of reading glasses will be connected to the internet by 2023. (augmented reality) and

(Eye-tracking)

▪ A government will collect taxes for the first time via blockchain 2023.

▪ Driverless cars will account for 10% of all cars in the US by 2026.

▪ Robots will be more intelligent, they listen and talk, and will be in everyplace: homes,

schools,.. etc.

▪ 10% of the world's population will be wearing clothes connected to the internet by 2022.

▪ Up to 5% of products will be printed on 3D printers, and the first 3D-printed car will be in

production by 2022.

▪ Artificial intelligence takes decision making

Major Technology Advances through 2030

https://www.news-innovation.com/research-development/10-technologies-that-will-change-the-world-by-2030

https://www.hiveforhousing.com/article/21-technology-milestones-we-will-achieve-by-2030_c

Major Technology Advances through 20304

Languagedependent

High performance processor

Huge memory capacity

Biometric and environment sensors

Wireless communication

Human language interaction

Very long battery life

What is Language Engineering Technology?

Language Engineering Technologies

involve: Oral or spoken, signed and

written languages.

To Enhance communications among:

Man-Machine or Machine/Man

Man-Man (Different languages)

To ease the access of information5

1) Natural Language Processing, includes:

Machine Translation

Text Understanding, Generation and Summarization

Information Retrieval and Question Answering

2) Speech Processing: that includes

Automatic Speech Recognition

Text to Speech

3) Computer Vision: Document Analysis using OCR ; includes:

Printed or Typewritten, and Handwritten

Off-line, On-line

AI’s Core of Language Engineering Technology6

Natural Language

Processing (NLP)

17

Bilingual Machine

Translation Ecosystem

(English/Arabic)

نظام ترجمة آلي ثنائي اللغة

(عربي/إنجليزي)

8

9

▪ Develop an English to Arabic translation model with quality for

continuous improvement and flexible to be expanded multi-lingual

other language pairs.

▪ Bilingual corpora/dictionaries will be involved, after cleaning and

removing non-alphanumeric texts using linguistic modification tasks for

the proposed machine translation model.

▪ Dataset creation, collection, annotation , adaption, etc.

▪ Bilingual machine translation model based on neural networks will be

developed.

▪ Encoder and decoder models are involved for such machine

translation.

▪ Evaluate the proposed translation model using standard methods.

Bilingual Machine Translation

Arabic/English Machine Translation

10

Natural

Language

Translator

Natural Language #1

(text)

Natural Language #2

(text)

Arabic English

أأكلانا I am eating

11

Design and Implementation View

2

Machine translation accuracy

Evaluation

Human Expertise 1

12

Using BERT-Score (English Reference)

Bidirectional Encoder Representations from Transformers BLEU: Bilingual Evaluation Understudy

Speech Processing

معالجة الصوت

213

2- Automatic Speech Recognition (ASR)

أنا سامع

14

15 Automatic Speech Recognition

Feature

VectorsDecoder (Modeling/Classification & search)

Dictionary Model

Acoustic

Model

Languag

e Model

Training Data (1200 Hours)

Recognized Text

Demos

Example 1

Example 2

Example 3

ArSL https://youtu.be/iBZqCt13JQs

ASL (Virtual Character) WebSign (SML&XML)

Jalees Reader http://www.jaleesreader.com/DemoArabic/OEBPS/pages/index00.html

16

Arabic Documents

Manuscripts Analysis

and Retrieving

تحليل واسترجاع المخطوطات العربية

317

Special Arabic Characteristics

1. Connectivity properties.

2. Dotting properties.

3. Multi Graphemes (location dependent)

4. Ligatures properties.

Arabic word segments can be represented by single

atomic grapheme.

5. Overlapping properties.

6. Font Size properties.

Arabic graphemes don’t have fixed height or

fixed width.

Fonts Families and variations: ،نسخ، رقعة، كوفي...

18

▪ Historical documents, challenges and

difficulties of recognizing Arabic calligraphy

that are cursive in nature, composed of dots

and diacritics, and has different writing style.

▪ We will propose an implementation approach

for layout document analysis, features

extractions, then object segmentation, and

recognition of Arabic historical documents

using deep learning.

▪ Collect Arabic manuscripts images in a dataset

(Dataset Collections).

▪ Arabic manuscript features extractions.

19

Optical Character Recognition

Arabic OCR Overview

Large volumes of Arabic documents:

Early Printed Documents

Printed Documents

Calligraphy Documents

Handwritten Documents

Historical Documents

1. What are the domains should be covered in the document datasets that are needed to start with?

2. What are the sizes and the volumes of the images to be processed at the training and classification phases?

3. How to measure the accuracy and the system performance?

20

20

Document Understanding System Modules

Firstly, an image is described by an object data of different types:

1. Graphic information where the whole image is represented as a sequence of orthogonal pixel runs [1]

2. Segmented information to describe texture regions

3. Layout data description to represent the arrangement of objects (their geometry)

4. Symbolic data representing that multiple glyph images

21

Modules / Approaches Processes: Extracted attributes & features

Preprocessing Binarization,

Document enhancement and noises removing, and

Skew and slant detection

Layout Analysis Document categorizations

Page orientation, and

Segmentation in Text and non-text regions.

21

Arabic Documents Types Overview

22

Islamic/ Christian observatory A chemical processes in Arabic manuscriptPatterns used to decorate

buildings

Painting in honor of Sultan Murad III (1574-95)

https://www.google.com.sa/search?q=Islamic+manuscripts+%2B+PPT&safe=active&sa=N&biw=1164&bih=595&tbm=isch&tbo=u&source=univ&ved=0ahUKEwib9aPVm7HLAhVDORQKHY-9A4w4KBDsCQgu

22

Page Layout Analysis and Decomposition

23

Document Preprocessing & Layout Analysis

Non-Textual RegionsTextual Regions

Optical Character

Recognition

Unified/ Universal

DescriptionGraphical Processing

Regions and Symbol

Processing

23

Document Image

Analysis

Graphical ProcessingTextual Processing

Optical

Character

Recognition

Page

Layout

Analysis

LineProcessing

RoI

Processing

TextSkew, blocks,

paragraphs

Lines, curves,

corners

Filled

regions

24

Text/Image Separation

Intervals between peaks (Indian Language)

25

Line Separation

Ascenders & descenders interfering with lines

Region-growing approach

In Devanagari, single word is a single connected component

Grow regions using horizontally adjacent components

26

Analysis of Arabic Calligraphy Pages

27

(a) Original document (b) After Binarization (c) After De-noising

(d) After De-Framing (f) After De-skewing (g) After Segmentation

Example of Arabic calligraphy document after the pre-processing processes

27

Example of Arabic Printed document after the pre-processing processes

28

Example of Arabic Printed document after the pre-processing processes

(a) Original document (b) After Binarization (c) After Skewing

(d) After De-noising (f) After De-Framing (g) After Segmentation

28

29

(a) Original document (b) After Binarization (c) After Skewing

(d) After De-noising (f) After De-Framing (g) After Segmentation

Example of Arabic Printed document after the pre-processing processes29

Overall Evaluation

30

Scanned Image #1 OCR Results Accuracy (%)

Version 2.0 = (90.50 %) Version 1.0 (82.35 %)

30

Overall Evaluation

31

Scanned Image #2 Version 2.0 = (94.97 %) Version 1.0 (86.43 %)

31

Overall Evaluation

32

Scanned Image #3 OCR Results Accuracy (%)

Version 2.0 = (90.07 %) Version 1.0 (0.74 %)

32

33

IR Design View

34

Implementation View

2

IR Accuracy

Evaluation

Human Expertise 1

35

Document Analysis / Machine Translation

Source Document Target Translation (Arabic)

36

OCR(Tesseract)

Translation

(Google)

Vietnamese document Arabic language

OCR(Tesseract)

Translation

(Google)

Germany document English language

I need a beer!

New Technologies

BCI

Brain Computer Interface

الحاسوبيةالدماغواجهة

Brain machine interface (BMI)

EEG

Electro EncephaloGram

كهربينشاطالتفكير،

ECG

Electro CardioGram

كهربينشاطالقلب،تخطيط

Head-mounted Display(HMD)

Device paired to a headset such as a harness or helmet

شاشة عرض مع سماعة: خوذة

Eye Glasses

Eye wear that employs cameras to intercept the real world view and re-display it's augmented view through the eye pieces.

(العالم الحقيقي)توظف الكاميرا وتعزز المشهد مع الواقع : نظارة طبية

Mobile Technology

Augmented, Virtual and Mixed Realities

تعزيز، افتراض ي، مختلط

Sign Language.

37

37

Several testing have been successfully measured with 90% accuracy (for all the examples).

The audio dataset is prepared, labeled and assigned to contain audio signals of Arabic with different accents, sizes, domains, styles, directions, and skewed.

The OCR recognition system utilized basic computer vision and image processing algorithms (edge detection, contours, and contour filtering) to segment characters/words from an input image.

The type of datasets included bilingual lexicons (MT), audio signals (SP), and documented images of manuscripts of calligraphy, early printed and printeddatasets (OCR).

We need to unify between international languages to be used in direct translation.

Using new technologies:

Study any time and any where.

Integrate with VR, AR and Mix reality to help others.

Conclusion

38

Welcome for any question?

38


Recommended