Date post: | 10-Jul-2015 |
Category: |
Technology |
Upload: | visiongeomatique2014 |
View: | 214 times |
Download: | 0 times |
Language
Technologies for
Geomatics: From
Intelligence to Agility
Vision Géomatique - 2014-11-12
Stéphane Gagnon, Ph.D.
Professeur, DSA, UQO
Outline
1. Business Intelligence
2. Language Technologies
3. Geomatics Applications
4. Big Data and Geo-Agility
2014-11-122 Stéphane Gagnon
Abstract Language Technologies are used for automated text
analytics, and rely on a blend of Linguistics, Artificial
Intelligence (AI), and Decision Sciences.
They include such applications as content
management, document indexing and search, text classification, automated translation, geographic and
contextual localization, semantic web, real-time text
stream processing, event patterns analysis, and others.
We present a brief discussion of how Language
Technologies may be integrated with geomatics
applications, not simply to enhance business and
decisional intelligence, but with the aim of making organizations more agile and resilient in the face of risk
and uncertainty.
2014-11-12Stéphane Gagnon3
Sources Baccalauréat en administration - Systèmes d'information de gestion
SIG1003 - Systèmes d'information pour gestionnaires
Efraim Turban, Linda Volonino, Gregory Wood, et Janice Sipior,
(2013), Information technology for management: Advancing
sustainable, profitable business growth, 9th edition, New York, Wiley, 480 pages, ISBN: 9781118547861
SIG1043 - Intelligence d’affaires
Ramesh Sharda, Dursun Delen, Efraim Turban, (2013), Business
Intelligence: A Managerial Perspective on Analytics, CourseSmart
eTextbook, 3rd edition, New York, Pearson Higher Education, 330 pages, ISBN: 9780133051070
2014-11-12Stéphane Gagnon4
Typical BI Architecture
2014-11-12Stéphane Gagnon10
Data Warehouse
Technical staff
Data Warehouse Environment
DataSources
Business Analytics Environment
Performance and Strategy
Business users Managers / executives
Built the data warehouse Access
ManipulationResults
BPM strategyü Organizingü Summarizingü Standardizing
Future component intelligent systems
User Interface - browser
- portal - dashboard
Data Mining (DM)
2014-11-12Stéphane Gagnon13
Sta
tistic
s
Management Science &
Information Systems
Artificial Intelligence
Databases
Pattern
Recognition
Machine
Learning
Mathematical
Modeling
DATA
MINING
DM Tasks
2014-11-12Stéphane Gagnon14
Data Mining
Prediction
Classification
Regression
Clustering
Association
Link analysis
Sequence analysis
Learning Method Popular Algorithms
Supervised
Supervised
Supervised
Unsupervised
Unsupervised
Unsupervised
Unsupervised
Decision trees, ANN/MLP, SVM, Rough
sets, Genetic Algorithms
Linear/Nonlinear Regression, Regression
trees, ANN/MLP, SVM
Expectation Maximization, Apriory
Algorithm, Graph-based Matching
Apriory Algorithm, FP-Growth technique
K-means, ANN/SOM
Outlier analysis Unsupervised K-means, Expectation Maximization (EM)
Apriory, OneR, ZeroR, Eclat
Classification and Regression Trees,
ANN, SVM, Genetic Algorithms
Language Technologies
Statistical Methods
Analyze documents as bags of
words
Semantic Methods
Analyze documents using tags from
ontologies describing relationships
2014-11-12Stéphane Gagnon15
Statistical Methods
Information retrieval/search
Topic/keyword tracking
Geo-language recognition
Categorization/classification
Clustering/recommendation
Concept linking/association rules
2014-11-12Stéphane Gagnon16
Semantic MethodsNatural Language Processing (NLP)
Part-of-speech tagging
Text segmentation
Word sense disambiguation
Syntax ambiguity
Imperfect or irregular input
Speech acts
2014-11-12Stéphane Gagnon17
NLP Tasks
Information extraction
Named-entity recognition
Question answering
Automatic summarization
Natural language generation & understanding
Machine translation
Foreign language reading & writing
Speech recognition
Text proofing, optical character recognition
Sentiment analysis
2014-11-12Stéphane Gagnon18
Text Mining (TM) Process
2014-11-12Stéphane Gagnon19
Establish the Corpus:
Collect & Organize the
Domain Specific
Unstructured Data
Create the Term-
Document Matrix:
Introduce Structure
to the Corpus
Extract Knowledge:
Discover Novel
Patterns from the
T-D Matrix
The inputs to the process
includes a variety of relevant
unstructured (and semi-
structured) data sources such
as text, XML, HTML, etc.
The output of the Task 1 is a
collection of documents in
some digitized format for
computer processing
The output of the Task 2 is a
flat file called term-document
matrix where the cells are
populated with the term
frequencies
The output of Task 3 is a
number of problem specific
classification, association,
clustering models and
visualizations
Task 1 Task 2 Task 3
FeedbackFeedback
Web Mining
2014-11-12Stéphane Gagnon21
Web
Analytics
Voice of
Customer
Customer Experience
Management
Customer Interaction
on the Web
Analysis of Interactions Knowledge about the Holistic
View of the Customer
IBM Watson QA
2014-11-12Stéphane Gagnon22
Trained models
Question
analysis
Hypothesis
generation
Query
decomposition
Soft
filtering
Hypothesis and
evidence scoringSynthesis
Final merging
and ranking
Answer and
confidence
... ... ...
Hypothesis
generation
Soft
filtering
Hypothesis and
evidence scoring
Answer sources
Evidence sources
Primary
search
Candidate
answer
generation
Support
evidence
retrieval
Deep
evidence
scoringQuestion
12
34
5
TM for Lies
2014-11-12Stéphane Gagnon23
Statements
Transcribed for
Processing
Text Processing
Software Identified
Cues in Statements
Statements Labeled as
Truthful or Deceptive
By Law Enforcement
Text Processing
Software Generated
Quantified Cues
Classification Models
Trained and Tested on
Quantified Cues
Cues Extracted &
Selected
Geo-Textual Contextualization
2014-11-12Stéphane Gagnon26
Extract knowledge from available data sources
A0
Unstructured data (text)
Structured data (databases)
Context-specific knowledge
Software/hardware lim itations
Privacy issues
Tools and techniques
Dom ain expertise
Linguistic lim itations
Geo-Localized Contents
Geographic Information
Geo-Intelligence Models
Geo-Information Sensors
Geo-Analytics of Voter Talk
2014-11-12Stéphane Gagnon28
INPUT: Data Sources
§ Census dataPopulation specifics, age, race, sex, income, etc.
§ Election DatabasesParty affiliations, previous election outcomes, trends and distributions
§ Market research Polls, recent trends and movements
§ Social mediaFacebook, Twitter, LinkedIn, Newsgroups, Blogs, etc.
§ Web (in general)Web pages, posts and replies, search trends, etc.
· Other data sources
OUTPUT: Goals
§ Raise money contributions§ Increase number of
volunteers§ Organize movements§ Mobilize voters to get out
and vote§ Other goals and objectives§ ...
Big Data & Analytics
(Data Mining, Web Mining, Text Mining, Multi-media Mining)
§ Predicting outcomes and trends
§ Identifying associations between events and outcomes
§ Assessing and measuring the sentiments
§ Profiling (clustering) groups with similar behavioral patterns
§ Other knowledge nuggets
BI and Agility Process efficiency and cost reduction
Brand management
Revenue maximization, cross-selling/up-selling
Enhanced customer experience
Churn identification, customer recruiting
Improved customer service
Identifying new products and market opportunities
Risk management
Regulatory compliance
Enhanced security capabilities
2014-11-12Stéphane Gagnon34
Big Data - Definition
Big Data means different things to people
with different backgrounds and interests
Traditionally, “Big Data” = Giga and Tera
E.g., volume of data at CERN, NASA, Google, …
The Vs that define Big Data
Volume
Variety
Velocity
Veracity
Variability
Value
2014-11-12Stéphane Gagnon35
Stéphane Gagnon
Big Data Examples
Data Sources
Web text documents
Multimedia annotations
Web logs
RFID
GPS systems
Sensor networks
Social networks
Internet search indexes
Detail call records
Application Domains
Financial markets
Broadcasting
Biology and life sciences
Healthcare informatics
Transportation
Security and defense
Atmospheric science
Genomics and research
Energy and SCADA
2014-11-1236
Big Data Architecture
2014-11-12Stéphane Gagnon37
Math
and Stats
Data
Mining
Business
Intelligence
Applications
Languages
Marketing
ANALYTIC TOOLS & APPS USERS
DISCOVERY PLATFORM
INTEGRATED DATA WAREHOUSE
DATAPLATFORM
ACCESSMANAGEMOVE
UNIFIED DATA ARCHITECTURESystem Conceptual View
Marketing
Executives
Operational
Systems
Frontline
Workers
Customers
Partners
Engineers
Data
Scientists
Business
Analysts
EVENT PROCESSING
ERPERP
SCM
CRM
Images
Audio
and Video
Machine
Logs
Text
Web and
Social
BIG DATA SOURCES
ERP
Big Data Requirements
2014-11-12Stéphane Gagnon38
Keys to Success with Big Data
Analytics
A Clear business need
Strong, committed sponsorship
Alignment between the
business and IT strategy
A fact-based decision-making
culture
A strong data infrastructure
The right analytics tools
Personnel with advanced
analytical skills
Conclusion: Toward Geo-Agility People-Centric: Track geo-information from key
individuals and assets across/around the enterprise
Contextualize: Add geo-info to unstructured contents,
use DM and TM with geo-analytics
Exploration: Link contextualized geo-info to real-time decision requirements
Open: Leverage open and mobile sources
Big Data: Make real-time streaming capabilities
Event-Driven: Develop organization agility and resilience,
capability to automate adaptation
Emergent Strategies: Adapt business strategy along with evidence-based decision-making
2014-11-12Stéphane Gagnon39
Merci!Stéphane Gagnon, Ph.D.
Professeur agrégé
Département des sciences administratives
Université du Québec en Outaouais
Pavillon Lucien-Brault
101, rue St-Jean-Bosco, Local A2228
C.P. 1250, succursale Hull
Gatineau (Québec) J8X 3X7
Téléphone: 819-595-3900, poste 1942
Télécopieur: 819-773-1747
Courriel: [email protected]
Web: http://www.gagnontech.org
Skype: stephanegagnon
Crédits des photos: SJ: http://www.lgt.ws, AT et LB: http://www.flickr.com/photos/uqo/
2014-11-1240 Stéphane Gagnon