FirstDIGFirst Data Investigation on the Grid
Paul Graham, Terry Sloan, Adam CarterEPCC
Ian Gregory, Darren UnwinFirst South Yorkshire
tel:+44 (0)131 650 5155 email:[email protected]
Description
First plc - UK’s largest public transport operatorData sources
Huge range – mileage, revenue, fuel, maintenance, routes …Collected – manually, ticket machines, GPS …Disparate DBMS
Acquisitions, historical, OS, physical location, representation …
Issues NOT unique to the bus industryFine for day to day operations, but …
Business questions – data from >1 sourceComplaints vs Lateness, Revenue vs Lost Miles …Aggregation – by service, by day, weekdays only …
Introduces challenges for data analysis
Description
First South Yorkshire situationNo common interfaceNo common reporting processStatistics produced manually when required
Labour intensiveNot performed often or well
Process to produce what is neededExpensiveImpractical
Description and Aims
Open Grid Services Architecture: Data Access and Integration
Assists with the access and integration of data from separate data sources via Grid ServicesOur remit:To evaluate the suitability of the use of OGSA-DAI in a commercial environment. If OGSA-DAI: Is appropriate, secure, straightforward to deploy and use … Does what we need! Provide feedback to OGSA-DAI team
Aims1. Demonstrate deployment of OGSA-DAI within the First South
Yorkshire bus operational environment and learn from it2. Short data analysis using OGSA-DAI service enabled data sources
to answer business questions posed by First South Yorkshire
Status: Workpackages
WP 1: Data Source requirements capture (FINISHED)D1.1 Data Source Requirements Capture & D1.2 Organisation Data Schema (COMMERCIAL-IN-CONFIDENCE)
WP 2: Development of data interfaces (FINISHED)OGSA-DAI Deployment
WP 3: Deployment & refinement of OGSA-DAI (FINISHED)First Data Service Browser User Guide First Data Service Browser Software
WP 4: Data mining requirements capture (FINISHED)D4.1 Data Mining Requirements Capture (COMMERCIAL-IN-CONFIDENCE)
WP 5: Initial data mining analysis (FINISHED)D5.1 Initial Data Mining Report (COMMERCIAL-IN-CONFIDENCE)
WP 6: Data mining detailed analysis (FINISHED)D6.1 Final Data Mining Report (COMMERCIAL-IN-CONFIDENCE)
Technical Achievements 1
Data MiningCombined two databases to answer First’s business questions
The Customer Contact System Microsoft Access Information on customer complaints e.g. time, service, nature
The Mileage database dBASE IV Information on bus mileage e.g. lost miles
Also investigated Revenue and Schedule Adherence suitability for data miningProduced detailed data mining report
Technical Achievements 2
OGSA-DAI deployment at First South YorkshireCreated Grid Data Services for DBMS previously unsupported by OGSA-DAI
MS Access – CCS, dBASE IV – Mileage
Investigated GDS for SQL Server and CVS-based DBMSRigorously exercised use of OGSA-DAI in a commercial setting:
Identified numerous areas for improvement in OGSA-DAIIdentified new requirements for use of OGSA-DAI in businessConfirmed the relevance and potential of OGSA-DAI for business
Technical Achievements 3
Data Service BrowserIdentified need to aid ‘ease of use’ for OGSA-DAI
Middleware
Developed a generic Grid Data Service BrowserSimple GUI – avoids XML etcAllows SQL queries and updates to databasesEnables JOIN queries across databases
Will be included in future OGSA-DAI releases… demo later
Achievements – First’s perspective
Project has proven that:
There is a cost-effective solution that First South Yorkshire can utiliseFirst can get to its data and analyse it in a useful mannerWith considerably reduced labour time First can produce more accurate and more wide-ranging information for the business management
Achievements
“the results of this exercise will revolutionise the way we do things in the bus industry”
Darren UnwinDivisional IT Manager
Dissemination
PresentationsErnst & Young, WestInfo Services, Strategy & Performance Associates, SingTel Optus, Executive Briefing Centre, Curtin Business School, Curtin University of Technology, Perth Australia, February 24th, 26th, 2004. Curtin Business School Information Systems Seminar, Curtin University of Technology, Perth, Australia, February 20th 2004UK e-Science booth, Supercomputing 2003, Phoenix, USA, November 2003
FlyersUK e-Science All Hands Conference, Nottingham, UK 2-4 September 2003
PostersUK e-Science All Hands Conference, Nottingham, UK 2-4 September 2003
ArticlesT.M.Sloan, A.Carter, P.J.Graham, D.Unwin, I.Gregory, "First Data Investigation on the Grid: FirstDIG", Proceedings of the 2nd UK e-Science All Hands Meeting, 2-4 September, 2003, Nottingham, UK
Exploitation
First Data Service Browser is being used and extended in the INWA project with Curtin Business School, Perth, Australia
First are keen to extend their deployment to other databases
Future Plans
Project is finished, no effort remaining.
Incorporation of First Data Service Browser into future releases of OGSA-DAI
First South Yorkshire want to build management reporting applications based on OGSA-DAI
Demo
Data Service Browser
Accessing three different DBMSMileage, CCS, MySQL
A JOIN – similar to the queries required for the data mining
Easy within one DB, requires intermediary steps for distributed DBWithout OGSA-DAI would have been impractical
Looking at Lost Miles and Customer Complaints
Run the Demo
Lost miles and Number of Complaints
0
50
100
150
200
250
300
350
Date
Lost miles
Complaints
In Conclusion
Successfully demonstrated the use of Grid middleware in a ‘real-world’ environmentOGSA-DAI team:
Gained (in)valuable feedbackIncorporated Data Service Browser
FirstDiscovered valuable information from their data which would have otherwise been practically unobtainableKeen to extend to other DBMS