+ All Categories
Home > Documents > Brochure

Brochure

Date post: 25-Sep-2015
Category:
Upload: ssanagav
View: 214 times
Download: 0 times
Share this document with a friend
Description:
Big Data
Popular Tags:
24
BiDA2014 1 BiDA2014 A National Workshop On Big Data Analytics at CRRao AIMSCS and CMSD, UoH, Hyderabad, India on 22-24 August, 2014 Organized jointly by: CR Rao Advanced Institute of Mathematics, Statistics and Computer Science (AIMSCS), Centre for Modelling, Simulation and Design (CMSD), University of Hyderabad (UoH), and Computer Society of India (CSI)
Transcript
  • BiDA2014

    1

    BiDA2014

    A National Workshop On

    Big Data Analytics

    at CRRao AIMSCS and CMSD, UoH,

    Hyderabad, India

    on 22-24 August, 2014

    Organized jointly by:

    CR Rao Advanced Institute of Mathematics, Statistics and Computer Science (AIMSCS),

    Centre for Modelling, Simulation and Design (CMSD), University of Hyderabad (UoH),

    and

    Computer Society of India (CSI)

  • BiDA2014

    2

  • BiDA2014

    3

    Foreword ........................

    Message from Professor C R Rao .

    BiDA2014 Schedule...........

    Inaugural address by H R Mohan..................

    Invited Speakers (in alphabetical order)

    S Chattopadhyay, C-DAC Bangalore................

    C Hota, BITS Pilani, Hyderabad.......................

    K Karlapalem, IIIT Hyderabad..........................

    P Manimaran, CRRao AIMSCS........................

    B L S Prakasa Rao, CRRao AIMSCS..................

    K Prasad, C-DAC Bangalore.............................

    A K Pujari, UoH, Hyderabad............................

    S Pyne, CRRao AIMSCS....................................

    K S Rajan, IIIT Hyderabad................................

    S B Rao, CRRao AIMSCS...................................

    V C V Rao, C-DAC Pune....................................

    Y Simmhan, IISc Bangalore.............................

    S K Udgata, CMSD, UoH, Hyderabad...............

    Expert Panel Discussion ....

    Technical Experts

    V C V Rao, V Handa, R R Naik, S Rout, C-DAC Bangalore...

    Page

    4

    5

    6

    8

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

  • BiDA2014

    4

    Foreword

    Welcome to the First National Workshop on Big Data Analytics (BiDA2014), 22-24 August, 2014, in Hyderabad. Data and information deluge is a truly remarkable phenomenon in todays world. Today we no longer need to go out seeking for precious bits of data; it is, in fact, the staggering amounts of data that stare at us begging our attention, whether or not we are suitably equipped to tackle the numerous challenges that they present. Indeed with their ever-increasing volume, velocity and variety, large datasets generated from different sources from all spheres of life continue to pose new challenges, and increasing demands, on computational analysis and data management. The inherent complexities (e.g., data in motion, unstructured nature, security, veracity, ethical issues) of Big Data has forced us to rethink how we can collect, store, combine and analyze it in efficient, reliable and cost-effective manner. The foundations and capabilities to understand and address these needs must be built up urgently. Large amounts of data from all sectors, ranging from health to education, environment to agriculture, commerce to art and humanities, security to governance, etc., now provide us both professionals and researchers alike with unprecedented opportunity to make an informed difference: introduce new approaches to industry and policy-making, take objective data-driven decisions, take advantage of inference and prediction, gain insights on risks and protect health and wealth, create more useful infrastructure, offer enhanced and dynamic understanding of various phenomena, and through all these myriad abilities, profoundly empower our lives and the world around us. CR Rao AIMSCS embodies the foresight and vision of Professor C R Rao in uniting the analytical trinity of Mathematics, Statistics and Computer Science under one roof. It is, therefore, both timely and appropriate for CR

    Rao AIMSCS to organize BiDA2014, jointly with the Centre for Modelling, Simulation and Design (CMSD), University of Hyderabad (UoH), and the Computer Society of India (CSI), which is celebrating its Golden Jubilee Year in 2014. As the field of Big Data is taking shape around the world, in this workshop, we have decided to offer both knowledge and skills that are relevant to Big Data. The participants will learn about different domains and applications, both established as well as emerging, in which the task of dealing with large and complex datasets is getting increasingly unavoidable, and in fact, highly sought after. Distinguished experts will speak on the current problems and technical aspects in their areas of research, while a panel of experts from the government and the industry will share their perspectives with the participants. I thank all of them for agreeing to come to speak at the workshop. Further, we also have also arranged for an experienced team of technical experts from the Centre for Development of Advanced Computing (C-DAC) to provide the participants with multiple sessions of hands-on laboratory training to demonstrate and work on practical problems in Big Data. Thanks much for their valuable time and expertise. We are pleased to note that the academia, the industry and the government are all represented in both our list of speakers as well as our list of participants. Notably, the latter spans the length and breadth of the country, in the true spirit of a National Workshop. We wish all of them a most enriching experience here. I take this opportunity to thank all my colleagues and scholars, as well as the co-organizers and the Organizing Committee-members, for their hard work in organizing the workshop. I also thank University of Hyderabad, C-DAC, CSI Hyderabad Chapter and IIIT Hyderabad for their kind and enthusiastic co-operation. Finally, we express our gratitude to all the sponsors, especially DST and MoS&PI, for providing generous support to the event. Thank you and best wishes, Saumyadipta Pyne.

    Chair, BiDA2014, CR Rao AIMSCS, Hyderabad, India. 15th August, 2014.

  • BiDA2014

    5

    Message from Professor C R Rao

    Statisticians are used to developing methodologies for analysis of data

    collected for a specific purpose in a planned way. Sample surveys and design of experiments

    are typical examples. Big data refers to massive amounts of very high dimensional and

    unstructured data which are continuously produced and stored with much cheaper cost than

    they are used to be. High dimensionality combined with large sample size creates issues such

    as heavy computational cost and algorithmic instability. The massive samples in big data are

    typically aggregated from multiple sources at different time points using different

    technologies. This creates issues of heterogeneity, experimental variations, and statistical

    biases and requires us to develop more adaptive and robust procedures.

    I am glad that Mahalanobis Professor Pyne has taken the initiative to conduct short courses,

    discussion meetings and conferences to evolve suitable methodologies to extract information

    from large data. I wish the workshop a great success.

    C.R. Rao, Sc.D., F.R.S.

    Padma Vibhushan, India

    Research Professor, University at Buffalo,

    Williamsville, NY 14221, USA.

    15th August, 2014.

  • BiDA2014

    6

    BiDA2014 Workshop Schedule

    Day 1: Fri. 22nd

    August, 2014

    Venue: Auditorium, Ramanujan building, CR Rao Advanced Institute of Mathematics, Statistics and

    Computer Science (AIMSCS), University of Hyderabad Campus, Gachibowli, Hyderabad 500046.

    Registration starts at 8:30 AM in CR Rao AIMSCS.

    Session 1

    Chair: Dr. S Pyne

    09:30 AM Welcome address by Dr. Allam Apparao

    09:40 Inaugural address by Shri H R Mohan (Chief Guest)

    BIG DATA Opportunities Ahead.

    10:10 Introductory Remarks by Dr. S Pyne

    10:15 Tea Break

    Session 2

    Chair: Dr. B L S Prakasa Rao

    10:45 Dr. V C V Rao: An Overview of Cloud and Distributed Computing -

    Programming Paradigms.

    11:30 Dr. Subrata Chattopadhyay: The Role and Challenges of

    e-Infrastructure for Supporting Big Science Discoveries.

    12:15 PM Dr. Chittaranjan Hota: Security and Privacy Concerns in Campus Wide

    Networks: Easing Out Using Big Data Analytics.

    01:00 Lunch

    Session 3

    Chair: Dr. S B Rao

    02:00 Dr. B L S Prakasa Rao: Big Data and High-Dimensional Data Analysis.

    02:45 Dr. Kamal Karlapalem: Towards Visualizing Clusters and Classes for Real

    Valued High Dimensional Data Sets.

    03:30 Dr. S K Udgata: Large Sensor Network and IoT Data Management with

    Big Data: The Research Challenges.

    04:15 Tea Break

    Session 4

    Chair: Dr. S K Udgata

    04:45 Dr. K S Rajan: Challenges in Managing Spatio-Temporal Big Data

    Sharing Current Research Initiatives.

    05:30 Expert Panel Discussion (Big Data: New Challenges and Directions)

    Panelists: Mr. K Mohan Raidu (Chair), Mr. T. Krishna Kumar

    Day 1 ends at 06:30 PM.

    6:30 7:30 PM Exhibition and tour of Professor C.R. Rao Gallery (CR Rao AIMSCS First Floor).

  • BiDA2014

    7

    Day 2: Sat. 23rd

    August, 2014

    Venue: Morning session Auditorium, Ramanujan Building, CR Rao AIMSCS.

    Afternoon session Centre for Modelling Simulation and Design (CMSD), University of Hyderabad (UoH).

    Session 1

    Chair: Dr. S Pyne

    09:00 AM Dr. Yogesh Simmhan: Fast Data Analytics for the Internet of Things.

    09:45 Dr. A K Pujari: Data Mining Trends in the Big Data Era.

    10:30 Tea Break

    Session 2

    Chair: Dr. A K Pujari

    11:00 Dr. S Pyne: Multivariate Stream Data Analytics with Applications to

    Health Sciences and Technology.

    11:45 Dr. S B Rao: Privacy preservation in Graphs and Social Networks.

    12:30 PM Dr. P Manimaran: Graph Mining Applications to Social Network

    Analysis.

    01:15 Lunch

    Session 3 (CMSD: Please note change in venue)

    Chair: Dr. V C V Rao

    02:15 Lab Session 1: Dr. V C V Rao & Team

    04:00 Tea Break

    Session 4

    Chair: Dr. Yogesh Simmhan

    04:30 Lab Session 2: Dr. V C V Rao & Team

    Day 2 ends at 06:00 PM.

    Day 3: Sun. 24th

    August, 2014

    Venue: Centre for Modelling Simulation and Design (CMSD), University of Hyderabad (UoH).

    Session 1

    Chair: Dr. P Manimaran

    09:00 AM Lab Session 3: Dr. V C V Rao & Team

    10:30 Tea Break

    Session 2

    Chair: Dr. S Pyne

    11:00 Ms. Karuna Prasad: Processing Large Datasets Using Hadoop.

    01:00 PM Conclusion of Workshop

    01:15 Lunch

    Day 3 (and workshop) ends at 02:15 PM.

  • BiDA2014

    8

    Shri HR Mohan

    President, Computer Society of India.

    Chairman, IEEE Computer Society & IEEE Professional Communication Society

    Vice Chairman, IEEE Communications Society

    Interface to Technical Societies at ACM Chennai

    Former Associate Vice President (Systems), The Hindu, Chennai

    Title: BIG DATA: Opportunities Ahead.

    Speaker Biosketch: Mr. HR Mohan is a graduate in Engineering from IIT Madras. He is currently a consultant in

    Information and Communications Technology area and ICT Education. He has a rich experience in the publishing

    industry. He had served at the India's National Newspaper The Hindu as Associate Vice President (Systems) till

    recently. At The Hindu, Mr. Mohan looked after the Corporate MIS activities. He was instrumental for the Internet

    Publishing of The Hindu the first newspaper from India to go online. Subsequently, Business Line, Frontline,

    Sportstar & other group publications including all supplements were made online with his efforts. At The Hindu, his

    other initiatives included Library Automation, Indexing, Digital Archives of 135 years of The Hindu, Information

    Services, Content Syndication and compilation and publishing of special thematic publications.

    Mr. Mohan is associated with a no. of professional bodies in the areas of Information & Communication Technology,

    Library, Information Sciences, Management, Technical Communication, Media and Industry bodies such as FICCI,

    Hindustan Chamber of Commerce, CII, AIMA & MMA. He has assisted in organizing over 750 technical meetings /

    seminars / workshops / conferences during the last 30 plus years of his association with the professional societies.

    Mr. Mohan, currently the President of the Computer Society of India is a Fellow of CSI and has served in the

    executive council of CSI at Chennai chapter and at the national level for over two decades in various capacities.

    Further, he has served as the Chairman of Conferences Committee, Intersociety Relations Committee, Publications

    Committee and Committee on Special Interest Groups. He is the Convener of CSI Chennai CIO Forum, and a

    Member of CSI Digital Library Committee. He represents India in the forums such as ICANN & SEARCC.

    Mr. Mohan is closely associated with the international associations such as IEEE & ACM as Senior Member. He

    serves as the Chairman of the IEEE Computer Society, Madras Chapter & IEEE Professional Communication

    Society, Madras Chapter. He serves in the executive committees of IEEE Communications Society (as Vice

    Chairman) and IEEE Technology Management Council Madras Chapter (as Treasurer) and ACM Madras Chapter as

    an interface to technical societies. He has served as the Vice Chairman of the IEEE Madras Section for the term

    2012-2013. He is a Trustee of Ranganathan Centre for Information Studies. He is a Director at Internet Society India

    Chennai Chapter. He is also associated with a no. of educational institutions as a member in their academic and

    governing councils.

  • BiDA2014

    9

    Mr. Mohan has rich experience in editing, publishing and content management. He currently edits the monthly CSI

    eNewsletter, which reaches to over 100,000 members of CSI and members of few other ICT related eGropus with

    readership over 20,000. He had been the editor of the IEEE India Info, the newsletter of the IEEE India Council (for

    2013) that reaches to about 55,000 members across the country. He had also edited IEEE MAS LINK, the monthly

    eNewsletter of the IEEE Madras Section, which reaches to about 12,500 members for over seven years till Dec 2013,

    INFOLINE, the newsletter of CSI Madras Chapter for about 20 years and CSI Digest, a quarterly monograph of

    CSI for about three years.

    Mr. Mohan manages a number of eGroups relating to his professional activities and interest and assists non-profit

    organizations and institutions in managing their websites. He is a regular contributor for the ICT Happenings column

    in CSI Communications and ICT Quiz columns in the newsletters and conducts ICT Quiz programmes. Mr. Mohan

    delivers guest lectures and presentations at various institutions and forums in the areas of his interest such as

    Information and Communication Technology, Open Source Software, Software Engineering, Knowledge

    Management, Internet, Web & ePublishing, eLearning, Web Marketing & Electronic Commerce, Cloud Computing,

    Bid Data & Analytics, eGovernance, Information & Cyber Security, Digital Libraries & Archives, Library &

    Information/Content Management and Services, Employability & Soft Skills, IT Education and related areas. Mr.

    Mohan believes in information sharing and makes himself available for such initiatives.

  • BiDA2014

    10

    Dr. Subrata Chattopadhyay

    Centre for Development of Advanced Computing (C-DAC), Bangalore.

    Title: The role and challenges of e-Infrastructure for supporting Big Science discoveries.

    Abstract: The basic steps of scientific discoveries are to conduct experiments which

    normally generate huge data that need to be checked, shared and analyzed or

    visualized. Finally these findings are published and announced to the communities.

    Some of the well-known discoveries in the field of high energy physics and life science

    will be presented to reflect on the basic steps and challenges of huge data handling and

    processing addressed for these discoveries.

    The role of data handling and analysis is a common challenge and becoming more and more complex as indicated

    by the some of the use cases presented above and some of the collaborating global experiment already planned. In

    that context, understanding of Big Data and its innovation to address the future challenges are becoming more

    critical. C-DAC has considerable experience on developing and managing e-Infrastructures that include High

    Performance Computing, Grid and Cloud technologies. The present status and future roadmap of these

    technologies being planned to address these challenges will be elaborated in this talk.

    Speaker Biosketch: Dr. Subrata Chattopadhyay is currently Associate Director at C-DAC, Bangalore. He is also the

    Chief Investigator of Garuda the national grid computing initiative of India. Previously he was involved in setting up

    the PARAM Padma, the first Indian supercomputing facility listed from India. He was also involved in setting up of

    nationwide high speed communication fabric of GARUDA and deploying grid middleware across various platforms of

    supercomputers. From C-DAC, he was the technical manager for the EUIndia Grid project that interconnect Indian

    grid project Garuda with the European grid initiatives EGI. He is also leading another International project funded

    by European Commission entitled Co-ordination and Harmonization of Advanced E-Infrastructure - Research and

    Education Data Sharing (CHAIN-REDS).

    He did his Bachelors in Engineering (BE) degree from NIT, Durgapur, Masters (M.Tech) from IIT, Kanpur and

    Doctorate (Ph.D.) from University of British Columbia, Vancouver, Canada. He brings more than 27 years of

    experience both from IT industry and research organizations. His major areas of interest include high performance

    computing, grid/cloud computing and process modeling and simulations

  • BiDA2014

    11

    Dr. Chittaranjan Hota

    Department of Computer Science, Birla Institute of Technology and Science

    (BITS) Pilani, Hyderabad Campus.

    Title: Security and Privacy Concerns in Campus Wide Networks: Easing Out Using Big

    Data Analytics.

    Abstract: With the proliferation of P2P systems, it is critical to consider the impact of

    these systems on the security of an Internet environment that is already struggling from

    several security issues. Many developing, and developed countries have less stringent

    regulations on P2P application usage. Currently, the P2P traffic control is achieved by

    either throttling the P2P bandwidth or allowing P2P traffic at certain times. Recent

    empirical studies indicate that P2P and Web traffic together dominate today's Internet

    traffic. Several open source and proprietary products detect and alert policy violations for usage of P2P applications

    using techniques like port-based analysis, and protocol analysis. In this talk, we will assess the impact of P2P traffic

    on perimeter security appliances, and develop intelligent approaches using machine learning techniques to counter

    their impact on campus wide networks. We will discuss the usage of Hadoop, Hive and Mahout to scale the data

    analytics framework that can capture Gigabytes of network traffic and try to figure out if there is any existence of

    anomalous P2P traffic like Botnet, or Malware within the corporate network traffic.

    Speaker Biosketch: Dr. Chittaranjan Hota did his Ph.D.. in Computer Science and Engineering from Birla Institute of

    Technology & Science, Pilani. He was the founding Head of the Computer Science Department at BITS Hyderabad,

    and currently he is Associate Dean, Admissions. He has been teaching and researching in Computer Science and

    Engineering area at BITS-Pilani since past 15 years, and overall since past twenty-five years. He has been a visiting

    researcher and visiting professor at University of New South Wales, Sydney; University of Cagliari, Italy; Aalto

    University, Finland; City University, London. He has research funding from UGC, New Delhi; DIT, New Delhi; and

    TCS, India. He has guided three Ph.D.. students and currently guiding five Ph.D.. students in the area of P2P

    Overlays, Information Security, Wireless Networks, and Distributed Computing Systems. He is the recipient of

    Australian Vice Chancellors Committee award, recipient of Erasmus Mundus fellowship from European commission,

    and recipient of Certificate of Excellence for Faculty Excellence Award from BITS Pilani. He has published

    extensively in peer-reviewed journals and conferences. He has also edited LNCS volumes. He is a member of IEEE,

    ACM, IE, and ISTE.

  • BiDA2014

    12

    Dr. Kamalakar Karlapalem

    Centre for Data Engineering, International Institute of Information Technology

    (IIIT), Hyderabad.

    Title: Towards Visualizing Clusters and Classes for Real Valued High Dimensional Data

    Sets.

    Abstract: High dimensional real valued data sets are the most difficult to process and mine. Data mining such data sets is usually done by techniques such as, clustering and classification. A major challenge is to conceptualize the results of the clustering and classification algorithms. Data visualization helps to comprehend and to get a deeper insight of the data and the data mining results. The problem is to come up with techniques and tools to results.

    We have built four tools to visualize and comprehend high dimensional real valued data. Heidi helps in visualizing the subspace overlap among the clusters. Beads help in visualizing spatial spread, size and shape of the clusters. PEARLS is a visual tool kit to simultaneous query and explore data sets using concepts behind Beads. CROVHD helps to visualize spread of data across dimensions, and to show closeness and separation of classes. In this talk, I shall present the background for data visualization, technical insights behind above tools, and list some open problems. Speaker Biosketch: Dr. Kamalakar Karlapalem is a faculty member at International Institute of Information Technology, Hyderabad, and heads the Centre for Data Engineering. His research spans the areas of database visualization, data analytics, workflow management systems, multi-agent systems, and data systems. He with his students have been awarded in academic competitions such as, RoboCup, VAST, and TAC. He has graduated eight Ph.D.. and thirty three Masters by Research students.

    He is an alumni of Indian Statistical Institute (M.Stat.), IIT, Kharagpur(M.Tech.) and Georgia Tech (Ph.D.) and was a faculty member in computer science department at HKUST (1992-2000), before joining IIIT, Hyderabad.

  • BiDA2014

    13

    Dr. P Manimaran

    CR Rao Advanced Institute of Mathematics, Statistics and Computer Science

    (AIMSCS), Hyderabad.

    Title: Graph Mining Applications to Social Network Analysis.

    Abstract: Nowadays, with growth of social media any individual in this world can easily

    connect to another in the cyber space. With this information, a social network can be

    constructed considering the individuals as nodes and their interactions as edges between

    them. The most challenging task is to mine the patterns in such social networks. In this

    talk, we will discuss graph mining applications using centrality analysis and community

    detection to extract information from social networks.

    Speaker Biosketch: Dr. Manimaran is an Assistant Professor in CR Rao Advanced Institute of Mathematics,

    Statistics and Computer Science, Hyderabad. Previously, he has done his post-doctoral research at Centre for

    DNA Fingerprinting and Diagnostics, Hyderabad. Dr. Manimaran received his Ph.D. in Physics from University of

    Hyderabad. His main research interests are computational Physics, time series analysis, complex networks and

    wavelet transform. He has published research papers in peer-reviewed journals and conferences. He has guided

    ten M.Tech. students, and guiding a Ph.D. student.

  • BiDA2014

    14

    Dr. B L S Prakasa Rao

    CR Rao Advanced Institute of Mathematics, Statistics and Computer Science

    (AIMSCS), Hyderabad.

    Title: Big Data and High Dimensional Data Analysis.

    Abstract: Over the last ten to fifteen years, more and more corporations are adapting to

    data-driven approach to have targeted services, reduce risks and improve performance.

    They are implementing specialized data analytic programs to collect, store, manage and

    analyze large data sets or what is now called BIG DATA. Such data sets are

    characterized by massive sample size and high-dimensionality. Traditional statistical

    methods are inappropriate to tackle such problems. There are many types of events

    where there are a potentially large number of parameters/covariates but relatively few

    instances of the event. This type of data is termed as HIGH-DIMENSIONAL DATA. We will discuss some issues

    arising in analysis of BIG DATA and HIGH-DIMENSIONAL DATA.

    Speaker Biosketch: Dr. B.L.S. Prakasa Rao holds the prestigious Ramanujan Chair Professorship at C R Rao

    Advanced Institute of Mathematics Statistics and Computer Science, Hyderabad. He was previously the Director of

    Indian Statistical Institute, Kolkata. He has held academic positions at Indian Statistical Institute, Kolkata and New

    Delhi, University of Iowa, University of Wisconsin, University of California, Davis, Purdue University, University of

    Illinois, University of California, Berkeley, Universite de Montreal, Indian Institute of Technology, Kanpur, University of

    Hyderabad and many others. He did his M.A. from Andhra University, M. Stat. at Indian Statistical Institute, Kolkata,

    and Ph.D. at Michigan State University. His research interests span Limit Theorems, Stochastic inequalities,

    Characterization of Distributions, Stochastic Processes, Inference for Stochastic Processes, Nonparametric

    Functional Estimation and Asymptotic Theory of Statistical Inference. He is the Editor-in-Chief of Sankhya Series A

    and Sankhya Series B, and member of Editorial boards of many journals of international repute. He has published

    over 220 research papers in journals of international repute. He also has written 13 books as well as many expository

    articles. He is a Fellow of all the Science Academies of the country, Fellow of Institute of Mathematical Statistics,

    USA and an elected member of the International Statistical Institute and many reputed professional societies. Dr.

    Prakasa Rao is the recipient of the prestigious S S Bhatnagar award and P V Sukhatme prize.

  • BiDA2014

    15

    Ms. Karuna Prasad

    Centre for Development of Advanced Computing (C-DAC), Bangalore.

    Title: Processing Large Datasets Using Hadoop.

    Abstract: The talk will give an overview of Hadoop Distributed File System

    (HDFS), its architecture and features. We will discuss MapReduce, the framework

    for processing data and components of Hadoop ecosystem. Further, the need of

    higher level tools like Pig and Hive on Hadoop cluster, and the architecture of Pig

    and Hive, and how their data models differ will be discussed.

    Speaker Biosketch: Ms. Karuna Prasad is working as Senior Technical Officer in C-DAC, Bangalore. She

    has received MCA degree from Nagpur University and MS in Software Systems from BITS, Pilani. Her

    experience in Distributed Computing areas include grid and cloud computing. She has publications in

    national and international conferences.

  • BiDA2014

    16

    Dr. Arun K Pujari

    School of Computer and Information Sciences, University of Hyderabad,

    Hyderabad.

    Title: Data Mining Trends in the Big Data Era.

    Speaker Biosketch: Dr. A K Pujari is Professor of Computer Science at the

    University of Hyderabad, Hyderabad. He is currently Dean, School of Computer &

    Information Sciences. Prior to joining UoH, he served at Automated Cartography

    Cell, Survey of India, and Jawaharlal Nehru University, New Delhi. He received

    Ph.D. from the Indian Institute of Technology, Kanpur and M.Sc. from Sambalpur

    University, Sambalpur. In 2008-2011, he was the Vice Chancellor, Sambalpur

    University, Sambalpur, Orissa.

    He has 32 years of teaching experience of post-graduate classes in several national institutions. During this

    period he had taught, designed and developed several new courses. The important ones are Data Bases,

    Theory of Computation, Computer Vision, Computational Geometry, Algorithms, Artificial Intelligence,

    Computer Based Optimization Techniques, Knowledge Representation & Reasoning, Machine Learning,

    Data Mining & Data Warehousing, Neural Networks, GIS and Bioinformatics.

    His research interests include AI, GIS, Combinatorial Algorithms, Data Mining.

  • BiDA2014

    17

    Dr. Saumyadipta Pyne

    CR Rao Advanced Institute of Mathematics, Statistics and Computer Science,

    Hyderabad.

    Title: Multivariate Stream Data Analytics with Applications to Health Sciences and

    Technology.

    Abstract: Analytics for large volume, high velocity data streams presents serious

    research challenges. In the recent years, many methodological developments have

    been made to address a variety of problems in stream data analytics. In this talk,

    well briefly review the field, and then look at different methods and algorithms for

    tackling the key problem of anomaly detection in multivariate stream data as applied

    to, in particular, disease outbreak analysis.

    Speaker Biosketch: Dr. Saumyadipta Pyne holds the prestigious PC Mahalanobis

    Chair and is Professor and Head of Bioinformatics in CR Rao Advanced Institute of

    Mathematics, Statistics and Computer Science in Hyderabad. He is also Adjunct Professor in Public Health

    Foundation of India, and Remote Research Associate of Broad Institute of MIT and Harvard University. Dr. Pyne is a

    Ramalingaswami Fellow of Department of Biotechnology, Government of India, and a former research fellow of the

    Indian Statistical Institute. Formerly he worked in Dana-Farber Cancer Institute of Harvard Medical School in Boston.

    He received his doctorate from the State University of New York at Stony Brook in USA working simultaneously in the

    Departments of Computer Science, Molecular Genetics and Microbiology. He conducted his postdoctoral research in

    Broad Institute of MIT and Harvard University in Cambridge. Dr. Pyne conducted pioneering research in the field of

    computational modeling of single cell level high-resolution, high-dimensional data. Dr. Pynes research interests

    include Big Data in Life Sciences and Health Informatics, Computational Statistics and High-dimensional Data

    Modeling. He has published extensively in top international journals.

    Dr. Pyne is actively engaged in promoting Big Data research and training activities in both India and abroad. He is

    the Workshop Co-Chair of IEEE International Conference on Big Data 2014, to be held in Washington DC in October

    2014. He is currently the only Member of the Program Committee from India, as also for the First IEEE International

    Conference on Big Data 2013 held in Santa Clara, Silicon Valley, in 2013, as well as the ACM International

    Workshop on Big Data in Life Sciences, to be held in September 2014. Dr. Pyne currently teaches a course on Big

    Data and High-Dimensional Data Analysis to the final year Integrated Masters students in the School of Mathematics

    and Statistics of the University of Hyderabad. He is also co-editing a book on the subject.

  • BiDA2014

    18

    Dr. K S Rajan

    Lab for Spatial Informatics, International Institute of Information Technology (IIIT),

    Hyderabad.

    Title: Challenges in Managing Spatio-Temporal Big Data Sharing Current Research

    Initiatives.

    Abstract: With the increase in a range of location based data collection models and

    devices and the frequency of such data collection, the current computer systems are not

    only challenged to store and manage these but also are faced with the need to adapt and

    develop the various algorithms to help handle such data. Some examples of these

    include the vehicle tracking systems for traffic flow analysis, mobile based location based

    service requests and public contributed data, mobile phone, GPS and other sensor

    trajectories. In many of these cases, efforts at discovering science related questions, is still quite an effort. Spatio-

    temporal bigdata has been looked upon to provide clues to some of these complex questions, with efforts ranging

    from spatio-temporal hypothesis generation to attempts at answering some of these complex questions.

    In this talk, we will cover some of the ongoing efforts in our research group, with a focus on our current research work

    in using spatio-temporal data for traffic pattern flow understanding in a city-wide road network and in knowledge

    discovery in the field of epidemiology. The latter has led to the development of MiSTIC, a spatio-temporal data mining

    algorithm with applications in crime and climatic studies too.

    Speaker Biosketch: Dr. KS Rajan is an Associate Professor at IIIT, Hyderabad, one of Indias top ranked research

    institutes, and leads the institutes Lab for Spatial Informatics (LSI). Recently, IIIT Hyderabad was ranked 3rd among

    Indias Top Technical Institutions in Dataquest-IDC's Survey 2012.

    Dr. Rajan is a multi-disciplinarian, with major interests in Geo-Spatial Technologies - GIS and Remote Sensing; Land

    use modelling and Environmental Policy. He has taken a key interest in the gap-areas between computer science

    and geospatial technologies and through his research works has helped focus on bridging this gap be it in developing

    spatio-temporal data mining algorithms, Web-based Geospatial technologies, or New algorithms to help convert

    satellite imagery to useful satellite based thematic products. He is also an active proponent of OpenSource in India

    and for Geospatial technologies in particular. His Lab has recently released two Open Source tools LSIViewer and

    VRGeo.in. While in environmental modelling, work includes the integration through analysis of the multi-disciplinary

    fields of science and engineering and agent based modeling of the Human-LandWater-Energy linkages with

    Ecosystem wide understanding and their interactions for impact studies and national and global level policy

    initiatives.

    Dr. Rajan has handled more than 20 projects (from both Government and Industry), has over 100 publications in

    Books, Journals and Conferences and given more than 60 invited talks in a range of domains. Has been active in

    curricular/academic & research matters of other Universities and Institutes, International and National programs.

    Recently, Dr. Rajan has been awarded the Indian National Geospatial Award 2013 of Indian Society of Remote

    Sensing.

  • BiDA2014

    19

    Dr. S B Rao

    CR Rao Advanced Institute of Mathematics, Statistics and Computer Science

    (AIMSCS), Hyderabad.

    Title: Privacy Preservation in Graphs and Social Networks.

    Abstract: The recent proliferation of graph and network data in various application

    domains has raised privacy-preservation concerns for the individuals/organizations

    involved. Recent studies show that simply removing the identity of the persons/nodes

    before publishing the graph/social network data does not guarantee privacy of the

    individuals. The structure of the graph/network and its basic parameters like the

    degrees of the nodes, immediate neighbors, similarities and other central measures of

    nodes - like eccentricity, betweenness centrality, can be revealing the identity of the

    individuals. To address the issues, we survey specific graph anonymous problems based on these degree,

    neighborhood and automorphisms and some of the known solutions. We also discuss recent work in this regard on

    other parameters / centrality measures of nodes like eccentricity, betweenness centrality and others eccentricity,

    betweenness centrality, centroid, stress, pagerank etc, and others and /or combination of these.

    Let P be a parameter set of a node that can be calculated easily (in a polynomial way) for each node of a graph,

    like the degree, the neighbors, the eccentricity, the betweenness centrality / other centrality measures. We call a

    graph Pk-anonymous, if for every node v, there are at least k-1 other nodes in the graph with the same values of the

    parameter set P. The definition of anonymity prevents the identification of the individuals / organizations by the

    adversaries with probability more than 1/k, based on prior knowledge of the parameter set P for v, the

    node/individual. We formally define, the graph Pk-anonymous problem thus as: Given a graph G, and a parameter

    set P, find a Pk-anonymous graph obtained from G with the minimum number of graph-modifications / operations

    (addition/deletion of edges etc) specified. The algorithms for finding a Pk anonymous graph for a given graph G

    and parameter set P will be based on the principle of realizabilty of the sequences of these parameter values of

    nodes of a new graph, which is Pk anonymous.

    Speaker Biosketch: Dr. S B Rao is an Emeritus Professor at CR Rao Advanced Institute of Mathematics, Statistics

    and Computer Science (AIMSCS), Hyderabad. He was the first Director of CRRAO AIMSCS. He was previously the

    Director of Indian Statistical Institute. He has held academic positions at Indian Statistical Institute Kolkata, Ohio

    State University, The Hungarian Academy of Sciences, The National University of Singapore, University of Western

    Australia and many more. His research interests encompass Graph Theory and Applications, Social, Biological and

    Economic Networks, Ramsey Theory, and other areas of Discrete Mathematics as well as Design of Experiments. He

    has published above 70 research papers in reputed journals of national and international repute including Journal of

    combinatorial theory, Discrete Mathematics, Sankhya. Dr. S B Rao had a Ph.D. from Indian Statistical Institute

    Kolkata under the supervision of the legendary Dr. C R Rao. He has held various administrative positions as well as

    in various Government bodies.

  • BiDA2014

    20

    Dr. V C V Rao

    Centre for Development of Advanced Computing (C-DAC), Pune.

    Title: An Overview of Cloud & Distributed Computing - Programming Paradigms.

    Abstract: The author discusses an overview of current trends on High Performance Computing and High Throughput Computing focusing on distributed computing perspective. A summary of distributed computing technologies which covers wide spectrum of Message Passing Clusters, Massively parallel processors and Grid Computing technologies are explained from programming perspective. Most importantly, the platform characteristics based on shared address space and message passing with emphasizes on performance, scalability, energy efficiency and virtualization are summarized. An overview of Cloud Computing that is evolved from cluster, grid and utility

    computing are explained. The author explains an overview of parallel and distributed computing paradigms such as Message Passing Interface (MPI), Hadoop MapReduce and Hadoop library from Apache and performance issues. The computational results for application kernels such image processing and large scale matrix computations and Graph analytics based on graph partitioning algorithms are presented based on BIG Data Hadoop MapReduce framework.

    Speaker Biosketch: Dr. VCV Rao is an Associate Director and Head of the High-Performance Computing - Frontier Technologies & Exploration (HPC-FTE) Group, at C-DAC, Pune, India. VCV Rao specializes in implementation of parallel algorithms on emerging parallel processing platforms (Cluster of Multi-Core Processors with accelerator devices -GPUs & CPUs). His group works on performance of application and system benchmarks and implementation of distributed computing algorithms focusing on Heterogeneous Computing environments such as Distributed Computing systems with coprocessors, accelerators, power aware computing, and Out-of-Core algorithms for large data processing. Dr. VCV Rao contributed to design, develop and deployment of C-DACs PARAM Series of Supercomputers from the year 1994 onwards. He is also playing an active role to proliferate parallel processing technology through workshops in India and contributed to PARAM series at Premier institutes in India. Dr. VCV Rao is associated with C-DAC since 1993. He received his Ph.D. degree in Mathematics in 1993 from IIT-Kanpur. VCV Rao was a visiting faculty at the Dept. of Computer Science, University of Minnesota (UoM), Minneapolis, and Post-Doctoral fellow at Army High Performance Computing (AHPCRC), UoM, during the year 1997-98.

  • BiDA2014

    21

    Dr. Yogesh Simmhan

    Supercomputing, Education and Research Centre, Indian Institute of Science

    (IISc), Bangalore.

    Title : Fast Data Analytics for the Internet of Things.

    Abstract: The pervasive spread of sensing, actuation and communication

    technology is helping realize the hardware aspects of the Internet of Things (IoT).

    However, capitalizing on the true potential of IoT for achieving societal impact in a

    developing country like India requires both affordable IoT solutions as well as

    meaningful analytics on top of the data collected. Data from IoT is often streaming in

    nature, distributed in their source, and intermittent. Realtime data aggregation and

    analytics requires research into stream processing systems that adapt to system

    behavior, and complex event processing across edge devices and the Cloud. This

    talk will discuss an IoT Architecture for India based on ongoing projects at IISc, and lay emphasis on managing the

    velocity dimension of Big Data for intelligent actions.

    Speaker Biosketch: Dr. Yogesh Simmhan is an Assistant Professor at the SERC Department at IISc. Previously, he

    was a Research Assistant Professor in the Electrical Engineering Department (Computer Engineering) at the

    University of Southern California, Los Angeles and Associate Director of the USC Centre for Energy Informatics. His

    research explores abstractions, algorithms and applications on distributed data and computing systems. These span

    Cloud Computing, Scalable and Distributed Computing, distributed data and metadata management, and software

    architectures for large scale applications in eScience and eEngineering. His research advances fundamental

    knowledge, and offers a practitioner's insight, on building scalable, resilient systems to empower dynamic, distributed

    and Big Data applications. Yogesh has a Ph.D. in Computer Science from Indiana University and was earlier a

    Postdoc at Microsoft Research, San Francisco. He is a Senior Member of IEEE and ACM.

  • BiDA2014

    22

    Dr. Siba K Udgata

    Centre for Modelling, Simulation and Design (CMSD), University of Hyderabad,

    Hyderabad.

    Title: Large Sensor Network and IoT Data Management with Big Data: The Research

    Challenges.

    Abstract: The ongoing convergence of evolution of devices (Internet of Things),

    accumulation of Big Data (large scale sensors) and deployment of large shared

    infrastructures (computing clouds) has created exciting new research challenges. The

    Internet of Things (IoT) has generated a large amount of research interest across a wide

    variety of technical areas. These include the physical devices themselves,

    communications among them, and relationships between them. One of the effects of

    ubiquitous sensors networked together into large ecosystems has been an enormous flow of data supporting a wide

    variety of applications. Technical and management challenges abound in this area, including: sensor networks

    management, and data management, analysis, and visualization. New research, tools, and applications in the field of

    big data have been exploding as researchers find new ways of addressing the challenges posed by volume,

    velocity and variety of data. The convergence of IoT and big data creates new opportunities for interesting and high

    impact research. Many sensor network data flows exhibit high velocity, distributed streams of heterogeneous data,

    often from mobile sources, and varying quality. We will present discuss some applications and challenges of some

    large scale sensor network applications with IoT using Big Data and data analytic.

    Speaker Biosketch: Dr. Siba K Udgata is a Professor in the School of Computer and Information Sciences,

    University of Hyderabad, India. He is also presently working at Centre for Modelling, Simulation and Design (CMSD)

    as Director. He has a Ph.D. in Computer Science in the area of mobile computing and wireless communication. His

    main research interests are Wireless Communication, Mobile Computing and Wireless Sensor Networks. He has

    twenty years of teaching experience of teaching Masters students of Computer Science and guiding research

    students. So far he has supervised more than 50 master student theses and four Ph.D. theses.

    He was a United Nations Fellow and worked in the United Nations University/ International Institute for Software

    Technology (UNU/IIST), Macau as research fellow in the year 2001. He was a visiting fellow at Ball State University,

    USA. He was also a visiting Professor at Asian Institute Technology, Bangkok, Mahasarakham University, Thailand

    and Tribhuban University, Kathmandu, Nepal.

    His research focus is on intelligent algorithm for wireless communication and related domains, mobile computing,

    sensor network algorithms and applications. He has worked as principal investigator in many Government of India

    funded research projects mainly for development of wireless sensor network applications and application of swarm

    intelligence techniques in cognitive radio network domain. Presently, he is leading a multi-institutional project funded

    by Information Technology Research Academy (ITRA), Department of Electronics and Information Technology

    (DeITy), Govt. of India.

  • BiDA2014

    23

    Expert Panel Discussion Theme: Big Data: New Challenges and Directions.

    Chair and Panelist: Mr. K Mohan Raidu, Director, Informatics India Mr. Raidu has founded Informatics India and is presently heading it as its Director. He is an Enterprise Solutions expert in the verticals of Sugar, Cement and Banking Industries. He has 35 years of experience in Software Development starting from 1979. As there was no formal education available in many Computer subjects in those days, he has acquired the skills on the jobs. While in service, he has taken short term courses from RR Labs-Hyderabad, CMC Hyderabad, IBM Bangalore, and University of Genova, Italy. Mr. Raidu holds MBA (Osmania University), MA (Philosophy, OU), and BSc (OU).

    Presently Mr. Raidus company works from two divisions, viz, Turn-Key projects and Software Products. Projects Division provides Solutions for Units of DAE (Department of Atomic Energy). Products Division provides ECS Methods (Electronic Clearing Services) to Branches of SBI. Other products are, Infrasoft (Infrastructure Asset Management Solution), LawOffice (Law Office Management Solution), FileManager a workflow

    management Software for Offices.

    Currently, Mr. Raidu is the Convener of CSI Golden Jubilee Event 2014. He is Member of CSI-SIG on e-Governance. He is also President of TITA Corporate Wing (Telangana Information Technology Association).

    Co-Panelist: Mr. T Krishna Kumar, Vice President, Tech Mahindra

    Mr. T Krishna Kumar is currently the Vice President at Tech Mahindra. He has about 18 years experience in the area of Management Consulting, IT and Mfg sector. Before Tech Mahindra he has worked in organizations like Avalon Consulting, KPMG, Satyam Computer Services and Batliboi. He has consulted for several Fortune 500 companies and International companies in the area of BPR and IT in the industries such as Manufacturing, Retail and Telecom. He has successfully driven IT strategies around IT in the areas of Manufacturing, CRM, Retail, SAP and Data warehousing solutions. Mr. Krishna Kumar is a 6 Sigma Blackbelt, after having done his MBA from SP Jain, Mumbai, BE from REC-Surat, CFA from ICFAI and Business Management from Harvard Business School. He has represented Indian IT Industry at the G-15 summit in Malaysia.

    His areas of passion are around Business Performance Management and how IT can deliver significant value to various human endeavors business or otherwise.

  • BiDA2014

    24

    Dr. V C V Rao, Mr. Rahul Ravidas Naik, Mr. Swapnajit Rout, Mr. Vikalp Handa High-Performance Computing Frontier Technologies & Exploration Group, Centre for Development of

    Advanced Computing (C-DAC), Pune.

    An Overview of the Laboratory Sessions: The laboratory sessions are focused on understanding Hadoop

    MapReduce framework, writing and execution of codes on IBM Multi-Core Processor system (IBM P755 32 CPU

    124GB RAM AIX as operating Systems) of Message Passing Cluster of CMSD, UoH for numerical and non-

    numerical computations. The basic codes (accessible from C-DAC website) for the following:

    (i) Hadoop MapReduce framework,

    (ii) Large scale matrix computations,

    (iii) Graph analytics based on Open Source Software Graph Distributed Computing GIRAPH & Bulk

    Synchronous Parallel (BSP) model, and

    (iv) Demonstration of Image Processing Kernels on BIG Data Framework.

    Mr. Rahul Ravidas Naik is a project engineer in High-Performance Computing - Frontier

    Technologies & Exploration (HPC-FTE) Group, at C-DAC, Pune, India. Mr. Rahul specializes

    in implementation of parallel algorithms on large scale distributed computing based on

    Hadoop MapReduce with accelerators GPUs and tuning performance of application and

    system benchmarks. He did his post graduate course at C-DAC, Mumbai in the year 2012-

    13. Prior to this, Mr. Rahul did his Masters in Computer Science at University of Southern

    California in the year 2006-08 and Bachelor in Computer Engineering at America university of

    Sharjah In the year 2006.

    Mr. Swapnajit Rout is a project engineer in High-Performance Computing - Frontier

    Technologies & Exploration (HPC-FTE) Group, at C-DAC, Pune, India. Mr. Swpanajit

    specializes on performance of application and system benchmarks on distributed computing

    systems with Hadoop Map Reduce and GPUs. He did his post graduate diploma in system

    software in C-DACs ACTS at Pune in the year 2013. He did his Bachelors in Information

    Technology at Biju Patnaik University of Technology, Odisha in the year 2011.

    Mr. Vikalp Handa is a project engineer in High-Performance Computing - Frontier

    Technologies & Exploration (HPC-FTE) Group, at C-DAC, Pune, India. Mr. Vikalp

    specializes on design and implementation of distributed parallel algorithms for information

    science and scientific application kernels on distributed computing systems with Hadoop

    Map Reduce and GPUs. Prior to this, he worked on implementation of computational

    finance applications and machine learning algorithms. He did his Bachelors in Computer

    Science Engineering in UIET, Punjab University, and Chandigarh in the year 2013.


Recommended