+ All Categories
Home > Documents > BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Date post: 06-Jan-2017
Category:
Upload: swami-nath-satpal
View: 172 times
Download: 5 times
Share this document with a friend
31
Transcript
Page 1: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12
Page 2: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

ANALYSING SURVEY DATA

RELATED TO ATTITUDE TOWARDS

INFLATION

GROUP MEMBERS : 1. SWAMI NATH SATPAL, NETAJI SUBHASH ENGINEERING COLLEGE

2. SATYAM KUMAR, NETAJI SUBHASH ENGINEERING COLLEGE

3. GAURAV KUMAR, NETAJI SUBHASH ENGINEERING COLLEGE

4. SHIVAM KUMAR, NETAJI SUBHASH ENGINEERING COLLEGE

5. WASHID SAYEED, NETAJI SUBHASH ENGINEERING COLLEGE

6. SUMAN KUNDU, RCC INSTITUTE OF TECHNOLOGY

Page 3: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Acknowledgement

I take this opportunity to express my profound gratitude and

deep regards to my faculty Sanjoy Chowdhury for his

exemplary guidance, monitoring and constant encouragement

throughout the course of this project.

The blessing, help and guidance given by him time to time shall

carry me a long way in the journey of life on which I am about

to embark.

I am obliged to my project team members for the valuable

information provided by them in their respective fields. I am

grateful for their cooperation during the period of my

assignment.

Swami Nath Satpal

USER
Sticky Note
Need names of all students.
Page 4: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CONTENTS

ACKNOWLEDGEMENT PROJECT OBJECTIVE PROJECT SCOPE REQUIREMENT SPECIFICATION DATA VISUALISATION

-E R DIAGRAM -LOOK UP TABLES

FUTURE SCOPE OF IMPROVEMENTS SCREENSHOTS CODES REPORTS CERTIFICATE

Page 5: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Project Objective

Analyzing survey data related to attitude towards inflation

of the common people in their day to day life.

This project is concerned with some guidelines for the

fundamental ideas of analysis of data from surveys.

Page 6: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Project Scope

The purpose of analyzing this data is to generate the

following reports:

1. The percentage of male & female respondents who said "Gone down" over each quarter when asked which of the options best describes how prices have changed over the past 12 months.

2. Percentage of respondents (in each income category) who said "is too high" over each quarter from 2003 to 2015 when asked what are their thoughts on the government setting an inflation target of 2.0%.

3. Percentage of respondents (at each education level) who mentioned "Stayed about the same" for each quarter from 2004 to 2012 when asked what they have to say on how the interest rates have changed over the past 12 months on things such as mortgages, bank loans and savings.

4. Percentage of respondents (at each tenure category) who mentioned "Rise a lot" for each quarter from 2003 to 2015 on what were their expectations of the interest to change over the next 12 months.

Page 7: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Requirement Specifications

Problem Definition :-

This project analyses survey data conducted by a bank in England to understand the general populations’ attitude to inflation. The data to be analysed ranges from 2003 to 2015. The survey data has been captured for every quarter within this period. The reports have been evaluated on the following survey questions:

Options which best describes how prices have

changed over the last 12 months and we have

calculated the percentage of male & female

respondents who said “Gone Down” over each

quarter.

The Government has set an inflation target of

2.0% so what is the thinking of the public about

this and here we have calculated the percentage

of respondents who is said is too high over each

quarter from 2003 to 2010.

Page 8: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

How would you say interest rates on things such

as mortgages, bank loans and savings have

changed over the last 12 months and here we

have calculated the percentage of respondents

who mentioned “Stayed about the same” for each

quarter from 2004 to 2012.

How would you expect interest rates to change

over the next 12 months and here we have

calculated the percentage of respondents who

mentioned “Rise a lot” for each quarter from

2003 to 2015.

Software Requirements

PIG

HIVE

R

FILEZILLA

VMWARE

Page 9: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

DATA VISUALISATION

E-R DIAGRAM

Page 10: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

LOOK-UP TABLE

Page 11: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12
Page 12: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12
Page 13: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Future Scope Of IMProvements

The future aspect of this survey is that user can use

these data to analyze and inspect the attitudes of

people towards inflation for further more queries and

can generate reports on these surveys, if required in

the future and would be highly helpful .

Page 14: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Screenshots of the percentage of male & female

respondents who said "Gone down" over each quarter

when asked which of the options best describes how

prices have changed over the past 12 months.

Page 15: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12
Page 16: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CODES

INSERT OVERWRITE TABLE male_1

select yyyyqq,count(q1) from q1 where q1='1' group by yyyyqq;

INSERT OVERWRITE TABLE male_q1

select yyyyqq,count(q1) from q1 group by yyyyqq;

INSERT OVERWRITE TABLE female_1

select yyyyqq,count(q1) from fq1 where q1='1' group by yyyyqq;

INSERT OVERWRITE TABLE female_q1

select yyyyqq,count(q1) from fq1 group by yyyyqq;

CREATE TABLE final(yyyyqq string,male_res1 string,male_q1

string,fe_yyyyqq string,female_res1 string,female_q1 string)

COMMENT ‘This table consists of male responded and female responded to

the questions asked’

ROW FORMAT DELIMITED

FIELDS TERMINATED BY','

STORED AS TEXTFILE;

INSERT OVERWRITE TABLE final

select f2.yyyyqq, f2.male_res1, f2.male_q1, f2.male_res1/f2.male_q1,

f3.female_res1,f3.female_q1,f3.female_res1/f3.female_q1 from final2

f2,final3 f3 where f2.yyyyqq=f3.yyyyqq;

Page 17: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Screenshots of the percentage of respondents (in

each income category) who said "is too high" over

each quarter from 2003 to 2010 when asked what are

their thoughts on the government setting an inflation

target of 2.0%.

Page 18: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CODES

INSERT OVERWRITE TABLE income_1

select yyyyqq,count(q4) from data where income='1' and yyyy between

'2003' and '2010' group by yyyyqq ;

INSERT OVERWRITE TABLE income_q1_1

select yyyyqq,count(q4) from data where income='1' and yyyy between

'2001' and '2010' and q4='1' group by yyyyqq;

CREATE TABLE final(yyyyqq string,inc1_res1 string,inc1_tres

string,inc_per string,inc2_res1 string,inc2_tres string,inc2_per string,

inc3_res1 string,inc3_tres string,inc3_per string,inc4_res1 string,inc4_tres

string,inc4_per string)

COMMENT ‘percentage of respondents who said income is too high in each

income category’

ROW FORMAT DELIMITED

FIELDS TERMINATED BY','

STORED AS TEXTFILE;

INSERT OVERWRITE TABLE final

select in1.yyyyqq, in1.q4_res1, in1.q4_tot, in1.q4_res1/in1.q4_tot,

in2.q4_res1, in2.q4_tot, in2.q4_res1/in2.q4_tot,

in3.q4_res1, in3.q4_tot, in3.q4_res1/in3.q4_tot, in4.q4_res1, in4.q4_tot,

in4.q4_res1/in4.q4_tot FROM income_join1 in1,income_join2

in2,income_join3 in3,income_join4 in4

where in1.yyyyqq=in2.yyyyqq;

Page 19: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Screenshots shows the percentage of

respondents who mentioned Stayed about the same

for each quarter from 2004 to 2012 at education level.

Page 20: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CODES

INSERT OVERWRITE TABLE educ1

SELECT yyyyqq,count(q5) FROM data1 where educ='1' and q5='3' and yyyy

between '2004' and '2012' group by yyyyqq;

INSERT OVERWRITE TABLE educ_total1

SELECT educ1.yyyyqq,educ1.q5,educ_q1.q5 FROM educ1,educ_q1 where

educ1.yyyyqq=educ_q1.yyyyqq;

INSERT OVERWRITE TABLE educ_total2

SELECT educ2.yyyyqq,educ2.q5,educ_q2.q5 FROM educ2,educ_q2 where

educ2.yyyyqq=educ_q2.yyyyqq;

INSERT OVERWRITE TABLE final

SELECT edu1.yyyyqq ,edu1.res_3, edu1.res_q5, edu1.res_3/edu1.res_q5,

edu2.res_3, edu2.res_q5, edu2.res_3/edu2 .res_q5

FROM educ_total1 edu1,educ_total2 edu2 where edu1.yyyyqq=edu2.yyyyqq;

Page 21: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

Reports

Options which best describes how prices have changed over the last 12 months and we have calculated the percentage of male & female respondents who said “Gone Down” over each quarter.

Page 22: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12
USER
Sticky Note
Should be female respondent.
Page 23: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

The Government has set an inflation target of

2.0% so what is the thinking of the public about

this and here we have calculated the percentage

of respondents who is said is too high over each

quarter from 2003 to 2010.

How would you say interest rates on things such

as mortgages, bank loans and savings have

changed over the last 12 months and here we

have calculated the percentage of respondents

who mentioned “Stayed about the same” for

each quarter from 2004 to 2012.

Page 24: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

How would you say interest rates on things such

as mortgages, bank loans and savings have

changed over the last 12 months and here we

have calculated the percentage of respondents

who mentioned “Stayed about the same” for

each quarter from 2004 to 2012.

Page 25: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

How would you expect interest rates to change

over the next 12 months and here we have

calculated the percentage of respondents who

mentioned “Rise a lot” for each quarter from

2003 to 2015.

Page 26: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CERTIFICATE

This is to certify that Mr. SWAMI NATH SATPAL of NETAJI SUBHASH ENGINEERING COLLEGE, registration number: 141090110114, has successfully completed the project on BIG DATA ANALYTICS WITH R using BIG DATA under the guidance of Mr. SANJOY CHOWDHURY.

--- ---------------------------------------------------

SANJOY CHOWDHURY Globsyn Finishing School (a division of Globsyn Skills)

Page 27: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CERTIFICATE

This is to certify that Mr. SATYAM KUMAR of NETAJI SUBHASH ENGINEERING COLLEGE, registration number: 141090110085, has successfully completed the project on BIG DATA ANALYTICS

WITH R using BIG DATA under the guidance of Mr. SANJOY CHOWDHURY.

--- ---------------------------------------------------

SANJOY CHOWDHURY Globsyn Finishing School (a division of Globsyn Skills)

Page 28: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CERTIFICATE

This is to certify that Mr. SHIVAM KUMAR of

NETAJI SUBHASH ENGINEERING

COLLEGE, registration number: 141090110092,

has successfully completed the project on BIG DATA

ANALYTICS WITH R using BIG DATA under the

guidance of Mr. SANJOY CHOWDHURY.

--- ---------------------------------------------------

SANJOY CHOWDHURY Globsyn Finishing School (a division of Globsyn Skills)

Page 29: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CERTIFICATE

This is to certify that Mr. GAURAV KUMAR of NETAJI SUBHASH ENGINEERING COLLEGE, registration number: 141090110034, has successfully completed the project on BIG DATA ANALYTICS

WITH R using BIG DATA under the guidance of Mr. SANJOY CHOWDHURY.

--- ---------------------------------------------------

SANJOY CHOWDHURY Globsyn Finishing School (a division of Globsyn Skills)

Page 30: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CERTIFICATE

This is to certify that Mr. WASHID SAYEED of NETAJI SUBHASH ENGINEERING COLLEGE, registration number: 141090110123, has successfully completed the project on BIG DATA ANALYTICS

WITH R using BIG DATA under the guidance of Mr. SANJOY CHOWDHURY.

--- ---------------------------------------------------

SANJOY CHOWDHURY Globsyn Finishing School (a division of Globsyn Skills)

Page 31: BPPIMT(VIPRoad)-Big Data Analytics with R-Gr12

CERTIFICATE

This is to certify that Mr. SUMAN KUNDU of RCC INSTITUTE OF TECHNOLOGY, registration number: 131170110091, has successfully completed the project on BIG DATA ANALYTICS

WITH R using BIG DATA under the guidance of Mr. SANJOY CHOWDHURY.

--- ---------------------------------------------------

SANJOY CHOWDHURY Globsyn Finishing School (a division of Globsyn Skills)


Recommended