+ All Categories
Home > Business > Resume parser

Resume parser

Date post: 03-Jul-2015
Category:
Upload: akrita-agarwal
View: 1,040 times
Download: 7 times
Share this document with a friend
13
Resume Parser Alpha Phase Akrita Agarwal
Transcript
Page 1: Resume parser

Resume Parser – Alpha Phase

Akrita Agarwal

Page 2: Resume parser

Introduction

• While creating/updating a profile, the user has to enter all the information manually.

• Resume parser aims to accept a user’s resume in Microsoft Word .doc format to reduce the manual step.

• In this phase, we have grouped the titles together to understand the structure of the document.

Page 3: Resume parser

Tools used

• Apache POI – Java API for Microsoft Documents

• HWPF – Apache POI component for Word Documents

• Eclipse

Page 4: Resume parser

Algorithm

• Document scanned for titles

• Each title acquires a 4-D vector, stored as a key in a map.

• The title is saved as a value in the map with its start and stop index.

• Titles grouped by their keys.

• 4-D key vector :[B][I][U].[font size].[font name].[font color]

E.g. 110.24.Times New Roman.0

Page 5: Resume parser

• Values –

– Each value is of form -

• ([start index] [stop index] [title text])

– List of values -

• ([start index] [stop index] [title text]), ([start index] [stop index] [title text]) …

– E.g.

– (353 362 Education) (1918 1931 Skills & Tools) …

– A list of values exist corresponding to each key

– Each key is unique to the system

– All values with the same key will be grouped together

– Sample resume run in the next slide

Page 6: Resume parser

AKRITA AGARWAL 521 W MLK Dr Unit A1 513-652-7687 Cincinnati 45220 Ohio USA [email protected]

EDUCATION Masters of Science, Computer Science, University of Cincinnati 2013 – 2015 (expected) GPA: 3.91/4.00 Relevant courses: Machine Learning, Artificial Intelligence, Pattern Recognition, Intelligent Data Analysis Projects: Detect the facial key points on face images using Neural Networks view presentation August 2013 – present Bachelor of Technology, Computer Science, Jaypee Institute of Information Technology 2009 – 2013

Relevant courses: Data Mining & Information Retrieval, E-commerce, Introduction to Data Science, Human Computer Interaction, Data structures and algorithms, Problem Solving and Research Methodologies

Projects:

Dance posture correction system based on Human Action Recognition using Microsoft Kinect Jun 2012- Jun 2013

SPAM filter for SMS on phones with supervised learning using Naïve Bayesian Classifier Jan 2012 – Aug 2012

Recommender systems for social networks based on supervised content and collaborative filtering July 2011 - Dec 2012

Rainfall estimation in India based on 150 years data using Linear Regression model view presentation Jan 2011- Jun2011

SKILLS & TOOLS Languages: C, C++, Java, Python2.7-3.3, R Tools: Microsoft Visual Studio, MATLAB, R studio

PROFESSIONAL EXPERIENCE University of Cincinnati, UCIT, Application Developer – working with Java and Play Framework to build Nov 2013 – present

an online network of professionals called ‘Eprofessional’ at University of Cincinnati, Office of Research.

Microsoft India, Student Partner - Developed applications for Windows 8, Windows phone 7, Sep 2012 – Sep 2013

designed the website for the Delhi region, conducted the MS Dreamspark Yatra and took sessions

on Cloud Computing, phone application development in many colleges of Delhi.

Indian Institute of Technology Delhi, Research Intern - Worked on real-time action recognition for office Jun 2012 – Aug 2013 environment (typing, reading, etc.), modelled by Support Vector Machines. The activities were recorded

using Microsoft Kinect.

KPC, Noida India, Software Intern, Jun 2010 – Aug 2010 Live website development and training on asp.net, Silverlight and XNA framework in Visual Studio

PUBLICATIONS WebTrovert: An AutoSuggest Search and Suggestions Implementing Recommendation System Algorithms, Book Title: Advances in Computer Science, Engineering & Applications Publisher: Springer Berlin / Heidelberg, ISBN: 978-3-642-30156-8, view here

AWARDS & HONORS Won IUnlockJoy & IUnlockJoy2, Windows phone application development competitions organized by Microsoft, 2012

Won MeetMyApp Windows 8 application development competition organized by Microsoft, 2012

ACM Brand Ambassador, JIIT ACM Student Chapter 2011-12

INTERESTS Playing Piano Open source coding (Github) Learning Languages

Output –

No. of unique groups – 4 --

100.20.Calibri.0[2937 2954 , 3414 3424 IUnlockJoy, 3426 3437 IUnlockJoy, 3524 3533 MeetMyApp]

100.24.Calibri.9[353 362 Education, 1918 1931 Skills & Tool, 2027 2049 Professional Experienc, 3074 3086 Publications, 3394 3409 Awards & Honors, 3665 3673 Interest]

1.20.Calibri.2[331 351 [email protected], 820 837 view presentation, 1873 1890 view presentation, 3382 3391 view here]

1.20.Calibri.0[2051 2074 University of Cincinnat, 2323 2337 Microsoft Indi, 2634 2669 Indian Institute of Technology Delh]

Page 7: Resume parser

Output

• The algorithm was tested on 3 resumes.

• Akrita.doc (1 page)

• Vagal.doc(14 pages)

• Collins.doc(112 pages)

• Note - Akrita.doc was used as a sample resume and the output is given in the previous slide.

Page 8: Resume parser

AKRITA AGARWAL 521 W MLK Dr Unit A1 513-652-7687 Cincinnati 45220 Ohio USA [email protected]

EDUCATION Masters of Science, Computer Science, University of Cincinnati 2013 – 2015 (expected) GPA: 3.91/4.00 Relevant courses: Machine Learning, Artificial Intelligence, Pattern Recognition, Intelligent Data Analysis Projects: Detect the facial key points on face images using Neural Networks view presentation August 2013 – present Bachelor of Technology, Computer Science, Jaypee Institute of Information Technology 2009 – 2013

Relevant courses: Data Mining & Information Retrieval, E-commerce, Introduction to Data Science, Human Computer Interaction, Data structures and algorithms, Problem Solving and Research Methodologies

Projects:

Dance posture correction system based on Human Action Recognition using Microsoft Kinect Jun 2012- Jun 2013

SPAM filter for SMS on phones with supervised learning using Naïve Bayesian Classifier Jan 2012 – Aug 2012

Recommender systems for social networks based on supervised content and collaborative filtering July 2011 - Dec 2012

Rainfall estimation in India based on 150 years data using Linear Regression model view presentation Jan 2011- Jun2011

SKILLS & TOOLS Languages: C, C++, Java, Python2.7-3.3, R Tools: Microsoft Visual Studio, MATLAB, R studio

PROFESSIONAL EXPERIENCE University of Cincinnati, UCIT, Application Developer – working with Java and Play Framework to build Nov 2013 – present

an online network of professionals called ‘Eprofessional’ at University of Cincinnati, Office of Research.

Microsoft India, Student Partner - Developed applications for Windows 8, Windows phone 7, Sep 2012 – Sep 2013

designed the website for the Delhi region, conducted the MS Dreamspark Yatra and took sessions

on Cloud Computing, phone application development in many colleges of Delhi.

Indian Institute of Technology Delhi, Research Intern - Worked on real-time action recognition for office Jun 2012 – Aug 2013 environment (typing, reading, etc.), modelled by Support Vector Machines. The activities were recorded

using Microsoft Kinect.

KPC, Noida India, Software Intern, Jun 2010 – Aug 2010 Live website development and training on asp.net, Silverlight and XNA framework in Visual Studio

PUBLICATIONS WebTrovert: An AutoSuggest Search and Suggestions Implementing Recommendation System Algorithms, Book Title: Advances in Computer Science, Engineering & Applications Publisher: Springer Berlin / Heidelberg, ISBN: 978-3-642-30156-8, view here

AWARDS & HONORS Won IUnlockJoy & IUnlockJoy2, Windows phone application development competitions organized by Microsoft, 2012

Won MeetMyApp Windows 8 application development competition organized by Microsoft, 2012

ACM Brand Ambassador, JIIT ACM Student Chapter 2011-12

INTERESTS Playing Piano Open source coding (Github) Learning Languages

Output –

No. of unique groups – 4 --

100.20.Calibri.0[2937 2954 , 3414 3424 IUnlockJoy, 3426 3437 IUnlockJoy, 3524 3533 MeetMyApp]

100.24.Calibri.9[353 362 Education, 1918 1931 Skills & Tool, 2027 2049 Professional Experienc, 3074 3086 Publications, 3394 3409 Awards & Honors, 3665 3673 Interest]

1.20.Calibri.2[331 351 [email protected], 820 837 view presentation, 1873 1890 view presentation, 3382 3391 view here]

1.20.Calibri.0[2051 2074 University of Cincinnat, 2323 2337 Microsoft Indi, 2634 2669 Indian Institute of Technology Delh]

My resume

Page 9: Resume parser

Dr. Vagal’s resume

ACHALA VAGAL, MD Department of Radiology

University of Cincinnati Academic Health Center

234 Goodman Street

Cincinnati, OH 45267-0761

Phone: (513 584-1584

FAX: (513) 584-9100

[email protected]

ACADEMIC APPOINTMENTS

9/2013-current Associate Professor, Department of Radiology, University Of Cincinnati

6/2007-8/2013 Assistant Professor, Department of Radiology, University Of Cincinnati

6/2005-6/2007 Clinical Instructor, Veterans Affairs Hospital and University of Cincinnati

3/1997-5/2001 Lecturer, Mahatma Gandhi Mission Hospital and Medical College, Mumbai, India

POSTGRADUATE TRAINING

7/8/13-7/12/13 Visiting Research Fellow, Imaging Research Lab, Weil Cornell Medical College,

New York.

7/22/13-7/28/13 Visiting Research Fellow, Imaging Lab, University of Virginia

6/2012-6/2013 Certificate in Clinical and Translational Research, Division of Epidemiology and

Biostatistics, University of Cincinnati

7/2003-6/2004 Neuroradiology Fellowship, University of Cincinnati Medical Center 7/2004-

6/2005 Body Imaging Fellowship, University of Cincinnati Medical Center

2/1994-1/1997 Diagnostic Radiology Residency, University of Mumbai, India

12/1993-1/1994 Internship, University of Mumbai, India

EDUCATION

6/88-12/93 MBBS with honors, Grant Medical College and Sir J.J Group of Hospitals,

Mumbai, India

6/86-6/88 Undergraduate, Sir Parshurambhau College, Pune, India

HONORS AND AWARDS

2006 Certificate of Merit, Educational Exhibit, RSNA Annual Meeting

2006 Cum Laude Award, Educational Exhibit, RSNA Annual Meeting

2006 Certificate of Merit, Educational Exhibit, ARRS Annual Meeting

2008 Summa Cum Laude Award: Scientific Presentation, American

Roentgen Ray Society (ARRS) Annual Meeting

2009 Certificate of Merit, Educational Exhibit, RSNA Annual Meeting

2009 Clinician Educator Development Program Grant , American

No. of unique groups - 4--

100.24.Times New Roman.0[5432 5470 PROFESSIONAL ORGANIZATION MEMBERSHIPS:8322 8383 Advanced CT/MR perfusion training for neuroradiology fellows:

101.24.Times New Roman.0[254 275 ACADEMIC APPOINTMENTS652 674 POSTGRADUATE TRAINING 1377 1387 EDUCATION 1559 1575 HONORS AND AWARD2802 2883 PROFESSIONAL ASSOCIATIONS AND COMMITTEES MEMBERSHIPS DEPARTMENTAL COMMITTEES:4199 4220 NATIONAL COMMITTEES:7927 7966 TEACHINGTeaching Material Developed: 9508 9541 PEER REVIEWED PUBLICATIONS14082 14117 SCIENTIFIC AND EDUCATIONAL EXHIBITS23120 23135 BOOK CHAPTER: 23370 23410 INVITED SPEAKER / PRESENTATIONSLocal:23957 23967 Regional:24531 24542 National: 26512 26557 RESEARCH SUPPORT Ongoing Research Support29818 29846 Completed Research Support 31991 32011 MANUSCRIPT REVIEWER:32118 32154 BOARD CERTIFICATION AND LICENSURE:]

1.24.Times New Roman.0[3809 3829 HOSPITAL COMMITTEES:5224 5250 INTERNATIONAL COMMITTEES:5827 5874 LOCAL CLINICAL/QI INITIATIVES: 3D Lab set up:6392 6415 CT/MR Perfusion Program6847 6859 Resident QA:7489 7499 Stroke QI:7975 8007 Resident Core Curriculum Project8751 8797 ASNR online educational curriculum development]

10.24.Times New Roman.0[4991 5003 Radiographic16085 16105 Certificate of Merit17077 17098 Magna Cum laude Award17698 17718 Magna Cum Laude Awar18787 18812 Certificate of Merit Awar19046 19071 Certificate of Merit Awar23082 23101 Certificate of Meri31933 31946 Radiographic]

Page 10: Resume parser

Dr. Collin’s resumeCURRICULUM VITAE

Jannette Collins, MD, MEd, FCCP, FACR

May 29, 2013

CURRENT APPOINTMENT:

Ben Felson Professor and Chair of Radiology, University of Cincinnati College of Medicine

Professor of Medicine, University of Cincinnati College of Medicine

ADDRESS:

Department of Radiology

University of Cincinnati Medical Center

234 Goodman Street, ML 0761

Cincinnati, OH 45267-0761

Phone: 513-584-4396

FAX: 513-584-0431

E-mail: [email protected]

CERTIFICATION:

Medical License Wisconsin #31295

Ohio #92719

National Board of Medical Education Certificate #369504

American Board of Radiology June 8, 1994

NIOSH B Reader Certification May 1, 1995

NIOSH B Reader Re-Certification 3/1/99 - 2/28/03

NIOSH B Reader Re-Certification 3/1/03 – 2/28/07

EDUCATION:

1975-76 University of Montana

Major: Music Missoula, Montana

1976-78 Montana State University

Bachelor of Science, Bozeman, Montana

Elementary Education G.P.A. 3.53/4.00

1980-81 University of Cincinnati

Master of Education, Cincinnati, Ohio

Special Education G.P.A. 4.00/4.00

1981-83 University of Cincinnati

Physics, Chemistry, Biology Cincinnati, Ohio

G.P.A. 3.69/4.00

1985-89 Medical College of Ohio

Doctor of Medicine Toledo, Ohio

No. of unique groups - 4--

100.24.Times New Roman.0[5432 5470 PROFESSIONAL ORGANIZATIO

Page 11: Resume parser

Performance

Resume Name Pages Runtime Error % in title grouping

Akrita.doc 1 ~ 1 minutes 0-1%

Vagal.doc 14 ~ 5 minutes Upto 50%

Collins.doc 112 ~ 2 hours 30-40%

The challenge lay in parsing the collins resume and produce meaningful results.

The system is able to recognize and group the titles with a fair accuracy.

Page 12: Resume parser

Next phase proposal

• Focus on individual sections.

• Extract the needed information from the groups.

• Further Semantic Analysis of the text to reduce the error rate

• Remove noise and junk values.

Page 13: Resume parser

• I am sorry I couldn’t make it to the meeting

• A BIG thank you to Josette for the motivation and Divya for presenting this

• If there are any questions/suggestions you can email me at – [email protected]

Thank You Everyone!

- Akrita


Recommended