Practical Data Science andInformatics EducationSean Davis, MD, PhDCenter for Cancer ResearchNational Cancer InstituteCI4CC, October 1, 2018 @seandavis12
Slides: http://bit.ly/sd-ci4cc-2018
http://bit.ly/sd-ci4cc-2018
- Information Transfer- ++Audience Sharing++- Panel Discussion- Sharing: http://bit.ly/CI4CC-DSE- Networking and Collaboration--KISS
principle
http://bit.ly/CI4CC-DSE
Learning objectives
- Name several practical venues to teaching Data Science (DS)
- Identify practical strategies to enhance DS education- Know that building a data mature organization involves
education and community-building- Appreciate the value of trainers in DS education- Recognize the importance of community efforts and
collaboration in DS education
ApproachIllustrative examples of formats
for Data Science education
Example Approach
Software Carpentry Two-day interactive teaching
Statistical Methods for Functional Genomics
Two week intensive course
Bioconductor Workshops Short, topical workshops
NIH Hackathons Hackathons
Software Carpentry is a volunteer project dedicated to teaching basic computing
skills to researchers.
Software Carpentry highlights
- Uses limited, highly-focused, community-developed curricula
- Requires that trainers be trained!- Value to organization:
- Visible and recognizable to leadership- Addresses need- Cheap and easy- Builds community of practice and identifies trainers
- Formal, utilizes best practices
CSHL course highlights
- Maintain identity as a statistics course, not bioinformatics- Longer course, allows creativity and flexibility- Isolate students from daily work environment- Do not underestimate value of “cost” of the course- Application process--force students AND instructors to
evaluate course objectives- Learning environment is ideal- No “project” involved--choose students who are in position
to apply knowledge
CSHL Daily schedule8-9am Breakfast
9-12noon “Interactive lectures”
12-1pm Lunch
1-4pm Afternoon lab
4-5pm Research Talk
5-7pm Dinner and free time
7-8:00pm Research Talk (continued)
8-11pm Open lab
https://seandavi.github.io/ITR/
https://seandavi.github.io/ITR/
Bioconductor Workshop highlights
- More than 50% of the 3-day annual Bioconductor conference
- Multiple 1-2 hour hands-on, topical practicums in analysis methods, taught by the developers
- Workshops delivered via cloud infrastructure- All materials developed collaboratively with continuous
integration testing (everything works)- Utilize standard template- Modern collaboration (github) and publishing (gitbook)
Bioconductor workshop syllabus template
https://github.com/Bioconductor/BioC2018/tree/c2bd6d8e6330d0ceff8d23d59e111bbc68c66564
Bioconductor workshop syllabus template
https://github.com/Bioconductor/BioC2018/tree/c2bd6d8e6330d0ceff8d23d59e111bbc68c66564
Results- 6 weeks- 15 workshops- 17 contributors- 388 pages- 125 participants over two
days-
https://biohackathons.github.io/ NIH Hackathons
(with notation of other formats and organizations)
https://biohackathons.github.io/
Hackathon highlights
- Best for participants with well-formed domain expertise and technical knowledge
- Goal- and outcome-oriented- Self-directed- Like Software Carpentry, can be very attractive to
leadership- Can require significant up-front planning and logistics
NIH Hackathon schedule
- Day -60: develop projects and project leads- Day -30: begin to develop teams- Day 0.5: Introductions, team organization, infrastructure
(github, cloud instances), roles- Day 1: morning hacking, report out at lunch, afternoon
hacking, report out in evening- Day 2: morning hacking, afternoon writing and publishing- Day 2: close with presentations- Day 3+: continue hacking, publication process
Tips
- Adult education is about building confidence- Adult education is about building community- Adult education is about utility- Do not underestimate the power of food- Focus on active learning using formative evaluation
(questions), faded examples, independent work- Allow lots of time for students to interact with the
instructor(s) one-on-one
Tips
- For many students, simply having dedicated time to focus on learning is valuable
- Value the trainers with training, recognition, and community
- Recognize that every educational opportunity is part of a community-building effort
- Adopt and encourage educational and community best-practices.
- We are not doing this in isolation!
Longer-term
- Mentors and mentoring (build a list)- Online forums (force students to ask a question)- Communication tools (Slack, email lists)- Organizational issues (but do not let this be a barrier)- Partnering- Funding
Oft-forgotten
- Licensing- Contributions and attribution- Code of conduct- Privacy and ethics- Disabilities and accessibility (physical and electronic)
Just do it!
Google Doc: http://bit.ly/CI4CC-DSE
http://bit.ly/CI4CC-DSE
Acknowledgments- Software Carpentry: Greg Wilson (SWC), Lisa Federer (NIH), Francis Collins
(NIH), Ben Busby (NIH)- Bioconductor Workshops: Levi Waldron (CUNY), Marcel Ramos (CUNY),
Martin Morgan (Roswell Park), Lori Shepherd (Roswell Park)- CSHL: Harmen Bussemaker (Columbia), Tuuli Lappalainen (NY Genome
Center), Olivier Elemento (Cornell), David Stewary (CSHL), Alicia Franco (CSHL), Tomas Rube (Columbia)
- NIH Hackathons: Ben Busby (NIH), MANY participants
[email protected]://seandavi.github.io@seandavis12
mailto:[email protected]://seandavi.github.io