PyPedia The free programming environment that anyone can edit! Alexandros Kanterakis Genomics...

Post on 31-Mar-2015

216 views 0 download

Tags:

transcript

PyPedia

The free programming environment that anyone can edit!Alexandros Kanterakis

Genomics Coordination Center, Department of Genetics, University Medical Center, Groningen, The Netherlands

Introduction

• Stay low level at every level• Be open source without being open• Make tools that make no sense to scientists• Do not ever share your results and do not reuse• Never maintain your databases and web services• Be unreachable and isolated

How not to be a bioinformatician

So, you think you can be a bioinformatician…

• Imagine you only have: A personal computer with a browser and an Internet connection

• Answer the following question:- Who is the current prime minister of Latvia?

SYTYCBAB• Imagine you only have: A personal computer with a

browser and an Internet connection• Answer the following question:

Compute the Hardy-Weinberg equilibriums of a set of genotypes

✔ Execute✖ Source✖ Documentation

✖ Execute✔ Source✖ Documentation

✖ Execute✖ Source✔ Documentation

? Web environment, online execution? Open Source? Integrate with other tools? Edit a method and share it? Examples and Unit tests? Deploy in the cloud? Frequency of new releases

✔ Execute✔ Source✔ Documentation

But what about…

wiki

A python sandbox to the rescue

From:http://wiki.python.org/moin/SandboxedPython

So:Google App Engine + MediaWiki = PyPedia

www.pypedia.com

Code as wiki

HTML input as wiki

Executing a method in a remote computer

• Edit your user page and add an “ssh” section:

• This content is NOT shown to anyone• Install the PyPedia client on remote

computer(details on pypedia.com)

==ssh==host=ec2-107-22-59-115.compute-1.amazonaws.comusername=JohnDoepath=/home/JohnDoe/runPyPedia

“Execute on remote computer”

Example:Fixed_point_user_JohnDoe

The cloud instance contains:numpy, scipy, matplotlib

Like SAGE but with custom execution environments (i.e BioPython, PyCogent, …)

Cool, but I want to call the function from my local computer..

• Install the PyPedia python library:git clone git://github.com/kantale/pypedia.git

• Load the function in python:>>> import pypedia>>> from pypedia import Pairwise_linkage_disequilibrium>>> Pairwise_linkage_disequilibrium([("A","A"), ("A","G"), ("G","G"), ("G","A")], [("A","A"), ("A","G"), ("G","G"), ("A","A")])

{'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG', 2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408, 'Dprime': 0.99999999986098675}>>>

• You can call the method of any user and your method can be called by anyone.

• Edit locally, push changes.

• On the top of each article there is a button:

• Creates a personalized version of the article that only you can edit.

• This is similar to the Github’s “fork” feature.

Using PyPedia for open science

• A complete analysis can be hosted in PyPedia

• Any finding generated or published should be easily shared and reproduced.

• The reproduction of a finding takes time even when the source code is released.

Reproducible science

• PyPedia offers a REST interface:• www.pypedia.com/index.php?

b_timestamp=<YYYYMMDDHHMMSS>&get_code=<python code>

• Get the most recent version of the <python code> that is edited before the timestamp.

• Reproduce the analysis by sharing a single URL:http://www.pypedia.com/index.php?b_timestamp=20120102101010&get_code=print Pairwise_linkage_disequilibrium([("A","A"), ("A","G"), ("G","G"), ("G","A")], [("A","A"), ("A","G"), ("G","G"), ("A","A")])

Reproducing an experiment#> curl \--data-urlencode 'b_timestamp=20120501010101' \--data-urlencode 'get_code=print Pairwise_linkage_disequilibrium([("A","A"), ("A","G"), ("G","G"), ("G","A")], [("A","A"), ("A","G"), ("G","G"), ("A","A")])' \ http://www.pypedia.com/index.php \ --output code.py

#> python code.py{'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG',

2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408, 'Dprime': 0.99999999986098675}

Meta-webserver• HTML injection is allowed

and encouraged!http://www.pypedia.com/index.php/Draw_face_user_Kantale

• Example run an HTML code posted on gist:http://www.pypedia.com/index.php?

run_code=import urllib2print urllib2.urlopen(

‘https://raw.github.com/gist/2689822/bbea0c43b278d7c4c04b3f7a23ba43f558fba98b/index_full.html’).read()

Click me!

• All content is under the Simplified BSD License• Two namespaces:– Validated articles. i.e: Minor_allele_frequency• Safe, only admins can edit

– User articles. i.e: Minor_allele_frequency_user_John• Unsafe, edited by individual user

– Qualitative articles from User namespace is promoted to the Validated namespace

– Validated articles cannot call User articles (duh..)

Some thoughts(in the embarrassing occasion I have some minutes left)

Code as wiki, program as wiki concept• Multidimensional expansion• As Mao said: Let a thousand flowers scripts bloom (and some of

them rot in hell)• Minimize the distance:

Dsanity(SCRIPTmade_by_IT_guy, SCRIPTuseful_to_biologists)• Encyclopedialize ™ your scripts because open source isn’t enough!

Future steps:• Attract editors, make communities!• If it can be done in python, why not Ruby, …?

• Contact: admin@pypedia.com• Source code license: GPL v3• Content license: Simplified BSD license• Join us in google groups:

http://groups.google.com/group/pypedia• Twitter: @PyPedia

• PyPedia’s source code:– Mediawiki extension:

https://github.com/kantale/PyPedia_server– Python library:

https://github.com/kantale/pypedia

• Acknowledgements:– Despoina Antonakaki– Kostas Tselios– Morris A. Swertz