1. Data Visualization in Python/ Django By KENNETH EMEKA ODOH
By KENNETH EMEKA ODO
2. Table of
ContentsIntroductionMotivationMethodAppendicesConclusionReferences
3. Introduction My background Requirements( Python, Django,
Matplotlib, ajax ) and other third-party libraries. What this talk
is about ( we will be restricted to python, matplotlib and django
). What this talk is not about ( we are not trying to re-implement
Google analytics ). Source codes are available at (
https://github.com/kenluck2001/PyCon2012_T alk ).
4. MOTIVATIONThere is a need to represent the business analytic
data in a graphical form. This is because a picture speaks more
than a thousand words. Source: en.wikipedia.org
5. Where do we finddata? Source: en.wikipedia.org
6. Sources of Data CSV DATABASES
7. Steps for data gathering Identify the data source.
Preprocessing of the data ( removing nulls, wide characters ) e.g.
Google refine. Actual data processing ( perform some statistical
analysis ). Present the clean data in descriptive format. i.e Data
visualization See Appendix 1
8. Visual Representation of data Charts / Diagram format Texts
format Tables Log filesSource: devk2.wordpress.com Source:
elementsdatabase.com
9. Categorization of data Real-time ( generating charts on real
time. This can also include mechanism for refreshing the site to
get the latest chart ). See Appendix 2 Batch-based ( create charts
from csv file. Example in my blog) See Appendix 2
10. Rules of Data Collection Keep data in the easiest process
able form e.g database, csv Keep data collected with timestamp. The
time that the data is collected or processed, for filtering .
Gather data that are relevant to the business needs. Ensure that
whenever the data grows so large. You have to prune some stale or
old data that are no longer needed.
11. Where is the data visualization done? Server See Appendix
from 2 - 6 Client Examples of Javascript library DS.js (
http://d3js.org/ ) gRaphael.js ( http://g.raphaeljs.com/ )
12. Factors to Consider forChoice of Visualization Where do we
perform the visualization processing? Is it Server or Client?It
depends Security Scalability
15. Appendix 1## This describes a scatter plot of solar
radiation against the month.This aim to describe the steps of data
gathering.CSV file from data sciencehackathon website. The source
code is available in a folder namedplotCodeimpoqv
cuvfqommavplovlib.backendu.backend_aggimpoqv FigtqeCanvauAgg au
FigtqeCanvaufqom mavplovlib.figtqe impoqv
FigtqedefpqepaqeLiuv(monvh_mouv_common_liuv): Pqepaqe vhe inptv foq
pqoceuu byqemoving all tnneceuuaqy valteu.Replace "NA" sivh 0
otvptv_liuv = [] foq x in monvh_mouv_common_liuv: if x != NA:
otvptv_liuv.append(x)
16. Appendix 1 contd.def
plovSolaqRadiavionAgainuvMonvh(filename): vqainRosReadeq
=cuv.qeadeq(open(filename, qb), delimiveq=,) monvh_mouv_common_liuv
= [] Solaq_qadiavion_64_liuv = [] foq qos in vqainRosReadeq:
monvh_mouv_common = qos[3] Solaq_qadiavion_64 =
qos[6]monvh_mouv_common_liuv.append(monvh_mouv_common)Solaq_qadiavion_64_liuv.append(Solaq_qadiavion_64)
#conveqv all elemenvu in vhe liuv vo floavshile ukipping vhe fiquv
elemenv foq vhe 1uvelemenv iu a deucqipvion of vhe field.
monvh_mouv_common_liuv = [floav(i) foq i
inpqepaqeLiuv(monvh_mouv_common_liuv)[1:] ] Solaq_qadiavion_64_liuv
= [floav(i) foq i inpqepaqeLiuv(Solaq_qadiavion_64_liuv)[1:] ]
fig=Figtqe() ax=fig.add_utbplov(111) vivle=Scavveq Diagqam of uolaq
qadiavionagainuv monvh of vhe yeaq ax.uev_xlabel(Mouv common monvh)
ax.uev_ylabel(Solaq Radiavion) fig.utpvivle(vivle, fonvuize=14)
vqy:
18. Appendix 3fqom django.hvvp impoqv HvvpReuponuefqom
mavplovlib.backendu.backend_aggimpoqv FigtqeCanvauAgg au
FigtqeCanvaufqommavplovlib.figtqeimpoqv Figtqefqom
YAAS.uvavu.modelu impoqvRegiuveqedUueq, OnlineUueq, SvavBid
#ucavveq diagqam ofntmbeq of bidu made againuv ntmbeq of online
tuequ# seekly qepoqv@uvaff_membeq_qertiqeddef
seeklyScavveqOnlinUuqBid(qerteuv, seek_no): page_vivle=Weekly
Scavveq Diagqam baued on Onlinetueq vequeu Bid seekno=seek_no
fig=Figtqe() ax=fig.add_utbplov(111) yeaq=uvav.gevYeaq() onlUueqObj
=OnlineUueq.objecvu.filveq(seek=seekno).filveq(yeaq=yeaq) bidObj
=SvavBid.objecvu.filveq(seek=seekno).filveq(yeaq=yeaq) onlUueqliuv
=liuv(onlUueqObj.valteu_liuv(no_of_online_tueq, flav=Tqte)) bidliuv
=liuv(bidObj.valteu_liuv(no_of_bidu, flav=Tqte)) vivle=Scavveq
Diagqam of ntmbeq of online Uueqagainuv ntmbeq of bidu (seek
{0l){1l.foqmav(seekno,yeaq) ax.uev_xlabel(Ntmbeq of online Uuequ)
ax.uev_ylabel(Ntmbeq of Bidu) fig.utpvivle(vivle, fonvuize=14) vqy:
ax.ucavveq(onlUueqliuv, bidliuv) excepv ValteEqqoq: pauu
19. Appendix 4# Example of how database may be deleted to
recover some space.From folder named YAAS. Check
task.py@peqiodic_vauk(qtn_eveqy=cqonvab(hotq=1, mintve=30,
day_of_seek=0))def deleveOldIvemuandBidu(): htndeqedandvsenvydayu
=davevime.voday() -davevime.vimedelva(dayu=120) myIvem
=Ivem.objecvu.filveq(end_dave__lve=htndeqedandvsenvydayu ).deleve()
myBid
=Bid.objecvu.filveq(end_dave__lve=htndeqedandvsenvydayu).deleve()#poptlave
vheqegiuveqedtueq and onlinetueq modelav qegtlaq inveqvalu
20. Appendix 5Check project inYAAS/stats/for more information
onstatistical processing
21. Appendix 6 # how to refresh the views in django. To keep
the charts. updated. See WebMonitor project {% exvendu "baue.hvml"
%l {% block uive_sqappeq %l