Introduction to
Presented by Diego Marinho de Oliveira
Short Version
Lead Data ScientistDate: 2016-08-25
Julia Language for Data Science, Machine Learning, Statistics, Mathematics
SummaryIntroduction
Interactive/Development Environments
Syntax
Libraries Highlights
Code Examples
Integration
Community
Important Events/Projects
IntroductionJulia is a high-level, high-performance dynamic programming language for technical computing
Adequate for Data Science, Research, Web application, scientific computer, among others
Multi-paradigm: multiple dispatch ("object-oriented"), procedural, functional, meta, multistaged
High-Performance JITCompiler
First appeared in 2012
Stable release 0.4.6 June 2016
Julia paper: http://arxiv.org/pdf/1411.1607v3.pdf
Introduction: How to Install JuliaInstall Julia version 0.4.6 (last stable version)
Windowshttps://s3.amazonaws.com/julialang/bin/winnt/x64/0.4/julia-0.4.6-win64.exe
Mac OS X
https://s3.amazonaws.com/julialang/bin/osx/x64/0.4/julia-0.4.6-osx10.7+.dmg
Ubuntusudo add-apt-repository ppa:staticfloat/juliareleasessudo add-apt-repository ppa:staticfloat/julia-depssudo apt-get updatesudo apt-get install julia
For more details check http://julialang.org/downloads/platform.html
Interaction/Development Environments
Julia Studio / Forio
Julia REPL
Kaggle Kernels
IJulia
JuliaBoxAtom
julia-vim
Syntax: Getting StartedJulia run code
Call Julia scriptjulia script.jl arg1 arg2…
Create a Julia command on the shelljulia -e 'for x in ARGS; println(X); end' foo bar
Create a Julia command using UTF-8 charactersecho 'println("Greetings! 你好 ! 안녕하세요 ?")' > ~/.juliarc.jl
Syntax: Variables
Example Variables
A variablex = 10
Float variabley = x + 2.0
UTF-8 variable 𝛔 = 1
Built-in Types
Int8, Int16, Int32, Int64, Int12, UInt32, UInt64, UInt128 103Bool false, trueAbstractString “Data!”Char ‘Z’Float16, Float32, Float64Complex 1 + 2imRational 5//6
Some math functions
round(Int, 76.0)
floor, ceil, trunc, eps, ...div, rem, mod, gcd, lcm, ...abs, sqrt, cbrt, exp, log, log2, ...sin, cos, tan, cot, sec, hypot, ...beta, gamma, eta, zeta, ...
Syntax: StringsTypes
AbstractStringUTF8StringASCIIStringChar
Simple Usage Examples
Using a chara = ‘A’
Simple stringb = ”for Data Science”
Interporlationprintln(”Julia is $b”)
Regular Expressionmatch(r”(\W\w+){,2}”, b)
Unicode usageprintln(“\u2200 x \u2203 y”)
Triple Quotejson = ”””{ “Id”: 10232}”””
Concat“Lets” * “ code”
Repeat sentencerepeat(“Julia”, 10)
Syntax: Functions, Control Flow Functions
Basic function definitionfunction f(x,y) x + yend
Terse function definitionf(x,y) = x+y
Optional and keywords argsx: optional; a: keywordf(x, y=1; a=3) = 1
Control Flow
Compound expressionsZ = begin f(x,y) x+ 1 x + yend
Repeated eval loopswhile x < 1for x=1:10
Short-circuit evaluation&&, ||
Conditional evaluationsIf x < 10 x += 1elseif 10 <= x < 12 x+= 2else x += 3end
Exception Handlingtry - catch
Tasks (coroutines)yieldto
Syntax: Types, Parallel and PackagesTypes
Abstract typeabstract Integer <: Real
Create a composite typetype Point
x::Float64y::Float64
end
Parallel
Execute parallel commandnheads = @parallel (+) for i=1:200000000 Int(rand(Bool))endPackages
Show statusPkg.status()
Install a new packagePkg.add(“<Package Name>”)
Remove PackagePkg.rm(“Package”)
Install from GitHubPkg.clone(“Package”)
Update packagesPkg.update()
Library Highlights● DataFrames.jl (data analysis)● Gadfly.jl (data visualization)● Vega.jl (data visualization)● HypothesisTests.jl (statistics)● XGBoost.jl (machine learning)● GLM.jl (machine learning)● Mocha.jl (machine learning/deep learning)● Low Rank Models.jl (machine learning)● JuMP.jl (optimization)
Code ExamplesSimple Example Preprocess and Plot Data
Import Packagesusing RDatasets,DataFrame, Gadfly
Load some datamtcars = dataset("datasets", "mtcars")
Filter Models by Horse Powermtcars = mtcars[mtcars[:HP] .> 100, :Model] Plotplot(mtcars, x=:Model, y=:HP, Geom.bar)
IntegrationsPython: PyCall
Use math from Pythonusing PyCall@pyimport mathmath.sin(math.pi / 4) - sin(pi / 4)
Use Python@pyimport matplotlib.pyplot as pltx = linspace(0,2*pi,1000); y = sin(3*x + 4*cos(2*x));plt.plot(x, y, color="red", linewidth=2.0, linestyle="--")plt.show()
IntegrationsR: RCall
Community
Lang
SStats Opt Parallel DB
Quantum
Astro
GPU FinanceSparse Math
...among others
Important Events/Projects