Date post: | 27-Jan-2017 |
Category: |
Software |
Upload: | mohit-jaggi |
View: | 383 times |
Download: | 1 times |
Building DSLs with ScalaMohit Jaggi
Code Ninja and Big (Data) Troublemaker
Ayasdi
Who am I?• Software engineer and architect
• Past life: built networking, security and application delivery appliances using ASICs, microcode, C, some C++. Operating system optional.
• This life: distributed systems, (big) data analytics, machine learning using Java and Scala. And real servers with a proper operating system!
Why I love Scala?• Less red tape
• Multi-paradigm
• Exploits the Java ecosystem
• Static typing and type inference
• Same language for everybody from casual script writer to application programmer to library designer
Who will find this useful?
• Beginner to intermediate scala programmers
• If you write code, you have APIs
• Library designers
Agenda
• Motivation for DSLs
• Scala Constructs useful for DSLs
• Lessons from my limited experience
Motivation for DSLs
• API Design Considerations
• DSL
• Types of DSLs and tradeoffs
Considerations for a good API
• sufficient for current needs
• extensible for anticipated needs
• does not preclude unanticipated needs
• forward/backward compatibility
• easy to use
• little or no documentation required, but
• complete documentation available
DSL
• Domain Specific Language
• helps with ease of use part
• e.g. SQL, R, jooq, bigdf
Types of DSLs
• External or standalone
• Internal or embedded
External DSL• New language from “scratch”
• Write a parser using something like lex/yacc, antlr
• Write an interpreter or
• Write a code generator
• Manage a run-time environment(Java?, native?, REPL?)
Internal DSL• “internal” or “embedded”
in a general purpose programming language
• Designed to “feel like” a different language more intuitive to the domain
• e.g. Jooq in Java/Scala feels like SQL
Tradeoffs
• Less work
• More completeness
• Less syntax flexibility
• “global optimizations” hard
• Error messages less meaningful
• More freedom in syntax
• More “global optimizations” possible
• Possible to have better error messages
• More work
• Typically slower without good code generation
Internal
External
Benefit Cost
Why Scala for DSLs?
select(BOOK.ID) from BOOK where BOOK.TITLE === “Scala scala”
select(BOOK.ID).from(BOOK).where(BOOK.TITLE.equal(“Java java”))
Scala Constructs Useful For DSLs
Vanishing Method Call
• apply() and update() can be called without naming them
df(“age”) == df.apply(“age”) == df.column(“age”)
df(“age”) = df(“age”) + 1
df.update(df(“age”), df(“age”) + 1)
Simulated dynamic typing• extend the Dynamic trait
• hides compile errors, is it worth paying that cost?
df.age == df(“age”) == df.selectDynamic(“age”)
def selectDynamic(colName: String) = { val col = self.column(colName) if (col == null) logger.error(s"$colName does not match any DF API or column name") col }
def updateDynamic(colName: String)(that: Column): Unit = update(colName, that)
df.age = df.age + 1
df.updateDynamic(“age”)(age + 1)
Dial 0 for operator• dot is optional, so methods looks like operators
• any symbol can be used as operator
• for example to list people that can buy beer you can use
df.where(df(“age”).gt(21))
df where (df(“age”) > 21)
• operator ending in colon applies to right operand, this provides more freedom (~active vs passive voice)
object orange { def eat_:(who: String):Unit =
println(s"$who is eating orange”) }
“monkey” eat_: orange
• careful with operators like ==, use === instead
Free companion object
• hide “new” to make an object
val df = DF.fromCSVFile(…)
• also a good place to put implicit conversions in
Reading between the lines• implicit conversion to “enrich” types
• can also use “implicit class”
• newer construct “Value Classes” can sometimes replace this
2 MB == ByteSize(2*1024*1024)
class BytesMaker(n: Int) {
def MB = new Bytes(n * 1024 * 1024)
def GB = new Bytes(n * 1024 * 1024 * 1024)
}
case class Bytes(val n: Int) { def print = println(s”bytes=$n") }
object Bytes {
import scala.language.implicitConversions
implicit def intToBytesMaker(n: Int) = new BytesMaker(n)
}
My name is Bond
• Pass by name parameters are useful to pass in “blocks of code”
def doTwice(action: => Unit) { action action }
doTwice { println("hey") }
Multiple Parameter Lists
• Help avoid unwanted commas and more intuitive parentheses/braces def when(predicate: => Boolean)(action: => Unit) = {
if(predicate) action
}
when(1 == 1) { println(“1”) }
instead of
when(1 == 1, println(“1”))
FWIW - API
• Don't provide unnecessary options, if needed you can always add more; like "salt in a dish", you can't remove it
• Names meaningful to user not the coder
• Consistent coding style
• Option[T] if truly optional, else exception
FWIW - DSL• Don’t take it to extremes (like “Baysick”)
• Assume users know some(or lot of) Scala
• Try to generate useful error messages
• Careful with ==, right associative operators, implicit search order
• Aim to provide gentle slope to Scala instead of none; basic Scala is simple, your API users will appreciate it
Thanks!We are hiring
http://engineering.ayasdi.com http://www.ayasdi.com/careers