+ All Categories
Home > Documents > Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar...

Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar...

Date post: 06-Aug-2020
Category:
Upload: others
View: 6 times
Download: 1 times
Share this document with a friend
35
Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014
Transcript
Page 1: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staging Parser Combinators for Efficient Data Processing

Manohar Jonnalagedda

Parsing @ SLE, 14 September 2014

Page 2: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

What are they good for?

● Composable○ Each combinator builds a new parser from a previous one

● Context-sensitive○ We can make decisions based on a specific parse result

● Easy to Write○ DSL-style of writing○ Tight integration with host language

2

Page 3: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Example: HTTP ResponseHTTP/1.1 200 OKDate: Mon, 23 May 2013 22:38:34 GMTServer: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)Last-Modified: Wed, 08 Jan 2012 23:11:55 GMTEtag: "3f80f-1b6-3e1cb03b"Content-Type: text/html; charset=UTF-8Content-Length: 129Connection: close

... payload ...

3

Page 4: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Example: HTTP ResponseHTTP/1.1 200 OKDate: Mon, 23 May 2013 22:38:34 GMTServer: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)Last-Modified: Wed, 08 Jan 2012 23:11:55 GMTEtag: "3f80f-1b6-3e1cb03b"Content-Type: text/html; charset=UTF-8Content-Length: 129Connection: close

... payload ...

Status

Headers

Content

4

Page 5: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Example: HTTP Responsedef status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf)

) map (_.toInt) Transform parse results on the fly

5

Page 6: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Example: HTTP Responsedef status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf)

) map (_.toInt)

def header = (headerName <~ ":") flatMap {

key => (valueParser(key) <~ crlf) map {

value => (key, value)

}

}

Transform parse results on the fly

Make decision based on parse result

6

Page 7: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Example: HTTP Responsedef status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf)

) map (_.toInt)

def header = (headerName <~ ":") flatMap {

key => (valueParser(key) <~ crlf) map {

value => (key, value)

}

}

def respWithPayload = response flatMap {

r => body(r.contentLength)

}

Transform parse results on the fly

Make decision based on parse result

Make decision based on parse result

7

Page 8: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Parser combinators are slow

Topic of this talk.

Standard Parser Combinators

Staged Parser Combinators

20x

Throughput

9

Page 9: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Parser Combinators are slowdef status: Parser[Int] = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~

crlf)

) map (_.toInt)

def header = (headerName <~ ":") flatMap {

key => (valueParser(key) <~ crlf) map {

value => (key, value)

}

}

def respWithPayload = response flatMap {

r => body(r.contentLength)

}

class Parser[T] extends (Input => ParseResult[T]) ...

10

Page 10: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Parser Combinators are slowdef status: Parser[Int] = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~

crlf)

) map (_.toInt)

def header = (headerName <~ ":") flatMap {

key => (valueParser(key) <~ crlf) map {

value => (key, value)

}

}

def respWithPayload = response flatMap {

r => body(r.contentLength)

}

class Parser[T] extends (Input => ParseResult[T]) ...

def ~[U](that: Parser[U]) = new Parser[(T,U)] { def apply(i: Input) = ... }

11

Page 11: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

● Prohibitive composition overhead● But: composition is mostly static

○ Let us systematically remove it!

Parser Combinators are slow

12

Page 12: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staged Parser Combinators

Composition of Parsers

12

Page 13: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staged Parser Combinators

Composition of Parsers

Composition of Code Generators

13

Page 14: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staging (LMS)

def add3(a: Int, b: Int, c: Int) = a + b + c

add3(1, 2, 3) 6

‘Classic’ evaluation

14

Page 15: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staging (LMS)

def add3(a: Int, b: Int, c: Int) = a + b + c

add3(1, 2, 3) 6

def add3(a: Rep[Int], b: Int, c: Int) = a + b + c

Adding Rep types

‘Classic’ evaluation

Expression in the next stage

Executed at staging timeConstant in the next stageExecuted at staging timeConstant in the next stage

15

Page 16: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staging (LMS)

def add3(a: Int, b: Int, c: Int) = a + b + c

add3(1, 2, 3) 6

def add3(a: Rep[Int], b: Int, c: Int) = a + b + c

Adding Rep types

add3(x, 2, 3) def add$3$2$3(a:Int) = a + 5

add$3$2$3(1)

‘Classic’ evaluation

Expression in the next stage

Executed at staging timeConstant in the next stageExecuted at staging timeConstant in the next stage

Code generation

Evaluation of generated code

16

Page 17: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

LMS

User-written code, may contain Rep types

LMS runtime code generation

Generated/optimized code.

17

Page 18: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staging Parser Combinators

class Parser[T] extends (Input => ParseResult[T])

Composition of Code Generators

class Parser[T] extends (Rep[Input] => Rep[ParseResult[T]])

static function: application == inlining for free

dynamic inputsdynamic input/output

18

Page 19: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staging Parser Combinators

class Parser[T] extends (Input => ParseResult[T])

Composition of Code Generators

class Parser[T] extends (Rep[Input] => Rep[ParseResult[T]])

dynamic inputs

def ~[U](that: Parser[U])

def ~[U](that: Parser[U])

def map[U](f: T => U): Parser[U]

def map[U](f: Rep[T] => Rep[U]): Parser[U]

dynamic input/output

static function: application == inlining for free

still a code generator

19

Page 20: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staging Parser Combinators

class Parser[T] extends (Input => ParseResult[T])

Composition of Code Generators

class Parser[T] extends (Rep[Input] => Rep[ParseResult[T]])

dynamic inputs

def ~[U](that: Parser[U])

def ~[U](that: Parser[U])

def map[U](f: T => U): Parser[U]

def map[U](f: Rep[T] => Rep[U]): Parser[U]

def flatMap[U](f: T => Parser[U]): Parser[U]

def flatMap[U](f: Rep[T] => Parser[U]): Parser[U] still a code generator

dynamic input/output

static function: application == inlining for free

still a code generator

20

Page 21: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

A closer lookdef respWithPayload: Parser[..] = response flatMap { r => body(r.contentLength) }

// code for parsing responseval response = parseHeaders()val n = response.contentLength//parsing bodyvar i = 0while (i < n) { readByte() i += 1}

User-written parser

Generated code

code generation

21

Page 22: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Gotchas

● Recursion○ explicit recursion combinator (fix-point like)

● Diamond control flow○ code generation blowup

General solution○ generate staged functions (Rep[Input => ParseResult])

22

Page 23: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Performance: Parsing JSON

● 20 times faster than Scala’s parser combinators

● 3 times faster than Parboiled2

23

Page 24: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Performance

HTTP Response

CSV

24

Page 25: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

If you want to know more

● Parser Combinators for Dynamic Programming [OOPSLA ‘14]

○ based on ADP○ code gen for GPU

● Using Scala Macros [Scala ‘14]

25

Page 26: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Desirable Parser Properties

Hand-written Parser Generators Staged Parser Combinators

Composable X ✓ ✓

Customizable X X ✓

Context-Sensitive ✓ ~ ✓

Fast ✓ ✓ ✓

Easy to write X ✓ ✓

26

Page 27: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

The people

● Eric Béguet● Thierry Coppey

● Sandro Stucki● Tiark Rompf

● Martin Odersky

27

Page 28: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Tack!Fråga?

Page 29: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Staging all the way down

● Staged structs○ boxing of temporary results eliminated

● Staged strings○ substring not computed all the time

Page 30: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Optimizing String handling

class InputWindow[Input](val in: Input, val start: Int, val end: Int){

override def equals(x: Any) = x match {

case s : InputWindow[Input] =>

s.in == in &&

s.start == start &&

s.end == end

case _ => super.equals(x)

}

}

Page 31: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Beware!● String.substring is in linear time ( >= Java 1.6).

● Parsers on Strings are inefficient.

● Need to use a FastCharSequence which mimics original behaviour of substring.

Key performance impactorsStandard Parser Combinators

Page 32: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Key performance impactors

Standard Parser Combinatorswith FastCharSequence

Standard Parser Combinators

Page 33: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Key performance impactors

Standard Parser Combinatorswith FastCharSequence

Standard Parser Combinators

~7-8xFastParsers with error reporting and without inlining

Page 34: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Key performance impactors

Standard Parser Combinatorswith FastCharSequence

Standard Parser Combinators

~ 2x

~7-8xFastParsers with error reporting and without inlining

FastParsers without error reporting without inlining

Page 35: Manohar Jonnalagedda · Staging Parser Combinators for Efficient Data Processing Manohar Jonnalagedda Parsing @ SLE, 14 September 2014. What are they good for? Composable Each combinator

Key performance impactors

Standard Parser Combinatorswith FastCharSequence

Standard Parser Combinators

FastParsers with error reporting and without inlining

FastParsers without error reporting without inlining

FastParsers without error reporting with inlining

~ 30%

~ 2x

~7-8x


Recommended