+ All Categories
Home > Technology > Scala collections wizardry - Scalapeño

Scala collections wizardry - Scalapeño

Date post: 06-Jul-2015
Category:
Upload: sagie-davidovich
View: 2,449 times
Download: 6 times
Share this document with a friend
27
Scala collections Sagie Davidovich @mesagie singularityworld.com linkedin.com/in/sagied
Transcript
Page 1: Scala collections wizardry - Scalapeño

Scala collections

Sagie Davidovich

@mesagie

singularityworld.com

linkedin.com/in/sagied

Page 2: Scala collections wizardry - Scalapeño

Warm up example:

Fibonacci sequenceval fibs: Stream[Int] = 0 #:: fibs.scanLeft(1)(_ + _)

Key concepts:

• Recursive values

• Streams

• Scan

• Binary place-holder notation

Page 3: Scala collections wizardry - Scalapeño

Immutable collections

You’ll know about

• Avoid memory allocation for empty collections

• Optimize for small collections

• Equal-hashCode contract

• Asymptotic behavior of common operations

Page 4: Scala collections wizardry - Scalapeño

NilList.empty and Nil are singletons.

No new memory is allocated

Page 5: Scala collections wizardry - Scalapeño

Option[A]

Page 6: Scala collections wizardry - Scalapeño

Immutable Sets – emptySetemptySet is a singleton too

Page 7: Scala collections wizardry - Scalapeño

Immutable Sets – Set1Optimized for sets of size 1

Page 8: Scala collections wizardry - Scalapeño

Immutable Sets – Set2Optimized for sets of size 2

Page 9: Scala collections wizardry - Scalapeño

Immutable Sets – Set4A HashSet is (finally) instantiated

Page 10: Scala collections wizardry - Scalapeño

Immutable Collections

Page 11: Scala collections wizardry - Scalapeño

Mutable Collections

Page 12: Scala collections wizardry - Scalapeño

One liners

Page 13: Scala collections wizardry - Scalapeño

Computing a derivative

def derivative(nums: Iterable[Double]) =

nums.sliding(2)

.map (pair => pair._2 - pair._1)

What can be improved in this solution?

Bonus question: change a few characters to find the max slope

Page 14: Scala collections wizardry - Scalapeño

Counting occurrences (histogram)

"encyclopedia" groupBy identity mapValues (_.size)

Map (

e -> 2, n -> 1, y -> 1, a -> 1, i -> 1,

l -> 1, p -> 1, c -> 2, o -> 1, d -> 1

)

Page 15: Scala collections wizardry - Scalapeño

Word n-grams

val range = 1 to 3

val text = "hello sweet world"

val tokenize = (s: String) => s.split(" ")

range flatMap (size => tokenize(text) sliding size)

Result:

Vector(Array(hello), Array(sweet), Array(world), Array(hello, sweet), Array(sweet, world), Array(hello, sweet, world))

Page 16: Scala collections wizardry - Scalapeño

Are all members of a greater than corresponding members of b

val a = List(2,3,4)

val b = List(1,2,3)

// O(n^2) and not very elegant.

(0 until a.size) forall (i => a(i) > b(i))

// O(n) but creates tuples and a temporary list. Yet, more elegant.

a zip b forall (x=> x._1 > x._2)

// same as above but doesn't create a temporary list (lazy)

a.view zip b forall (x=> x._1 > x._2)

// O(n), without tuple or temporary list creation, and even more elegant.

(a corresponds b)(_ > _)

Page 17: Scala collections wizardry - Scalapeño

Strings are collections. How come?

“abc”.max

@inline implicit def augmentString(x: String) = new StringOps(x)

String <% StringOps <: StringLike <: IndexedSeqOptimized …

Page 18: Scala collections wizardry - Scalapeño

Complexity of collection operations

• Linear:

– Unary: O(n):

• Mappers: map, collect

• Reducers: reduce, foldLeft, foldRight

• Others: foreach, filter, indexOf, reverse, find, mkString

– Binary: O(n+ m):

• union, diff, and intersect

Page 19: Scala collections wizardry - Scalapeño

Immutable Collections time complexity

head tail apply update prepend appendList C C L L C LStream C C L L C LVector eC eC eC eC eC eCStack C C L L C LQueue aC aC L L L CRange C C C - - -String C L C L L L

Page 20: Scala collections wizardry - Scalapeño

Mutable Collections time complexity

head tail apply update prepend append insert

ArrayBuffer C L C C L aC LListBuffer C L L L C C LStringBuilde

r C L C C L aC LMutableList C L L L C C LQueue C L L L C C LArraySeq C L C C - - -Stack C L L L C L LArrayStack C L C C aC L LArray C L C C - - -

Page 21: Scala collections wizardry - Scalapeño

Bonus question

What’s the complexity of Range.sum?

Page 22: Scala collections wizardry - Scalapeño

Range

Page 23: Scala collections wizardry - Scalapeño

Equals-hashCode contract

(a equals b) (a.hashCode == b.hashCode)

All Scala collection implement the contract

Bad idea: Set[Array[Int]]

Good idea: Set[Vector[Int]]

Bad Idea: Set[ArrayBuffer[Int]]

Bad Idea: Set[collection.mutable._]

Good Idea: Set[collection.immutable._]

Page 24: Scala collections wizardry - Scalapeño

More on collections equality

val (a, b) = (1 to 3, List(1, 2, 3))

a == b // true

Q: Wait, how efficient is Range.hashCode?

A: O(n)override def hashCode = util.hashing.MurmurHash3.seqHash(seq)

Challenge yourself:

Is there a closed (o(1)) formula for a range hashCode?

Page 25: Scala collections wizardry - Scalapeño

Java interoperability

Implicit (less boilerplate):

import collection.javaConversions._

javaCollection.filter(…)

Explicit (better control):

Import collection.javaConverters._

javaCollection.asScala.filter(…)

scalaCollection.asJava

Page 26: Scala collections wizardry - Scalapeño

The power of type-level programminggraph path-finding in compile time

import scala.language.implicitConversions

// Verticescase class A(l: List[Char])case class B(l: List[Char])case class C(l: List[Char])case class D(l: List[Char])case class E(l: List[Char])

// Edgesimplicit def ad[A1 <% A](x: A1) = D(x.l :+ 'A')implicit def bc[B1 <% B](x: B1) = C(x.l :+ 'B')implicit def ce[C1 <% C](x: C1) = E(x.l :+ 'C')implicit def ea[E1 <% E](x: E1) = A(x.l :+ 'E')

def pathFrom(end:D) = end

pathFrom(B(Nil)) // res0: D = D(List(B, C, E, A))

Page 27: Scala collections wizardry - Scalapeño

Want to go Pro?

• Shapeless (Miles Sabin)

– Polytypic programming & Heterogenous lists

– github.com/milessabin/shapeless

• Scalaxy (Olivier Chafik)

– Macros for boosting performance of collections

– github.com/ochafik/Scalaxy


Recommended