Mind is Software

Ying’s thoughts about software and business

Learning Scala

This is a note for learning Scala based on the tour of Scala. The word “Scala” is an abbreviation of the term SCAlable LAgnuage. It was created in 2003 as a multi-paradigm (OO + functional) programming language that is concise, elegant and type-safe.

Highlights

Scala is a pure OO language because every value is an object. Types and behavior of objects are described by classes and traits. A class can have one superclass and many mixins (traits) to add more data and behaviors.

Scala is a functional language in the sense that every function is a value. It has a concise syntax for defining anonymous functions. It supports higher-order functions, nested functions/methods, currying. It use case class and pattern matching to model algebraic types. It uses singleton object to group functions. It has built-in regualr expression pattern, extractor object and for comprehension to write concise and elegent code.

Scala is statically typed with an expressive type styem that has the following features:

  • generic classes
  • variance annotations
  • upper and lower type bounds
  • inner classes and abstract type members as object members
  • compound type
  • explicitly typed self refrence
  • implicit parameters and conversions
  • polymorphic methods
  • type inference

Scala is extensible with meta-programming facilites such as implicit class and string interpolation.

Basics

You can combine expressions in a block inside a pair of curly brackets {}.

Functions are expressions that take parameters. it is defined as (paramter-list) => expression. It can be anonymous or assigned to a value. A function literal is an annoymous function. The Scala uses function0, function1, function2 to represent function literal based on the number of input arguments. Placeholder syntax is a shortened form of function literals, replacing named parameters with wildcard operators _. It is only used when the explicty type of the function is specified outside the literal and the parameters are used only once. Instead of val doubler: Int => Int = x => x * 2, you write val doubler: Int => Int = _ * 2.

Methods are defined with def keyword followed by a name, parameter lists, a return type, and a body. Method can take multiple parameter lists or no parameter lists at all. When a method is called with a fewer number of parameter lists, then this yields a function taking the missing parameter lists as its arguments. This is called partial application. Methods can be nested. In places that expect a function such as funciton argument of a higher-order function, methods will be coerced into functions. To explicitly convert a method to a funciton, use the wildcard operator _ like val <identifier> = <method name> _.

You define a class with class keyword followed by its name and optional constructor parameters and an optional body. Class fields can be declared with val or var before a class paramter. An instance is crated with teh new keyword. Class can be marked sealed which means all subtypes must be declared in the same file.

Case class is immutable and compared by its value. You can instantiate a case class with the new keyword. Methods named “apply”, referred to as a default method or an injector method, can be invoked without the method name. Lazy value are only created the first time they are instantiated. A case class is mainly used for storing data and is great for DTO. By default, case classes convert parameters to value files thus no need to use val keyword. For a variable field, use var keyword. Scala automatically generates apply, copy, equals, hashCode, toString, unapply methods for a case class. The unapply method decomposes the instance into parts used in pattern match.

Objects are single instances of their own definitions/classes. You define an object with the object <identifier> {...} syntax and access an ojbect by its identifier. It is instantiated the first time it is accessed. Best methods suited for objects are pure functions and I/O functions that don’t use fields of an instance. An object may also have apply method that makes it possible to invoke an object by name. It is often used as a factory, especially to create a new instances of a class from its companion object.

Traits are types containing fields and methods. Traits can have default method implementation. Traits can be sealed. You can extend a trait and override its methods. It can take type parameters but not class parameters. Multiple traits (mixins) can be combined. The multiple inheritance ordering is right to left.

Unified Types

The Any, AnyVal, and AnyRef types are the root of Scala’s type hierarchy. Any is the absolute root with two children: AnyVal and AnyRef (it is java.lang.Object). Value types are subtypes of AnyVal and include all numeric types plus Char, Boolean, and Unit. Value types are allocated either on the heap as objects or locally on the stack as JVM primitive values. AnyRef is the root of all reference types. Reference types are only allocated on the heap as objects. Null is a subtype of any reference type and has a single value of null. Nothing is a subtype of any other type. Nothing is only used as a type, because it cannot be instantiated.

Char can be assigned to or convert from a number. The Unit type denotes the lack of data and has of literal (). It is used to define a function or an expression that doesn’t return data.

& and | check both arguments while && and || are lazy.

The following types are available on all types in Scala:

  • asInstanceOf[<type>]: converts the value to another type. Cause error if not compatible with the new type.
  • equals: checks if two objects are equal.
  • getClass: returns the type of a value.
  • hashCode: retuns the hash code of the value.
  • to<type>: convert to a compatible type value.
  • toString: renders the value to a string.

A tuple is an ordered container of two or more values. It is similar to case a class but its elments have no name. Use ( <value 1>, <value 2>[, <value 3>...] ) to create a tuple. use 1-based index (._1 and ._2 etc) to access a tuple. Another way to create a 2-sized tuple is using the relational operator ->. A tuple can be taken apart using pattern matching.

Expressions and Conditions

An expression is a unit of code that returns a value. One ore more lines of code is an expression block if they are inside curly braces.

A statement is just an expression that doesn’t return a value, or in other terms, it returns a Unit value ().

Conditional expressions are if .. esle and <expression> match { case <pattern match> => <expression>}. Use a wildcard pattern case id => or case _ => to avoid scala.MatchError) to avoid MatchError. Matching supports pattern guards case <pattern> if <Boolean expression> => .... Matching also support type matching case <identifier>: <type> => .... The identifier must strat with a lowercase letter.

A range is defined by <starting integer> [to|until] <ending integer> [by increment]. Basic for loop is for (<identifier> <- <iterator>) [yield] [<expression>]. An iterator guard (also known as a filter) add an if expression to an iterator. For loop can have value binding. There are do loop and while loop too.

Advanced Functions/Methods

A higher-order function takes other functions as parameter or returns a function . Antohter form of a function type parameter is a by-name parameter, which can take either a value or a function that eventually returns the value. The syntax is <identifier>: => <type>. Each time a by-name parameter is used inside a function, it gets evaluated into a value. A by-name parameter is not invoked unless it is used.

A method that defines some of its parameters as being implicit can be invoked by code that has a local implicit value, but can also be invoked with an explicit parameter.

A total function supports all possible values of its parameter types. A partial function is a function leteral that apply a series of case pattern to its inputs, requiring that the input match at least one of the given patterns. Involing one of these partial functions with data that doesn’t not meeet at least one case pattern results in a Scala MatchError. Partial functions are useful when working with collections and pattern matching to collect items acceptable by a given partial function.

Collections

Scala has a high-performance, object-oriented, and type-parameterized mutable and immutable collections that have higher-order operations. The List, Set and Map are immutable collections once they have been created.

A List is an immutable and recursive data structure. It has Nil (List[Nothing] as its tail. Use cons (short for construct) operator :: to build a list. Because List implementation is a linked-list, it is efficent for ::, drop and take, but operation on the end operators such as :+, takeRigh, and dropRight may be slow for large list. The toBuffer method coverts a list to a mutable mutable.Buffer.

Collection builder can append elements to create a new collection.

An Array is a fixed-size, mutable, indexed collection that wraps aoround Java’s array type. it can be used like a sequence because of an implict class.

Seq is the root type of all sequences, including List and Vector. It can be used as a shortcut for creating a List. The IndexedSeq is a shortcut for Vector.

The LazyList type is a lazy collection, generated from one or more starting elements and a recursive function. Elements are added to the collection only when they are access for the first time. The generated elements are cached for later retrieval, ensuring that each element is only generated once. It cna be unbounded. It can be terminated with LazyList.Empty. You use LazyList.cons to construct a new stream with the head and tail.

A monadic collection contains no more than one element. An Option type is unimplemented but has two subtypes Some and None. The util.Try has two subtypes: Success and Failure. concurrent.Future represent a potential value that will be available in the future.

Advance Typing

A slef type is a trait annotation that asserts that the trait must be mixed in with a specific type, or its subtype, when it is added to a class. It has a syntax of trait ..... { <identifier>: <type> => .... }. Usually the identifier is self though it could be any identifier.

Another method for using traits is to add them to a class when the class is instantiated. The real value of doing this is that it adds new functionality or configurations to existing classes. It is called dependency injection becasuse the actual functionality is injected when instantiated. Two instances of the same class can operatre under different configurations.

You can import a single member of a class instance by name, or the entire set of fields and methods with the underscore character _.

Implicit classes provide a type-safe way to “monkey-patch” new methods and fields onto existing classes. By providing an automatic conversion from instances of type A to type B, an instance of type A can appear to have fields and methods as if it were an instance of type B.

A class is an entity that may include data and methods, and has a single, specific definition. A type is a class sepcification, which may match a single class/trait or a range of classes/traits that conform to its requirements. A type could be a relation such as “class A or any of its descendants”. Objects are not considered types.

Use type <identifier>[type parameters] = <type name>[type parameters] to define a type alias.

Abstract types are specifications that may resolve to zero, one, or many classes.

A type paramter can be specified as to meet an upper bound <: (the type or its subtypes) or a lower bound >: (the type or its bases). The view-bound operator <% supports anything that can be treated as that type including an implicit onversion.

Type variance makes type parameters less restrictive. Type variance specifies how a type parameter may adapt to meet a base type or subtype. By default, type parameters are invariant.

When a type parameter is specified as covariant (with +), it can change into a compatible base type. Covariance is helpful for morphing type parameters into their base types. A type parameter is contravariant (with -) where a type parameter may morph into a subtype. As an example, in trait Function1[-T1, +R], the input parameter is defined as contravariant and the return is defined as covariant. Therefore we have input parameter position as contravariant position whereas the return position as covariant position. The reason is that for any function, it can be replaced by another function if the later has less requirements for its input parameters (the - meaning) but more requirements for its return parameter (the + meaning). The key to understand the concpet is that covariance and contravariance are qualities of the class not qualities of the parameters.

Many types such as implicit parameters, implicit conversions, and type aliases can only be defined within other types. There are two exceptions. The contents of Scala.Predef object are added automatically to the namespace in Scala. Another way is throught package objects deefined in package.scala using package object <identifier> {type ...}.


Share