Basic Scala Syntax for Java Programmers

Categories: Java

Introduction

This page contains a quick summary of the syntax of the Scala programming language. These are mostly notes I made from an online Scala tutorial, a Scala book, and a Scala introductory video course; they have been posted here mostly as a reference for myself, and are not expected to be of particular use to anyone else. If you are learning Scala, then I would recommend reading the online tutorial yourself, and making your own notes!

Learning any language requires mastering two parts: the syntax and the patterns-of-use. This applies to human and computer-programming languages. These notes really only address the syntax part, but might make it possible to read Scala code sufficiently to get a general idea of what it is doing. Maybe.

The home page for the Scala programming language contains an excellent introduction from which some of these notes come. I can also recommend the book “Programming Scala Second Edition”. The wikipedia page on Scala also provides a good overview for Java developers.

Scala is a big complicated language, with both OO and Functional options; this is the price paid for having an “easy upgrade path” from Java, and the ability to use existing Java libraries. Having two choices makes writing code easier - the programmer can choose whichever style they are most familiar with. But it makes reading code harder - you need to know both styles in order to understand existing programs. I think this language with a small experienced team may be good, and with inexperienced programmers it would be hell. The Scala video course I watched did have some useful tips for starting with Scala; in particular the presenter recommended aiming for “functional in the small, OO in the large”. In other words, developers familiar with OO languages should not try to take on the most advanced “philosophies” of Scala immediately (eg the Scalaz library), but instead use the OO features of Scala at the component-level. At the class or method level, it is however useful to make use of immutable structures, lambdas, etc.

Guidelines for functional vs object styles: program in the small with functional, program in the large with OO. The concept of interfaces works really well for decoupling code. However mutability and a multiplicity of classes can lead to bugs and overcomplex/verbose code at lower levels - where the functional approach shines. OO programs are also a good fit for GUI frameworks.

Table of Contents

Why Use Scala?

In comparison to Java, Scala code provides a whole bunch of things which replace verbose Java usage with more compact sourcecode and remove annoying Java limitations:

  • Gets rid of constructor code which just copies params to fields
  • Automatically generates field accessors (getters/setters) where relevant (var/val constructor params)
  • Automatically generates sensible toString/equals/hashcode methods for DTO-like data structures (case classes)
  • Fields of an object can be accessed in sourcecode using property-like syntax but this is compiled to a call to a getter/setter (and can be overridden in the called code if desired)
  • Has type inference, to get rid of complicated type-declarations for variables/fields
  • Has tuples - saves a lot of trivial class definitions, and in particular makes it easy for a method to return multiple values
  • Semicolons usually optional (except in complicated cases)
  • Return keyword optional in most cases
  • Can omit braces around body of single-line methods (see example below)
  • Supports operator overloading (eg defining a method named “+” for a custom class)
  • Keyword “new” often not needed (technically, not a language feature but companion objects typically provide factory methods)
  • Operator “==” is equivalent to “.equals” ie usually does value-comparison, not identity comparison. Identity comparison is available as “obj1.eq(obj2)”.
  • Any method can be invoked with “named parameters”, ie params can be passed in any order. Example: swap(op2=12, op1=17)
  • Methods can have default-values for parameters.
  • If-statements are expressions which return a value, eg “val x = if (expr) 1 else 2”
  • Import statements can define an alias-name for the imported type(s)
  • Import statements can occur anywhere in source-code, not just at the top of a file (and they are lexically scoped)
  • Local aliases can be defined for complicated type-definitions
  • Triple-quoted strings are multi-line strings, allowing literal json, xml, etc to be easily expressed. In addition, backslashes are not interpreted in such strings, making it easy to write literal regexes and similar things.
  • Interpolated strings like s"my name is $name" are a more compact alternative to String.format.

The standard library provides a lot of functional-programming-style features if you wish to go that way. In particular, the immutable collection types make scalable multi-threaded applications easier to develop.

Support in Scala for generic types is significantly better than in Java, ie code can be expressed in a more type-safe way - fewer casts and nasty workarounds.

Using Scala also “opens a window” into the functional programming world while being more accessible than simply jumping into Haskell or similar, eg:

  • Pattern Matching (deconstruction-based switches)
  • Currying
  • Tuples
  • Explicit Tail Call Optimisation
  • Support for passing closures as parameters (more concise and flexible than Java8 Lambdas)

There are many external libraries for Scala which provide interesting functionality in functional-programming-style, including Akka and Play.

The primary disadvantages of Scala relative to Java are:

  • Much slower compilation times (particularly annoying within IDEs which try to provide as-you-type feedback - the experience is much less pleasant than Java where compile-times are faster)
  • Poorer binary compatibility (often code compiled with Scala is only compatible with code compiled with exactly the same release of Scala)
  • Calling Java code can be clumsy
  • Not as widely known as Java (ie finding other developers and advice is more challenging)
  • A more complex language than Java (more features, more syntax, and thus harder to learn)
  • Some libraries overuse Scala features, in particular implicit conversions and operator-chars-as-method-names, making code hard to read
  • For code distributed as a “fat jar”, the Scala standard libraries need to be included - and they are quite large. In other words, simple tool-like apps in Scala are huge in comparison to their Java equivalents.

The Scala Implementation

The Scala compiler, libraries, and tools are licensed under a “bsd-like” license; the code can be downloaded from an SVN repo. Development appears to happen on open lists, which is good.

The Scala Environment

Scala 2.12.0 and later runs on any JVM v1.8 or later (ie Scala code compiles to Java bytecode); earlier versions of Scala required only Java 1.6. Scala code can call Java libraries (though sometimes some workarounds are needed, eg converting a Scala collection-type into its java.util.* equivalent before passing it as a parameter to a Java method).

The standard Scala distribution previously had support for compiling Scala code to the CLR environment, ie Microsoft .Net runtime. However this support was removed in Scala v2.11 (2014).

There are good Scala plugins for Eclipse, Intellij and NetBeans IDEs.

Scala applications can be built with standard Java build-tools (eg Maven) or with the Scala-specific SBT (Scala Build Tool).

As with many languages, the standard Scala toolset provides a REPL (Read Eval Print Loop) tool for experimenting with Scala code; fragments of code can be entered and immediately executed. Methods defined via the REPL are added to a special “global class”.

Basic Application Structure

Scala code ist stored in plain text files, using UTF8 encoding. The standard file suffix is “.scala”. Scala code does not have to follow Java’s one-file-per-class approach; a single file can contain multiple definitions. Nevertheless, the recommended convention is that a separate file is used for each class, and the filename matches the classname except:

  • a class and its companion object must be in the same file
  • an ADT sum type (a sealed trait and its implementations) must have all component types in the same file
  • nested (inner) classes are obviously in the same file as their enclosing type
  • when code maintenance is improved by grouping types - in which case the filename should start with a lowercase letter.

The primary “structural” components are packages, classes, traits and objects:

  • Packages are similar to those in Java. However Scala code often names packages following conventions common in languages like C++, Python or Ruby rather than Java’s widely-spread “reverse domain name” structure. Package naming hierarchies are thus often “shallower” than in Java.

  • Classes are roughly similar to classes in Java. The “case class” is a variant of class that acts more like a “struct” or DTO (Data Transfer Object).

  • Traits are roughly similar to interfaces in Java.

  • Objects are somewhat like “singleton instances” in Java, ie are like declaring a class and applying the static singleton design pattern. An object-declaration in the same file as a class-declaration and using the same name as the class is called a “companion object” for the class and has access to the private members and methods of that class.

The primary structural difference between Scala and Java is that Scala supports functions as first-class citizens, ie references to functions can be created and passed around as easily as references to other types. Functions can be partially-applied, and composed in the usual ways supported by other functional languages.

Basic Code Syntax

Source code blocks are delimited by braces, as with Java (not indentation-based as with Python). However semicolons are optional in most cases, and the conventional style is to omit them except in the rare situations where they really are required.

By convention, types start with upper-case letters while methods and variables start with lower-case letters, as in Java. Multi-word names use CamelCase. Constants are usually named like types, ie camelcase starting with an uppercase letter (unlike Java where constants are usually written in ALL_UPPER_CASE).

Packages, classes and traits (like interfaces) are declared reasonably similarly to Java. Singleton types can also be declared in a manner similar to classes, using the “object” keyword.

Methods are declared using keyword “def”. When an “= ..fnbody..” follows the declaration, then this is a concrete method; when no function body follows the declaration then the method is abstract.

Variables are declared using either “val” (immutable; roughly equivalent to Java final) or “var” (mutable).

Type inference may be used in variable and member declarations, ie declarations do not need to include an explicit type - the compiler can usually figure it out from the context. Method return types may also be omitted when they can be inferred by the compiler. However method parameter types must always be explicitly defined. Type inference does not make Scala radically different from Java, it is just a nice way to reduce clutter. Note that Java has a little bit of type inference itself: the “<>” syntax.

When types are present, they usually follow the name of some element rather than precede it, eg “var foo: String” rather than Java’s “String foo;”.

Generics in Scala is a bit different than Java, but roughly “[T]” means “<T>”.

Java’s primitive types are replaced by Scala “value types” which are accessed like objects, but can potentially be as efficient as the Java primitive equivalents. These types are named Byte, Int, Long, etc. Note that Scala does not provide its own reference types for the basic heap-allocated equivalents - it just reuses java.lang.Integer, java.lang.Byte etc. As with Java, primitives are boxed (copied to storage on the heap) automatically when needed (eg when adding a primitive to a collection).

Arrays in Scala do not use square-brackets for declaration or indexing. Arrays are real objects with methods, and creating an array and reading/writing its elements is done via methods not special-case syntax as in Java. Reading/writing is actually done via a special “apply” method which can be invoked without the method-name; thus copying an element from one array to another looks like “dest(i) = src(j)”.

Scala allows nested function definitions, ie functions within functions.

Scala code can include inline xml (no escaping required).

Scala has no static fields or methods. The primary purpose of Java static fields and methods is to provide a single instance of a variable or constant, or a single implementation of a method. Scala singleton objects are “types with a single instance”, so provide a suitable home for such fields and methods.

Scala has no checked exceptions - only unchecked ones. Otherwise, exceptions work similarly to Java, with throw and catch keywords. Exceptions are used slightly less often in Scala than Java, given standard-library types such as Option and Either which make it easier to return error-indicating objects on failure. Catching of exceptions is also often done via the library-function Try (note the capital first letter) which converts an exception into a return-value.

The Scala Type Hierarchy, Value Types and Java Primitive Types

Every type in Scala is a subytpe of scala.Any. This has two subtypes:

  • scala.AnyVal is the parent of all “value types”
  • scala.AnyRef is the parent of all “reference types”

Value-types are the kinds of things allocated “on the stack” or embedded inline into other objects, and so garbage collection is not relevant for them. They have no identity, only a value. Equality comparisons therefore always are value-comparisons, and never identity-comparisons. Specifically, methods on the type-definition are not permitted to do anything that would require a “new” operation (ie heap allocation) at runtime. This ensures the methods can be compiled down to a set of helper functions rather than requiring real objects. The result is code that looks object-oriented but which is as efficient as procedural code. The Scala wrappers for the primitive Java types (int, double, char, etc) are all of this kind (called “value classes”), removing the clumsy Java distinction between primitives and objects without a performance hit. It is also allowed (and useful) for users to define their own “value classes” as no-overhead wrappers for their own datatypes.

Reference-types are the kinds of things allocated “on the heap”. Local stack-frames and other objects may hold references (pointers) to them, but they are not “embedded” into other objects, and may be garbage-collected if no references to them remain.

Just about every type in Scala subclasses scala.AnyRef, and act much like Java classes.

The kinds of methods found on java.lang.Object (ie those common to every object instance) are spread between Any and AnyRef.

When primitives are embedded into a collection (eg List[Int]) then they are “boxed” ie converted to on-heap objects just as in Java.

Type “Unit” is the equivalent of Java “void”, and the single value of this type is written “()”. The keyword “null” is also used - but is not related to Unit.

Typecasts and Type Aliases

Typecasts are done via a method defined on a base Scala type (ie available on any object):

  var x = myref.asInstanceof[Int]

On a similar subject, the equivalent of Java’s Foo.class is classOf[Foo].

Alternative names (“type aliases”) can be defined for existing types, just for convenience:

type BarList = List[Bar]   # defines a "type alias"

Variables

Vals and Vars

Scala provides keyword “val” to declare references that cannot change. But the object pointed to can still change internally - unless the type is designed with no mutation methods. In short: just like a Java final field. Unlike Java, Scala developers tend to use vals very frequently.

When a variable is declared with var then it can be rebound to some other reference, as in Java.

Variable Declaration Syntax

Scala puts the variable-name before the type, eg

  val foo: Int = 3
  var bar: String = "hello" // initialized but mutable
  var baz: String // uninitialized

instead of

  final int foo = 3;
  String bar = "hello"; // initialized but mutable
  String baz;  // uninitialized

Type inference often allows the explicit type declaration to be omitted:

  val pi = 3.14     // implicitly Double
  val foo = 3       // implicitly Int
  var bar = "hello" // implicitly String
  var baz: String

The Scala var/type ordering works better with type inference. I rather like it, as it mimics natural language better: “foo is an integer”.

A val declaration must be initialised, like final variables in Java. Obviously it makes no sense to have an uninitialised immutable reference (except when it is an abstract field definition in an an abstract class).

A val or var declaration can be initialised with a code-block containing multiple statements (which is run immediately). The code-block acts like a function (ie the variable is set to the value of the last expression in the code-block):

  val myfunc = {.....}

A string written as s"..." is an interpolated-string; any $name or ${expression} text in the string will be expanded.

Instances of classes are created with “new” (as in Java). However it is very common to define a factory method on a “companion object”; this is described in more detail later but means that calls to new are less commonly seen when reading Scala code. As an example: val mylist = List(1,2,3) is using the “default apply method” on the List companion object to instantiate a new list, ie that method acts as a factory.

Almost Everything is an Expression

Almost every code-structure that can be used within a method returns a value. In Java, things like if/for/while/switch are statements without return-value; in Scala:

  • if/else is an expression, ie returns a value
  • for..yield is an expression
  • match (similar to switch) is an expression
  • try/catch is an expression
  • but for (without yield), while, and do..while are statements (have no return value)

The fact that “if/else” returns a value is great, and makes the ?: “ternary operator” unnecessary (Scala does not have a ternary operator):

  val x = if (a>0) a else b;

Methods

Functions, Closures and Methods

All these names refer to very similar things:

  • A function is a block of logic which takes zero or more parameters.
  • A closure is a function with associated “captured variables”.
  • A method is a function associated with an object instance.

Because a singleton object instance can always be found without needing a parameter or “captured” reference to it, a method defined on a singleton object type can effectively be used as a function.

More information on functions, closures and methods is provided later.

Basic Method Declarations

Method declarations are permitted within class, trait and object declarations.

Methods are declared like:

   def helloTo(who: String, greeting: String): String = {return greeting + ", " + who}

This syntax mimics natural language nicer than Java: “helloTo returns String”, and puts the important fact (method name) earlier than the subordinate fact (returns String).

The “signature” of this method is (String,String) => String - see later for passing methods and functions as parameters.

Often the method return type can be deduced (see “type inference” above), in which case the explicit return-type declaration can be omitted:

   def helloTo(who: String, greeting: String) = {return greeting + ", " + who}

   // Inferred return type is Int...
   def min(op1: Int, op2: Int) = {
     if (op1 <= op2)
       return op1 // Int returned
     else
       return op2 // Int returned
   }

By default, the return-value of any method is the result of the last expression evaluated in that method, ie the “return” keyword is often unnecessary:

   def helloTo(who: String, greeting: String) = {greeting + ", " + who}

   def min(op1: Int, op2: Int) = {
     // an "if/else" is actually an expression which evaluates to one of its branches..
     if (op1 <= op2)
       op1
     else
       op2
   }

And when the method body is just one line, the braces can be omitted:

   def helloTo(who: String, greeting: String) = greeting + ", " + who
   def min(op1: Int, op2: Int) = if (op1 <= op2) op1 else op2

A method with no parameters may be defined without parentheses, in which case it must be invoked without parentheses. When defined with parentheses it may be invoked with or without them (inconsistency due to compatibility with Java). Convention: only methods without side-effects (pure functions) are written without parentheses. Example:

  def age = { currentYear - this.yearOfBirth }

An abstract method declaration uses the same form as above, but without the “=” sign and the body. In an abstract method declaration, the return-type must always be explicitly declared (obviously, it cannot be deduced from the method body as there is none).

When a method is declared as “def name(params) {...}” without a return type and without equals then the return-type is implicitly Unit (like Java’s void) - and any expression at the end of the method body will thus be ignored. This usage is not recommended; it is better style to use the equals and let Scala’s type inference figure out that no value is being returned from the method - or explicitly declare a return-type of Unit. A method returning Unit is also known as a procedure.

Declaring a method with a return-type and then forgetting the equals leads to a compilation failure - the Scala compiler sees an abstract method declaration whose return-type is being defined as an inline subclass (the method body is seen as subtyping the previous type-name). Inline subclasses are not allowed in declarations, so this is an error - but currently a rather confusing one.

One exception to the def-requires-equals rule is when defining secondary constructors for a class; these are created with def this(...) {this(..)} and do not have an equals before the body.

Implicit/Explicit Typing

The fact that variable declarations inside a method are inferred, but that methods must have explicit param types seems a good compromise. It removes pointless text while keeping confusion manageable.

Having strict typing is great for refactoring and understanding existing code - mandatory for large projects.

More on Method Declarations

Method-names can use, or be exclusively formed from, punctuation chars. Such methods can be overridden in subclasses just like any other method. Example:

   def <<(x:Int) = {...}

Such method names are not limited to a predefined set of operators - sequences of punctuation chars may be used as a method-name. These methods are usually called “operators”, even if they don’t mirror traditional operator-names such as “+” or “<<”. Just about any character can be used in a method or variable name, except quote/apostrophe/backtick/comma/semicolon/dot and those that come in pairs, ie “()[]{}”. A name which begins with an “operator character” must contain only operator characters. A name which begins with an alphabetic character must include an underscore before any operator chars, eg “do_++”.

Absolutely any sequence of characters can be used as a method name when they are surrounded by backticks - even reserved-words from the language such as if.

Methods with variable numbers of parameters (varargs) are supported as with Java, but with a different syntax:

  def add(operands: Int*) = ...

As in Java, a varargs-method can be invoked with a sequence of literal args, or an array. Unlike Java, it is possible to pass an arbitrary collection (sequence) to a varargs method, but a special syntax is needed: add(mylist:_*). The strange syntax is actually a kind of inline “type cast” (type ascription) of variable mylist to type “_*”.

Methods have public access by default (see later for more info on access control).

Method params can be given default values:

  def foo(reqdParam: String, optParam: String = "") = ...

Multiple Parameter Groups

A method (or function) can be declared with multiple groups of parameters:

  def myMethod(i: Int, s:String)(s:String):Float = {...}

Such a method can be invoked in various ways; the simplest is myMethod(1,"hello")("world"). See later for information on implicit parameters, partial function application and lazy parameters.

The “signature” of such a method looks like (Int,String)=>String=>Float (see later for information on passing functions as parameters).

Method Invocation - Named Parameters

Methods can be invoked using “named parameters”; given a standard method-declaration like:

  def swap(op1: Int, op2: Int)

it can be invoked as

  swap(op2=17, op1=12)

There is another feature with a similar name but quite different functionality: “call-by-name parameters”. These are described later.

Method Invocation - Infix Form

The dot can be omitted when calling a method on an object. And you can omit parentheses when invoking any method with one parameter. So instead of this:

  foo.bar(12)
  fix.fox(13,14)

you can write this:

  foo bar 12
  fix fox(13,14)

This is particularly useful when overriding operators:

  foo.+(13)   // invoke the method named "+" on object foo
  foo + 13    // same as above

Combining this with the ability to define method-names consisting of punctuation characters can lead to elegant-looking APIs, or simply unreadable ones. The widely-used Akka library uses this to allow code like “sender ! Response(errMsg)” - meaning invoke the method named “!” on object “sender” passing the return value of (special) method Response.apply(errMsg) method as a parameter.

Interestingly, the source-code for standard class “scala.Int” includes this declaration:

   def +(x:Int): Int

which is what allows us to write

  val sum = 1 + 4

Operator Precedence

When invoking a series of methods using infix form, the question of precedence arises, eg:

  val x1 = 1 + 3 * 8                        // traditional example
  val x2 = foo +- bar *^ baz                // some infix methods with non-ascii names
  val x3 = foo plusminus bar starcarat baz  // some infix methods with ascii names

Mathematical operators have well-known precedence: multiplication has higher precendence than addition, so the first example is interpreted as 1 + (3 * 8).

In Scala, there is a standard set of operator-characters which have a fixed precedence (eg +-); the programmer cannot override these precedences. A method-name which starts with such a character has the fixed precedence value of that character, ie a method’s precedence is set by the first character of its name. While this is a loss of flexibility, making precedence dynamic (ie allowing functions to declare operator precedence) would probably lead to totally unreadable code.

Method names used in infix-form which do not start with a recognised operator-character all have the same precedence.

The first two of the above examples are therefore equivalent to:

  val x1 = 1 + (3 * 8)
  val x2 = foo.+-(bar.*^(baz))

In the third example, methods “plusminus” and “starcarat” have the same precedence, so the question of associativity arises - see below.

Operator Associativity

When invoking a series of methods using infix form, and the methods have the same precedence, then the question of associativity arises, eg:

  val x1 = 1 + 3 + 5                        // traditional example
  val x2 = foo +- bar +- baz                // some infix methods with non-ascii names
  val x3 = foo plusminus bar starcarat baz  // some infix methods with ascii names

Mathematical operators have well-known associations: they almost all associate left-to-right, so the first example is interpreted as (1+3)+5.

In Scala, each standard operator-character has a fixed associativity; the programmer cannot override it. When used in infix-form, a method’s associativity is set by the last character of its name. Methods used in infix-form which do not end with a recognised operator-character all have left-to-right associativity.

The above examples are therefore equivalent to:

  val x1 = (1 + 3) + 5
  val x2 = (foo.+-(bar)).+-(baz)
  val x3 = (foo.plusminus(bar)).starcarat(baz)

Operator Binding and the Colon Character

One particularly important operator in Scala is :: which means “concatenation”. The standard linked-list type provides a method with this name for building lists, and if you are familiar with functional programming you will know how often linked lists are used. Interestingly, :: is right-associative (ie different from most other operators). Actually, what is done in Scala is to define character : as right-associative, meaning any method ending in a colon (including ::) is right-associative.

In addition, for any method ending in a colon, the “binding” is reversed. Above, it was mentioned that “a op b” is equivalent to “a.op(b)”. However for any method-name ending with a colon, “a op b” is instead equivalent to “b.op(a)”. In particular, lists are commonly built like “1 :: 2 :: Nil” which is equivalent to “List.Nil.::(2).::(1)”.

Thus:

   val l1 = 1 :: 2 :: Nil;  // creates (1,2)  -- and equivalent to "val t1 = Nil.`::`(2); val l1 = t1.`::`(1);"
   val l1a = 0 :: l1;       // creates (0,1,2)
   val l2 = 3 :: 4:: Nil;   // creates (3,4)
   val all = l1 ::: l2;     // triple-colon is "flattening concatenation" which creates (1,2,3,4)

While this is something of an advanced topic, Nil possibly needs a little explanation. The object Nil is a singleton of type List[Nothing], ie the generic List type with a type-parameter of the special Nothing type which can be cast to any other type (is a subtype of any type). The List class has a method ::(b B): List[B] ie a method which takes an object of any type B and returns a list of that type. However there is a type-constraint on B: a list of type A can only be passed a B if A is covariant with B. Type Nothing is covarient with every type; a List[Nothing] can therefore take a parameter of any type (eg an Integer or String), and returns a list of that new type. From that point on, the new list object has a type List[B] which is probably not covariant with other types, and therefore the :: operator will only accept new objects of the same type B. The result is that Nil can be used as the starting-point for lists of any type; the compiler infers the type of the list from the call b :: Nil to be whatever type b has. Scala’s type-inference has already been mentioned, and its support for type-parameters (generics) is discussed later.

Point-free Style

Taken to extreme, this ability to use “infix” form for methods taking one parameter leads to something called “point free style”. For example:

  List(1,2,3,4) filter isEven foreach println

is equivalent to

  val input = List(1,2,3,4)
  input.filter(isEven).foreach(println)

In the above code, type List has a method filter which returns an object of type Seq which has a method foreach which takes a function-reference.

Tuples

A tuple is somewhere between an array and a datastructure. Like an array, it is a fixed-size sequence of references to other objects. Unlike an array, each reference can have a different type. Tuples are defined simply like ("hello", 123, 12.45, "world"). The values in a tuple are usually accessed via pattern matching aka “destructuring” (see later). However the fields can also be accessed with sometuple._1, sometuple._2 etc, or by method sometuple.productElement(i). The total number of fields is available via method tuple.productArity.

The operator -> builds tuples of two objects, eg 1->2 returns a tuple of two integers (ie a single object containing two subobjects).

Advanced topic: operator -> is actually defined on a type named ArrowAssoc which also defines an implicit conversion method from any type to itself. When the compiler sees x -> y it can therefore generate the equivalent code ArrowAssoc(x).->(y)) which returns a tuple. Implicit conversions are powerful and elegant - but finding the full set of such impicit conversions is not trivial for the compiler, and is possibly one of the reasons why the Scala compiler is so slow. It can also be very difficult when reading the code to figure out exactly what is going on.

Some quick tips:

  • a tuple can be converted to a string (for debugging) via: ("hello", 2, "world").toString
  • a tuple can be converted to a string (more controlled) via: ("hello", 2, "world").productIterator.mkString(",")

Classes

Basic Class Declarations

A Scala “class” declaration is roughly like a Java class declaration. However parameters to what is named the “primary constructor” are kind of squashed into the class declaration header, and the variable-declarations and inline anonymous code-blocks within the class body form the “body” of the primary constructor:

  class User(val id: Int, var name: String, private val yearOfBirth: Int, comment: String = "no comment") {
    var isActive = false
    private var foo = "something"

    println("in User constructor")

    def printme() = println(name + ":" + comment)
  }

Any “class parameters” which are prefixed with val or var are simultaneously constructor-parameters and members of the class; the members are automatically initialized to whatever the caller provided (saving the often boring boilerplate found in many Java class constructors). Parameters without val or var are available only during the execution of the constructor - but they can be used to initialise regular members, or “captured” as shown in the definition of method printme.

The primary constructor can be made private just like a Java constructor can be made private, via class User private (...) {..}.

The modifier on class-parameter id is “val”, so this becomes an immutable member with a public getter. The modifier on “name” is var, so this becomes a mutable member with auto-generated public getter and setter. There is no modifier on comment, so it is not stored on the class - but is “captured” by method printme so a reference to comment is effectively part of the class. As shown by parameter yearOfBirth, class-parameters can be explicitly declared “private var” or “private val” if desired.

As with method parameters, constructor-parameters can have default values.

The Scala compiler generates regular Java “.class” files as output, and running a Java decompiler on the classfile will show that Scala generates a Java class for each Scala class. The class has a constructor matching the parameters in the Scala class-definition, and getters/setters for those params and members declared with val/var.

Overriding the getter or setter for a member is not directly possible. Instead, you define the member as private, then define a read and/or write method which returns the private value:

  class Foo(private var idInternal: Int) {
    // Getter method usable like "var x = foo.id". Note that the method is declared with _no parameters_ (not even an empty-list).
    def id = {
      println("Getting id")
      idInternal
    }

    // Setter method callable like "foo.id = 12"
    def id_=(value: Int) = {
      println("Setting id")
      idInternal = value;
    }
  }

If defining a setter in this way, it is mandatory to also explicitly define a getter.

The setter method has an underscore between the name and the equals-sign because of the Scala rules on var/method names: a name may only include chars outside the ones allowed by Java if it (a) purely consists of such symbols (eg ::) or if the special chars are preceded by an underscore. Allowing things like “id=” or “id++” to be a valid variable or method name would obviously make additional whitespace mandatory in many places, which would be very inconvenient. The “special char is part of the name if-and-only-if preceded by underscore” is a reasonable compromise. In this case, though, it does force the id_= syntax which is a little odd.

A member which is declared without an initial value is “abstract” (obvious for immutables vals; not so obvious for vars). Null may be used as an initialization value.

There are several special method-names that can be defined on a class: apply, unapply, unapplySeq and update. See “special methods” later.

Traits (interfaces) and inheritance are discussed later.

More on Class Constructors

Here is an example of a class with some constructor-parameters and constructor logic:

  class User(val id: Int, var name: String, comment: String) {
    println("Primary constructor: running..")

    println("Primary constructor: Initialising internal var isActive")
    var isActive = false

    println("Primary constructor: Initialising internal var foo")
    private var foo = "something"

    // secondary (alternate) constructor
    def this(id: Int) {
      // delegate to the primary constructor above
      this(id, "unknown", "no comment")
    }

    def printme() = println(name + ":" + comment)

    println("Primary constructor: still running...")
  }

Note that the codeblock after the “class .. “ declaration is actually a kind of method-body, containing code. However variables declared in this method-body are also members of the class, and continue to exist after the constructor completes. This will feel somewhat familiar to Javascript developers, where a “constructor” actually creates variables and defines methods by storing data in the associated map. Scala is of course more statically-typed, but there are some similarities.

The above demonstrates a secondary constructor, which is declared as a method with name “this”. Note that there is no “=” between declaration and implementation. The implementation must itself call “this” to delegate to another constructor (eg the primary). Of course this particular secondary constructor, which just sets default values for parameters, can instead be implemented by defining default values directly in the primary constructor params..

The primary constructor may be made private via

  class User private (....) {..}

in which case instances can only be created via secondary constructors, or via factory-methods on a companion object (see later).

Case Classes

Scala has “case classes”, which are effectively a tuple with named elements. They are simple “data representation objects” or “data transfer objects” used for holding a bunch of related data.

A case-class instance is immutable, ie all its members are read-only. It cannot be subclassed (is “final” in Java terms).

It automatically has a public field for each constructor param, as usual for Scala classes - but “private” fields are not supported. Standard functions equals and toString are auto-generated, as is a copy method (which supports named params to change just specific fields). If there is no existing object declaration with the same name as the case-class, then one is also auto-generated which provides an apply method for creating new instances of that type (factory method) and an unapply method for use with pattern-matching. This all saves a lot of boilerplate code!

A case-class can extend existing classes as long as they have no abstract methods. Constructor params for the parent type(s) are passed along from the case-class constructor as usual in inheritance (see later).

You can define methods on a case-class, and can also define them on the companion object.

Creating an instance of a case-class can be done with the “new” keyword, but is usually done via a factory method on the companion object (which is auto-generated if not defined by the developer); see below. The default copy method can then be used similarly to the builder-pattern if desired to create variants of the original object.

Object Declarations (Singletons and Companion Objects)

A Scala “object” declaration defines a type of which there will be only one instance at runtime. On application startup, the singleton instance is automatically created; it is not possible to create more instances. The instance can be referenced via its “object name”.

  object MySingleton {
    val PI = 3.14       // declare members as usual
    def foo(...) = ...  // declare methods as usual
  }

  // Call it using its "object name" as the "instance"
  MySingleton.foo()

Such singleton objects can be used to do the kind of things which Java developers implement as static variables and static methods. It also replaces all use of the (old-fashioned) static-singleton-pattern from Java.

Singleton objects can extend a base class and have traits; they have a proper type after all (unlike Java statics). By default the base class is AnyRef (similar to Java’s Object base type) but the base class can be anything - what is special about an object is not its type, but the fact that there is only one instance of it.

Declaring an object also effectively declares a variable of the same name which points to the singleton instance. Given an object declaration like above, the name “MySingleton” is simply an immutable reference to the singleton instance, just like any other reference variable.

The “main” method of an application (which is static in Java) must be defined on a Scala singleton object as a method named “main(args: Array[String]): Unit”.

When an object-declaration has the same name as a Class definition, and is in the same file, then it is called a “companion object” to the class. A Scala class may call private methods of its companion object, and a companion object may call private methods on its companion class (including private constructors). Because of this, a companion object is commonly used to define factory methods for the type with the same name:

  // class with no public constructor
  class User private (val id: Int, var name: String, comment: String) {...}

  // companion object
  object User {

    // trivial factory method (but not following usual conventions; see below for info on the apply method)
    def create(id: Int, name: String, comment: String) = new User(id, name, comment)
  }

  // Use the factory-method
  var user1 = User.create(1, "fred", "no comment")

The example above defines a factory-method with name create that returns an instance of the (non-singleton) User class. However in practice, such factory-methods are usually defined as an apply method; see the following section for a quick overview and later for more complete information on apply methods.

If you use a Java decompiler on the generated code for a class with a “companion object”, you will see that the class has a method which returns the “companion object” instance. The implementation of this method returns the value of a static field on the JVM class of the companion object, rather in the way the static-singleton-pattern is traditionally implemented in Java.

The type of a singleton object and its companion non-singleton type are not related in any way - the only link between them is that they happen to share a common name at the Scala sourcecode level and that they have access to each other’s private methods and attributes.

When the compiler is searching for implicit methods and classes related to some class C then the companion object for class C is included in the list of places checked for such definitions. This means that the companion object is a convenient place to define implicit methods for type-casting and other interesting behaviour. See later for a discussion of implicit typecasts and implicit methods.

Scala object-types should not be overused. Statics are a pain when testing in Java; it is far nicer when a class under test has a “provider object” injected into it during construction. In production Java code, the object injected can be a singleton while in testing it can be a mock. The same approach should be taken in Scala, ie if you may need to mock instances for testing then don’t make them singleton “objects”.

When programming in Scala in a truly functional style, an application may have few or no class-declarations at all; instead case-classes are used for declaring data-structures and then methods on object declarations are used to manipulate those datastructures, with higher-order functions (functions as parameters) used for structuring dataflows rather than object-oriented interface/class relations.

The Apply Method as Factory

A method named “apply” on a class or singleton-object acts as a “default method” which can be invoked without specifying its name. If a class named Foo has a method named apply, then it can be invoked on an instance of Foo like someFoo.apply(12) or simply someFoo(12). As seen in the first example above, methods on an object-declaration SomeObject are invoked as SomeObject.method(..); thus an apply-method can be invoked simply as SomeObject(...). This makes the apply method a nicer way to provide factory methods for the class of the same name than the “create” method approach shown above:

  // class with no public constructor
  class User private (val id: Int, var name: String, comment: String) {...}

  // companion object
  object User {
    def apply(id: Int, name: String, comment: String) = new User(id, name, comment)
  }

  // Use the factory-method: invokes singleton-object-method User.apply(...)
  var user1 = User(1, "fred", "no comment")

This factory-pattern (companion with apply-method) can be found often in Scala, eg the standard List type: “val mylist = List(....)” invokes List.apply(...) on “object List”.

Quite often the companion object for an abstract class will define an apply-method which acts as a factory for the default implementation of that type. Example: Seq(1,2) returns a List (List of course is a subtype of Seq) - though maybe that is not the best possible example as that specific code-path is somewhat indirect:

  • Companion object Seq extends SeqFactory which extends GenSeqFactory which extends GenTraversableFactory which extends GenericCompanion
  • Method GenericCompanion.apply calls back into SeqFactory.newBuilder which returns new mutable.ListBuffer
  • Calling .result on the ListBuffer then returns an immutable list.

See later in this article for more information on apply methods.

Traits (interfaces)

Scala’s equivalent of Java interfaces is the trait. It works similarly to interfaces, and is pretty obvious.

A trait is meant to be used to define behaviour that is relevant to many different unrelated types, eg “serializable” could be a good trait, or “closeable”:

  trait Closeable {
    def close(): Unit
  }

Traits may (obviously) have abstract method definitions. Like class-based abstract methods, the return-type must be explicitly defined.

Like Java8 interfaces, traits may also have concrete method definitions - methods with bodies.

Unlike interfaces, traits can also declare members:

  • When the member is abstract (not initialised with a value) then any class which implements that trait must itself declare a member with that name. A similar effect can be obtained in Java by having the interface define abstract getter/setter methods and then use these in a concrete method implementation on the interface.

  • When the trait member is initialised then all classes which inherit from the trait automatically get a member of that name. This isn’t quite the same as inheriting a member from an ancestor class, ie this isn’t full “multiple inheritance”; see “linearization” below.

Traits do not have constructors.

Traits which declare members (whether abstract or not) are often refered to as mixins.

To support traits with members (which are not allowed in Java interfaces), each Scala trait actually generates JVM bytecode for both an interface (named in the usual manner) and an abstract class named {traitname}$class. This is normally irrelevant in pure Scala code, but can be useful to know when interacting with Scala libraries from Java code.

Inheritance

Scala inheritance works reasonably similarly to Java, but the syntax is a little different:

    class Derived (val arg1: Int, arg2: String)
      extends Base(arg2) with Closeable with Serializable {
      ...
    }

At most one base class may be specified, but multiple traits are permitted. The first ancestor type is indicated using “extends” and subsequent ones are indicated using “with”.

Any parameters required by the base-class must simply be mapped 1:1 from constructor-params of the subclass; if the param to a base class needs to be computed somehow, then the constructor can be made private and that logic can be done in a factory method. Neither “val” nor “var” can be specified on arguments being forwarded to the base-class; that would imply the subclass having a field duplicating one from the base class which doesn’t make sense.

Traits do not have constructors, so a class never needs to pass arguments to a trait it implements.

A problem with multiple-inheritance of traits or interfaces is what to do when the same method (or field in the case of traits) is defined in multiple ancestor types. In Scala, the order in which traits are listed in the “with” clause is significant in resolving these conflicts (see “linearization” later). Java8 simply reports a compiler-error in this case.

A trait may extend a class as well as another trait - as long as the ancestor class has a constructor with no arguments. In this case, the trait may use “super” in its method-definitions to call methods on the base class. When a concrete class mixes in multiple traits which extend the same base type and override the same method, then overridden methods in traits are invoked somewhat like a chain of superclasses - in the order in which the concrete class declares the mixins.

When a subclass reimplements a concrete method defined in an ancestor type, the Scala keyword “override” must be added to the “def” statement (similar to Java’s @Override annotation, but mandatory). As in Java, keyword “super” is used to invoke methods in the base class.

When a class does not declare an ancestor type then its default ancestor type is standard type AnyRef (similar to Java’s Object type).

Guidelines for subclassing:

  • never derive an abstract class from another abstract class
  • never derive a concrete class from a concrete class except to add mixins (pure logic without fields)
  • when a problem seems to require one of the above, try to use the adapter pattern.

Linearization of Traits

When the Scala compiler encounters a class which inherits from a trait, it actually uses the trait as a kind of “template” from which a new abstract class is generated.

For a class declared as “class Foo extends Base with Trait1 with Trait2” the compiler generates an abstract class C which extends Base and “pastes in” any non-abstract contents of Trait1, then another abstract class D which extends C and “pastes in” any non-abstract contents of Trait2. Class Foo then extends class D. The generated classes implement the JVM-level interfaces corresponding to each trait, but the implementations (methods and members) are “specialized” for the class which extends them.

This process is called “linearization”; obviously it does increase the number of classes in the classpath (runtime overhead) but this approach allows traits to have non-abstract members without the problems of full multiple-inheritance. Linearization provides some additional advantages over interfaces; in particular, a trait can override a method inherited from an ancestor class; this allows effects similar to AOP (aspect-oriented programming) method interceptors.

Anonymous Subclasses

The syntax “new Foo with Bar” creates a new instance of an anonymous class which is a subclass of Foo and implements Bar. Any custom code in the new anonymous class can reference methods defined in the “with” mixin. All code in the same scope also sees the variable as the anonymous subtype, ie can call the mixin methods. Obviously if the instance is passed into some other function or returned then it is only accessable as the specified param-type or return-type.

Sealed Classes and Traits

A class or trait declared as “sealed” can only be subtyped in the same file. This allows the developer/compiler to know the full set of subtypes is fixed. This is particularly useful in pattern-matching statements (similar to Java switch-statements) where the compiler can determine whether a “default” clause is required or not.

Sealed traits are a very common pattern in the functional-programming community and are called “sum types” (a kind of Algebraic Data Type aka ADT). A trait (interface) is declared, and then the complete set of implementations of that interface are declared in the same file.

As with Java, the keyword final prevents classes from being subclassed at all.

The Implicit Keyword

Here is a brief summary of the three kinds of functionality which use the ‘implicit’ keyword. For more information on implicits, see the official reference.

All kinds of implicit behaviour should be used sparingly - it can make the code flow hard to read.

Implicit Type Conversions

When an object of type T is passed as a parameter to a method which expects a parameter of type Q, and T does not inherit from Q then the compiler does not just give up, as would be the case in Java. Instead, the compiler looks for an implicit wrapper class or an implicit wrapper method which takes an instance of T and returns an instance of Q; if one is found then the necessary call to convert the T to a Q is inserted automatically.

A wrapper class can be defined as implicit class SomeName(arg:T) extends Q ... ie a class whose constructor takes a single instance of the type to be converted from, and which implements the target type.

Alternatively, a wrapper method can be defined as implicit def someName(t:T): Q = ...

The class or method can be declared locally, or be brought into scope via an import statement.

One limitation is that this does not work well for objects being passed around as some abstract type. The lookup of the implicit conversion method/class is done on the declared type of the variable, not its runtime type.

Implicit Methods

When a method M is invoked on a type T which does not have any such method, the compiler searches for an implicit type conversion (see above) which converts T to a type which does have method M.

Interestingly, this effectively allows “adapters” to be defined for a type transparently, which is basically equivalent to adding methods to a type without modifying the definition of that type. This is similar to a Haskell “type class” or C# “extension methods”. Given some interface (trait), an implicit class can be defined that takes an instance of some type T and returns an adapter which implements that trait for that type.

These implicit conversions are used to handle mixed-type arithmetic, eg adding ints and longs. In most cases, the implicit-method-invocations are inlined in the generated code.

The standard-library type Option can be implicitly converted to a list of 0 or 1 items; this allows using an Option in a list-comprehension, ie “dynamically” makes all List operations also available on Option instances, as Option instances can be converted to instances of List.

Some plain Java types (eg String) have implicit conversions to Scala wrappers which provide additional related methods. The primitive Java types have implicit conversions to “objectify” them, (ie a Java integer can be treated as a scala.lang.Int).

There is a Scala standard package which provides implicit converters for Java collection types, transparently mapping them directly to Scala standard library equivalents. However this package is no longer recommended; the Scala standard library also offers a different package which provides implicit conversions to various wrapper types each of which offers a method “asScala” - ie these types are not directly converted to a Scala collection, but instead converted to an intermediate form that provides the option to elegantly but explicitly convert to the Scala equivalent.

Implicit Parameters

A method can include the keyword “implicit” once at any point in its parameter-list; all parameters declared after that point are implicit-parameters. An example is def foo(i:Int, implicit w:Widget) = ... When such a method is invoked the caller can specify a value for implicit parameters, just like normal parameters. However if the caller does not specify a value (passes fewer arguments to the method than declared) then the compiler looks for a declaration of an implicit value of the appropriate type somewhere in the current lexical scope, and passes that value automatically as the parameter.

The exact rules for where the compiler looks for “suitable declarations” are a little complex, but the following are among the options:

  • implicit val someobj = ... in the local method
  • implicit val .. on some type specified in an import-statement
  • implicit val ... on the companion object of the object on which the method is being invoked.
  • implicit object ... somewhere in scope; declaring an object-type effectively declares a variable of the same name, ie a val referencing the singleton instance of that type, and is thus conceptually the same as the previous options. Remember that singleton objects can implement traits…

The companion-object lookup is particularly useful when subclassing a type or trait which defines a method with an implicit param; the subtype can define an appropriate implicit object to use as a parameter for all instances of that subtype (when not overridden by the caller via a local implicit val or an imported implicit val).

Type Classes (adding functionality to classes)

The ability to require an implicit parameter for a method provides an indirect, but very powerful, way to extend the functionality of existing classes without “wrapping” it as described in the sections on implicit type conversions and implicit methods.

The design pattern is as follows:

  • design a trait that could potentially be used with many different types of object
  • when defining methods which operate on objects of this trait, do not constrain the input parameter to be of a specific type - instead require an implicit parameter that implements that trait for the input argument type.
  • and in the method implementation, use methods on the implicit parameter to manipulate the parameter object(s) rather than invoking methods directly on the parameters (ie program in functional-style rather than object-oriented-style).

This looks something like the following:

trait WantedTrait[T] ...
  
def someMethod[P](obj:P)(implicit WantedTrait[P] operations) = {
   operations.doSomething(obj)
}

The code someMethod(arg) will compile when arg has type SomeType if (and only if) there is an implicit object in scope which implements WantedTrait[SomeType]. Or in other words, someMethod can be used on any type X by defining a suitable implicit implementation of WantedTrait[X]. That makes someMethod extensible to handling any type - as long as a sensible implementation of WantedTrait can be defined for it. This is the essence of the “type class” concept from the Haskell functional language. And importantly, it requires instantiating no wrapper classes around the parameter at runtime - the implementation of WantedTrait[X] for any X should be a singleton, ie the implicit method call just passes a reference to an already-existing and stateless object.

Scala has a “shortcut syntax” for this pattern: def someMethod[T: WantedTrait](obj:P) is eauivalent the the longer definition above.

A more detailed explanation of the concept can be found here.

It is unlikely that a Scala beginner will implement such code, but this pattern is found sometimes in external libraries - and even in the standard Scala library, so it is good to be able to recognise what it is doing.

Structural Typing

Scala allows class-generic-types and method-parameter-types to be defined “inline”; such a declaration is called a structural type. It is then possible to use any type which has the appropriate properties.

Because the following method uses a structural type for its parameter, any object with a no-args method called “name” may be passed here:

  def onAnyNamedObject( target: { def name:String }) = {
    println(target.name)
  }

Structural typing is only available when an inline type is explicitly specified; it doesn’t mean that an instance of some class T1 can be cast to a different class T2 just because T1 is a superset of the properties/methods of T2.

Structural type usage is checked at compile-time for correctness. However at runtime, this is currently implemented via reflection - ie this has a performance impact.

Packages and Import Statements

Declaring Packages

Scala code is grouped into packages as with Java; a Scala sourcefile should start with a package-declaration like:

  package foo

All declarations following that line is within package foo. There is no absolute requirement on the directory-name in which the file is stored (Scala tools can work with any desired structure). However some tools do assume that Scala code in package X is in a directory with the same name, so that layout is still recommended.

Scala code often does not follow the Java package-naming convention of “reverse internet domain names” (eg com.example.projectname.componentname); instead much “flatter” structures are often used. The Scala standard libraries are usually just under package “scala”.

Packages can be defined with nested syntax:

   package foo
   package bar // nested inside foo, ie foo.bar

Importing Stuff

Imports work similarly to Java, ie provide a short alias for the full name of something. However there are a few special things about imports in Scala.

Wildcard imports use the “_” character rather than “*” (because “*” is a valid name in Scala). A single “_” is a reserved word and thus not a valid type or function-name.

The import statement can be used to enable certain Scala language features; an import starting with “scala.language.” is not a real package but rather an indication that the specified feature should be enabled for the following code.

An import-statement can define a local “alias” for an imported type to avoid name clashes.

An import-statement can import multiple types via import com.example.{Type1, Type2}.

An import-statement can be used to import the methods and members of a type (usually an object-declaration), making its members and methods accessible without a dot; import com.example.Widget._ where Widget is a singleton object will make all the methods of Widget available without needing Widget. as a prefix.

Unlike in Java, an import statement can be placed anywhere in a Scala file, and it is scoped exactly like variable declarations. When a type is used only in one function, the corresponding import statement can be placed in only that function.

The compiler first tries to resolve an imported package relative to previous imports in the same file. Given:

import com.example.foo
import com.example2.bar
import foo.baz

the compiler resolves the last import as com.example.foo.baz.

Package Objects

Java supports a single file literally name package-info.java in each package (filesystem directory). However in Java, the only things such a file can contain is the package declaration, with attached annotations and javadoc.

Scala similarly supports a single file in each package literally named package.scala. For a package com.example.foo, this file should contain:

package com.foo

package object foo {
}

Scaladoc (javadoc-like documentation) and annotations be attached to the “package object”, as in Java. However the package object can also define types, constants, and methods which are useful for all code in the package. The contents of the “package object” are implicitly imported into every other file in the package.

Access Control

Scala supports modifiers to control access to classes (and methods):

  • Modifier “public” works just like Java - but is seldom seen as it is the default in Scala.
  • Modifier “protected” is similar to Java (grant access to subclasses) but unlike Java it does not grant access from code in the same package
  • Modifier “private” is similar to Java, but does not apply to any “companion object” in the same file.
  • Modifier “implicit” is complicated; see the dedicated section on this.
  • There is no “package scope” modifier (and omitting a modifier implies public). Instead package-access is granted via an attached “scope”; see below.

It feels a little strange for things to be public by default, but (a) Scala uses immutable variables and pure functions much more than Java, making it less dangerous and (b) what looks like direct access to a public field in Scala is actually a method-call with zero parameters, allowing an accessor method to be defined later if desired without breaking calling code.

The Java modifier “abstract” is not needed on methods in Scala; any method without a body is abstract. The abstract modifier can be applied to classes.

A modifier can have a “scope” applied to it, eg private[com.example.widgets]; this example makes the class private (ie it can only be accessed by the class itself, or a companion object) but then grants access to all code within the specified package, and its sub-packages. The specified package can be absolute, or be a component of the current package eg when the current package is com.example.acme.widgets then private[acme] is valid.

Generics

Generic Types

Scala has generics somewhat similar to Java, though more powerful and with a different syntax. Type-params are specified in square brackets (“[]”).

To declare an instance of a generic type:

  val myStrings = new List[String]

To declare a parameter of a generic type:

  def printAll(items: List[String]) = ...

Covariance, Contravariance and Type Bounds

The tricky Java-generics “covariants” and “contravariants” support (“<T extends P>” and “<T super P>”) is replaced by expressions like:

  class Stack[+A] {
    def push[B >: A](b: B): Stack[B] = ...
  }

More obvious? No, not obviously. However it is more powerful. The above declarations mean that given a stack which holds objects of type A or its subtypes, method push takes a parameter which must be a subtype of A.

The equivalent of the Java generic “?” is Scala’s “Any” type. All Scala types are subtypes of Any (including Int, Boolean, etc). The wildcard “_” can also be used.

  • A<:B” is an “upper type bound” (A must be an ancestor of B). This is equivalent to Java “<? extends B>”. Defines “out” types, useful to constrain types returned from methods.

  • Syntax “B>:A” is a “lower type bound” (B must extend A). This is equivalent to Java “<? super B>”. Defines “in” types, useful to specify types passed to methods.

  • List[_]” is equivalent to Java “List<?>

Generic Type Parameters via Abstract Type Declarations

The simplest way to define a generic type is like class Foo[A] .....

An alternative way to express generic type parameters is to embed the types within the class body. An abstract class can include definition “type T” which is equivalent to Java generic syntax “<T>”. Some subclass then declares “type T = sometype” to complete the definition. As example:

  class MyGenericClass {
    abstract type Foo
    abstract Foo templateFn()
  }

The above approach does require that a subclass be declared. The alternative syntax class Foo[T] does not; a declaration can be made inline eg val foo: MyGenericClass[Integer]. However the abstract type syntax can be clearer in some cases.

  • adding upper-bounds or lower-bounds to a type-decl makes it more restrictive
  • adding variance declarations makes it less restrictive (since the default is invariance)

Variance

Given that B is a subtype of A, what is the relationship between List<A> and List<B>?

  • invariant: no relation, incompatible
  • covariant: List<B> can be cast to List<A>, ie someListOfA = someListOfB
  • contravariant: List<A> can be cast to List<B>, ie someListOfB = someListOfA.

In general, covariance is useful when reading values of the generic type, while contravariance is useful when writing values of the generic type. Or in other words:

  • when a type is declared as covariant (with +A) it can be used as a return-type for methods (out), but not a param-type (in).
  • when a type is declared as contravariant (with -A) it can be used as a param-type (in) but not a return-type (out).

Java classes are always invariant, except in two special cases: arrays and method-return-types.

An example of Java array covariance:

// Create a "view" of the underlying string array as a different type
Object[] objarray = new String[]{"s1", "s2"};

// Same principle as above, this time with a parameter instead of a variable
void methodTakingArray(Object[] data) {...}
methodTakingArray(new String[]{"p1", "p2"});

// Negative side of array covariance: covariance is good for _reading_ but bad for _writing_..
objarray[0] = Boolean.TRUE; // this write-operation compiles but throws exception at runtime!

And in Java, given a parent type defining a method returning A, a subtype can override the method and return a subtype of A.

Scala has very sophisticated facilities for expressing generic variance constraints, including:

  • List[+A] (covariance)
  • List[-A] (contravariance)

Immutable collections are “read only” and therefore can be safely declared as covariant on the datatype they contain; Scala’s standard immutable types do this.

Generic Self Types

A trait can declare “this: sometype =>” which requires that any type extending that trait must be a subtype of sometype. Method implementations on the trait can then access methods defined on the specified type-bound. Example uses are traits which can only be used together with types that are “serializable” (have a write method) or “ordered” (have a compare method).

Pattern Matching

Patttern-matching is like a “super switch statement”:

  val result = somevar match {
     case pattern1|pattern2 => val1
     case pattern3 => val3
     case _ => defaultval
  }

Most of the built-in types support pattern-matching. Case-classes automatically support pattern-matching. Custom classes are useable with pattern-matching when some class defines an “unapply” method which takes an instance of the custom type and returns a tuple representing the object’s matchable state; this method is often defined on the “companion object” for a type.

A statement of form “x match { case Foo(p1,p2) ..}” will result in a call to “Foo.unapply(x):Option[sometuple]”. When no such method exists, or the method returns “None”, then the case is considered not to match, otherwise the values in the returned tuple are compared to or bound to p1,p2,etc.

There is an “isInstanceOf” method defined on a base Scala type (and thus available on every object). However it is considered better style in Scala to use pattern-matching when the type is not known. Expression case val: SomeType => .. matches only if val is of the specified type.

The full details of pattern-matching are too complex to be described here. I have written a separate article dedicated to Scala pattern-matching.

Destructuring Bindings

A destructuring binding statement (also known as destructuring assignment) can extract values from a tuple:

val mytuple = (2, "hello", List(1,2,3))
val (a, b, _) = mytuple // a=2, b="hello"

An assignment statement can also extract values from a case-class (or other type for which an unapply method exists):

val myWidget = MyWidget(2, "hello", List(1,2,3))
val MyWidget(a, b, _) = myWidget // a=2, b="hello" - assuming MyWidget.unapply(..) is implemented in the usual manner..

The same destructuring-based access to fields within a tuple or case-class can be used in a map-function:

// below, variables "s" and "i" are assigned to components of the tuple as each is processed..
List(("tuple1", 45), ("tuple2", 67)).map { case (s, i) => "" + i + s }.foreach(println)

Functions and Lambdas

Anonymous Function Declaration

Method declarations use the “def” keyword and are statements; they have a side-effect and return no result. They are also “compile-time-only” structures.

Function definitions instead return a result that must be stored in a variable or passed as a parameter. They are partly compile-time; the compiler checks the syntax and generates code for them. However because they “return a result”, they are also have runtime behaviour. Functions are also called anonymous functions, closures, or lambdas.

Function declaration syntax is similar to method-declaration syntax. However it:

  • does not use “def”
  • puts the args before a “=>” operator

The full syntax is:

   val f1 = (x:Int, y:Int):Int => {return 0}

Often the return type can be deduced:

   val f1 = (x:Int,y:Int) => {return 0}

The return-statement can usually be omitted. And when the body is just one line, the braces can be omitted:

   val f1 = (x:Int, y:Int) => 0

More examples of anonymous function definitions:

  var f1 = (x: Int) => x + 1
  var f2 = (x: Int, y:String) => {...}
  var f3 = () => println("Hello, World")

As with method declarations, braces are optional when the body only has one line, and the “return” keyword is optional (and usually omitted).

Because the “this” reference associated with a method defined on a Scala “singleton object” is always uniquely identifiable, methods on singleton objects can also be used as functions:

  var pln = scala.Predef.println

This is similar to the way Java can use static methods as lambdas via syntax like “System::println”. Methods of a singleton object can actually be imported via an import-statement, making them callable without needing the name of the singleton object type - and all methods of scala.Predef are imported by default, thus “println” can simply be written directly.

Lambdas are sometimes called functions or anonymous functions.

The (formal params) part can be left out; the code can then refer to the parameters using “_”, eg the following are equivalent:

  var f1a = (msg: String) => println(msg)
  var f1b = println(_)

In f1b, the compiler detects that the right-hand-side of an expression contains an underscore, and realizes that this expression is not a simple method-call but instead an anonymous function definition. This allows the function to be defined without the leading (..) => syntax. The compiler can deduce that the type of the input parameter to this function is whatever type function println takes as parameter.

Some more examples:

  var f2a = (x:Int, y:Int) => x + y
  var f2b = (_ + _)  // Question: how are the types determined here?

  var f3a = list map { x => sqrt(x) }
  var f3b = list map { sqrt(_) }
  var f3c = list map sqrt

In f2b, there are two underscores, so the function assigned to f2b will have two parameters. The type of the arguments is ?? (try the REPL to find out..)

Anonymous functions can “capture” referenced variables in their context (ie the function body refers to a value which is in scope but not a parameter), in which case they are called closures.

I personally find the “_” syntax rather ugly - something like “$1” might have been nicer. However using underscores for param-placeholders has a long tradition in functional programming languages.

Declaring Methods Which Take a Function as Parameter

When defining a method which takes a function as a parameter, the syntax looks like:

   // first param is a function with one Int parameter and a String return value
   def high1(f: Int => String): String = ....`

   // first param is a function with two Int parameters and a Double return value
   def high2(f: (Int, Int) => Double): String = ....`

   // second param is a function with no parameters and a Long return value
   def high3(a: String,  f: () => Long): () = ....`

Partial Function Application aka Currying

Scala supports partial function application in several ways. The general concept of partial function application is that a function which has N parameters may be converted into a new function which has N-1 parameters with the other parameter now being bound to a fixed value. This is also called currying.

In one approach, method parameters are declared in multiple parameter-groups eg “SomeMethod(f:ftype)(g:gtype)”. Invoking SomeMethod(p1) returns a reference to something that now takes just a gtype parameter. Both calls can be immediately invoked via something like “SomeMethod(p1)(p2)”. As described in the previous section, Scala allows a method with just one parameter to be given a code block outside of the arguments parentheses; defining SomeMethod with two parameter-groups therefore allows it to be invoked like SomeMethod(f1){..} which would not be the case if SomeMethod had just been defined with one parameter-group containing f and g. Writing a code-block in this way is just syntactic sugar - but it can make code look elegant.

When executing code like val lhs = rhs with rhs being a function-type then Scala’s type-inference assumes that lhs has a type which is the return-type of rhs, and thus rhs is invoked. If lhs should instead be a reference to function rhs then either lhs must be given a type (eg val lhs: ()=>Unit = rhs) or rhs must be given a wildcard parameter-list (val lhs = rhs _) to make it clear to the compiler what is desired. Unfortunately in this case, Scala’s support for invoking a function without a parameter list, together with its type-inference, makes this additional underscore necessary.

A function with multiple parameters can be curried with syntax like val multiplyBy3 = multiply(3, _:Int), which returns a function taking one parameter that delegates to the underlying function multiply.

Alternatively, a function defined with multiple parameters can be explicitly converted to curried-form via syntax “(fname _).curried”, after which it can be “partially applied” as desired.

  def runme(arg0: String)(arg1: String) = {   // curryable method with more than one argument-lists
    println(s"runme: $arg0, $arg1")
  }

  // arg1 is a normal param, not call-by-name, so the following code-block is evaluated
  // eagerly before runme is invoked.
  runme("hello") {
    "big" + " world"
  }

When a code-block is specified where a parameter is expected, and the parameter has a normal type, then the code-block is evaluated eagerly and the result is passed to the invoked function. When the parameter is marked as “call-by-name” then the code-block is passed as a closure; see later for call-by-name aka lazy parameters.

Explicitly using the Function Classes

References to functions are actually represented as references to instances of one of the Function* types from the Scala standard library.

Type Function1[Arg1, Out] represents all functions which take one parameter; the second generic type in the signature is the return-type. Similar types are defined for functions with more parameters.

The following are equivalent:

val succ = (x:Int) => x + 1

val succFunction = new Function1[Int,Int] {
  def apply(x:Int): Int = x + 1
}

Here succFunction is an instance of an inline-defined anonymous subclass of Function1, and that subclass defines the “function logic” in the default “apply” method of that anonymous subclass. It isn’t usual to deal with Function* types directly in Scala sourcecode but they sometimes appear in stacktraces etc.

Methods vs Functions

Methods are logic that must be invoked in the context of an explicit object instance, ie instance.method(args). Functions are logic that have no explicit context: it has parameters only: function(args). However the two are interchangeable.

A method has an obvious functional equivalent: def somefn(obj, args) => obj.method(args).

And in the section on currying/partial-function-application above, we have discussed how to take a function with N args and bind the first arg to a fixed value, resulting in a function with N-1 args. So applying that approach to the function above:

val obj = new SomeObject()
val somefn = (obj, args) => obj.somemethod(args)
val somefn2 = somefn(obj, _) // bind only the first arg

// and now method obj.somemethod has been converted to a function...
somefn(args)

The function somefn2 has some “hidden context” - the object on which the method will be invoked. However this is not visible to the caller, who can treat somefn2 just like any other function with the same argument list.

The same effect can be achieved more directly:

val obj = new SomeObject()
val somefn2 = (args) => obj.somemethod(args)  // returns a closure capturing a reference to obj

or even more directly via:

val obj = new SomeObject()
val somefn3 = obj.somemethod _

Lazy (call-by-name) Parameters

A method can be defined to take a block of code as a parameter (also known as a “call-by-name argument” or “thunk”):

  def printme(prefix: String, f: => String) = {
    println("in printme")
    println(prefix)
    val fval = f // triggers evaluation of the codeblock associated with f
    println(fval)
  }

Note that f looks somewhat like the examples in the previous section which take a function as a parameter - except that here no input-types are specified for the function f, not even an empty list.

Such a method can be invoked as:

  printme("someprefix", {println("evaluating lazy param"); "value-to-print"})

The code in braces here is not evaluated immediately; instead it forms a closure which is passed to method printme. In effect, param f is lazy.

If method printme had been declared like def printme(prefix:String, f:String) (ie just remove “=>”) then the calling code above would still compile - but the codeblock would be evaluated eagerly.

When the invoked method takes only one parameter, then the parentheses can be dropped (as already described above for normal non-lazy params):

  def printme(f: => String) = {
    println("in printme")
    println(f)
  }

  printme {
    println("evaluating lazy param")
    "value-to-print"
  }

A method taking multiple params can be made usable in the above manner by either rewriting the method to use multiple parameter groups where the last group has a single lazy parameter, or by currying (partially applying) the multi-param method to convert it to a single-param method (ie binding all other params to fixed values).

Do NOT write a lazy param block as {return "value-to-print"}; the return executes in the context of the calling method ie returns from the caller!

Dynamic Types

A class which “extends Dynamic” can be invoked with any method at all. When the compiler sees that source-code is invoking a method which is not defined at compile-time, and the target type implements Dynamic, then the compiler instead produces code that invokes a suitable generic method on the class; for example invoking a simple no-params “getter” such as obj.fieldname() triggers a call to “obj.selectDynamic(fieldname)”.

Loops

There is a for-loop that is similar to Java, though the syntax is slightly different:

   for(item <- list) { ...  }

which is equivalent to

  list.foreach(...)

Integers are objects, and have a method “to” which return a Range object, which is iterable. This allows things like:

   for(i <- 0 to 9) ..

which is equivalent to

  (0 to 9).foreach(....)

The for-loop structure can actually be quite complex, supporting multiple nested expressions with embedded ifs, can be followed by a “yield” statement and other options. And for-loops are actually just “syntactical sugar” for calls to methods flatMap/filter/foreach/map. A more complete discussion of for-loops can be found in the section “for comprehensions” later, after the discussion on flatMap/filter/etc.

While and do-while loops are similar to Java.

Standard Library

Collections, Sequences and Lists

Scala provides a collections library which has a selection of both immutable and mutable types. The immutable ones should be used where possible. Java’s standard library is also available for communicating with Java code.

This library has helper methods for converting to and from Java’s collection types (for interacting with standard Java libraries). Collection types support typical functional operations such as foreach, map, flatMap:

   val mylist = List("a","b","c") // use factory method on List companion object to instantiate an immutable list
   mylist.foreach(value => println(value)) // function aka lambda
   mylist.foreach(println) // method reference (identical behaviour to previous line)

The standard List type is immutable. Like lisp lists, prepending to such a list is efficient, creating a new list that has new elements at the start then points to the original list as its tail. Syntax: “val foo = newelement :: origlist”, which invokes method “origlist.::”

The Scala List type is a “strict” collection. Its transform methods (eg map, filter) immediately process the input list and produce another list. The Scala View and Stream collections are instead “on-demand” (pipeline or lazy) collections. Their transform methods (map,filter,etc) return closures which pull data from the original collection only as-needed (similar to Haskell). Method List.view produces a view over a list.

A stream is an algorithmically-generated sequence where elements are only produced as needed. However once produced they are retained by the stream (not discarded) - so using a stream to generate 10,000 elements will result in an object which uses lots of memory.

The various standard collection types generally provide a public constructor. Usually they also have a companion-object which provides factory methods. Most also provide a builder API, eg Set.newBuilder() which allows objects to be added one at a time, and then a builder method returns the final immutable datastructure.

The standard mutable datastructures provided by Scala, eg MutableBuffer also have methods to return their contents as an immutable collection. However if the point is simply to create an immutable collection then the builder APIs are considered preferable to creating a temporary mutable collection - and are more typesafe.

Arrays

Arrays use () for access, because square-brackets are used for generic-types:

  val foo: Array[String] = Array("hello", "world")
  val hello = foo(0)
  foo(0) = "hi"
  println(foo.mkString(","))

An Array is an unusual type which is immutable in size but mutable in content (like Java arrays). The Vector type is immutable in both length and content while still providing order(1) access to any element like an array does.

The Option Type

Standard library type Option can hold either a reference to an object or null - ie represents an “optional value”.

It provides not just a wapper but also a way of saying “skip this value” when applying a function. Chains of calls like summarize(compute(getFoo())) can be tricky in Java when getFoo might return a null, or when compute might return null for a valid input. However Option(getFoo()).map(compute).map(summarize).getOrElse(default) works fine regardless of whether getFoo returned null; the other methods do not need to be option-aware.

Using Option like this is actually related to the functional concept of Monads - though understanding monads is not needed just to take advantage of option’s ability to skip transformations when the wrapped value is null.

Tip: Applying method flatMap to a list of Option objects returns a list of all the defined values, and discards all the None values (each Option is converted to a list of one value or an empty list, and these lists are then flattened).

There is an excellent article on Scala’s Option type here.

Exceptions

Java Style Exceptions

Scala has no checked exceptions (all are runtime exceptions). Otherwise, similar to Java.

When calling Java libraries that throw checked exceptions, these are converted to runtime exceptions.

A try/catch block in Scala is almost identical to Java. Only difference is in the catch clause, which uses a match-expression:

   try {
     --
   } catch {
      case expr1: ...
   }

Because Scala does not check exceptions, there is no equivalent of Java’s “throws” clause on method declarations. This can cause problems if writing Scala code that is intended to be called from Java; the Java compiler will not complain about uncaught exceptions propagating from a Scala method. For Java compatibility, Scala methods can be annotated with @throws annotations; a Java compiler will then treat that method just like an equivalent Java method with a “throws” clause.

The Try Type

While Scala supports a try/catch syntax for exceptions very similar to Java’s, it is generally considered better style to write methods that return objects representing value-or-error, eg Option, Try or Either, rather than throw exceptions. When invoking a method that does throw exceptions, the standard Try type can be used to handle that method as if it returned (value-or-exception).

Note the difference between try (built-in mechanism, similar to Java) and Try (a standard library type with an apply method which accepts a closure, executes it and maps any exception to a returned value)!

val result = Try {...}
if (result.isSuccess) {...} else {...}

Rather than simply using isSuccess, a number of possibilities exist, including:

  • using match with case-clauses for subtypes Success or Failure,
  • using getOrElse
  • using method map.

As described for the Option type above, invoking monadic methods like map on the result of Try will cause them to be run if the method returned success, and to be skipped if the method threw an exception.

There is an excellent article on using Try here

Try-with-resources

While discussing exception-handling and the try operator, there is a related feature in Java: try-with-resources. Scala does not have this syntax - it must be either implemented manually with a try/finally block or a helper method must be used. Fortunately, Scala’s first-class functions makes this easy. See the discussions here for more information.

For the common case of handling just one resource, this solution appealed to me best:

 def autoClose[A <: AutoCloseable, B](resource: A)(code: A ⇒ B): Try[B] = {
    val tryResult = Try {code(resource)}
    resource.close()
    tryResult
  }

List Efficiency

In order to support head/tail operations efficiently on lists, they really do need to be singly-linked-lists. It would be possible to have an array-type implementation plus offset; however freeing no-longer-needed head objects is then not possible.

At first glance, such linked lists look inefficient compared to things like Java’s ArrayList - it seems that they would generate lots of heap allocations and deallocations and make a lot of work for the garbage collector.

Fortunately, each list node is always of a fixed size (two pointers, being “item” and “nextnode”). Therefore managing reusable pools of these nodes is pretty easy; just allocate an array of 5000 of them at a time. And freed nodes can go back on the head of the list, so “cache hotness” will work well. The Scala runtime does this automatically, and therefore the memory-management overhead is really not too bad - it is certainly not a generic malloc/free (or garbage-collection operation) for each discarded head node.

Note that although lists are immutable, the elements in the list may be mutable objects.

Special Methods

There are several special method-names in Scala: apply, unapply, unapplySeq and update.

Methods unapply and unapplySeq are used with pattern-matching; see the section on pattern-matching for more information.

Apply-method Overview

If a class defines a method named “apply”, then that method can be called without writing “.apply”. This can be considered “the default method” for the class:

   class User .. {
     def apply(prefix: String) {
     }
   }

   val user = new User()
   user.apply("hello, ")  // normal method invocation
   user("hello, ")  // same as above - but looks like a "function call" not a method-call.

There can be multiple apply methods with different parameter-lists.

Apply-methods are used in several ways; the two most common are:

  • defining an apply-method on a class to allow it to act as a “keyed collection” (discussed below)
  • defining an apply-method on a companion-object to provide a factory for the corresponding class (already discussed above)

Map-like Access with Apply and Update

The apply method can be used to provide elegant read-access for a class acting as a “keyed collection”. Similarly, magic method update can be used to also provide write-access:

  val i = a(somekey);   // read: compiler converts this to "i = a.apply(somekey)"
  a(somekey) = 1;       // write: compiler converts this to "a.update(somekey, 1)"

The standard Array class itself uses apply and update:

   val items = new Array("first", "second")
   val item = items(1)  // actually calls Array.apply(Int)

The standard Map class also uses apply and update to provide access to the values it holds.

Standard Functions map, filter, withFilter, flatMap

The methods map, filter, and flatMap are common operations in functional languages, and Scala both supports them and uses them widely.

An iterator is an object or function which provides a series of objects one after another. A sequence is an object which can be iterated over (ie can provide an iterator which returns its contents). Collections such as lists are sequences, and thus are iterable. The Scala iterator type provides methods map, filter, withFilter and flatMap.

Iterator method map takes every element provided by the iterator and applies a caller-specified method to transform that element into another object. The result is a collection containing 1 element for every element provided by the iterator - never more or less (unless the mapping function throws an exception).

Iterator method filter takes every element provided by the iterator, and applies a caller-specified boolean method to decide whether to “accept” the object or not. The result is a collection containing a subset of the elements provided by the iterator.

Note that map cannot filter out elements - everything is mapped 1:1. And filter cannot modify elements - they are either accepted or not. In fact, both of these methods are based on a more general one - flatMap. Flatmap works something like map but instead of being limited to 1:1 conversions it supports 1:N. The function that flatMap applies to each element takes one element as input and returns a list:

  • when the list is empty then the effect is like filter removing an object
  • when the list contains exactly one element then the effect is like map
  • and the list can also contain multiple elements if desired (something not possible with map or filter)

When invoked on standard collection types such as List, these methods are immediate (aka eager); they read all elements from the input iterator and write the results into memory. This is different from lazy-based languages such as Haskell, in which map/filter/flatMap are lazy and only take effect when data is needed.

Method withFilter is an always lazy version of filter; when invoked on an iterator it just returns another iterator. When an object is fetched from this second iterator, it repeatedly fetches elements from the underlying iterator until one is found which matches the filter condition. The effect from the caller viewpoint is identical: whether filter or withFilter is invoked, the set of objects returned from the iterator is just those for which the filter condition matches. However the lazy version can be more efficient - particularly if the entire sequence is not being consumed (eg when doing a find type operation that terminates when the first match is encountered).

Lazy versions of map, flatMap, etc are available via streams and views (eg List.view.map(...))

Two other methods based on flatMap are collect and partition:

  • partition works like filter except that it returns two lists - the “accepted” objects and the “rejected” objects. Method filter can be seen as a special case of partition where the rejected list is thrown away.
  • collect works like filter but the condition is defined via a partial function (eg case-clauses) rather than a boolean expression; which is more elegant is a matter of taste.

Standard Functions fold and reduce

The flatMap-related functions always treat each element in the input individually and unrelated to the others.

The fold family of functions are instead useful when the desired result requires interaction between the elements provided by the input iterator, eg summing a list of integers or finding the longest string in a list of strings. Method reduce is a special case of fold.

For-comprehensions

As noted earlier, for-loops can include multiple expressions (nested looping), embedded if-statements, and can optionally be followed by the yield keyword.

A for-loop without yield is equivalent to a sequence of calls to flatMap (the expressions), withFilter (the if-statements) and a final foreach. A for-loop with yield is equivalent to a sequence of calls to flatMap (the expressions), withFilter (the if-statements) and a final map.

Although for-loops and flatMap/filter/foreach/map are equivalent, complicated for-loops with multiple nested expressions and ifs are often easier to read than their function-based equivalents. However which to use is purely a matter of taste.

A simple loop without yield is effectively a “foreach” call; the following are identical:

  for(i <- 0 to 5; if (i % 2 == 0)) {println s"i is $i"}
  (0 to 5).withFilter(i => i % 2 == 0).foreach {i => println s"i is $i"}

A simple loop with yield is effectively a “map” call; the following are identical:

  for(i <- 0 to 5) yield i*2
  (0 to 5).map(i => i*2)

A for-loop can also be specified with curly-braces instead of parentheses; the syntax is very similar but the rules about using semicolons when specifying multiple expressions and if-statements are slightly different.

This stackoverflow answer has an excellent discussion on for-loops.

Annotations

Scala has annotations which resemble Java annotations. Some standard annotations include:

  • @transient, @volatile
  • @scala.beans.BeanProperty - can be added to any member of a class, and triggers generation of a getX method; mutable members also get a setX method.
  • @tailrec - requires the compiler to report an error if the annotated method cannot be compiled with “tail recursion optimisation”

Custom annotations are simply defined as traits which extend scala.Annotation or one of its subtypes.

Annotation types start with a lower-case letter by convention (unlike Java).

Tail Recursion

Functional programming style uses recursive code definitions more often than traditional OO programming styles do. When a recursive function is implemented correctly, and the compiler implements “tail recursion optimisation” then recursive calls can compile to code just as efficient as an imperative loop. The Scala compiler does implement tail recursion optimisation.

Unfortunately, a slightly different recursive function may be impossible to apply “tail recursion optimisation” to; the code is then much less inefficient and potentially leads to stack-overflow errors. To catch such errors, a recursive function may be annotated with @tailrec; the compiler will then report an error if tail recursion optimisation is not possible.

Algebraic Data Types

An ADT is either a “product type” or “sum type”.

Product types are familiar to OO developers - basically tuples and standard classes.

A Sum type is a finite set of classes implementing the same interface. If an instance of some “interface” type is known to be implemented by concrete type A or B or C, then together they form a sum type. Or in other words, the type is a union of a finite set of concrete implementations.

In OO terminology, different types can be told apart at runtime via “is instance”. In non-OO functional languages, specific instances of a sum type are described as having a “type tag” - the name of the concrete variant of the base interface.

Nested (Inner) Classes and Path-dependent Types

In Java, a class can be declared within another class. When the nested (aka inner) class is declared “static” then this nesting simply affects the name of the nested class (namespacing). When the nested class is not declared “static” then it includes an implicit member which is a reference to a “parent instance”; an instance of the nested type can only be created via a parent instance.

Scala allows classes to be nested within object declarations. These act like Java’s “static inner classes” - they can be instantiated like “new OuterType.InnerType”.

Scala also allow classes to be declared within classes. These can be instantiated either:

  • from outside the enclosing class via syntax “new someref.InnerType
  • from within a method of the enclosing class via the normal syntax “new InnerType”.

In both cases, this returns an object with an internal reference to the outer instance whose type is OuterType#InnerType. Classes nested within classes therefore act like Java non-static inner classes.

When a method of class OuterType returns (or exposes in any way) an instance of InnerType to a caller, and the caller invoked that method via a val (constant) reference then the compiler creates a kind of “type alias” for that returned object, of form refname.InnerType.

Two references to the nested type created via two different constant references are not type-compatible at compile-time. When creating a generic collection (eg a list), adding the first element to that collection implicitly sets the type of that collection, and thereafter instances of the same nested class which are associated with different parents cannot be added to the list - a compile-time failure occurs.

These alias types are called “path-dependent types”. A reference of this type can always be “upcast” to the actual underlying type (OuterType#InnerType).

When the method returning an inner type is invoked via a var (mutable) reference then the returntype is not refname.InnerType but the underlying type OuterType#InnerType as the refname is not a stable value.

Nested types generated through two different references ref1 and ref2 are different (ref1.InnerType and ref2.InnerType) even if ref1 and ref2 point to the same object.

The most useful feature of dependent types is that when using a val reference to invoke a method on OuterType which takes an instance of InnerType as a parameter, the input parameter type is treated as refname.InnerType and therefore the compiler will only accept instances of InnerType which were obtained via the same val reference. Each instance of OuterType thus acts as a “closed system” that only accepts objects from its caller which are guaranteed to have been provided to the caller from the same OuterType instance (and in fact via the same constant reference to that instance).

Path-dependent types can sometimes be useful in preventing bugs (making incorrect combinations of objects impossible at sourcecode time), but are not widely used.

See the Scala Tour for some examples.

Multithreading

Standard class scala.concurrent.Future can be used to spawn threads almost explicitly:

    import scala.concurrent.ExecutionContext.Implicits.global
    val future = scala.concurrent.Future { Thread.sleep(5000); "hi"}
    println(scala.concurrent.Await.result(future, Duration(10, TimeUnit.SECONDS)))

However in general, multithreaded applications should instead be built using the Actor pattern. Since v2.10, the Akka library is part of the Scala standard distribution, and provides an Actor-based library which also supports distributed processing.

Further information on futures can be found here.

The standard library also includes “parallel collections” - most standard types provide a “par” method which returns an object which offers map/flatMap methods which apply an anonymous function in parallel over the collection using a global threadpool managed by the standard library.

When using multi-threading, synchronized access to shared data is of course needed. The equivalent of a Java synchronized method synchronized void foo() is def foo(..) = synchronized {..}.

Lazy Variables

A variable may be marked as lazy in which case the initialization-expression is only evaluated when the variable is first read, eg

 lazy val foo = util.Random.nextInt

Partial Functions

A partial function is a function which is only defined for some inputs. Think of:

def partial1(i: Int): String = {
  if (i==0) "zero"
  else if (i=1) "one"
  else if (i==2) "two"
}

Clearly there are many parameter values for which no return value is defined.

When a “partial” expression is executed with a value which is not handled, what happens depends upon the context. It might trigger an exception, or the return value might be defaulted to None (ie an Option wrapping), or might be defaulted to false.

Partial functions are often written in the form of a sequence of case-statements (without the surrounding match):

  somelist.collect {
    case 1 | 2 => true
  }

Manifests

When a Java class that uses another generic class is compiled, the type-parameter used with that other generic class is lost; this is called “type erasure”. As an example:

  class Foo {
    private List<String> mystrings = new List<>();
  }

just compiles to

  class Foo {
    private List mystrings = new List();
  }

Inspecting object mystrings at runtime will provide no information about the type of object within the list.

This is actually a limitation of the JVM, and therefore Scala code has the same limitation. However Scala works around the issue by using parameters of type Manifest[T]. For any generic type, a manifest can be generated if requested. The usual way to request a manifest is for a generic function (or method on a generic type) to declare an implicit parameter of type Manifest; at each call-site the compiler will then ensure that a Manifest object is passed as part of the function call, describing the real type of the parameter (as known to the compiler at that callsite). Example:

  def foo[A](someParam:A)(implicit mf: Manifest[A]) = {..}

At each location where foo is called, the compiler generates code to pass some external value as someParam - and also generates code to pass a descriptor of the type of someParam as a second parameter. Within method foo, parameter mf can then be used to obtain information about what type someParam actually has - reflection will not provide that information because someParam may have undergone type-erasure (eg when someParam has type List[String]).

Other Minor Features

Operator && is a short-circuit boolean operator, as usual in other languages. operator “&” is a non-short-circuit boolean operator, ie same as “&&” but always evaluates both left and right sides. It is NOT a bit-and operator!

Scala does not have any direct equivalent to Java enums; there are various options with different tradeoffs.

Scala supports “atoms” using form ‘someatom. An atom is a unique identifier and so can sometimes be used where an Enum would be used in Java (ie instead of using unique integer constants).

Package scala.Predef is imported into scope by default (like java.lang in Java).

Due to the lack of semi-colons, line-wrapping in Scala is slightly more complicated than Java. When a single expression is split over multiple lines, each line must either have an unclosed parenthetical or end with an infix method in which the right parameter is not given (eg +).

The recommended code-style for Scala can be found here.

Ecosystem

Here is a short list of some frameworks which are commonly used with Scala applications:

  • play, lift, scalatra, akka
  • scalaz (hard-core functional programming)
  • SBT - scala’s build tool. But scala can also be built with maven etc.

IDEs

Some notes on using the IntelliJ IDE for Scala development:

  • Intellj Scala plugin available from standard JetBrains repo - just open “settings” and search for / install the plugin.
  • When creating first Scala project, you will be prompted for “scala SDK” - select “create” then “download”. Or install Scala manually locally, then select that install location.
  • for intellij+mac, the “scala console” evaluates expressions only after cmd-enter has been pressed.

References