01: Scala Functional Programming basics – pure functions, referential transparency & side effects

Q1. What is a pure function?
A1. A pure function is a function where the following conditions are met:

1) The Input solely determines the output.

2) The function does not change its input.

3) The function does not do anything else except computing the output. This means NO I/O, which is no reading from database, file or input console & no writing to a database, file or output console. No modification of a global variable that is outside the function.

A pure function takes the input & computes the output and returns the output. If a function does anything else outside the scope of computing the output value from the input value like writing the output to a console, read from a file or database, change a global variable, etc then these are known as the side effects. A pure function is a function that is free from these side effects.

Side effects are when

1) Method return a unit (e.g. println).

2) Method change the application state.

3) Method that talk to outside world like files, databases, network, web services, etc.

4) and more

Q2. Is below function addOne(….) a pure function?

A2. It is a pure function as it takes the input and computes the output by adding 1 to the input. For the same input, you will always get the same output. It is free from side effects as it does not modify any global variables or don’t do any I/O operations.

Q3. Is below function addOne(..) a pure function?

A3. No as it mutates the global variable “g”. If you run the function addOne(5) first time you get the result as 6, but if you run it again with the same input of 5, you will get the result as 11 because the value of “g” has changed from 1 to 6 after the first run. So, the above function has a side effect, hence NOT a pure function.

Q4. Is there an easy method to validate the purity of a function?
A4. Yes, by testing for the Referential Transparency.

Q. What is a referential transparency?
A. A function is said to be referentially transparent if we can replace it with a corresponding value without changing it’s behaviour. You can do this with pure functions because the same input will always give you the same output.

The example in Q1 is referentially transparent as we can replace addOne(5) with 6.

The example in Q2 is NOT referentially transparent as the addOne(5) value keeps changing, run 1 will be 6, run 2 will be 11, run 3 will be 16 and so on. The behaviour keeps changing.

Q5. Can you give some real world examples where there is no referential transparency?
A5. Here are a few examples where there is NO referential transparency:

1) Code that writes/reads to databases, files or consoles.

2) Functions which depend on time like getDayOfWeek(), getHour(), System.currentTimeMillis etc.

3) Random number generation.

So, can you write real world applications without reading/writing to a database or file? The answer is NO. As a programmer you must strive to write code that follows referential transparency wherever possible.

Q6. Why is the below addOne(…) function with the println is considered impure even though it returns the same output for a given input?

A6. As per the initial conditions for it be pure there should not be any I/O operation like reading/writing to database, file or console. println is an I/O operation to the console.

Q7. What are the benefits of pure functions?

1) Easier to reason, debug, test & combine as a pure function has no side effects or hidden I/O, so you can get an idea of what it does just by looking at its signature.

Example 1:

Many methods on the Scala collections classes are pure functions like drop, filter, map, foldLeft, etc. The foreach method on collections classes is impure because it’s only used for its side effects, such as printing to STDOUT.

Example 2:

Using andThen & compose functions.

Note: addAhum _ “is an eta expansion“. It converts methods into functions (i.e. Function1, Function2, etc). This different from addHum(_), which is a partial function meaning x => addHum(x).

So, it is incorrect to do addHum(_).andThen(addAhum(_)) because it evaluates to x => addHum(x).compose(y => addAhum(y)).

The functions andThen() and compose() are defined in Function1, Function2, etc as shown below:

Function1 Scala trait

Function1 Scala trait

2) Easier to parallelize as shown below the input.par returns a parallel collection to map, filter, etc. Scala being a hybrid language supporting both OOP & FP, it supports both var (i.e. mutable) & val (i.e. immutable) variable assignments. When you use “val”, once initialised you cannot modify the variable. How can you create real world apps with immutability?. Scala creates a copy of the objects without modifying them. Immutable objects are thread-safe, hence can be parallelized.

3) Lazy evaluation, where a program may postpone evaluating an expression until just-in-time, when its value is needed. Apache Spark, which is a big data computation engine uses this technique at it’s core.

4) Memoizable, which is an optimisation technique to reduce the computational time at the expense of space via caching the results of computations based on the values of the operands or arguments.

Memoization makes sense only if the result of the function will be the same for a given set of arguments or input. Since pure functions have this property, they’re readily memoizable. Here is an example of cacheing the isOdd(..) function value.


Can you build applications without side effects?

An application isn’t very useful if it can’t read or write to the outside world like files & databases. So, you write core of your applications using pure functions and then write an impure “wrapper” around that core to interact with the outside world like files, databases, consoles, etc. There are ways to make impure interactions with the outside world feel a little more pure. You will later learn about Monads, where you can have an IO Monad for dealing with user input, files, networks, and databases. But in the end, FP applications have a core of pure functions combined with other functions to interact with the outside world.

Q. Is the below function pure?

A. No, as it prints output to the console.

Q. can you refactor the above function into 2, where one is pure & one with side effects?
A. Yes, the println(..) needs to be extracted out to a separate function.

This can be further refactored as shown below:


The functions evaluate(..) and evalMsg() can be easily unit tested. They are pure functions that will return same output for the given inputs no matter how many times invoked.

Finally, the above code can be further abstracted by introducing an IO trait as shown below.

Now the function whoGotMore(..) has no side effects as it returns a type IO. The IO returns an action or effect that needs to take place, but does not print it. The printing takes place only when invoke the “run()” method on the IO instance.


SideEffects to composable IO Monad

Here is the code with side effects. The println statements & reading input from the user via console are side effects. The println returns a type Unit.


Let’s capture side effects into a IO Monad as shown below:

Let’s now use the IO Monad into a composable Monad structure. The side effects are deferred until executed with the run:


Bonus Question

This is for the advanced Scala functional programmers & you can come back to this later, once you have read Q&As on Monads.

Q. What do the terms effect or effectful mean in functional programming?
A. These terms have nothing to do with side effects. An effect is related to a concept known as the Monad, which is a container type that has the functions like flatMap & map. The effect means main purpose of each individual monad. An effect is what the monad handles. For example:

Option[T] – is a Monad that has the effect of optionality : You get a Some or None.

Try – abstracts the effect of failures as it manages exceptions as effects.

Future is a monad that models latency as an effect.

IO is a Monad from the Scala cats.effect library for encoding side effects as pure values, capable of expressing both synchronous and asynchronous computations.

The functions with effects return: F[A] rather than A. F & A are types where “A” can be an Int, String, etc and “F” can be an Option, Future, List, etc. For example, the above evaluate(…) function takes (s1: Student, s2: Student) as input & returns Option[Student] as an output (i.e. not just Student). This can be written as as (s1: Student, s2: Student) => Option[Student]. This can be written in an abstract way as (a1: A, a2: A) => F[A] (not just A, but F[A]).

Q. What is the benefit of returning monadic A => F[A]?
A. In Scala, you can sequence operations when they are Monadic with the for-comprehension.

When all 3 are valid values:

When there is a none:

There are more examples of sequencing operations at: Q93 – Q98 Scala monads interview Q&As.

Unlike object oriented programming, in functional programming the Data & behaviour are separated. The Data is represented with Sum & Product types discussed later with Q109 – Q113 Scala ADT (Algebraic Data Types) Interview Q&As and the behaviour is achieved with pure functions & effects.

Functional Programming - Pure Vs Effects

Functional Programming – Pure Vs Effects

Learn more at: Scala: What do “effect” and “effectful” mean in functional programming? by Alvin Alexander.

Java & Big Data Interview FAQs

Java Key Areas Interview Q&As

800+ Java Interview Q&As

Java & Big Data Tutorials