Heading image for post: Elixir's for-loops go beyond "comprehension"

Elixir

Elixir's for-loops go beyond "comprehension"

Profile picture of Vinicius Negrisolo

Elixir's for-loops have so many features in a single statement that it's worth it to review the major ones. In this post we will dig into the for-loop by covering: mapping, filtering, reducing.

Elixir's for-loop is not new to me, I have been using it for years, and still, I didn't know about the reduce option. This made me take a break to look into the docs again and review what it is possible to do with this loop.

To start with, Elixir's for-loop is also known as comprehension. The same word is used in other languages, such as Python or Haskell, to describe a language syntax sugar for applying some common operations to a list. Comprehension in Elixir takes the form of a for-loop. So if you may come from a language that does not have comprehension you may think that a for-loop would only iterate over an enumerable, but it's much more than that.

Mapping

The basics of Elixir's comprehension is to map over an enumerable. This means to iterate over each value of a list and build some other value with that. The return will be a list of built values.

There they are some examples:

for n <- 1..5 do
  n * 2
end
# => [2, 4, 6, 8, 10]
for n <- [1, 2, 3, 4, 5] do
  n * 2
end
# => [2, 4, 6, 8, 10]
for << n <- "acb123" >> do
  n + 1
end
# => 'bdc234'

The element <- list part of the for-loop is called in Elixir as generator and it means that for each element in the list it will assign a variable and process the do block with that.

Multiple generators: Cartesian product

Elixir allows multiple generators per for-loop. In this case all values generated per each generator will be used to call the do block with each possible pair of values. It's easier to see an example to understand how multiple generators work:

names = ~w[James John Patricia]
surnames = ~w[Johnson Smith Williams]

for name <- names,
    surname <- surnames do
  "#{name} #{surname}"
end
# => [
# =>   "James Johnson",
# =>   "James Smith",
# =>   "James Williams",
# =>   "John Johnson",
# =>   "John Smith",
# =>   "John Williams",
# =>   "Patricia Johnson",
# =>   "Patricia Smith",
# =>   "Patricia Williams"
# => ]

The result is a flatten list of all possible names and surnames. This is a very outstanding feature of the language.

Filtering

Filtering in Elixir for-loops can be done by two different approaches:

Filtering by pattern matching

Filtering by pattern matching seems to be an unknown feature for some Elixir devs that I've talked to in the past. In Elixir most of the places that we can use a pattern match the default behavior for a non match is to raise an error. The only exceptions to the language that I know of is in the with and in the for statements when used with the <- left arrow operator. This can be a non-expected behavior if we are not aware of this feature. In other words, if the left side of the generator does not match, then the for-loop will just ignore that value and continue to process.

people = [
  %{name: "John", active: true},
  %{name: "Patricia", active: false}
]
for %{active: true, name: name} <- people do
  name
end
# => ["John"]

In this example we are iterating over a list of people, then for each person, we are filtering the active ones and finally destructing the name value to return it.

A similar approach could be done by using guard clauses:

people = [
  %{name: "John", active: true},
  %{name: "Patricia", active: false}
]
for %{active: active, name: name} when active == true <- people do
  name
end
# => ["John"]

Filtering by truthiness

To explain this section we need to cover another part of the for-loop syntax in elixir.

Elixir allows "non-generator" clauses in the for-loop, and it's called filters. These are called filters because the behavior is to evaluate each filter and if a filter returns nil or false then the value is discarded. But filters can also be a way to destructing element values, or building partial values for the for-loop processing. Let's take a look into some examples:

people = [
  %{name: "John", active: true, age: 30},
  %{name: "Patricia", active: false, age: 45}
]

for person <- people,
    person.age > 40 do
  person.name
end
# => ["Patricia"]

As you can see person.age > 40 is being evaluated for each person and if this "filter" is falsy then the person is discarded as we did with John because he's just too young for the example.

Filters are regular Elixir statements, so you can also use the = operator:

for person <- people,
    name = person.name do
  name
end
# => ["John", "Patricia"]

This also means that a non match in a filter will raise a MatchError:

for person <- people,
    %{wrong_name: name} = person do
  name
end
# => ERROR** (MatchError) no match of right hand side value: %{active: true, age: 30, name: "John"}

Finally just a reminder that the following "destructuring" of active = person.active has the implicit behavior of removing that person from processing. So watch out for this:

for person <- people,
    name = person.name,
    active = person.active do
  "#{name} status is: #{active}"
end
# => ["John status is: true"]

for person <- people,
    name = person.name do
  "#{name} status is: #{person.active}"
end
# => ["John status is: true", "Patricia status is: false"]

Order Matters

This section is to highlight that order of generators and filters matters. Check this example:

for i <- 1..2, IO.inspect(i), j <- 5..6 do
  {i, j}
end
# 1
# 2
# => [{1, 5}, {1, 6}, {2, 5}, {2, 6}]

As we can see here the i value is inspected twice because the range of that is 1..2. But in the following example:

for i <- 1..2, j <- 5..6, IO.inspect(i) do
  {i, j}
end
# 1
# 1
# 2
# 2
# => [{1, 5}, {1, 6}, {2, 5}, {2, 6}]

The i value is inspected 4 times. This happens because both i and j generators happen before the IO.inspect(i). And even though I am not using the value j in the inspection, the generators are expanded already by the "cartesian product" properties of multiple generators. Let's keep that in mind.

Unique

We can send an uniq: true option to the loop, and it's pretty straightforward:

for n <- [1, 1, 2, 2, 2, 3] do
  n
end
# => [1, 1, 2, 2, 2, 3]
for n <- [1, 1, 2, 2, 2, 3], uniq: true do
  n
end
# => [1, 2, 3]

Into

Another useful option to use is into and it works similarly to Enum.into/2:

people = [
  %{name: "John", active: true},
  %{name: "Patricia", active: false}
]
for person <- people, into: %{} do
  {person.name, person}
end
# => %{
# =>   "John" => %{active: true, name: "John"},
# =>   "Patricia" => %{active: false, name: "Patricia"}
# => }
for c <- [48, 49, 50, 51, 52], into: "" do
  <<c>>
end
# => "01234"

Reduce

I finally learned this week about the reduce option. The usage is very intuitive but it comes with a new syntax element to this statement, the acc ->

people = [
  %{name: "John", surname: "Smith", active: true},
  %{name: "John", surname: "Williams", active: true},
  %{name: "Patricia", surname: "Jones", active: false}
]
for person <- people, reduce: %{} do
  acc ->
    Map.update(acc, person.name, [person], &[person|&1])
end
# => %{
# =>   "John" => [
# =>     %{active: true, name: "John", surname: "Williams"},
# =>     %{active: true, name: "John", surname: "Smith"}
# =>   ],
# =>   "Patricia" => [%{active: false, name: "Patricia", surname: "Jones"}]
# => }

As we can see the acc -> act as an anonymous function with without the fn ... end words. This acc -> syntax is not too much to diggest, but it certainly adds to the syntax complexity of the for-loop.

Number of Iterations

When working with Enum we usually run more than 1 operation to a list, so we end up iterating over the list multiple times, basically once per Enum function applied.

people = [
  %{name: "John", active: true},
  %{name: "Kevin", active: true},
  %{name: "Patricia", active: false},
  %{name: "Jennifer", active: true},
]

people
|> Enum.filter(& &1.active)
|> Enum.map(&{&1.name, &1})
|> Enum.into(%{})

In this case we're iterating over the people list 3 times.

But when we work with a for-loop we can reduce that to a single iteration, we could rewrite the same code as:

for %{active: true} = person <- people, into: %{} do
  {person.name, person}
end

This is a performance boost for free.

Conclusion

Elixir's comprehension is a great tool to use when working with enumerables. It brings features, for example, filtering, mapping, reducing and unique lists in a single statement. It's also possible to work with multiple generators if the app needs that, and the cartesian product is a very elegant and very handful approach to that.

The syntax for the for-loop can be a bit more complex if compared to regular for-loops from another programming languages. To be fair that's very much expected as Elixir's comprehension put so many features all together in a single statement.

If you want to read the docs check the Kernel.for/1.

At Hashrocket, we love Elixir and Phoenix! Reach out if you need help with your Elixir projects!

Photo by Ross Sneddon on Unsplash

More posts about Elixir Comprehension