Erlang/OTP 25 Highlights
OTP 25 is finally here. This post will introduce the new features that I am most excited about.
You can download the readme describing all the changes here: Erlang/OTP 25 Readme. Or, as always, look at the release notes of the application you are interested in. For instance here: Erlang/OTP 25 - Erts Release Notes - Version 13.0.
This years highlights are:
- New functions in the
maps
andlists
modules - Selectable features and the new
maybe_expr
feature - Dialyzer
- Improvements of the JIT
- Relocatable installation directory
- ETS-tables with adaptive support for write concurrency
- New option
short
forerlang:float_to_list/2
anderlang:float_to_binary/2
- The new module
peer
supersedes the slave module gen_xxx
modules has got a newformat_status/1
callback- The
timer
module has been modernized and made more efficient - Crypto and OpenSSL 3.0
- CA-certificates can be fetched from the OS standard place
- A new fast Pseudo Random Generator
New functions in the maps
and lists
modules #
Triggered by suggestions from the users we have introduced new functions in the maps
and lists
modules in stdlib
.
maps:groups_from_list/2,3
#
For short we can say that this function take a list of elements and group them. The result is a map #{Group1 => [Group1Elements], GroupN => [GroupNElements]}
.
Let us look at some examples from the shell:
> maps:groups_from_list(fun(X) -> X rem 2 end, [1,2,3]).
#{0 => [2], 1 => [1, 3]}
The provided fun calculates X rem 2
for every element X
in the input list and then group the elements in a map with the result of X rem 2
as key and the corresponding elements as a list value for that key.
> maps:groups_from_list(fun erlang:length/1, ["ant", "buffalo", "cat", "dingo"]).
#{3 => ["ant", "cat"], 5 => ["dingo"], 7 => ["buffalo"]}
In the example above the strings in the input list are grouped into a map based on their length.
There is also a variant of groups_from_list
with an additional fun by which the values can be converted before they are put into their groups.
> maps:groups_from_list(fun(X) -> X rem 2 end, fun(X) -> X*X end, [1,2,3]).
#{0 => [4], 1 => [1, 9]}
In the example above the elements X
in the list are grouped according the X rem 2
calculation but the values stored in the groups are the elements multiplied by themselves (X * X
).
> maps:groups_from_list(fun erlang:length/1, fun lists:reverse/1, ["ant", "buffalo", "cat", "dingo"]).
#{3 => ["tna","tac"],5 => ["ognid"],7 => ["olaffub"]}
In the example above the strings from the input list are grouped according to their length and they are reversed before they are stored in the groups.
For more details see the maps:groups_from_list/2
documentation.
lists:enumerate/1,2
#
Takes a list of elements and returns a new list where each element has been associated with its position in the original list. Returns a new list with tuples of the form {I, H}
where I
is the position of H
in the original list. The enumeration starts with 1 and increases by 1 in each step.
Example:
> lists:enumerate([a,b,c]).
[{1,a},{2,b},{3,c}]
There is also a enumerate/2
function which can be used to set the initial number to something else than 1. See example below:
> lists:enumerate(10, [a,b,c]).
[{10,a},{11,b},{12,c}]
For more details see the lists:enumerate/1
documentation.
lists:uniq/1,2
#
Removes duplicates from a list while preserving the order of the elements. The first occurrence of each element is kept.
We already have lists:usort
which also removes duplicates but returns a sorted list.
Examples:
> lists:uniq([3,3,1,2,1,2,3]).
[3,1,2]
> lists:uniq([a, a, 1, b, 2, a, 3]).
[a, 1, b, 2, 3]
lists:uniq/2
allows the user to specify with a fun how to determine that 2 elements in the list are equal. In the example below the provided fun is just testing the first element of the 2 tuples for equality.
Examples:
> lists:uniq(fun({X, _}) -> X end, [{b, 2}, {a, 1}, {c, 3}, {a, 2}]).
[{b, 2}, {a, 1}, {c, 3}]
For more details see the lists:uniq/1
documentation.
Selectable features and the new maybe_expr
feature #
Selectable features is a new mechanism and concept where a new potentially incompatible feature (language or runtime), can be introduced and tested without causing troubles for those that don’t use it.
When it comes to language features the intention is that they can be activated per module with no impact on modules where they are not activated.
Let’s use the new maybe_expr
feature as an example.
In module my_experiment
the feature is activated and used like this:
-module(my_experiment).
-export([foo/1]).
%% Enable the feature maybe_expr in this module only
%% Makes maybe a keyword which might be incompatible
%% in modules using maybe as a function name or an atom
-feature(maybe_expr,enable).
foo() ->
maybe
{ok, X} ?= f(Foo),
[H|T] ?= g([1,2,3]),
...
else
{error, Y} ->
{ok, "default"};
{ok, _Term} ->
{error, "unexpected wrapper"}
end.
The compiler will note that the feature maybe_expr
is enabled and will handle the maybe construct correctly. In the generated .beam
file it will also be noted that
the module has enabled the feature.
When starting an Erlang node the specific feature (or all) must be enabled otherwise the .beam
file with the feature will not be allowed for loading.
erl -enable-feature maybe_expr
Or
erl -enable-feature all
For more details see the feature section in the Erlang Reference Manual.
The new maybe_expr
feature EEP-49 #
The EEP-49 “Value-Based Error Handling Mechanisms”, was suggested by Fred Hebert already 2018 and now it has finally been implemented as the first feature within the new feature concept.
The maybe ... end
construct is similar to begin ... end
in that it is used to group multiple distinct expressions as a
single block. But there is one important difference in that the
maybe
block does not export its variables while begin
does
export its variables.
A new type of expressions (denoted MatchOrReturnExprs
) are introduced, which are only valid within a
maybe ... end
expression:
maybe
Exprs | MatchOrReturnExprs
end
MatchOrReturnExprs
are defined as having the following form:
Pattern ?= Expr
This definition means that MatchOrReturnExprs
are only allowed at the
top-level of maybe ... end
expressions.
The ?=
operator takes the value returned by Expr
and pattern matches
it against Pattern
.
If the pattern matches, all variables from Pattern
are bound in the local
environment, and the expression is equivalent to a successful Pattern = Expr
call. If the value does not match, the maybe ... end
expression returns the
failed expression directly.
A special case exists in which we extend maybe ... end
into the following form:
maybe
Exprs | MatchOrReturnExprs
else
Pattern -> Exprs;
...
Pattern -> Exprs
end
This form exists to capture non-matching expressions in a MatchOrReturnExprs
to handle failed matches rather than returning their value. In such a case, an
unhandled failed match will raise an else_clause
error, otherwise identical to
a case_clause
error.
This extended form is useful to properly identify and handle successful and unsuccessful matches within the same construct without risking to confuse happy and unhappy paths.
Given the structure described here, the final expression may look like:
maybe
Foo = bar(), % normal exprs still allowed
{ok, X} ?= f(Foo),
[H|T] ?= g([1,2,3]),
...
else
{error, Y} ->
{ok, "default"};
{ok, _Term} ->
{error, "unexpected wrapper"}
end
For more details see the maybe section in the Erlang Reference Manual.
Motivation #
With the maybe
construct it is possible to reduce deeply nested conditional expressions and make messy patterns found in the wild unnecessary. It also provides a better separation of concerns when implementing functions.
Reducing Nesting #
One common pattern that can be seen in Erlang is deep nesting of case
... end
expressions, to check complex conditionals.
Take the following code taken from Mnesia, for example:
commit_write(OpaqueData) ->
B = OpaqueData,
case disk_log:sync(B#backup.file_desc) of
ok ->
case disk_log:close(B#backup.file_desc) of
ok ->
case file:rename(B#backup.tmp_file, B#backup.file) of
ok ->
{ok, B#backup.file};
{error, Reason} ->
{error, Reason}
end;
{error, Reason} ->
{error, Reason}
end;
{error, Reason} ->
{error, Reason}
end.
The code is nested to the extent that shorter aliases must be introduced
for variables (OpaqueData
renamed to B
), and half of the code just
transparently returns the exact values each function was given.
By comparison, the same code could be written as follows with the new construct:
commit_write(OpaqueData) ->
maybe
ok ?= disk_log:sync(OpaqueData#backup.file_desc),
ok ?= disk_log:close(OpaqueData#backup.file_desc),
ok ?= file:rename(OpaqueData#backup.tmp_file, OpaqueData#backup.file),
{ok, OpaqueData#backup.file}
end.
Or, to protect against disk_log
calls returning something else than ok |
{error, Reason}
, the following form could be used:
commit_write(OpaqueData) ->
maybe
ok ?= disk_log:sync(OpaqueData#backup.file_desc),
ok ?= disk_log:close(OpaqueData#backup.file_desc),
ok ?= file:rename(OpaqueData#backup.tmp_file, OpaqueData#backup.file),
{ok, OpaqueData#backup.file}
else
{error, Reason} -> {error, Reason}
end.
The semantics of these calls are identical, except that it is now much easier to focus on the flow of individual operations and either success or error paths.
Dialyzer #
-
Dialyzer now supports the
missing_return
andextra_return
options to raise warnings when specifications differ from inferred types. These are similar to, but not quite as verbose, asoverspecs
andunderspecs
. -
Dialyzer now better understands the types for
min/2
,max/2
, anderlang:raise/3
. Because of that, Dialyzer can potentially generate new warnings. In particular, functions that useerlang:raise/3
could now need a spec with ano_return()
return type to avoid an unwanted warning.
Improvements of the JIT #
The JIT compiler introduced in Erlang/OTP 24 improved the performance for Erlang applications.
Erlang/OTP 25 introduces some major improvements of the JIT:
-
The JIT now supports the AArch64 (ARM64) architecture, used by (for example) Apple Silicon Macs and newer Raspberry Pi devices.
-
Better code generated based on types provided by the Erlang compiler.
-
Better support for
perf
andgdb
with line numbers for Erlang code.
Support for AArch64 (ARM64) #
How much speedup one can expect from the JIT compared to the interpreter varies from nothing to up to four times.
To get some more concrete figures we have run three different benchmarks with the JIT disabled and enabled on a MacBook Pro (M1 processor; released in 2020).
First we ran the EStone benchmark. Without the JIT, 691,962 EStones were achieved and with the JIT 1,597,949 EStones. That is, more than twice as many EStones with the JIT.
Next we tried running Dialyzer to build a small PLT:
dialyzer --build_plt --apps erts kernel stdlib
With the JIT, the time for building the PLT was reduced from 18.38 seconds down to 9.64 seconds. That is, almost but not quite twice as fast.
Finally, we ran a benchmark for the base64 module included in this Github issue.
With the JIT:
== Testing with 1 MB ==
fun base64:encode/1: 1000 iterations in 11846 ms: 84 it/sec
fun base64:decode/1: 1000 iterations in 14617 ms: 68 it/sec
Without the JIT:
== Testing with 1 MB ==
fun base64:encode/1: 1000 iterations in 25938 ms: 38 it/sec
fun base64:decode/1: 1000 iterations in 20603 ms: 48 it/sec
Encoding with the JIT is almost two and half times as fast, while the decoding time with the JIT is about 75 percent of the decoding time without the JIT.
Type-based optimizations #
The JIT translates one BEAM instruction at the time to native code
without any knowledge of previous instructions. For example, the native
code for the +
operator must work for any operands: small integers that
fit in 64-bit word, large integers, floats, and non-numbers that should
result in raising an exception.
In Erlang/OTP 25, the compiler embeds type information in the BEAM file to the help the JIT generate better native code without unnecessary type tests.
For more details, see the blog post Type-Based Optimizations in the JIT.
Better support for perf
and gdb
#
It is now possible to profile Erlang systems with perf and get a mapping from the JIT code to the corresponding Erlang code. This will make it easy to find bottlenecks in the code.
The same goes for gdb
which also can show which line of Erlang code a specific address in the JIT code corresponds to.
Perf is a Linux command-line tool for lightweight CPU profiling; it checks CPU performance counters, trace points, uprobes, and kprobes, monitors program events, and creates reports.
An Erlang node running under perf
can be started like this:
perf record --call-graph fp -- erl +JPperf true
The result from perf could then be viewed like this:
perf report
It is also possible to attach perf
to an already running Erlang node like this:
# start Erlang at get the Pid
erl +JPperf true
And the pid for the node is 4711
You can then attach perf
to the node like this:
sudo perf record --call-graph fp -p 4711
Below is an example where perf
is run to analyze dialyzer
building a PLT like this:
ERL_FLAGS="+JPperf true +S 1" perf record --call-graph=fp \
dialyzer --build_plt -Wunknown --apps compiler crypto erts kernel stdlib \
syntax_tools asn1 edoc et ftp inets mnesia observer public_key \
sasl runtime_tools snmp ssl tftp wx xmerl tools
The above code is run using +S 1 to make the perf output easier to understand.
If you then run perf report -f --no-children
you may get something similar to this:
Frame pointers are enabled when the +JPperf true
option is passed, so you can
use perf record --call-graph=fp
to get more context.
Any Erlang function in the report is prefixed with a $
and all C functions have
their normal names. Any Erlang function that has the prefix $global::
refers
to a global shared fragment.
So in the above, we can see that we spend the most time doing eq
, i.e. comparing two terms.
By expanding it and looking at its parents we can see that it is the function
erl_types:t_is_equal/2
that contributes the most to this value. Go and have a look
at it in the source code to see if you can figure out why so much time is spent there.
After eq
we see the function erl_types:t_has_var/1
where we spend almost
5% of the entire execution in. A while further down you can see copy_struct_x
which is the function used to copy terms. If we expand it to view the parents
we find that it is mostly ets:lookup_element/3
that contributes to this time
via the Erlang function dialyzer_plt:ets_table_lookup/2
.
perf
tips and tricks #
You can do a lot of neat things with perf
. Below is a list of some of the options
we have found useful:
perf report --no-children
Do not include the accumulation of all children in a call.perf report --call-graph callee
Show the callee rather than the caller when expanding a function call.perf archive
Create an archive with all the artifacts needed to inspect the data on another host. In early version of perf this command does not work, instead you can use this bash script.perf report
gives “failed to process sample” and/or “failed to process type: 68” This probably means that you are running a buggy version of perf. We have seen this when running Ubuntu 18.04 with kernel version 4. If you update to Ubuntu 20.04 or use Ubuntu 18.04 with kernel version 5 the problem should go away.
Improved error information for failing binary construction #
Erlang/OTP 24 introduced improved BIF error information to provide more information when a call to a BIF failed.
In Erlang/OTP 25, improved error information is also given when the creation of a binary using the bit syntax fails.
Consider this function:
bin(A, B, C, D) ->
<<A/float,B:4/binary,C:16,D/binary>>.
If we call this function with incorrect arguments in past releases we will just be told that something was wrong and the line number:
1> t:bin(<<"abc">>, 2.0, 42, <<1:7>>).
** exception error: bad argument
in function t:bin/4 (t.erl, line 5)
But which part of line 5? Imagine that t:bin/4
was called from deep
within an application and we had no idea what the actual values for
the arguments were. It could take a while to figure out exactly what
went wrong.
Erlang/OTP 25 gives us more information:
1> c(t).
{ok,t}
2> t:bin(<<"abc">>, 2.0, 42, <<1:7>>).
** exception error: construction of binary failed
in function t:bin/4 (t.erl, line 5)
*** segment 1 of type 'float': expected a float or an integer but got: <<"abc">>
Note that the module must be compiled by the compiler in Erlang/OTP 25 in order to get the more informative error message. The old-style message will be shown if the module was compiled by a previous release.
Here the message tells us that first segment in the construction was given
the binary <<"abc">>
instead of a float or an integer, which is the expected
type for a float
segment.
It seems that we switched the first and second arguments for bin/4
,
so we try again:
3> t:bin(2.0, <<"abc">>, 42, <<1:7>>).
** exception error: construction of binary failed
in function t:bin/4 (t.erl, line 5)
*** segment 2 of type 'binary': the value <<"abc">> is shorter than the size of the segment
It seems that there was more than one incorrect argument. In this case, the message tells us that the given binary is shorter than the size of the segment.
Fixing that:
4> t:bin(2.0, <<"abcd">>, 42, <<1:7>>).
** exception error: construction of binary failed
in function t:bin/4 (t.erl, line 5)
*** segment 4 of type 'binary': the size of the value <<1:7>> is not a multiple of the unit for the segment
A binary
segment has a default unit of 8. Therefore, passing a bit string of
size 7 will fail.
Finally:
5> t:bin(2.0, <<"abcd">>, 42, <<1:8>>).
<<64,0,0,0,0,0,0,0,97,98,99,100,0,42,1>>
Improved error information for failed record matching #
Another improvement is the exceptions when matching of a record fails.
Consider this record and function:
-record(rec, {count}).
rec_add(R) ->
R#rec{count = R#rec.count + 1}.
In past releases, failure to match a record or retrieve an element from a record would result in the following exception:
1> t:rec_add({wrong,0}).
** exception error: {badrecord,rec}
in function t:rec_add/1 (t.erl, line 8)
Before Erlang/OTP 15 that introduced line numbers in exceptions, knowing which record that was expected could be useful if the error occurred in a large function.
Nowadays, unless several different records are accessed on the same line, the line number makes it obvious which record was expected.
Therefore, in Erlang/OTP 25 the badrecord
exception has been changed
to show the actual incorrect value:
2> t:rec_add({wrong,0}).
** exception error: {badrecord,{wrong,0}}
in function t:rec_add/1 (t.erl, line 8)
The new badrecord
exceptions will show up for code that has been compiled
with Erlang/OTP 25.
Relocatable installation directory #
Previously shell scripts (e.g., erl
and start
) and the RELEASES
file
for an Erlang installation depended on a hard coded absolute path to the
installation’s root directory. This made it cumbersome to move an
installation to a different directory which can be problematic for platforms
such as Android (#2879) where the
installation directory is unknown at compile time. This is fixed by:
-
Changing the shell scripts so that they can dynamically find the
ROOTDIR
. The dynamically foundROOTDIR
is selected if it differs from the hard-codedROOTDIR
and seems to point to a valid Erlang installation. Thedyn_erl
program has been changed so that it can return its absolute canonicalized path when given the –realpath argument (dyn_erl gets its absolute canonicalized path from the realpath POSIX function). The dyn_erl’s –realpath functionality is used by the scripts to get the root dir dynamically. -
Changing the release_handler module that reads and writes to the
RELEASES
file so that it prependscode:root_dir()
whenever it encounters relative paths. This is necessary since the current working directory can be changed so it is something different thancode:root_dir()
.
ETS-tables with adaptive support for write concurrency #
It has since long been possible to optimize an ETS table for write concurrency doing like this:
ets:new(my_table, [{write_concurrency, true}]).
Now we also introduce adaptive support for write concurrency which can be configured like this:
ets:new(my_table, [{write_concurrency, auto}]).
This option forces tables to automatically change the number of locks that are used at run-time depending on how much concurrency is detected. When you enable automatic write concurrency decentralized_counters
are also activated for even more scalable ETS tables. Use this option when you know that a lot of processes will be accessing an ETS table on systems with many number of cores.
For more details you can read PR 5208 that introduced the change and the blog post about decentralized counters.
New option short
for erlang:float_to_list/2
and erlang:float_to_binary/2
#
A new option called short
has been added to the functions erlang:float_to_list/2
and erlang:float_to_binary/2
. This option creates the shortest correctly rounded string representation of the given float that can be converted back to the same float again.
If option short
is specified, the float is formatted
with the smallest number of digits that still guarantees that
F =:= list_to_float(float_to_list(F, [short]))
When the float is inside the range (-2⁵³, 2⁵³), the notation that yields the smallest number of characters is used (scientific notation or normal decimal notation). Floats outside the range (-2⁵³, 2⁵³) are always formatted using scientific notation to avoid confusing results when doing arithmetic operations.
The implementation is contributed by Thomas Depierre and uses the Ryū algorithm.
Ryū, is a new algorithm to convert binary floating point numbers to their decimal representations using only fixed-size integer operations. Ryū is simpler and approximately three times faster than the previously fastest implementation. https://github.com/ulfjack/ryu
The new module peer
supersedes the slave module #
The peer
module provides functions for starting linked Erlang nodes. The Erlang node spawning new “peer” nodes is called origin
, and the newly started nodes are peers.
A peer node automatically terminates when it loses the control connection to the origin. This connection could be an Erlang distribution connection, or an alternative - TCP or standard I/O. The alternative connection provides a way to execute remote procedure calls even when Erlang Distribution is not available, allowing to test the distribution itself.
Peer node terminal input/output is relayed through the origin. If a standard I/O alternative connection is requested, console output also goes via the origin, allowing debugging of node startup and boot script execution (see -init_debug). File I/O is not redirected, contrary to slave
behavior.
The peer node can start on the same or a different host (via ssh) or in a separate container (for example Docker). When the peer starts on the same host as the origin, it inherits the current directory and environment variables from the origin.
Note #
This module is designed to facilitate multi-node testing with Common Test. Use the ?CT_PEER() macro to start a linked peer node according to Common Test conventions: crash dumps written to specific location, node name prefixed with module name, calling function, and origin OS process ID). Use random_name/1
to create sufficiently unique node names if you need more control.
A peer node started without alternative connection behaves similarly to slave(3)
.
gen_XXX
modules has got a new format_status/1
callback. #
The format_status/2
callback for gen_server
, gen_statem
and gen_event
has been deprecated in favor of the new format_status/1
callback.
The new callback adds the possibility to limit and change many more things than the just the state.
The purpose with both the old and the new format_status
callbacks are to let the user filter away sensitive information and possibly data of huge volume from the crash reports.
The timer
module has been modernized and made more efficient #
The timer module has been modernized and made more efficient, which makes the timer server less susceptible to being overloaded. The timer:sleep/1
function now accepts an arbitrarily large integer.
Crypto and OpenSSL 3.0 #
Some applications in OTP like SSL/TLS and SSH need cryptography to work. That is provided by the OTP application crypto, which interfaces Erlang to an external cryptolib in C using NIFs. The main example of such an external cryptolib is OpenSSL.
The OpenSSL cryptolib exists in many versions. OTP/crypto supports 0.9.8c and later, although only 1.1.1 is still maintained by OpenSSL.
OpenSSL has released its version 3.0 series, which is their future platform totally re-built with a new API. The APIs of previous versions (1.1.1 and older) are partly deprecated, although still available in 3.0. The support of 1.1.1 will also end in a future.
Since it is vital to get security patches in the cryptolib, and in a future only the 3.0 API might be available, OTP/crypto now from OTP-25.0 interfaces OpenSSL 3.0 using the new 3.0 API. A few functions from old APIs are still used, but they will be replaced as soon as possible.
You as a user will hopefully not notice any difference: if you have OpenSSL 1.1.1 (or older - not recommended) and build OTP, that one will be used as previously. If you have any OpenSSL 3.0 version installed, that one will be used without need of doing anything special except for normal handling of dynamic loading paths in the OS.
CA-certificates can be fetched from the OS standard place #
With the new functions public_key:cacerts_load/0,1
and public_key:cacerts_get/0
the CA certificates can be fetched from the standard place of the OS (or from a file).
They will then be cached in decoded form by use of persistent_term
which makes them available in an efficient way for the ssl
and httpc
modules. The intention with this is to make it unnecessary to depend on for example certifi
in many packages.
On Windows and MacOSx the certificate store is not an ordinary file so the information is fetched via an API using a NIF (Windows) or with an external program (MacOSx).
Example with ssl
%% makes the certificates available without copying
CaCerts = public_key:cacerts_get(),
% use the certificates when establishing a connection
{ok,Socket} = ssl:connect("erlang.org",443,[{cacerts,CaCerts}, {verify,verify_peer}]),
...
We also plan to update the http client (httpc
) to use this soon.
A new fast Pseudo Random Generator #
A new custom designed Pseudo Random Generator rand:mwc59
has been implemented. It is probably the fastest possible
generator with good quality that can be written in Erlang.
To do this it barely avoids bignums, allocating heap data,
and uses only a minimal number of fast operations.
Under the “right” circumstances: A number that takes 60 ns to generate
with the default generator can be generated in 4 ns with rand:mwc59
.
It is intended for applications in dire need for speed
in PRNG numbers, but not any of the comfort features
that rand
otherwise offers.