This EEP proposes a structured documentation API for Erlang where the documentation is handled as part of the language parser and included directly in the compiled .beam files, as a replacement for EDoc. Python, Elixir, and Clojure are examples of languages that follow this approach of treating documentation as data rather than code comments.
The main limitation in EDoc today is that the documentation is kept as code comments. This requires an explicit tool to parse said code comments, which complicates access to the docs by IDEs, from the shell, etc. There have been recent improvements in this area by making EDoc compile to EEP 48 but it still requires an explicit step.
Furthermore, the “code comments” approach is more complex implementation wise, as it requires parsing the source code alongside code comments, parsing the code comments, and so on. Personally, I also enjoy the explicit distinction between documentation and code comments: they have different requirements and different audiences.
This EEP proposes the addition of two module attributes to Erlang: -doc
and -moduledoc
. It also includes an extra section on additional features
that can make aid writing documentation, but they are optional and some
of them should likely be explored more in depth in other EEPs.
Still, they are included to provide a long term view of the features and
challenges related to structured documentation.
As with EEP 48, this proposal pertains exclusively to API references and their documentation. It doesn’t cover guides, tutorials, and other documentation formats.
This EEP proposes two new attributes: -doc
and -moduledoc
. They could
be used as follows:
-module(base64).
-moduledoc "
Convenience functions for encoding and decoding from base64.
".
-doc "
Encodes the given binary to base64.
".
-spec encode(binary()) -> binary().
encode(Binary) ->
% ....
-doc "
Decodes the given binary from base64.
".
-spec decode(binary()) -> {ok, binary()} | error.
encode(Binary) ->
% ....
The new -moduledoc
attribute can be listed anywhere and it will contain
the documentation for the given module. The -doc
attribute must be
listed anywhere before a function and it will contain the documentation
for the following function. For instance, the example below:
-doc "Example".
-spec example() -> ok.
example() -> ok.
is equivalent to:
-spec example() -> ok.
-doc "Example".
example() -> ok.
Listing multiple -doc
attributes with string values for the same
function should warn or error accordingly, unless the documentation
is being set to hidden. For example, this is valid:
-doc "Example".
-doc hidden.
example() -> ok.
But this should warn/error:
-doc "Example".
-doc "Updated example".
example() -> ok.
This as well:
-doc "Example".
example(one) -> 1;
-doc "Updated example".
example(two) -> 2;
The module attribute must either be a string OR the atom hidden
.
Marking a module as hidden means it won’t be part of the doc.
For example, imagine the base64
module above delegates some of
its logic to a private base64_impl
module:
-module(base64_impl).
-moduledoc hidden.
Note a module may be hidden but individual functions can still be documented:
-module(base64_impl).
-moduledoc hidden.
-doc "
Some comments as if it was public.
".
decode64(Binary) ->
% ...
According to EEP 48, this is intentional. For example, base64_impl
should be private for users of the base64
functionality, but a
developer working directly on the base64
may still want to access
the docs for base64_impl
functions directly from their IDE. Each
documentation tool should honor hidden
accordingly. If no -doc
is provided, it defaults to none
according to EEP 48.
The -doc
attribute accepts the hidden
atom too.
Some developers prefer to not place the documentation alongside the
source code. For such cases, -doc
and -moduledoc
may also provide
a {file, Path}
, where Path
is a relative path from the root of the
project to the documentation source:
-moduledoc({file, "doc/src/manual/my_module.asciidoc"}).
-doc({file, "doc/src/manual/my_module.my_function.asciidoc"}).
The file will be read by the compiler and embedded into the chunk at compilation time.
It is up for debate if private functions should support the -doc
attribute or not. Elixir warns if this is used, but this has been a
source of complaints in the past. Given that Erlang declares the
visibility of a function outside of its definition, via the -export
attribute, I personally argue that Erlang should allow -doc
for
non-exported functions, especially to avoid warnings when flags
such as -compile export_all
are used. In such cases, however,
the values of the -doc
attribute should never go to the Docs
chunk, as per EEP 48.
The -doc
attribute can also be used to document types and callbacks.
Here, we have two options:
-doc
attribute-callbackdoc
and -typedoc
attributesThose approaches are mostly equivalent.
The new module attributes must also support documentation metadata by passing a map as argument:
-module(beam64).
-moduledoc "
Convenience functions for encoding and decoding from base64.
".
-moduledoc #{
author => [<<"The Erlang/OTP team">>],
license => <<"Apache 2 License">>,
cross_references => [binary]
}.
If the -moduledoc
is called multiple times with a map, the maps will
be merged. This comes with the added benefit that shared metadata can
be moved to a header file:
%% prelude.hrl
-moduledoc #{
authors => [<<"The Erlang/OTP team">>],
license => <<"Apache 2 License">>
}.
which we can then include and augment:
-module(beam64).
-include("prelude.hrl").
-moduledoc "
Convenience functions for encoding and decoding from base64.
".
-moduledoc #{cross_references => [binary]}.
A list of built-in attributes is available on EEP 48.
Compiling a module with the -moduledoc
or -doc
attributes will
automatically generate a Docs chunk into its .beam file, making the
documentation directly accessible in the shell. A -nodocs
flag can
be added to erlc
to skip the generation of the chunk.
Release tools should also prune the Documentation chunk out of .beam
files by default. Note this is already done by beam_lib:strip_release/1
and beam_lib:strip_files/1
.
One important discussion about documentation is what is the documentation format that the documentation should adopt. Luckily, EEP 48 is agnostic to the format, however one must still be listed. There are a couple options to this problem:
The Erlang/OTP picks a documentation format as the default for the Erlang community, such as AsciiDoc.
The documentation format must be explicitly listed via a new
attribute called -docformat
, such as -docformat "text/asciidoc".
This proposal does not attempt to discuss the prefered documentation format, as it is a separate discussion. It is also possible to support both options above (pick a format but allow it to be overridden).
This section is going to list other topics, and future extensions, that are related to this topic but are not required for the implementation of this EEP.
In the examples above, we have used char lists for the documentation. However, the Docs chunk requires the documentation to be a binary string. We have a couple of options:
The third option is likely the most reasonable for Erlang,
since the string
module treats both lists and binaries
as strings.
Note that similar concerns apply to documentation metadata. Attributes such as License, Authors, and so on, are required to be binaries. Since those attributes are known upfront, they can be validated/normalized by the parser/compiler.
With this in mind, it is important to remember that Erlang binaries are latin1 by default. Therefore, if the documentation contains non-latin1 characters, one would have to write:
-doc <<"Documentation about púnctuation"/utf8>>.
Similarly, the Authors metadata:
-doc #{
authors => [<<"José Valim"/utf8>>]
}.
For this reason, the Build and Packaging Working Group of the
Erlang Ecosystem Foundation has also discussed the option of making
Unicode binary strings easier to write in Erlang, such as
u"Héllò World"
. While adding such construct is not a direct part
of this EEP, the point made in this subsection is that, if we require
the documentation or its metadata to be a binary, we may need to
provide conveniences for writing UTF-8 binary strings in Erlang.
Another feature often related to structure documentation is triple-quoted strings. To understand why they may be useful, consider the following example that has a double quote in its documentation:
-doc "Remove double-quotes (\") from the given string".
remove_double_quotes(String) ->
...
For this reason, we may support triple-quoted strings, which reads better in multi-line format and reduces the need for escaping:
-doc """
Remove double-quotes (") from the given string
""".
remove_double_quotes(String) ->
...
Note, however, this is not a strict requirement. While Python and Elixir provide this feature, Clojure is a counter example of a language with structured documentation without triple-quoted strings. On the other hand, some features such as doctests, detailed below, may strongly rely on triple-quoted strings.
A direct consequence of making the documentation more structured and accessible is that Erlang can include doctests, which is the ability to run and validate the examples in your documentation. For example, someone could write this:
-doc """
Encodes the given binary to base64.
1> base64:encode("hello").
<<"aGVsbG8=">>
""".
-spec encode(binary()) -> binary().
encode(Binary) ->
% ....
And then in your test suite:
doctests(_Config) ->
ct_doctest:run(base64).
The doctest attribute will access the documentation entries in the base64 Docs chunk, extract all of the examples, and run them. Of course, while there is nothing stopping doctests from being implemented on top of EDoc today, this EEP makes doctests considerably simpler to implement.
Doctests would benefit from a separate EEP, as there are some extra considerations, as doctesting exceptions, unparseable formats, etc, but it is worth mentioning them given their benefits to users and documentation authors.
warn_missing_doc
#
Another optional feature is to support a -compile(warn_missing_doc).
that emits a warning whenever an exported function is not documented.
If this proposal is to be accepted, what happens with Edoc?
My proposal would be for EDoc to be repurposed with a new backend that updates the EDoc documentation in existing files to the new module attribute format. This way, EDoc can eventually be deprecated and phased out.
The HTML rendering part of EDoc can be repurposed to work on the Docs chunk, allowing the Docs chunk to benefit from it. Another option is to use ExDoc, which supports Erlang projects either by running it as an escript or via Rebar3 integration.
Similarly, if the goal is to eventually phase out EDoc, the Erlang/OTP
team needs to discuss what happens with the existing EDoc to XML
conversion in erl_docgen
. Two options exist:
Generate EDoc to XML one last time and use the XML as the source from now on
Convert EDoc to the documentation style in this proposal and teach erl_docgen how to convert the Docs chunk to XML
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.