Architecture Of The Literate Code Reader Escript
Introduction
This script is designed to simply apply presentation layer polish (eg typography, layout, whitespace etc) to the presentation of code comments in a way that makes architectural description of software codebases easier to read and comprehend.
(This page is simply a representation of this source file)
The Problem
Traditional documentation systems focus on documenting software as a library or subsystem - as a black box to be reused where the implementation is abstracted away - and by and large they are excellent at it.
But architectural readings are a different beast. If API/interface documentation as a set of descriptions of jigsaw pieces with their particular indents and protrusions explored and described then this approach seeks to provide the picture on the jigsaw box cover.
Not all files in the code base are equally useful in understanding an architecture and in building architecture diagrams sometimes you just want to hide some files whose contents are just noise when the reader is trying to understand the shape of the system.
The Constraints
The following constraints inform the design:
- this is a multi-language project - it must be uncoupled from a particular languages tooling - hence the choice of a self-executing escript
- the architecture documentation must play nicely with API/interface documentation
- the documentation must be in-synch (and hence in-repo) with the code
- most of the code that this could be useful for is on GitHub, so this script needs to play nicely with GitHub Pages which use Jekyll
Input Languages Supported
The following input languages are supported:
- erlang
- elixir
- markdown
- supercollider
- ruby
Output Formats Supported
The following output languages are supported:
- plain markdown
- markdown-for-jekyll (generates table of contents)
- html
html
mode is a bit sucky. It expects a hightlight javascript package in each directory.
You can find that package in the git of this app.
Process flow
Overview
The process flow is show below. The script is run with a series of options:
- where to find the inputs
- where to put the outputs
- what outputs to create
Inputs Outputs
┌──────────────────┐ ╔══════════════════╗
│ │─┐ ║ ║
│ Elixir │ │ ║ List of ║
│ Files │ │ ┌───────────────────────────────────────────────────────▶║ files ║
│ │ │ │ ║ ║
└──────────────────┘ │ │ ╚══════════════════╝
┌──────────────────┐ │ │ ╔══════════════════╗
│ │ │ │ ║ ║
│ Erlang │ │ │ ║ HTML ║
│ Files │─┤ ┏━━━━━━━━━━━━━━━━━━┓ ┏━━━━━━━━━━━━━━━━━━┓ ┏━━━━━━━━━━━━━━━━━━┓ ┌▶║ output ║
│ │ │ ┃ ┃ ┃ ┃ ┃ ┃ │ ║ ║
└──────────────────┘ │ ┃ Tranverse ┃ ┃Exclude files from┃ ┃ Transform into ┃ │ ╚══════════════════╝
┌──────────────────┐ ├▶┃ directories and ┃─▶┃the artchitecture ┃─▶┃ <comments> and ┃─┤ ╔══════════════════╗
│ │ │ ┃ read the files ┃ ┃ documents build ┃ ┃ <code> ┃ │ ║ ║
│ Supercollider │ │ ┃ ┃ ┃ ┃ ┃ ┃ │ ║ Markdown ║
│ Files │─┤ ┗━━━━━━━━━━━━━━━━━━┛ ┗━━━━━━━━━━━━━━━━━━┛ ┗━━━━━━━━━━━━━━━━━━┛ ├▶║ output ║
│ │ │ │ ║ ║
└──────────────────┘ │ │ ╚══════════════════╝
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ ╔══════════════════╗
│ │ │ ║ ║
│ Other │ │ ║ Jekyll ║
Languages │─┘ └▶║ Extensions ║
│ ║ ║
─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ ╚══════════════════╝
The same technique is used to make the inputs and outputs extensible and you can see it in operation wherever you see code like:
level = Kernel.apply(langmodule, :comment_level, [c])
When the files are read the filetype is extracted and a module called extensions
is called:
def get_lang_module(".ex"), do: LiterateCompiler.Languages.Elixir_lang
def get_lang_module(".exs"), do: LiterateCompiler.Languages.Elixir_lang
def get_lang_module(".elixir_architecture"), do: LiterateCompiler.Languages.Elixir_lang
def get_lang_module(".erl"), do: LiterateCompiler.Languages.Erlang
This function returns the name of the module that we use in Kernel.apply
.
Notice that as well as the expected .ex
and .exs
we have a new filetype called .elixir_architecture
.
This file type enables us to write documents like this that contain code snippets which are invisible to
the Elixir compilation tools.
To extend this script to add other languages it is a simple matter of adding extra lines here and
writing a new module under in the LiterateCompiler.Languages
namespace.
Caveat: we use a custom highlight javascript package to cover all the languages. You will need to generate a new version to add a new language. You can do it on the highlightjs download page.
(Read and compare LiterateCompiler.Languages.Elixir_lang and LiterateCompiler.Languages.Erlang)
On the way out, the same applies. We specify an output format in the command line args and then there is a outputter lookup.
grep
for Kernel.apply
to see where all this happens.
CLI Escripts
This is an escript running as a command line.
An escript is just a standard Elixir application bundled with the runtime in a self-extracting format. When run it invokes a function called main/1
with the command line arguments as a list.
The function that is called is specified in the mix.exs
file. A module is nominated as the entry point and that module is expected to expose the function main/1
.
In this escript the cli.ex is used. It uses the standard Elixir Args
to process the arguments
Args
The arguments that the script accepts are best observed by reading the code
Walking The Tree
The code of the Elixir app is written in the filesystem as a set of directories that make up a tree
.
├── index.elixir_architecture
└── literate_compiler
├── args.ex
├── cli.ex
├── extensions.ex
├── languages
│ ├── elixir_lang.ex
│ └── erlang.ex
├── languages.ex
├── outputter
│ ├── html.ex
│ └── markdown.ex
├── outputter.ex
├── process_files.ex
├── toc.ex
└── tree.ex
The module tree.ex provides the functionality to walk the tree. We pass in functions from the module process_files.ex to actually do the work.
Outputs
Some outputs (like list files) are self-contained, the function prints out the files it finds as it goes.
Other require outputs to be created. Once the cli
module has invoked the tree walker it gets back the appropriate output from the walk and it then uses that to generate the outputs
The module Outputter
does most of the work and uses the Kernel.apply
trick to pick up the outputters for markdown
or html
as appropriate.
To be a good citizen the escript can emit a Jekyll table of contents using toc.ex
.
Examination of either the source code of this page, or elixir_lang.ex shows that a special markup is used for Jekyll
There is a problem in that Jekyll
thinks two curly braces without a space (so like ‘{ {‘ but without the space, obvs) is for it and Elixir has a lot of them so a special processing route has to be found.
Because of the clash in needs between document me as a library and show me the architecture it is possible for this script to skip some comments if the language processor supports it.
See the elixir language module for details. Elixir attaches documentation to the AST during parsing using @moduledoc
and @doc
attributes. (See Writing Elixir Documentation for details). With levels we can filter our API focussed documentation (Ex_docs will ignore our inline documentation too).
Finding a style (and uses of levels) for your project and language will be up to you.
Contents
- index.elixir_architecture
- literate_compiler - args.ex
- literate_compiler - cli.ex
- literate_compiler - extensions.ex
- literate_compiler - languages.ex
- literate_compiler - outputter.ex
- literate_compiler - process_files.ex
- literate_compiler - toc.ex
- literate_compiler - tree.ex
- literate_compiler/languages - elixir_lang.ex
- literate_compiler/languages - erlang.ex
- literate_compiler/languages - markdown.ex
- literate_compiler/languages - ruby.ex
- literate_compiler/languages - supercollider.ex
- literate_compiler/outputter - html.ex
- literate_compiler/outputter - markdown.ex