Issue 45: trees

Every 5th issue of this email I do a small dive into a specific topic somewhat related to Elm. You can read all these issues under the special tag. As I was thinking which area to cover this time, I wanted to return to testing again due to selfish motives. I was so blown away by the property-based testing, and it's such an interesting area that I wanted to learn more about. It took me almost two weeks to finally come in terms with myself and choose a different one. Meet trees.

Trees? Yes, trees 🌲, the tree data structure. There is nothing special about them, and it's not the biggest requirement for a frontend language. Yet, as I was browsing package list, I kept seeing implementations of different types of trees: rose tree, red black tree, quad tree, and others. And I got curious, why would anyone need this (just because I don't doesn't mean others don't), and what are these tree implementations useful for. So let's explore some of these packages and learn about each type of tree.

Trees are amazing data structure that are used everywhere: high-frequency trading (look for wavelet trees), filesystems, databases, browsers, and more. Trees allow us to store the data in some hierarchical form, be it for easier search, retrieval, or compression. If you look at the HTML and the way it's organized, it resembles a tree structure (and in fact it is). DOM is a tree. And no doubt you have heard of the virtual DOM (Elm, JS, React, Vue) which use tree-like data structures.

While knowing algorithms for (or implementing) virtual DOM might not be your day to day task, writing parsers is something that is slightly more common. Parsers build what is known as abstract syntax tree (AST) which is a tree representation of the parsed text, e.g. programming language. James Carlson wrote an excellent article about an implementing a new programming language parser in Elm. He also wrote about parsing Markdown. Dillon and Jeroen from Elm Radio Podcast did two episodes on parsing too: elm/parser and Parse, Don't Validate.

Now let's look at some of the tree packages on Elm package list and see what they do and where they can be used. The first one to catch my eyes (and somewhat familiar) was evelios/elm-geometry-quadtree which is an implementation of the Quadtree data structure. The first tech startup that I have built was called Flykk. It was the first platform for building location-based games, something similar to what Niantic Labs have initially built with Ingress. You would have a map with your location at the center, and go around the area and collect various items, fight with other game characters, and perform quests. Placing these items of the map was relatively easy, but loading these efficiently (especially when you have hundreds or thousands of objects with meta-data and images) in 2012 on a limited phone hardware was tricky. And so we made heavy use of quadtrees to take the visible area of the map and find the collision point with items that the user should see. Collision detection is probably the most common (at least that I know) use-case for this data structure. Take a look at elm-collision-detection (demo):

The elm-visualization package also relies on quadtrees for optimized performance.

Next tree type that caught my attention was the rose tree (which is also called multiway tree). You can find a fair bunch of libraries implementing this data structure: elm-rosetree, tree-with-zipper, elm-typed-tree, tea-tree, canopy. I have never heard of these trees, and having that many different implementations caused my interest and some research to learn about it. Funny enough, links to content from James Carlson (as is from Dillon Kearns and Jeroen Engels) appear almost in every Elm Bits newsletter. If you're want to learn about a particular topic, chances are that one of these three has already covered it, which is the case with multiway tree. For an extensive introduction and real-world examples you can read James' excellent On Outlines, Rose Trees, and Zippers article (and this short answer that makes a much better job at explaining rose trees than Wikipedia). In short, rose trees are generic tree-like data structures where each node can have a variable number of branches. Such trees are used in the construction of ASTs while parsing text, e.g. Markdown as in the example from James. Having your source structured as a tree, it becomes easy to navigate it with zipper.

Updated on 26th of November: right after this post went live, Carl Fredrik Herö made a post on Discourse explaining how he used rose trees to store the category tree for the site. It is very easy to grab the subtree on any level and render it.

Quadtree nodes have exactly four children. Rose tree nodes can have an arbitary number of nodes. And binary trees, as the name suggests, can have only up to two child nodes. There is an implementation of the red black tree, which is a self-balancing binary tree. When you insert a value into this structure, it re-balances itself so that the height of the tree is kept at low value. If you look for the RB-tree application, most of the use-cases come from some low-level programming: schedulers, and Map implementations. Tim DuBois, author of the Elm package, mentions that his library is more for the research purpose, but I was still curious if there is a use-case for these in building web interfaces. And, well, I couldn't really. If you have any examples, please do let me know.

There is another type of self-balancing binary tree, AVL, which also has an Elm package that implements Dict and Set structures. Possibly, the only real use-case for these algorithms is to implement other data structures? My sentiment might feel like disappointment, which it's not, but I also wanted to see if there is some wider use for these trees.

Take a look at conflict-free replicated data types. If you're not into distributed systems, this may sound complicated. CFRDTs are used to perform data updates without expensive synchronization. Macario Ortega, the author of the crdt-replicated-tree package, gives a simple use-case (with reference implementation): collaborative text editor. Personally, this is the best find for me because we might eventually add collaborative roadmap editing at Shipit, and while this might not be the best fit four our purposes, it gives some food for thought and consideration.

While this is not an exhaustive list of Elm packages that implement the tree data structure one way or another, hopefully this sparked your interest and might have given you some pointers how to solve your own code problems.

And as usual, I'll finish this one with a quote:

Although greed is considered one of the seven deadly sins, it turns out that greedy algorithms often perform quite well.

by Stuart Russell from "Artificial Intelligence: A Modern Approach".

Show Comments