State management in JavaScript
data:image/s3,"s3://crabby-images/5308a/5308aaa9aad4dad046f699ab983aea73d4964f22" alt=""
Managing the state of a browser based JavaScript application is tough, often it seems unnecessarily so.
My goal today is to outline a few simple principles that avoid the most common pitfalls that make state management unnecessarily difficult.
The principles
If you want more effective programmers, you will discover that they should not waste their time debugging, they should not introduce the bugs to start with.
Edsger W. Dijkstra
- Represent all state data uniquely as variables/properties
- Represent all derived data in your application as functions/methods
- Explicitly define dependencies in your data and avoid circular dependencies
- When state data changes, recalculate derived data in dependency order (topological evaluation)
- Never change state while deriving, never derive while changing state
- Prefer abstractions for the native DOM over templates and DSLs
- Prefer state that is as localised in scope as possible.
Hopefully you will find (as I have) that these principles:
- Are effective across a broad range of frameworks (even jQuery)
- Follow solid principles researched and documented many years ago
- Will continue to be relevant year after year
- Are not tied to buzzwords or jargon from a specific tool
- Are applicable to many different programming paradigms and styles.
That’s it. 🚢
data:image/s3,"s3://crabby-images/18489/184899d6abc1cb1a44974ba2025665f02f7f4015" alt=""
State vs. derived data
Let’s say we have a circle.
Circles are commonly represented (e.g. SVG, etc.) with three values, x
, y
, and r
(radius). These three values can represent the state of the circle, change any one of them and you have a new circle.
Every other property or way of representing that same circle can be derived from these three values, e.g.:
- The polar co-ordinates of the circle
- The circumference of the circle
- The area of the circle.
It’s meaningless to change a derived value for any given state. For example, a “circle” with a circumference other than 2 * PI * Radius
is by definition no longer a circle.
Note that the polar co-ordinates could also have been selected as “state” for the position of our circle and then our x/y values could have been “derived” from this. Similarly, the circumference, radius and area of a circle are all interchangeable to describe the size of a circle, so the designation “state” and “derived” for these properties is a bit arbitrary.
Also note that many derivations will be “lossy”, such as extracting the title from an article of content. There’s simply no way to rebuild an article from its title alone. In this case it is sensible to use the full article as “state” and the title extraction process as a derived value only.
Unique state
The position of a circle can be represented as either polar or cartesian coordinate state, but never both.
Duplicated state usually makes something more complicated.
This code returns the position of a hypothetical circle object in both coordinate systems:
const myCircle = {
r: 100,
theta: Math.PI / 2
};function cartesian (c) {
return [c.r * Math.cos(c.theta), c.r * Math.sin(c.theta)];
}
In this case polar coordinates are designated as state (properties of the myCircle
object) and cartesian values are derived (using math).
There are only two possible and equally simple strategies here — either derive the polar or the cartesian coordinates.
Both functions are based on the same unique state so it should never be possible for tests to pass individually but fail when cross referenced.
This code also returns position data for a similar circle object:
function cartesian (c) {
return [c.x, c.y];
}function polar (c) {
return [c.r, c.theta];
}
In this case though both polar and cartesian coordinates are being treated as state (expected as properties of the circle object).
The code might appear more simple at first, but the complexity has simply been moved somewhere else and probably amplified in the process.
Reading this code doesn’t show how or even if these four state properties are always internally consistent. It’s entirely possible that tests for cartesian
and polar
could pass independently but fail when cross referenced using a circle object with inconsistent state.
Be wary of the potential for duplicated state hidden in:
- Cached/memoized functions that are not referentially transparent
- Getters/setter functions and “private” properties on objects
- Anything that reads and writes to the DOM
- Leaky abstractions over other state.
Updating the DOM
Updating the DOM becomes important when we need to accommodate interactivity for users, which implies asynchronous logic.
Imagine that we have a perfect initial setup between our state and derived values. For any valid initial state we can derive a set of values and then build a DOM to visually represent those values relatively easy with “raw” JavaScript.
Now, imagine something in our state changes and we need to respond to that. Ideally we could model that by simply setting the new state, re-deriving everything and telling the browser what the new DOM should look like. If the DOM was a REST API (if only!) this would be a simple PUT request and we would be done.
Unfortunately, the DOM only offers the equivalent of PATCH endpoints. This means that we must track what needs patching ourselves, to simulate a PUT request.
This means that each time we update our state and derived values, we must calculate the difference between the “before” and “after” values. This calculated difference is what we must pass to the DOM API (not simply the “after” values).
Reliably patching the DOM by hand is infeasible for any large codebase.
To fix the issue we want a tool to handle some key tasks for us automatically:
- Track which value derivation functions correspond to each state change
- Re-run the derivation functions and track the changes in any return values
- Pass changed values to the appropriate DOM API update methods
- Avoid prematurely updating the DOM before our calculations are finished.
Luckily there are many tools available that do exactly this. The main difference is usually the method of feeding dependency information into the tool. There seem to be two popular styles at the moment:
- DAGs directly chaining values together, with DOM elements at the ends
- A Virtual DOM simulating the DOM, but optimised to calculate diffs.
Use whichever style seems most intuitive to you — I personally prefer the DAG approach as there is less “magic” involved.
Don’t template, abstract!
Templates remove access to tools and obscure their nature.
Abstractions combine and extend tools to make them more useful.
HTML templating is popular and sensible in server-side in languages that represent HTML documents as strings. The templating system can fix/warn about invalid markup and provide a defence against XSS security issues.
JavaScript in the browser has direct access to the real DOM so the issues caused by the HTML-as-strings paradigm do not apply. The process of converting a templated strings into new DOM elements is hidden away and could easily include subtle bugs or security issues.
Write functions to accept, compose and return native DOM elements.
Native DOM elements provide the ultimate interoperability between your code, your upstream libraries and the browser.
The majority of JavaScript libraries everywhere will happily accept raw DOM elements, regardless of how they are generated. Very few libraries will directly accept unprocessed template code.
If you feel that you absolutely must use a templating system, use something that maps 1:1 with DOM element creation, like JSX.
OK template example (using JSX):
<div className="sidebar" />
Compiles to:
React.createElement(
'div',
{className: 'sidebar'},
null
)
Personally I would prefer a simpleReact.div()
method, rather than invent and learn an entire templating system — many other frameworks adopt the single-element-function approach, no templates are required!
Prefer local scope over global scope
A bug can impact anything and everything in scope.
Pass huge global state objects to functions and expect huge global bugs.
Pass tiny derived values to functions and expect tiny localised bugs.
Pass global distributed state around and expect to lose $280M.
The scope of data you allow your functions to access is the scope of your bugs.
Avoid the temptation to reference external state from inside functions — force yourself to work with function arguments. Even if it feels like there are “too many” arguments (although, feel free to refactor in this case). Even when writing methods on objects.