(This is part of a series of posts on tools for thinking about programming — abstractions, patterns and smells, among other things — which I think make all kinds of programming tasks easier. See here for other posts like this one.)
The eventual value
is an interesting abstraction to deal with both
synchronous and asynchronous computations. The point of this tool is to
abstract the time of evaluation of a value.
Most of the concepts in this post are covered in the Wikipedia article on Futures and Promises, which is definitely worth reading in full. You’ll find a lot of interesting references there.
According to Wikipedia, the name eventual
was first used by Peter Hibbard.
Although I don’t know if the ideas here match those in his article, I chose to
use it because other common names for this pattern are used in several
different languages with slightly different meanings, bound to specific
implementations and strategies. Eventual
is less known, therefore less
overloaded.
The eventual value
is most obviously useful in async scenarios, but allowing
the programmer to forget about the timing of things is a great way of
simplifying code in general.
First, let’s look at how this abstraction shows up in several forms in Clojure and JavaScript. Then, we’ll talk about how we could implement a simple version of this to refactor some Java code.
While we go through these hoops, I will point out important differences between these programming artifacts. Understanding the differences is key to grokking the unifying theme.
Clojure
Clojure has several interesting options to deal with concurrency (in fact,
that’s one of it’s selling points). Three of them are pretty closely related:
delay
, future
and promise
. All three eventually hold a single immutable
value, accessible by calling the deref
function or using the @
symbol.
They differ on how that value is evaluated.
Delays receive a code block to run, but do nothing until deref
is
called. At that moment, the code block will be evaluated, and the result will
be the value of the delay
. If a thread calls deref
while another thread
is evaluating the delay
, the second thread will block until the other is
done, then receive the evaluated value. Notice that this means the code block
will only run once.
(def d (delay
(Thread/sleep 1000)
(* 6 7)))
(deref d) ; sleeps, then computes 6 * 7, then returns 42
(deref d) ; returns 42 immediately
Futures spawn a new thread, which will evaluate a code block. The result
of the block will be the value of the future
. If a thread calls deref
on
a future
before its code block has finished running, it will block until
the value is available.
(def f (future
(Thread/sleep 1000)
(* 6 7)))
(deref p) ; blocks until the future thread is finished, then returns 42
(deref p) ; return 42 immediately
Promises are like a write-once variable. Calling the deliver
function
on a promise
sets its value, and any subsequent call to deliver
has no
effect. If a thread calls deref
on a promise
before it is delivered, it
will block until the value is available.
(def p (promise))
(.start (Thread. #(do
(Thread/sleep 1000)
(deliver p (* 6 7)))))
(deref p) ; blocks until the other thread delivers, then returns 42
(deref p) ; return 42 immediately
The interesting thing here is that, even though these are three pretty
different evaluation strategies, the language gives us a single interface for
all of them: deref
. This means you can make an API that returns something
deref
able, while retaining the freedom to change whether the value is
pre-computed at startup, or by a background thread started at a convenient
moment, or only when it is first needed.
One important reminder here is that the deref
function in Clojure also works
on other entities besides the three mentioned above, for example atoms
and
agents
. These entities, however, can have their values changed after they
are set, so they don’t quite match the abstraction we are looking for (they
look more like references than values). Unfortunately, Clojure lacks an
specific interface for eventual values
.
Javascript
Javascript code runs in a single thread. This forced the language to turn to an unusual approach for tasks that take too long (like IO): dispatch a task to some other process or thread on the runtime environment (which is not controlled by the JS virtual machine), and run a callback when that task completes. This approach leads to code like this:
navigator.geolocation.getCurrentPosition(location => {
alert(`lat: ${location.coords.latitude}, lon: ${location.coords.longitude}`)
})
The call to getCurrentPosition
returns undefined
and doesn’t block the JS
thread. The browser will asynchronously find out the location and, when
that’s done, notify the Javascript VM to call the callback function with the
location as an argument.
Another way to look at what’s going on here is to think that the
getCurrentPosition
call doesn’t return anything immediately, but eventually
returns the location via its callback. So another way this API could have been
written is to return (immediately) an object that will eventually hold the
position. In modern JS, these objects are called Promise
s.
Warning: do not confuse it with the Clojure promise
. They are closely
related, but the JS concept is fit to a different execution model
(single-threaded), so it works a bit differently.
The Promise object represents the eventual completion (or failure) of an asynchronous operation, and its resulting value.
The JS Promise
exposes a .then
method, which expects a callback as its
parameter. Whenever the Promise
resolves (i.e. its value is delivered), any
callbacks registered by calling .then
will be called. After the Promise
is
resolved, any new call to .then
will end up with the callback being called
with that same value.
One interesting feature of JS Promise
s is that each .then
call returns a
new Promise
. This derived Promise
will eventually resolve to the result of
the callback passed to the original .then
call. This allows for chaining
Promise
s, like in the snippet below:
doSomething()
.then(result => doSomethingElse(result))
.then(newResult => doThirdThing(newResult))
.then(finalResult => {
console.log(`Got the final result: ${finalResult}`);
})
Which looks very much like what we would do in Clojure with deref
:
(let [result (do-something)
new-result (do-something-else (deref result))
final-result (do-third-thing (deref new-result))]
(println (str "Got the final result:" (deref final-result))))
The actual execution is different, because each step runs asynchronously in
the Javascript example: the Promise
implementation schedules each .then
callback to be run at a later time by the event loop of the only thread
available. In the Clojure example, the thread running the deref
calls will
be blocked until some thread delivers the promise
s, completes the future
s
or computes the delay
s (we need not know which one).
Promise
s in JS fit very nicely to the use cases which would require a
future
or promise
in Clojure. The delay
use case, on the other hand, is
not supported by the standard JS implementation: one cannot use the Promise
constructor to define a computation that runs if and only if the .then
method is called (see the MDN docs for details on the Promise
constructor).
It’s not too difficult to build a simple implementation of that (check this
one, for example), but there’s no
current standard in the language.
Extra: ReactiveX Observables
ReactiveX’s Observables
are an interesting abstraction to model streams of values. The linked page
mentions a distinction between hot and cold observables, which almost
match the distinction between Clojure’s future
/promise
and delay
.
A “hot” Observable may begin emitting items as soon as it is created, and so any observer who later subscribes to that Observable may start observing the sequence somewhere in the middle. A “cold” Observable, on the other hand, waits until an observer subscribes to it before it begins to emit items, and so such an observer is guaranteed to see the whole sequence from the beginning.
The important difference is that once a hot Observable has emitted a value,
it is lost to anyone that wasn’t subscribed at that moment. Once a Clojure
future
/promise
or a JS Promise
is resolved to a value, that value is
accessible forever (or until GC, of course).
A definition of sorts
So after looking at all these near-misses, here’s the abstraction we are looking for:
- It may have no value at first, but eventually resolves to some value
- Once resolved, the value never changes
- There’s some way to wait for or be notified of the value’s resolution
- It makes no guarantees about time of computation and resolution. In fact, it may even resolve synchronously at the time of access, or never resolve.
This is what I call an eventual value
.
Clojure deref
comes close to this, because it is an interface that applies
to both delay
s, future
s and promise
s. On the other hand, it misses the
target by also applying to mutable entities like atom
s and agent
s.
The JS Promise
also isn’t an exact match, lacking support to the laziness
we find in Clojure’s delay
.
ReactiveX cold Observable
s are a close match to Clojure’s delay
(if we
take a single value from them, instead of several values on a stream), but hot
Observable
s do not match the behavior we want, because the value is lost to a
late subscriber.
Again, the important part is not language support or the specific implementation. The important part is what we are abstracting away: time of execution. This means you can reach for this concept whenever you feel that some module shouldn’t have to decide, or even care about, the time when something is executed.
(Although of course language support would make everything easier.)
Aside: The power of not giving guarantees
Notice one of the characteristics described above is a negative: an eventual value makes no guarantees about time of computation and resolution.
When an abstraction gives a guarantee, it gives power to the code that uses it. The user code can rely on it. On the other hand, the guarantee reduces the space of possible implementations one could choose from.
For example, the JS Promise specification
guarantees
that then
blocks will not be ran synchronously, even if the result value is
already available when .then
is called. This means you cannot choose an
implementation that does that.
So, in theory, the less you assert about the behavior of your abstraction in your contract (interface and docs), the more you keep your ability to change the implementation later on. It’s a way of delaying decisions: only add something to your contract when you know the your user code needs that guarantee.
On the other hand, Hyrum’s Law:
With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.
Even though you may claim that “I never promised I’d resolve this asynchronously!“, a user of your API may still observe that you always resolve asynchronously, rely on that fact, and be mad when you break their software by changing that implementation “detail”.
The way I think about this is that the no-guarantees approach works best for internal APIs, where you have some control over how people are using the abstraction (so you can call them out on code review if they rely on implementation details).
It also works better when the aspects that you are leaving “undecided” are not extremely important (in the context they are being used). For example, on a webservice the question of whether a computation is running on the same thread or in a different one is tipically not very important. On the other hand, that’s a crucial matter when dealing with UI applications, because if you hang the main thread for too long, the UI will become unresponsive (and that’s why the JS Promise spec guarantees asynchronicity).
This means that you can only take the easy way out of “not giving guarantees” on aspects that don’t matter too much for the code that uses your API. As I said, it’s a way of delaying the decision of committing to certain aspects of an implementation, but there are things you cannot delay.
Why writing all of this, then? Well, even while agreeing that Hyrum’s Law applies to many cases, I still think it is useful to start from a no-or-very-few guarantees approach when thinking about APIs and abstractions. I then add more items to the contract as I investigate the intended and possible use cases.
Exercise
Now, on to a small exercise. Take a look at this Java class:
class MoviesLibrary {
private Map<UUID, Movie> map;
private static Map<UUID, Movie> loadMapFromFile() { /* ... */ }
public void load() {
this.map = loadMapFromFile();
}
public boolean isLoaded() {
return this.map != null;
}
public Movie getMovie(UUID id) {
return this.map.get(id);
}
}
There are a couple problems here:
- the
getMovie
method will crash with aNullPointerException
if called before theload
method; and - if the
load
method is called several times, the MoviesLibrary content may change unexpectedly (because of changes to the underlying file). So really the client code has to callisLoaded()
to decide whether it should callload()
.
Can you use the ideas above to get rid of these problems? Check my answer below:
Both problems go away if we make the load
method idempotent and make the
getMovie
method wait until load
has run at least once. This is pretty easy
to implement:
class MoviesLibrary {
private Map<UUID, Movie> map;
private static Map<UUID, Movie> loadMapFromFile() { /* ... */ }
public synchronized void load() {
if (this.map == null) { // synchronized null-check
this.map = loadMapFromFile();
}
}
public Movie getMovie(UUID id) {
if (this.map == null) {
this.load();
}
return this.map.get(id);
}
}
The synchronized
keyword on load
guarantees that the null
-check inside
that method will only pass on the first call; all other calls will be blocked
until loadMapFromFile()
has finished and this.map
is assigned.
Now getMovie
can simply call load()
every time, and if necessary it will
block until the loading is done. The null
-check inside getMovie
avoids the
performance costs of calling a synchronized method if we know it’s not needed,
but that’s an optimization, not strictly necessary to get us the simpler API.
We can drop the isLoaded()
method because client code can just call load()
without worries.
Notice how this approach gives us the same behavior as Clojure’s delay
. We
can pre-load the data if we need to by calling load()
, or we can be totally
lazy and just wait until the first getMovie
. In fact, the current
implementation
for delay
looks much like the above example (except for the idiosyncratic
indentation).
Summary
Eventual values
abstract the time of evaluation of a value from the code
that uses that value. Turn to this tool whenever you feel that client code
shouldn’t worry about what your evaluation strategy is.
If you have language support, use it, even if the match isn’t perfect: in
Clojure, for example, an API can just return a deref
able entity, documenting
that it’s guaranteed to be immutable, but never say whether it’s a future
,
promise
or delay
(or a custom deref
able that you wrote).
If you don’t have language support, or what you have doesn’t match your needs,
just think about how you can make your API more eventual
-like, the way we
did in the exercise above.