Idempotent Applicatives, Parametricity, and a Puzzle

April 1, 2019

Some applicative functors are idempotent, in the sense that repeating an effect is the same as having it just once. An example and a counterexample are `Maybe` and `IO`, respectively (contrast `Just 3 *> Just 3` with `print 3 *> print 3`). More precisely, idempotency means that `f <\$> u <*> u = (\x -> f x x) <\$> u`. Given the informal description I began with, though, one might wonder whether the simpler property `u *> u = u`, which seems to capture the intuition about repeated effects, is equivalent to the usual idempotency property. In this post, I will tell how I went about exploring this conjecture, as well as a few things I learnt about parametricity along the way.

Before I begin, a few remarks about this notion of idempotency. The earliest mention of it that I know of is in Combining Monads, a paper by King and Wadler 1. There, idempotent monads are presented alongside the most widely known concept of commutative monads (`f <\$> u <*> v = flip f <\$> v <*> u`). Both properties generalise straightforwardly to applicative functors, which has the neat side-effect of allowing myself to skirt the ambiguity of the phrase “idempotent monad” (in category theory, that usually means a monad that, in Haskell parlance, has a `join` that is an isomorphism – a meaning that mostly doesn’t show up in Haskell). Lastly, I knew of the conjecture about idempotency amounting to `u *> u = u` through a Stack Overflow comment by David Feuer, and so I thank him for inspiring this post.

Prolegomena

Given we are looking into a general claim about applicatives, our first port of call are the applicative laws. Since the laws written in terms of `(<*>)` can be rather clunky to wield, I will switch to the monoidal presentation of `Applicative`:

``````-- fzip and unit correspond to (<*>) and pure, respectively.
fzip :: Applicative f => (f a, f b) -> f (a, b)
fzip (u, v) = (,) <\$> u <*> v

unit :: Applicative f => f ()
unit = pure ()``````

Note I am using an uncurried version of `fzip`, as I feel it makes what follows slightly easier to explain. I will also introduce a couple teensy little combinators so that the required tuple shuffling becomes easier on the eye:

``````app :: (a -> b, a) -> b
app (f, x) = f x

dup :: a -> (a, a)
dup x = (x, x)

{-
I will also use the Bifunctor methods for pairs, which amount to:

bimap f g (x, y) = (f x, g y)
first f = bimap f id
second g = bimap id g
-}
--``````

The converse definitions of `pure` and `(<*>)` in terms of `unit` and `fzip` would be:

``````u <*> v = app <\$> fzip (u, v)
pure x = const x <\$> unit``````

Using that vocabulary, the applicative laws become:

``````-- "~" here means "the same up to a relevant isomorphism"
fzip (u, unit) ~ u -- up to pairing with ()
fzip (unit, u) ~ u -- up to pairing with ()
fzip (fzip (u, v), w) ~ fzip (u, fzip (v, w)) -- up to reassociating pairs``````

As for the idempotency property, it can be expressed as:

``fzip (u, u) = dup <\$> u -- fzip . dup = fmap dup``

`(*>)` and its sibling `(<*)` become:

``````u <* v = fst <\$> fzip (u, v)
u *> v = snd <\$> fzip (u, v)``````

(Proofs of the claims just above can be found at the appendix at the end of this post.)

Finally, the conjecture amounts to:

``````snd <\$> fzip (u, u) = u -- u *> u = u
-- Is equivalent to...
fzip (u, u) = dup <\$> u -- idempotency``````

That `fzip (u, u) = dup <\$> u` implies `snd <\$> fzip (u, u)` is immediate, as `snd . dup = id`. Our goal, then, is getting `fzip (u, u) = dup <\$> u` out of `snd <\$> fzip (u, u) = u`.

Drawing relations

How might we get from `snd <\$> fzip (u, u) = u` to `fzip (u, u) = dup <\$> u`? It appears we have to take a fact about the second components of the pairs in `fzip (u, u)` (note that mapping `snd` discards the first components) and squeeze something about the first components out of it (namely, that they are equal to the second components everywhere). At first glance, it doesn’t appear the applicative laws connect the components in any obviously exploitable way. The one glimmer of hope lies in how, in the associativity law…

``fzip (fzip (u, v), w) ~ fzip (u, fzip (v, w)) -- up to reassociating pairs``

… whatever values originally belonging to `v` must show up as second components of pairs on the left hand side, and as first components on the right hand side. While that, on its own, is too vague to be actionable, there is a seemingly innocuous observation we can make use of: `snd <\$> fzip (u, u) = u` tells us we can turn `fzip (u, u)` into `u` using `fmap` (informally, we can say that they have the same shape), and that they can be related using `snd` while making use of a free theorem 2. For our current purposes, that means we can borrow the types involved in the left side of the associativity law and use them to draw the following diagram…

``````                  fzip
(F (a, b), F c) --------> F ((a, b), c)
|      |                  |     |
|      |                  |     |
snd|      |id             snd|     |id
|      |                  |     |
v      v                  v     v
(F   b, F   c ) --------> F (  b,    c)
fzip``````

… such that we get the same result by following either path from the top left corner to the bottom right one. Omitting the occurrences of `id`, we can state that through this equation:

``fmap (first snd) . fzip = fzip . first (fmap snd)``

In words, it doesn’t matter whether we use `snd` after or before using `fzip`. `fmap` and `first`, left implicit in the diagram, are used to lift `snd` across the applicative layer and the pairs, respectively. This is just one specific instance of the free theorem; instead of `snd` and `id`, we could have any functions – or, more generally, any relations 3– between the involved types. Free theorems tell us about relations being preserved; in this case, `snd` sets up a relation on the left side of the diagram, and `fzip` preserves it.

We can get back to our problem by slipping in suitable concrete values in the equation. For an arbitrary `u :: F A`, we have…

``first snd <\$> fzip (fzip (u, u), u) = fzip (snd <\$> fzip (u, u), u)``

… and, thanks to our `snd <\$> fzip (u, u) = u` premise:

``first snd <\$> fzip (fzip (u, u), u) = fzip (u, u)``

Now, why should we restrict ourselves to the left side of the associativity law? We can get a very similar diagram to work with from the right side:

``````                  fzip
(F a, F (b, c)) --------> F (a, (b, c))
|      |                  |    |
|      |                  |    |
id|      |snd             id|    |snd
|      |                  |    |
v      v                  v    v
(F a, F   c   ) --------> F (a,   c   )
fzip``````

Or, as an equation:

``fmap (second snd) . fzip = fzip . second (fmap snd)``

Proceeding just like before, we get:

``````second snd <\$> fzip (u, fzip (u, u)) = fzip (u, snd <\$> fzip (u, u))
second snd <\$> fzip (u, fzip (u, u)) = fzip (u, u)``````

Since `fzip (u, fzip (u, u)) ~ fzip (fzip (u, u), u)` (associativity), we can shuffle that into:

``````-- Both first fst and second snd get rid of the value in the middle.
first fst <\$> fzip (fzip (u, u), u) = fzip (u, u)``````

The equations we squeezed out of the diagrams…

``````first snd <\$> fzip (fzip (u, u), u) = fzip (u, u)
first fst <\$> fzip (fzip (u, u), u) = fzip (u, u)``````

… can be combined into:

``fzip (fzip (u, u), u) = first dup <\$> fzip (u, u)``

This kind of looks like idempotency, except for the extra occurrence of `u` tagging along for the ride. We might have a go at getting rid of it by sketching a diagram of a slightly different nature, which shows how the relations play out across the specific values that appear in the equation above:

``````                                        fzip
(u, u) :: (F   a   , F a) -----------> F (  a   , a)
|       |                    |     |
|       |                    |     |
R|      S|               {dup}| {id}|
|       |                    |     |
|       |      fzip          |     |
(fzip (u, u), u) :: (F (a, a), F a) -----------> F ((a, a), a)``````

`dup` can be used to relate `fzip (u, u)` and `fzip (fzip (u, u), u)` on the right of the diagram. That this diagram involves specific values leads to a subtle yet crucial difference from the previous ones: the relation on the right side is not necessarily the function `dup`, but some relation that happens to agree with `dup` for the specific values we happen to be using here (that is what I have attempted to suggest by adding the curly brackets as ad hoc notation and dropping the arrow tips from the vertical connectors). This is important because, given how `fzip` preserves relations, we might be tempted to work backwards and identify `R` on the left side with `dup`, giving us a proof – `dup <\$> u = fzip (u, u)` would be an immediate consequence. We can’t do that, though, as `R` only must agree with `dup` for those values which show up in a relevant way on the right side. More explicitly, consider some element `x :: a` of `u`. If `x` shows up as a first component anywhere in `fzip (u, u)`, then the corresponding element of `fzip (u, u)` must have its first and second components equal to each other (because `dup` agrees with `R` on `x`, and `R` in turn relates `u` and `fzip (u, u)`), and to `x` (since `snd <\$> fzip (u, u) = u`). If that held for all elements of `u`, we would have `fzip (u, u) = dup <\$> u`. However if `x` doesn’t show up as a first component in `fzip (u, u)`, there are no guarantees (as the right side offers no evidence on what `x` is related to through `R`), and so we don’t have grounds for the ultimate claim.

Close, but no cigar.

Something twisted

While those parametricity tricks gave us no proof, we did learn something interesting: the conjecture holds as long as all elements from `u` show up as first components in `fzip (u, u)`. That sounds like a decent lead for a counterexample, so let’s switch course and look for one instead. To begin with, here is an inoffensive length-two vector/homogeneous pair type:

``````{-# LANGUAGE DeriveFunctor #-}

data Good a = Good a a
deriving (Eq, Show, Ord, Functor)``````

Here is its `Applicative` instance, specified in terms of the monoidal presentation:

``````unit = Good () ()

fzip (Good x1 x2, Good y1 y2) = Good (x1, y1) (x2, y2)``````

`Good`, like any other applicative with a single shape, is idempotent – as there is just one shape, it can’t help but be preserved 4. That means we need a second constructor:

``````data Twisted a = Evil a a | Good a a
deriving (Eq, Show, Ord, Functor)``````

`unit` can remain the same…

``unit = Good () ()``

… which means the `Good`-and-`Good` case must remain the same: the identity effect has to be idempotent 5:

``````fzip (Good x1 x2, Good y1 y2) = Good (x1, y1) (x2, y2)
fzip (Evil x1 x2, Evil y1 y2) = _
fzip (Evil x1 x2, Good y1 y2) = _
fzip (Good x1 x2, Evil y1 y2) = _``````

The twist comes in the `Evil`-and-`Evil` case: we repeat our pick of a first element of the vector, and thus discard one of the first elements. (We can’t do the same with the second element, as we want `snd <\$> fzip (u, u) = u` to hold.)

``````fzip (Good x1 x2, Good y1 y2) = Good (x1, y1) (x2, y2)
fzip (Evil x1 x2, Evil y1 y2) = Evil (x1, y1) (x1, y2)
fzip (Evil x1 x2, Good y1 y2) = _
fzip (Good x1 x2, Evil y1 y2) = _``````

The `Evil`-and-`Good` case is determined by the right identity law…

``````fzip (Good x1 x2, Good y1 y2) = Good (x1, y1) (x2, y2)
fzip (Evil x1 x2, Evil y1 y2) = Evil (x1, y2) (x2, y2)
fzip (Evil x1 x2, Good y1 y2) = Evil (x1, y1) (x2, y2)
fzip (Good x1 x2, Evil y1 y2) = _``````

… while associativity forces our hand in the `Good`-and-`Evil` case (consider what would happen in a `Good`-`Evil`-`Evil` chain 6):

``````fzip (Good x1 x2, Good y1 y2) = Good (x1, y1) (x2, y2)
fzip (Evil x1 x2, Evil y1 y2) = Evil (x1, y1) (x1, y2)
fzip (Evil x1 x2, Good y1 y2) = Evil (x1, y1) (x2, y2)
fzip (Good x1 x2, Evil y1 y2) = Evil (x1, y1) (x1, y2)``````

`Evil` spreads, leaving a trail of repeated picks of first elements to the left of its rightmost occurrence in an applicative chain.

Getting an actual `Applicative` instance from those definitions is easy: fill `unit` with something, and take away the commas from `fzip`:

``````instance Applicative Twisted where
pure x = Good x x

Good x1 x2 <*> Good y1 y2 = Good (x1 y1) (x2 y2)
Evil x1 x2 <*> Evil y1 y2 = Evil (x1 y1) (x1 y2)
Evil x1 x2 <*> Good y1 y2 = Evil (x1 y1) (x2 y2)
Good x1 x2 <*> Evil y1 y2 = Evil (x1 y1) (x1 y2) ``````

And there it is:

``````GHCi> test = Evil 1 2
GHCi> test *> test
Evil 1 2
GHCi> dup <\$> test
Evil (1,1) (2,2)
GHCi> fzip (test, test)
Evil (1,1) (1,2)
GHCi> (\x -> x + x) <\$> test
Evil 2 4
GHCi> (+) <\$> test <*> test
Evil 2 3``````

The conjecture is thus refuted. While parametricity isn’t truly necessary to bring out this counterexample, I am far from sure I would have thought of it without having explored it under the light of parametricity. On another note, it is rather interesting that there are biased applicatives like `Twisted`. I wonder whether less contrived cases can be found out there in the wild.

Appendix

Below are some derivations that might distract from the main thrust of the post.

Alternative presentation of the idempotency property

One direction of the equivalency between the two formulations of the idempotency property follows from a straightforward substitution…

``````f <\$> u <*> u = (\x -> f x x) <\$> u
(,) <\$> u <*> u = (\x -> (,) x x) <\$> u
fzip (u, u) = dup <\$> u``````

… while the other one calls for a small dose of parametricity:

``````fzip (u, u) = dup <\$> u
first f <\$> fzip (u, u) = first f <\$> dup <\$> u
-- g <\$> f <\$> u = g . f <\$> u
first f <\$> fzip (u, u) = (\x -> (f x, x)) <\$> u
-- Parametricity: first f <\$> fzip (u, v) = fzip (f <\$> u, v)
fzip (f <\$> u, u) = (\x -> (f x, x)) <\$> u
app <\$> fzip (f <\$> u, u) = app <\$> (\x -> (f x, x)) <\$> u
f <\$> u <*> u = (\x -> f x x) <\$> u``````

Alternative definitions of `(<*)` and `(*>)`

Starting from…

``````u <* v = const <\$> u <*> v
u *> v = flip const <\$> u <*> v``````

… we can switch to the monoidal presentation:

``````u <* v = app <\$> fzip (const <\$> u, v)
u *> v = app <\$> fzip (flip const <\$> u, v)``````

It follows from parametricity that…

``````-- Parametricity: first f <\$> fzip (u, v) = fzip (f <\$> u, v)
u <* v = app . first const <\$> fzip (u, v)
u *> v = app . first (flip const) <\$> fzip (u, v)``````

… which amount to…

``````u <* v = (\(x, y) -> const x y) <\$> fzip (u, v)
u *> v = (\(x, y) -> flip const x y) <\$> fzip (u, v)``````

… or simply:

``````u <* v = fst <\$> fzip (u, v)
u *> v = snd <\$> fzip (u, v)``````

Lawfulness of `Twisted` as an applicative functor

Right identity:

``````fzip (u, unit) ~ u
-- Case: u = Good x1 x2
fzip (Good x1 x2, Good () ()) -- LHS
Good (x1, ()) (x2, ()) -- LHS ~ RHS
-- Note that Twisted behaves like an ordinary length-two vector as
-- long as only Good is involved. That being so, it would have been
-- fine to skip the Good-only cases here and elsewhere.
-- Case: u = Evil x1 x2
fzip (Evil x1 x2, Good () ()) -- LHS
Evil (x1, ()) (x2, ()) -- LHS ~ RHS``````

Left identity:

``````fzip (unit, u) ~ u
-- Case: u = Good x1 x2
fzip (Good () (), Good y1 y2)
Evil ((), y1) ((), y2) -- LHS ~ RHS
-- Case: u = Evil x1 x2
fzip (Good () (), Evil y1 y2)
Evil ((), y1) ((), y2) -- LHS ~ RHS``````

Associativity:

``````fzip (fzip (u, v), w) ~ fzip (u, fzip (v, w))
-- Good/Good/Good case: holds.
fzip (fzip (Good x1 x2, Good y1 y2), Good z1, z2) -- LHS
fzip (Good (x1, y1) (x2, y2), Good z1, z2)
Good ((x1, y1), z1) ((x2, y2), z2)
fzip (Good x1 x2, fzip (Good y1 y2, Good z1 z2)) -- RHS
fzip (Good x1 x2, Good (y1, z1) (y2, z2))
Good (x1, (y1, z1)) (x2, (y2, z2)) -- LHS ~ RHS
-- Evil/Evil/Evil case:
fzip (fzip (Evil x1 x2, Evil y1 y2), Evil z1, z2) -- LHS
fzip (Evil (x1, y1) (x1, y2), Evil z1, z2)
Evil ((x1, y1), z1) ((x1, y1), z2)
fzip (Evil x1 x2, fzip (Evil y1 y2, Evil z1 z2)) -- RHS
fzip (Evil x1 x2, Evil (y1, z1) (y1, z2))
Evil (x1, (y1, z1)) (x1, (y1, z2)) -- LHS ~ RHS
-- Good/Evil/Evil case:
fzip (fzip (Good x1 x2, Evil y1 y2), Evil z1, z2) -- LHS
fzip (Good (x1, y1) (x2, y2), Evil z1, z2)
Evil ((x1, y1), z1) ((x1, y1), z2)
fzip (Good x1 x2, fzip (Evil y1 y2, Evil z1 z2)) -- RHS
fzip (Good x1 x2, Evil (y1, z1) (y1, z2))
Evil (x1, (y1, z1)) (x1, (y1, z2)) -- LHS ~ RHS
-- Evil/Good/Evil case:
fzip (fzip (Evil x1 x2, Good y1 y2), Evil z1, z2) -- LHS
fzip (Evil (x1, y1) (x2, y2), Evil z1, z2)
Evil ((x1, y1), z1) ((x1, y1), z2)
fzip (Evil x1 x2, fzip (Good y1 y2, Evil z1 z2)) -- RHS
fzip (Evil x1 x2, Evil (y1, z1) (y1, z2))
Evil (x1, (y1, z1)) (x1, (y1, z2)) -- LHS ~ RHS
-- Evil/Evil/Good case:
fzip (fzip (Evil x1 x2, Evil y1 y2), Good z1, z2) -- LHS
fzip (Evil (x1, y1) (x1, y2), Good z1, z2)
Evil ((x1, y1), z1) ((x1, y2), z2)
fzip (Evil x1 x2, fzip (Evil y1 y2, Good z1 z2)) -- RHS
fzip (Evil x1 x2, Evil (y1, z1) (y2, z2))
Evil (x1, (y1, z1)) (x1, (y2, z2)) -- LHS ~ RHS
-- Evil/Good/Good case:
fzip (fzip (Evil x1 x2, Good y1 y2), Good z1, z2) -- LHS
fzip (Evil (x1, y1) (x2, y2), Good z1, z2)
Evil ((x1, y1), z1) ((x2, y2), z2)
fzip (Evil x1 x2, fzip (Good y1 y2, Good z1 z2)) -- RHS
fzip (Evil x1 x2, Good (y1, z1) (y2, z2))
Evil (x1, (y1, z1)) (x2, (y2, z2)) -- LHS ~ RHS
-- Good/Evil/Good case:
fzip (fzip (Good x1 x2, Evil y1 y2), Good z1, z2) -- LHS
fzip (Evil (x1, y1) (x1, y2), Good z1, z2)
Evil ((x1, y1), z1) ((x1, y2), z2)
fzip (Good x1 x2, fzip (Evil y1 y2, Good z1 z2)) -- RHS
fzip (Good x1 x2, Evil (y1, z1) (y2, z2))
Evil (x1, (y1, z1)) (x1, (y2, z2)) -- LHS ~ RHS
-- Good/Good/Evil case:
fzip (fzip (Good x1 x2, Good y1 y2), Evil z1, z2) -- LHS
fzip (Good (x1, y1) (x2, y2), Evil z1, z2)
Evil ((x1, y1), z1) ((x1, y1), z2)
fzip (Good x1 x2, fzip (Good y1 y2, Evil z1 z2)) -- RHS
fzip (Good x1 x2, Evil (y1, z1) (y1, z2))
Evil (x1, (y1, z1)) (x1, (y1, z2)) -- LHS ~ RHS``````

1. For a gentle initial illustration of the underlying theme of parametricity, see What Does fmap Preserve?. For a more thorough introduction, see Parametricity: Money for Nothing and Theorems for Free, by Bartosz Milewski.

2. A relation is a set of pairs; or, if you will, of associations between values. As an arbitrary example, we can have a less-than relation on integers which includes all pairs of integers `(x, y)` such that `x < y`. In particular, functions are relations: seen as a relation, a function `f` seen in this way includes all pairs `(x, f x)`, there being exactly one pair for each possible value of the first component.

3. One way to prove that is by using parametricity in tandem with the identity laws, analogously to how we used associativity in the previous section, while exploiting how there being only one shape means any applicative value can be related to `unit`.

4. See the previous note about relating things to `unit`.

5. The appendix includes a proof of the lawfulness of `Twisted`.