In this post: yet another code-deduplication trick with trade-offs regarding readability and documentation generation. Make of it what you will.

Many useful operations can be derived in terms of indexing functions:

module type Indexable1 = sig
type 'a t

val get    : 'a t -> int -> 'a
val length : _ t -> int
end

module Foldable_of_indexable1 (I : Indexable1) : sig
open I

val iter      :        ('a -> unit) -> 'a t -> unit
val iteri     : (int -> 'a -> bool) -> 'a t -> unit
val fold_left : ('acc -> 'a -> 'acc) -> 'acc -> 'a t -> 'acc
val exists    : ('a -> bool) -> 'a t -> bool
val for_all   : ('a -> bool) -> 'a t -> bool
val is_empty  : _ t -> bool
(* ... *)
end


For many types, including array, the get-based definitions are identical to their hand-optimised equivalents (modulo functor application). We can imagine avoiding a lot of standard-library boilerplate – and potential for API inconsistency – by using many such functors 1. We'd end up defining exactly one iter function that suffices for all Indexable types:

let iter f t =
for i = 0 to length t - 1 do
f (get t i)
done


All good so far. Now we reflect on the fact that string values are indexable containers of char values. Indeed, our definition of iter is exactly equal to the one in Stdlib.String.iter. Unfortunately, our Indexable1 module type can only describe parametric containers:

module _ : Indexable1 with type 'a t := string = Stdlib.String

Error: Signature mismatch:
...
Values do not match:
val get : t -> int -> char
is not included in
val get : t -> int -> 'a
File "string.mli", line 52, characters 0-57: Actual declaration


The problem is that it's impossible to tell the type system something like

'a t = string $\quad$ implies $\quad$ 'a = char

as part of our substitution. Roughly speaking, this means that any indexable type with unboxed elements can't benefit from our Foldable_of_iterable1 definitions, even if their own definitions would be identical! Strings, bytes, unboxed arrays and unboxed vectors all suffer from this problem.

As ever, we could solve the problem with copy-paste: a new Indexable0 module type for non-parametric containers, and a new functor Foldable_of_indexable0 with exactly the same implementations as our previous one.

(* Non-parametric indexable types *)
module type Indexable0 = sig
type t
type elt

val get    : t -> int -> elt
val length : t -> int
end

module Foldable_of_indexable0 (I : Indexable0) : sig
open I

(* All with the same implementation as before... *)
val iter      :        (elt -> unit) -> t -> unit
val iteri     : (int -> elt -> bool) -> t -> unit
val fold_left : ('acc -> elt -> 'acc) -> 'acc -> t -> 'acc
val exists    : (elt -> bool) -> t -> bool
val for_all   : (elt -> bool) -> t -> bool
val is_empty  : t -> bool
(* ... *)
end


This definition suffers from the dual problem when we try to apply it to parameterised containers like 'a array:

module _ : Indexable0 with type t := 'a array = Stdlib.Array

Error: The type variable 'a is unbound in this type declaration.


This time, we wanted to be able to say something like

elt = 'a $\quad$ implies $\quad$ t = 'a array $\quad$ (where 'a is universally quantified),

which is even less meaningful than our previous attempt since now we're expecting substitution to introduce new equalities and quantifications. We really do need both Indexable0 and Indexable1 with this approach.

## A better (?) solution

Interestingly, it's possible to generalise Indexable0 and Indexable1 with another layer of indirection by making elt a type operator:

module type IndexableN = sig
type 'a t
type 'a elt

val get    : 'a t -> int -> 'a elt
val length : _ t -> int
end


elt carries the type equalities needed for the Indexable1 case, without forbidding the non-parametric implementation needed for the Indexable0 case. Arrays can set 'a elt = 'a, and strings can set 'a elt = char. Indeed, we can do this in the general case:

(** [Indexable0] is a special-case of [IndexableN] *)
module Indexable0_to_N = functor
(T : Indexable0) ->
(T : IndexableN with type 'a t := T.t and type 'a elt := elt)

(** [Indexable1] is a special-case of [IndexableN] *)
module Indexable1_to_N = functor
(T : Indexable1) ->
(T : IndexableN with type 'a t := 'a T.t and type 'a elt := 'a)


Now we can define a single Foldable_of_indexableN functor (with exactly the same implementations as before), and it will work for polymorphic and monomorphic containers. Neat!

## What did it cost?

At time of writing, this trick has a cost in documentation complexity: Odoc doesn't inline the type substitutions introduced by our IndexableX_to_N functors, so the documentation for our Array and String modules will appear more general than necessary. The new Odoc model fixes this issue with better support for type substitutions, meaning that such trickery doesn't need to impact your users.

One unavoidable limitation is in what sort of operations we can put in the Foldable_of_indexable functor. Suppose our initial functor included a sum : int t -> int following function:

val sum : int t -> int


sum relies on the fact that it's possible to put int values in the container, which is clearly not possible for strings. Naturally, the type system prevents us from violating parametricity in this way:

module type Summable =
type 'a t
val sum : int t -> int
end

module _ (T : Summable) = (T : Summable with type 'a t := string)

Error: Values do not match:
val sum : int t -> int
is not included in
val sum : string -> int


To state the obvious, we can't rely on parametricity in our implementations if we want them to work on non-parametric containers.

The typeclasses in Haskell's base have the same "polymorphic-instances-only" property as our Indexable1 signature (unsurprising, since Base doesn't provide any unboxed container types).

class Indexable1 f where        -- Polymorphic instances only
get    :: f a -> Int -> a
length :: f a -> Int


A similar trick can be performed there to generalise the typeclass instances for monomorphic containers like Text:

{-# LANGUAGE TypeFamilies #-}

type family   Elt container     -- Relate containers to their element type
type instance Elt [a]  = a
type instance Elt Text = Char

class IndexableN c where
get    :: c -> Int -> Elt c
length :: c -> Int

instance IndexableN [a] where   -- Polymorphic instance
get    = (!!)
length = Prelude.length

instance IndexableN Text where  -- Monomorphic instance
get    = Text.index
length = Text.length


As in the OCaml version, we use an Elt type operator to carry the equality needed for the monomorphic case. This time we used type families to specify the relations explicitly, but we could have used multi-parameter type classes for something more akin to the OCaml functor implementation. See the mono-traversable package for more of this sort of trickery in Haskell.

1. This is the approach taken by Jane Street's base, and is very similar to the Haskell notion of building standard libraries from type-class instances.