Modes: An Ubiquitous yet Overlooked Concept in Programming
The first programming language that I learned seriously is Java.
I’ve quickly come out with an interpretation that whatever language features in Java correspond to parts of speech.
Variables are nouns;
this
is the subject and the arguments after the name of the method call are objects;
methods are verbs and interfaces are adjectives.
But what are the adverbs?
Well, it’s pretty obvious that programming languages aren’t natural languages so not everything has an analogy in natural languages.
But, hey, recently I realized there actually is something roughly corresponds to adverbs, which I called argument modes, modes, for short.
Modes are ways to call a function or to annotate the data with more information.
One may ask, “Is there more than one way to call a method in Java?”
Actually, there are 4 modes in Java, but arguments of 2 modes can only be called implicitly.
Now let’s see an example with all 4 modes in Java.
public interface Stream<T> {
// ...
<R> Stream<R> map(Function<? super T, ? extends R> mapper);
// ...
}
-
the normal mode
Here, the argumentmapper
is in the normal mode, which is the only mode in which arguments can always be supplied explicitly at runtime in Java. -
the type mode
They are arguments of generics, which are introduced in Java SE 5.
Examples are theT
inStream<T>
and theR
beforeStream<R>
.
They are enclosed in angle brackets (<>
),
and some of them can only be inferred,
such as the<R>
in front of the method declaration and the 2?
’s. -
the
super
mode
Appears asT
in? super T
in the source code.
Always have to be passed implicitly.
Defines the lower bound of the type. -
the
extends
mode
Appears asR
in? extends R
.
Similar to the previous mode.
We can see that modes provide ways to pass some information to methods and classes, either explicitly or implicitly.
However, is the concept of modes also presented in other languages?
From now on, let’s focus on another language, Haskell,
which provides a more advanced type system compared to Java.
In haskell, functions are first-class members, and classes can abstract over higher-kinded types.
With language extensions, we can use even more advanced features.
Haskell has 3 modes.
Now I’m going to show all of them in the follow example.
The following fmap
function is similar to the map
function in Java, but is more flexible by abstracting over the container type.
fmap :: forall f a b. Functor f => (a -> b) -> f a -> f b
-
the normal mode
The actual parameters passed at runtime, which are the values of typesa
,b
,f a
andf b
.
The same as the normal mode in Java. -
the
forall
mode
The arguments after the keywordforall
, which aref
,a
,b
.
Similar to the type mode in Java.
Can only be inferred until GHC 8.
After GHC 8, type arguments can be explicitly provided using the syntax@a
Types themselves also have types, which are called kinds.
for example,f
has kind* -> *
.
Ultimately, with extensionTypeInType
turned on,
kinds also have types, e.g.* :: *
. -
the constraint mode
The constraints before the fat arrow (=>
).
Constraints are adjectives.
In the example, sending an argument (f
) to the classFunctor
gives you a constraint, but classes can have zero or more arguments and constraints need not be classes with arguments passed in (with language extensions).
We’ve seen that Java and Haskell both have several modes, and many other languages also do.
However, the syntaxes among the modes are “unequal” in either Java and Haskell.
Every mode has its own syntax, and they don’t look similar at all.
In the following section, I will present a hypothetical dependently-typed system programming language with more unified syntax and view among the possible modes without subtyping.
In this hypothetical language, all modes of arguments can be called explicitly, either optionally or not.
And I don’t use angle brackets, because they could make parsing more difficult or ambiguous.
In former C++ versions, you cannot write A<B<C>>
instead of the uglier A<B<C> >
, because the trailing >>
would be parsed as an operator.
Rust also faced a similar problem.
Thus, the best solution in my opinion is to fully eliminate angle brackets.
In fact, I’m going to use only 2 kinds of delimiters, ()
and []
, preserving {}
for other uses.
-
normal mode
The same as above.
Unlike Haskell, arguments in normal mode cannot be curried because it’s closer to the behavior of current machines.
If arguments in normal mode could be curried, the compiler often has to generate several uncurried functions or return lambda.
Of course, functions can still be partially applied using explicit lambda.Examples:
fn factorial : (Uint) -> Uint; fn max : (Int, Int) -> Int;
-
const mode
Before introducing const mode, I’m going to talk about const functions.
Const functions are pure functions that always terminate, e.g. the abovefactorial
andmax
function.const fn factorial : (Uint) -> Uint; const fn max : (Int, Int) -> Int;
The idea is that const functions can simply be evaluated at compile time if all the arguments are known at compile time.
Const functions need not to be evaluated at compile time, though.
In fact, it must be able to be evaluated at runtime.
In most dependently-typed languages, including this hypothetical one, types are themselves first-class values, and user-defined data types are also constants (the type, not the term of the type).
All small types have type (or kind)Type
.Now, return to const mode.
Arguments in const mode must be evaluated in compile time, so they must be either a literal (including data types) or a const function applied with (recursively) constants.
For example,const fn apply : [From : Type, To : Type](From -> To, From) -> To;
This function simply applies its first argument to its second argument.
What’s the difference between the previous
max
function and the followingmax2
function, though?const fn max2 : [_ : Int, _ : Int]() -> Int;
First,
max2
can only be evaluated at compile time, not at runtime.
More importantly, arguments in const mode are supposed to be inferred, resembling the type mode in java, but much more powerful.
Arguments ofmax2
can never be inferred, so they should not be in const mode.// Invocations: max(0, 1); apply(factorial, 42); // Arguments in const mode are inferred. // Doesn't work: // max2(); // Cannot infer its arguments in const mode.
Moreover, arguments in const mode are able to be curried, so
max2[a, b]
andmax2[a][b]
are interchangeable.
There are 2 reasons for it:-
users are not supposed to do heavy calculation at compile time, so performance isn’t that important.
-
In Scala, generics cannot be curried, you need to use a hack called type lambda if you want to partially apply a generic data type.
The reason why it’s a hack is not only it’s syntactically verbose, but also it makes type inference unpredictable.
Now let’s consider another feature of the const mode:
The type of the arguments can also be inferred when declaring the function.
apply
can also be written as below:const fn apply : [From, To](From -> To, From) -> To;
And the well-known
id
function could be declared as:const fn id : [T](T) -> T;
-
-
instance mode
The hypothetical programming language is in fact heavily inspired by Agda.
The beauty of Agda in my opinion is that it provides a uniform way to define sorts of information, therefore noclass
versusinterface
in Java ordata
versusclass
in Haskell.
The boundary between types and classification of types disappeared, and the differences are represented with different modes.The idea of
data
andclass
doesn’t necessarily need to be unified in this language, though.
But in this language, instances are just another kind of argument, (usually automatically) being passed to functions in a similar syntax.
instance arguments are enclosed in[()]
, for example:const fn map[F, A, B][(Functor[F])](F[A], A -> B) -> F[B];
Notice that
[(Foo)]
should not be parsed (either by human beings or compilers) as[ (Foo) ]
because[A]
means[A : _]
and putting a pair of parentheses around the argument doesn’t make much sense.
Also notice that instance mode is more similar to normal mode than const mode is because(Bar)
and[(Bar)]
mean(_ : Bar)
and[(_ : Bar)]
, respectively. -
pi mode
In a dependently-typed language, types can depend on terms.
But in our language this could not be done … yet.
Being enclosed in([])
, arguments in pi mode are similar to ones in const mode such that([T])
represents([T : _])
, but also arguments in pi mode can be provided at runtime and cannot be curried.
Summary
normal | const | instance | pi | |
---|---|---|---|---|
T means |
(_ : T) |
[T : _] |
[(_ : T)] |
([T : _]) |
type inference | no | yes | no | yes |
phase (if not const fn ) |
runtime | compile time | compile time | runtime |
curryable | no | yes | yes | no |
argument inference | no | yes | proof search | ? |
can be dependent on | no | yes | yes | yes |
Dependent Modes
The 4 modes above are possibly enough for a language without subtyping, or maybe there could be 2 variants of pi mode, one’s argument can be inferred whereas the other’s cannot.
The syntax would likely be ([])
and ([[]])
then.
But I’m just going to assume there is only 1 kind of pi in the rest of the article.
If I want to achieve higher abstration, I would want to be able to be generic over these kinds of modes.
Now, introduce dependent modes.
First, there would be a special builtin type Mode
.
It’s definition could be something like :
#[lang = "mode"]
const data Mode = Normal | Const | Instance | Pi ;
Mode
is special that a term of that type can only be supplied in const mode, because Mode
s usually work at compile time; it makes little sense at runtime.
Functions are able to recieve arguments in the mode depending on the previous argument of type Mode
.
Let’s show an example:
const fn baz : [mode : Mode][[[mode] Int]] -> Int;
[[[mode] Int]]
means that the argument is in mode mode
You can call the function with all kinds of modes:
baz(1);
baz[1];
baz[(1)];
baz([1]);
mode
would be initialized to the mode in which you call the function or would be inferred.
If the argument in the dependent mode is being dependent on another argument, then the mode cannot be initialized as normal mode.
const fn wierdId : [mode : Mode][[[mode] T : Type]](T) -> T;
wierdId(1); // Ok, `mode` would probably be `Const`.
wierdId([Int])(1) // Also ok.
// wierdId(Int)(1) // Error.
Summary, Updated
normal | const | instance | pi | dependent | |
---|---|---|---|---|---|
T means |
(_ : T) |
[T : _] |
[(_ : T)] |
([T : _]) |
[[_ : T]] |
type inference | no | yes | no | yes | no |
phase (if not const fn ) |
runtime | compile time | compile time | runtime | depends |
curryable | no | yes | yes | no | depends |
argument inference | no | yes | proof search | ? | depends |
can be dependent on | no | yes | yes | yes | depends |