BEJOCAMA - Category Theory and C++
Table of Contents
1. Intro
A programming language consists on a bunch of elements - variables, functions, objects, structures - which behave like it is defined by this language. But there is perhaps a motivation to work with self defined elements, which behaves in a certain way. We can talk about constructing a domain specific language or constructing a theory, which needs to be proved.
It's difficult to argue, that there is a need of a new language, because there are not so much people, who solved the same complex programming problem in two different languages and finally he came to the result, that the first one was more suitable than the second one.
Most programmers are interested on structural problems - but most programmers must also produce solutions for existing problems. So an interesting task could be, to move step by step to more powerful working - directly usable - concepts.
So I want to use existing features of a programming lanuguage and give them an appropriate meaning. In constrast to object oriented programming, where the Liskov principle is most important - a circle is a shape - I want to give things an interpretation, which may vary from context to context. So a circle can be interpreted as a shape, but other interpretations are also possible.
Such considerations lead to a very different programming style and new constructions to work with.
Sure, this approach is motivated by category theory. Because programming is totally constructive, all ideas should here be representable by constructions - by code examples.
For this I'm using C++ to express some ideas. The result is not a C++ programm - even if it seems to be. The most important feature of C++ I'm using is the template system, which allows generic programming. This is the connection to category theory, which has a generic nature.
Let's start to find a solution for the problem avoiding loops.
2. Avoiding Loops
Why loops are a problem? Loops are a problem because of the body, which may contain portential programming errors. But if we consequently replace the loop body by a separated body function (perhaps a lambda), the problem is still in that function and the code may become perhaps worse to read
std::list<int> test_function(std::list<int> l) { std::list<int> rl; for(auto& e: l) { rl.emplace_back(e*e); } }
can be transformed to
std::list<int> test_function(std::list<int> l) { std::list<int> rl; auto square = [&rl] (int e) { rl.emplace_back(e*e); }; for(auto& e: l) { square(e); } return rl; }
In the c++ standard we will find functional structures for avoiding explicitely looping
std::list<int> test_function(std::list<int> l) { std::list<int> rl; auto square = [&rl] (int e) { rl.emplace_back(e*e); }; std::for_each(l.cbegin(), l.cend(), square); return rl; }
After a look on the code, can I deliver an argument, why we should do this? My answer is - at the moment it makes no sense. Because I want create a new language and we need some words of this language to construct something we can compare with the old language. This is an journey of construction experiments. So proclaiming a problem avoiding loops is at firsta pseudo argument and finally an introduction drug.
Let's now go a different direction and let's look on the functor concept
template<template<typename...> class G> struct functor; template<> struct functor<std::list> { teplate<typename F, typename A> decltype(auto) operator()(F&& f, const std::list<A>& l) { using type = decltype(std::forward<F>(f)(std::declval<A>())); if constexpr (std::is_same<void,type>::value) { std::for_each(l.cbegin(), l.cend(), std::forward<F>(f)); } else { std::list<type> rl; auto x = [&rl,ff=std::forward<F>(f)](const auto& e) { rl.emplace_back(std::forward<F>(f)(e)); }; std::for_each(l.cbegin(), l.cend(), x); return rl; } } };
The template functor is said here to be an interpretation and especially in this case an interpretation of a list. The list is also a template, which is understood here as an generator. For the moment lets state
An interpretaton is a mapping of a generator to a type. A generator is a mapping of a type to a type.
Next take a look on the code
std::list<int> test_function(std::list<int> l) { auto square = [] (int e) { return e*e; }; return functor<std::list>()(square,l); }
It's perhaps clear, that such container iterations can be generalized, so that the loop we only need to implement at one place.
But this is not enough. Important seems to me, to express the relation to an arithmetic calculation by an operator. First take a look on the wish
std::list<int> test_function(std::list<int> l) { auto square = [] (int e) { return e*e; }; return square <<a>> l; }
In functional languages we can find a \(\lt{}$\gt{}\) operator. In c++ our possiblities to define operators are limited. I'm trying to circumvent this by using an approach found on different sites in the web, looking for "custom operators in c++". But what would the \(\lt{}a\gt{}\) operator express - in contrast to \(\ll{}a\gg{}\) - simply
std::list<int> test_function(int i) { auto square = [] (int e) { return e*e; }; return square <a> i; }
So
\(square \lt{}a\gt{} i\)
is simply
\(square(i)\).
Using an operator instead of parenthesis is a very important method to express value application in functional languages. So let's apply the function square two times, using the application operator
std::list<int> test_function(int i) { auto square = [] (int e) { return e*e; }; return square <a> square <a> i; }
This looks much more elegent and readable than
std::list<int> test_function(int i) { auto square = [] (int e) { return e*e; }; return square(square(i)); }
I think functional programmers wouldn't like so much these languages without having these operators.
We can see, that the operator \(\ll{}a\gg{}\) is an extension of an ordinary function application, where the function is applied to a functorial value.
If we would generate a new function via the functor operator instead of directly applying the function to the value, we would get the following situation
std::list<int> test_function(std::list<int> l) { auto square = [] (int e) { return e*e; }; return functor<std::list>()(square) <a> l; }
We can directly apply the value to the new generated function. This approach of the functor implementation looks like
template<> struct functor<std::list> { template<typename F> constexpr decltype(auto) operator()(F&& f) { return [ff=std::forward<F>(f)] (auto&& l) { using type = decltype(ff(std::declval<decltype(*l.begin())>())); if constexpr (std::is_same<type,void>::value) { std::for_each(l.cbegin(), l.cend(), ff); } else { std::list<type> rl; auto x = [&rl,&ff](const auto& e) { rl.emplace_back(ff(e)); }; std::for_each(l.cbegin(), l.cend(), x); return rl; } }; } };
Function generation code can always declared as constexpr. Perhaps it is suprising - because of the auto in the lambda declaration, that is possible to do the following
std::list<int> test_function(std::list<int> l) { auto square = [] (int e) { return e*e; }; auto list_square = functor<std::list>()(square); return list_square <a> l; }
Because the lambda is a shortcut to an object with one call operator, the object list_square represents the object, not the operator function. The template resolution is done only at application time. Perhaps it is later better to replace the lambda by a dedicated function object.
Before continue with more interpretations and their coding, let's get in slide touch with some mathematical notations.
3. Functor in Category Theory
The whole theory on avoiding loops can be express in an abstract diagram
The function f transforms values of type X to values of type Y. The functor F represents a type transformation from X to F<X> and Y to F<Y> and a function transformation from F to F(f). The type transformation in our example in the last chapter is done by the list template, where an int is transformed to a list<int>. The function square from int to int becomes a function from list<int> to list<int>. So we can see, that the type transformation is completely defined by the argument and the return value of the function f and the type generator list.
Details can be read in books on category theory. Only one remark is perhaps important at this place. Category theory isn't a theory of programming, but can be applied to this area, if we look more in a functional way on this topic.
Now a first sidestep follows - custom operators – is comming soon.