BEJOCAMA - Category Theory and C++

Table of Contents

1. Intro

A programming language consists on a bunch of elements - variables, functions, objects, structures - which behave like it is defined by this language. But there is perhaps a motivation to work with self defined elements, which behaves in a certain way. We can talk about constructing a domain specific language or constructing a theory, which needs to be proved.

It's difficult to argue, that there is a need of a new language, because there are not so much people, who solved the same complex programming problem in two different languages and finally he came to the result, that the first one was more suitable than the second one.

Most programmers are interested on structural problems - but most programmers must also produce solutions for existing problems. So an interesting task could be, to move step by step to more powerful working - directly usable - concepts.

So I want to use existing features of a programming lanuguage and give them an appropriate meaning. In constrast to object oriented programming, where the Liskov principle is most important - a circle is a shape - I want to give things an interpretation, which may vary from context to context. So a circle can be interpreted as a shape, but other interpretations are also possible.

Such considerations lead to a very different programming style and new constructions to work with.

Sure, this approach is motivated by category theory. Because programming is totally constructive, all ideas should here be representable by constructions - by code examples.

For this I'm using C++ to express some ideas. The result is not a C++ programm - even if it seems to be. The most important feature of C++ I'm using is the template system, which allows generic programming. This is the connection to category theory, which has a generic nature.

Let's start to find a solution for the problem avoiding loops.

2. Avoiding Loops

Why loops are a problem? Loops are a problem because of the body, which may contain portential programming errors. But if we consequently replace the loop body by a separated body function (perhaps a lambda), the problem is still in that function and the code may become perhaps worse to read

std::list<int> test_function(std::list<int> l)
{
  std::list<int> rl;

  for(auto& e: l) {
    rl.emplace_back(e*e);
  }
}

can be transformed to

std::list<int> test_function(std::list<int> l)
{
  std::list<int> rl;

  auto square = [&rl] (int e) {
    rl.emplace_back(e*e);
  };

  for(auto& e: l) {
    square(e);
  }

  return rl;
}

In the c++ standard we will find functional structures for avoiding explicitely looping

std::list<int> test_function(std::list<int> l)
{
  std::list<int> rl;

  auto square = [&rl] (int e) {
    rl.emplace_back(e*e);
  };

  std::for_each(l.cbegin(), l.cend(), square); 

  return rl;
}

After a look on the code, can I deliver an argument, why we should do this? My answer is - at the moment it makes no sense. Because I want create a new language and we need some words of this language to construct something we can compare with the old language. This is an journey of construction experiments. So proclaiming a problem avoiding loops is at firsta pseudo argument and finally an introduction drug.

Let's now go a different direction and let's look on the functor concept

template<template<typename...> class G>
struct functor;

template<>
struct functor<std::list>
{
  teplate<typename F, typename A>
  decltype(auto) operator()(F&& f, const std::list<A>& l)
  {
    using type = decltype(std::forward<F>(f)(std::declval<A>()));

    if constexpr (std::is_same<void,type>::value) {

      std::for_each(l.cbegin(), l.cend(), std::forward<F>(f));

    } else {

      std::list<type> rl;

      auto x = [&rl,ff=std::forward<F>(f)](const auto& e) {
        rl.emplace_back(std::forward<F>(f)(e));
      };

      std::for_each(l.cbegin(), l.cend(), x);

      return rl;
    }
  }
};

The template functor is said here to be an interpretation and especially in this case an interpretation of a list. The list is also a template, which is understood here as an generator. For the moment lets state

An interpretaton is a mapping of a generator to a type. A generator is a mapping of a type to a type.

Next take a look on the code

std::list<int> test_function(std::list<int> l)
{
  auto square = [] (int e) {
    return e*e;
  };

  return functor<std::list>()(square,l);
}

It's perhaps clear, that such container iterations can be generalized, so that the loop we only need to implement at one place.

But this is not enough. Important seems to me, to express the relation to an arithmetic calculation by an operator. First take a look on the wish

std::list<int> test_function(std::list<int> l)
{
  auto square = [] (int e) {
    return e*e;
  };

  return square <<a>> l;
}

In functional languages we can find a \(\lt{}$\gt{}\) operator. In c++ our possiblities to define operators are limited. I'm trying to circumvent this by using an approach found on different sites in the web, looking for "custom operators in c++". But what would the \(\lt{}a\gt{}\) operator express - in contrast to \(\ll{}a\gg{}\) - simply

std::list<int> test_function(int i)
{
  auto square = [] (int e) {
    return e*e;
  };

  return square <a> i;
}

So

\(square \lt{}a\gt{} i\)

is simply

\(square(i)\).

Using an operator instead of parenthesis is a very important method to express value application in functional languages. So let's apply the function square two times, using the application operator

std::list<int> test_function(int i)
{
  auto square = [] (int e) {
    return e*e;
  };

  return square <a> square <a> i;
}

This looks much more elegent and readable than

std::list<int> test_function(int i)
{
  auto square = [] (int e) {
    return e*e;
  };

  return square(square(i));
}

I think functional programmers wouldn't like so much these languages without having these operators.

We can see, that the operator \(\ll{}a\gg{}\) is an extension of an ordinary function application, where the function is applied to a functorial value.

If we would generate a new function via the functor operator instead of directly applying the function to the value, we would get the following situation

std::list<int> test_function(std::list<int> l)
{
  auto square = [] (int e) {
    return e*e;
  };

  return functor<std::list>()(square) <a> l;
}

We can directly apply the value to the new generated function. This approach of the functor implementation looks like

template<>
struct functor<std::list>
{
  template<typename F>
  constexpr decltype(auto) operator()(F&& f)
  {
      return [ff=std::forward<F>(f)] (auto&& l) {
        using type = decltype(ff(std::declval<decltype(*l.begin())>()));

        if constexpr (std::is_same<type,void>::value) {

           std::for_each(l.cbegin(), l.cend(), ff);

        } else {

           std::list<type> rl;

           auto x = [&rl,&ff](const auto& e) {
              rl.emplace_back(ff(e));
           };

           std::for_each(l.cbegin(), l.cend(), x);

           return rl;
       }
    };
  }
};

Function generation code can always declared as constexpr. Perhaps it is suprising - because of the auto in the lambda declaration, that is possible to do the following

std::list<int> test_function(std::list<int> l)
{
  auto square = [] (int e) {
    return e*e;
  };

  auto list_square = functor<std::list>()(square);

  return list_square <a> l;
}

Because the lambda is a shortcut to an object with one call operator, the object list_square represents the object, not the operator function. The template resolution is done only at application time. Perhaps it is later better to replace the lambda by a dedicated function object.

Before continue with more interpretations and their coding, let's get in slide touch with some mathematical notations.

3. Functor in Category Theory

The whole theory on avoiding loops can be express in an abstract diagram

functor.png

The function f transforms values of type X to values of type Y. The functor F represents a type transformation from X to F<X> and Y to F<Y> and a function transformation from F to F(f). The type transformation in our example in the last chapter is done by the list template, where an int is transformed to a list<int>. The function square from int to int becomes a function from list<int> to list<int>. So we can see, that the type transformation is completely defined by the argument and the return value of the function f and the type generator list.

Details can be read in books on category theory. Only one remark is perhaps important at this place. Category theory isn't a theory of programming, but can be applied to this area, if we look more in a functional way on this topic.

Now a first sidestep follows - custom operators – is comming soon.