Search This Blog

Friday, September 16, 2022

C++ Decays into Lisp (Chapter 1)

I have been using Mathematica since about version 3.0 (roughly 1996).  I also have extensive Lisp experience beginning in the 1980's. I ported Portable Standard Lisp (https://github.com/blakemcbride/PSL and https://en.wikipedia.org/wiki/Portable_Standard_Lisp) to VMS for use at LEXEME Corporation.  LEXEME initially used Lisp to build what I considered to be a successful computer-language to computer-language translation systems (more on this company in future posts).

Mathematica is a Lisp-ish offshoot in terms of its core Math Kernel functionality.  The UI is otherworldly in terms of power when compared to most IDEs - particularly in the area of complex mathematical and textual formatting as well as 2D and 3D graphics functionality.  I have routinely used Mathematica and it's UI over the last few decades to construct many complex applications and prototypes.  (There is also there "Workbench" which is an Eclipse-based Mathematica IDE - I have not played with that.)

So, more or less, forty some years of lisp and lisp-like language experience.

Key features of these languages: lambda's, rule processing, programs represented as data, and all the other lisp-ish features, and, most importantly, lack of speed.

All that said, I have also used C professionally since about 1976 and C++ since the mid 1990's as well as many python, Java, JavaScript, and many more.

The reason for this post is the disastrous direction "C++" has taken over the years.  Disastrous you say?  Why C++ is becoming the greatest thing in programming since sliced bread!  How can you say this!

As a for instance, let's take a look at this video:

The first part describes the explosion in "pages in the standard" from 500-ish to thousands over the years (I smell design by committee).  Not a good start.

From the video

The next chunk of video describes code of the form:

    main() {
        std::cout << f() << '\n';
        std::cout << f() << '\n';
        std::cout << f() << '\n';
        std::cout << f() << '\n';
    }

With a lengthy discussion of various ways (lambdas, generators, static variables) to configure the function f() to return sequential values.

   static int f_val = 0;
    int f() {
        return f_val++;
    }

In the olden days the above function f() would accomplish this task.

However, the video takes you through ever more complex forms of the same functionality: first lambdas and finally a really complex "generator".

All that said the important take away is this: why keep adding ever more complex features to the language when the language already has the necessary features to accomplish the task?

The "olden days" version generates code that looks like this on an Intel processor (using clang):

    push    rbp
    mov     rbp, rsp
    mov     eax, dword ptr [rip + f_val]
    mov     ecx, eax
    add     ecx, 1
    mov     dword ptr [rip + f_val], ecx

    pop     rbp
    ret

(Taken from compiler explorer at https://godbolt.org/)

We can use a lambda (one of many possible ways) to do the same thing:

    int f_val = 0;
    auto f = [&]() {
        return f_val++;
    };

which produces this code:

    push    rbp
    mov     rbp, rsp
    mov     qword ptr [rbp - 8], rdi
    mov     rax, qword ptr [rbp - 8]
    mov     rcx, qword ptr [rax]
    mov     eax, dword ptr [rcx]
    mov     edx, eax
    add     edx, 1
    mov     dword ptr [rcx], edx
    pop     rbp
    ret

Here we see seven instruction instead of four (75% increase) and our language has progressed in complexity as measured by the standard maybe 50% (at least to the C++11 features used here).

Why is this okay?

Well, for one thing processors today are insanely fast compared to the 1990's...  (A PDP-11/20 used in the early 1970s for Unix had an instruction cycle time of perhaps 800ns, current processors have a cycle time of perhaps 1ns as well as many advanced features like caches, and so forth.) 

So we can be less efficient in our generated instructions and our thinking.  Yes, our thinking...

All the muddled C++ "standard" expansion is basically tied to the idea that it's "too hard" for random Joe programmer to manage memory on his own - either will malloc/free or new/delete.  Really?  Try and figure out some old, giant code base stuffed with "auto" that takes months to adequately decode before you can really make a knowledgable fix.  How is this better?

So, to solve the problem, we turn a memory pointer language into one that tries to get the programmer to structure the code in such a way as to not need memory pointers.

The other sob story is that programmers need "efficiency" - python and friends are too slow (being interpreted or garbage collected) - so, rather than re-evaluate our programming situation, we need to move features of these other languages into C++.

But let's weigh this against the real cost of software: maintenance over lifetime.  Instead of one way, albeit tedious or using explicit memory allocation, we instead have created a generation of standards "expansion" so that either we have a mix of forty plus years of "coding standard" mixed into a single application or a continuous "rewrite" process that requires the code base to be constantly "reworked" to remove old "standard" code and replace it with new, less efficient "standard code".

Today C++ aspires to have all the things that are "wrong" with lisp: auto instead of explicit types, impossible to understand code (using auto, for example, to hide sins which only a compiler may know), explosive error messaging (same as losing a parend in lisp), a thousand ways to do simple things, and on and on.

And some things that are not related to Lisp: an entire interpreted Turning-complete template model.

Things like GPUs, graphics card standards such as OpenGL, Vulkan make writing "portable" code impossible because the libraries for these things are more complex than the base application.

Let's not forget WASM.  The new architecture to take your old, creaking C++ application into the modern operating system of a browser.

What's next in C++, piping like in shells, oh wait, it's already there!

Perhaps a C++ REPL?

I can't wait!

I mentioned Mathematica at the start because, of all things, it does the best of bringing everything you might want to do when developing complex applications into one place: code, graphical output, text formatting, documentation, handing complex math.  You can build prototypes quickly and with minimal effort.  They will run on any supported machine.

It's certainly no panacea but it's the most effective thing out there in many ways.

I think Mathematica needs a more Atom/Julia-like environment (though sadly Atom is to be shelved in place of VSCode) and a better interface to the volume of libraries out there (rather than just their own).

Things like VSCode are nice but they never quite work on large code bases.

No, instead of this ridiculous decades-long C++ "standards" effort how about a global effort to design a real development environment that we, as humanity, can be proud of.

A language that has a debugger that always works.

An IDE and language that is designed to handle new hardware: GPU, AI.

A development environment that can handle large, complex at-scale systems.

A language that allows control over efficiency.

A language that runs "everywhere" - hardware, GPU, browser.

I think that the focus on the language is really unimportant to a large degree (sure it needs to be readable).  What is important is a real development environment that is portable, easy to use, helpful, works across all platforms and is designed to handle the life cycle of real software.



No comments:

Post a Comment