I wrote this dumb little C++ test program to compare the use of simple STL vector stuff with a plain old C-style character array:
//
// main.cpp
// testtest
//
#ifdef BEST
#include <stdio.h>
void foo(char * x, size_t& ptr) {
*(x + ptr++) = 23;
}
#define bar(a, b, c) *(a + b++) = c
void test()
{
char a[100];
size_t ptr = 0;
foo(a, ptr);
bar(a, ptr, 23);
}
#endif // BEST
#ifdef MIDDLE
// Type your code here, or load an example.
#include <iostream>
#include <vector>
using namespace std;
void foo(vector<int>& x, size_t& ptr) {
x.push_back(23);
ptr++;
}
void test()
{
vector<int> a{ 1, 3, 5};
size_t ptr = 0;
foo(a, ptr);
}
#endif // MIDDLE
#ifdef WORST
#include <iostream>
#include <vector>
using namespace std;
vector<int>foo(void) {
vector<int> somev;
somev.push_back(23);
return somev;
}
void test()
{
vector<int> a{ 1, 3, 5};
vector<int> data = foo();
a.insert(a.end(), data.begin(), data.end());
}
#endif // WORST
int main(int argc, const char * argv[]) {
for (int i = 0; i < 1000000; i++) {
test();
}
}
If I run this code on a Mac M1 I get (using stock clang, etc.):
% clang -DBEST -std=c++14 -O2 testtest_all.cpp -o testtest_all -lstdc++
% time ./testtest_all
./testtest_all 0.00s user 0.00s system 1% cpu 0.231 total
% time ./testtest_all
./testtest_all 0.00s user 0.00s system 45% cpu 0.005 total
% time ./testtest_all
./testtest_all 0.00s user 0.00s system 54% cpu 0.005 total
% time ./testtest_all
./testtest_all 0.00s user 0.00s system 45% cpu 0.006 total
% clang -DWORST -std=c++14 -O2 testtest_worst.cpp -o testtest_worst -lstdc++
% time ./testtest_worst
./testtest_worst 0.09s user 0.00s system 30% cpu 0.290 total
% time ./testtest_worst
./testtest_worst 0.08s user 0.00s system 94% cpu 0.089 total
% time ./testtest_worst
./testtest_worst 0.08s user 0.00s system 97% cpu 0.086 total
% time ./testtest_worst
./testtest_worst 0.08s user 0.00s system 97% cpu 0.085 total
Feel free to plug the above C++ code into
https://godbolt.org/ to see the gory details (some output shown above). (Note too that the initial slow speed may be some JIT thing running on the M1 - so ignore that for the purposes here.)
This results are quite interesting: I am paying a 10x performance hit for the convenience of memory "safety" - that is - I don't run off the end of an array or something like that.
I am not worrying about cache lines or other esoteric stuff here. Just running a dumb, simple-minded performance test.
It surely seems like we are using 10x the number of processors just so we can "write better code."
According to
Quora (which is a dubious source but a source non-the-less) I would estimate there are between a half to a trillion processors running todays world.
Wikipedia thinks that about 10% of all human power is used for computers (see
this).
Let's estimate 2-3 watts, on average, per processor (most are small, many are not, so we guess).
I am sure this is not very precise but it gives you the big picture idea of how much computing costs in terms of global energy output.
Now let's imagine that we use inefficient coding systems (C++/STL) to write code for these systems.
Fixing the code to be more efficient, i.e., removing STL as we saw in the tests above, we might be able to spend 1% of humanities power on computers instead of 10%.
That's 10% of the heat and corresponding power output by the entire world.
Seems like a steep price to pay for mere code-writing convenience.
C++ and STL seem like they are costing humanity a lot, just to nobody has to learn how to prevent memory leaks or create some better language.
That means your "code base" of millions of lines is destroying the planet.
No comments:
Post a Comment