Search This Blog

Wednesday, May 2, 2012

Human Genetics - Barking up the Wrong Tree

BRCA2 - Wikipedia
Yesterday I wrote about how bacteria are symbiotic with the human body.  I also described how complex this system was including not only the trillions of bacteria but also the systems the bacteria used for communication.

Today I came across this article at the WSJ.  It describes how human genetic mapping has "identified" 4,000 diseases with a "known genetic basis."

The article links, for example, various gene sequences such as BRCA2 to various risks of disease, e.g., breast, prostate, and pancreatic cancer.

Now, to be sure, genetic disruptions and mutations can directly cause a variety of human problems as well as death.

But this article is not talking about direct, causal linkages.  Instead it talks about risk.

I talked about medical risk in "Cholesterol, Heart Disease and Magical Thinking."

So here I'd like to take a somewhat different approach to this idea of genes and medical risk.

Like medicine computer science can involve the study of complex systems, albeit on a much smaller scale.

For example, there are many very large software systems in the world.  Some with tens or hundreds of millions of lines of code (source here, m = millions of lines):

    Lucent 5ESS Switch        100m   
    Windows Vista              50m  
    Red Hat Linux 7.1          30m 
    Windows XP                 40m    
    Visual Studio              40m  
    MS Office                  30m 

Now these are just estimates but the orders of magnitude are whats important.

We can think of the "source code" here as the DNA of the software system.  During the execution of the program there are various pieces of information stored in the computers memory and on the file system - these are the "expression" of the software which is kind of like the genes expressing things like arms or brains or whatever.  Input from the environment, such as users created Word documents, is like the input your body receives from the outside world.

Now this is a very crude analogy but an important one, I think, when one looks are the issue of complexity.

Humans are hugely more complex because of a few basic factors:

1) Genes express proteins and the assembly of proteins - software typically does not express constructs used to build other constructs in the same way.  There are some similarities, for example software to construct abstract notions, e.g., a linked list, and then other code that uses the linked list.  But its not quite the same.

2) Software systems (at least those described above) are not self assembling.  Humans assemble them directly.

3) Humans are four-dimensional physical constructs described by DNA and DNA is more like a blueprint for a machine rather than a machine in and of itself. Software applications, on the other hand, are effectively two dimensional and less machine-like.  (Taking out "time" software can be though of as storage (one dimension) and program (another dimension).  Human's are three dimensional objects described by a one dimensional linear DNA sequence.)

Now at a code level human DNA is about 4- billion of GTAC sequences.

So though DNA itself is not much more complex than a very large software application because its effectively a blueprint for a self-replicating machine as opposed to mathematical description of algorithms I will argue that DNA is at least five orders of magnitude more "information dense" than software - as well as physically "information dense" in the sense that one human cell is tiny compared to a thumb drive that could store 100m lines of code (again by orders of magnitude).

So that's about ten or fifteen times more basic "code" than the largest software applications.

Basically an order of magnitude but not two.

So what do we, as humans, know about fixing or repairing something like the Lucent 5ESS Switch?

Well, for one thing, we don't go about like doctors go about understanding human genetics.

No one goes into the code and randomly makes changes here or there in order to fix a symptom.

So let's imagine an example:  Suppose this system (which is used by the phone companies to switch voice calls over trunk lines - see this) has a problem - say a set of files on disk is not deleted at a certain point in the during the operation of the system.

Now in the medical world of allopathy the "solution" would be to simply go in, like with surgery, and remove the offending files though some manual means.

See - now the files are gone.  But what about what's causing the problem?

While we might search the code base looking for references to where these file are created without the context of how the system operates we would be in the same position as doctors and modern medicine is looking at genes.

For example, I might search for "unlink" commands in the source code (these remove files on 3B2 systems).  Their might be 10,000 or 50,000 occurrences of them - only there may only be one - because a library of code is used elsewhere that calls unlink.

So what can we deduce from the single instance of an unlink (which I argue is like looking at one gene like BRCA2).

Not much.

BRCA2 is about 81 thousand bases long - pdfExpress - Lexigraph's product - is about the four times that size (404 thousand lines of code) - and it uses 206 unlink commands.

So medicine is looking at a large, complex entity that plays a role in many aspects of the human body and comparing it to diseases which have other, non-genetic components, e.g., iodine and breast cancer.

In our 5ESS we cannot just look at the files that are not being deleted - we need to understand why they are not being deleted.  And while we can remove them they are probably going to reappear (like a cancer) until we figure out what's wrong.

So what I am saying here is that medical science is not thinking like systems analysts working on a new, unknown software problem (certainly new hires are brought in to work on the 5ESS software and they must be trained - how it works outwardly, i.e., what it does at the "outer level" corresponding to what a healthy and unhealthy human might look like - and how it works inwardly, i.e., what the pathways inside the working system are, what they do and so on - like your veins, arteries, nerves, bones and so on.

You don't just let the new hires go in and hack away unknowingly at what they do not understand.

Personally I think there is a missing human science in "debugging" and learning about complex systems (I described this in "Through the Keyhole").

Medical science is bogus for several reasons with regard to "understanding" genetics:

1) Even if someone gave me the 100 million lines of 5ESS software code without understanding the context of it use, i.e., the switching and OS hardware that goes with it, what could I really understand?

2) If I don't have the proper environmental context to look at the code, i.e., not only the wires, power, cooling, physical connections, etc., but also how someone configures and sets it up - what happens while its running, and so on I could never hope to understand it let alone fix it.

So these basic problem, much less the fact that DNA is many orders of magnitude "more sophisticated" in terms of size and information density make the attempt look even more foolish.

Now let's through in that DNA builds a conscious self replicating machine that can live one hundred years.  Software and programming as a technology has only been around maybe 70 (or maybe 150 years depending on how you count).

So these medical scientists are using basic biology to try an understand a complex, DNA blueprint for a conscious, self-replicating three dimensional physical object.

Worse they think about it at the most basic molecular level.

This is a systems problem.

While we need to understand the basic DNA and biological pieces and parts we also have to have a high-level systems understanding of what these things do.

And that's were the failure is.

BRCA2 is 81 thousand base pairs of highly dense information describing a portion of a self-assembling conscious machine.

Humans don't understand conscious (and this is important because of the placebo effect which shows the mind affects the body) or self-replicating machines (though people are just fooling around now trying to build them).

We also don't fully understand all the physics or molecular biology involved.

I argue these medical scientists also ignore environmental factors like nutrition.

So all this boils down to the fact that we're barking up the wrong tree here.  We need to go "back to school" and understand basic science, biology, physics, and genetics from the ground up first before diving into fixing something like BRCA2.

The real question is this: given the complexity and density of genetic information, and all the rest I describe here - can a human mind really "understand" it all well enough to do anything useful?

(Yes you might say - look at antibiotics or surgery - I argue we're just deleting files that aren't supposed to be there as opposed to understanding anything...)

No comments:

Post a Comment