Breaking the DNA Code November 28, 2006
Posted by Dr. Bertalan Meskó in genetics, Invention.trackback
The genome consists of five types of nucleotides (Uracyl is not important for us now as a part of the RNAs) . If you take a look at a code of a gene, you can’t describe the protein to be translated. But now, Professor Simon Shepherd at the University of Bradford has constructed an algorithm that can unpick the sequences of Adenines, Guanines, Cytosines and Thymines.
Professor Shepherd originally tested his computer programme on the entire text of Emma by Jane Austen after removing all the spaces and punctuation, leaving just a long impenetrable line of letters. Despite having no knowledge of the English vocabulary or syntax, the programme managed to identify 80 per cent of the words and separate them back into sentences.
In case of a genome, the problem is much more complicated as we don’t even know exactly how we can describe a sentence. There are introns that are not transcripted and protein folding itself is one of the most difficult processes. Maybe, after additional developing, this method can be applied to a genetic sequence.
We are treating DNA as we used to treat problems in intelligence. We want to break the code at the most fundamental level… The protein folding problem is regarded as one of the three grand challenge problems of 21st century science. Its resolution is crucial to the development of the new drugs and medical therapies that the Human Genome project promises one day to deliver.
Although results will not happen overnight, we can expect to see the promise of the Human Genome project bearing fruit within the next 20 to 50 years.
I’m just wondering what can be the other two other challenge problems of 21st century science. Any comment is welcome.
References:
- DNA Code Breaker Tested Theory On Jane Austen Text
- Nature (pg 259, Vol. 444, 16 November 2006)
- Evolving code wiki








Wow now that is very cool. There is so much information locked up in the genome that we don’t know how to decode.
I’m not sure what the other two problems are…perhaps finding a so-called theory of everything or at least a quantum theory of gravity?
BTW, I have plenty of concerns about it. The genome is not a text, it is much more complicated. What about introns? Protein folding is also more complicated than the grammar of a language. We’ll see. So much left to work…