“Omg. u no how 2 do the bio hw?”
Texting uses a peculiar alphabet. It keeps
messages brief but still encodes enough
meaning for students to communicate about homework, coffee dates and
crushes — all while accommodating the
occasional typo.
The genetic alphabet, the letters used
as the blueprint for all life, balances brevity and clarity in a similar way. Just four
letters combine to spell out the more
than five dozen three-letter words that
encrypt the information needed to make
all the cells in the human body, and any
other body as well.
Figuring out how life’s code came to be
is nature’s original homework problem,
and it isn’t easy: It’s like studying how
people in Paris talk today to determine
what the first Latin alphabet must have
been. Attempts at deciphering the code’s
origins are also complicated by the fact
that, unless another iteration of life turns
up, say on a distant planet, scientists have
only one version to study.
“We only have one experiment, and
it’s extremely hard to repeat this experiment,” says physicist Tsvi Tlusty of
the Weizmann Institute of Science in
Rehovot, Israel.
Without a way to replicate life’s earliest
days, coming up with a theory to explain
the code is like interpreting a Rorschach
test. But that doesn’t stop scientists
from trying. Some are now finding that
outside pressures, such as a need to minimize error, may have driven the code to
evolve the same way as texting — through
en masse trial and error. Chemical attractions between molecules, others report,
could have set the code’s destiny.
“These discoveries make the whole
field legitimate rather than a matter of
pure speculation,” says molecular biologist Eugene Koonin of the National
Center for Biotechnology Information
in Bethesda, Md.
With sugar and a phosphate, these molecules form the basic units of DNA,
called nucleotides. And the letters provide instructions for making proteins,
essential cellular players that kick-start
chemical reactions, serve as scaffolding
and act as messengers.
To make a protein, the two strands of
the double helix–shaped DNA unwind,
and a complementary strand called messenger RNA forms alongside one of the
exposed arms of the helix. Messenger
RNA is a copy of DNA with chemically
similar letters (though messenger RNA
has a U for uracil instead of thymine).
This RNA is moved to a cellular factory
called a ribosome where, through a process called translation, the RNA gets
decoded and proteins are constructed.
Every three nucleotides in the mes-
senger RNA spell a genetic word, called
a codon, that codes for a specific amino
acid. In the ribosome, amino acids are
linked to form protein molecules, the
way words come together into sen-
tences. The codon for the amino acid
methionine often serves as a start signal,
while three other codons serve as punc-
tuation marks, telling protein construc-
tion to “stop.”
With an alphabet of four letters, 64
three-letter codons are possible. Yet
cells make their proteins from only
20 amino acids.
Francis Crick, codiscoverer of the DNA
double helix, proposed in 1968 that the
code for these 20 amino acids was a “
frozen accident,” working well enough to get
passed down through generations like an
old tradition. If so, on a second go-round
life could look completely different, and
any life found elsewhere in the universe
would probably be unfamiliar as well. But
some of the alphabet’s features seem too
good to be an accident, so scientists have
tried to find logic beyond pure luck.
Some of the words, for example, act like
synonyms. Three or four codons, usually
identical except for one letter, can stand
for the same amino acid, just like hi and
hey both mean “hello.” This feature can
protect cells from errors: If messenger
RNA’s CGA mistakenly becomes CGU,
for example, the cell still selects the same
amino acid (in this case, arginine).
But even if a mistranslation leads to the
wrong amino acid, the code is arranged so
that the product will be chemically similar to the intended one. This logic isn’t
generally a rule in English: You’d have
trouble doing your homework with a
“hen” if you really needed a “pen.”
Related neighbors the genetic code used by all life on earth maps 64 three-letter words
to 20 corresponding amino acids and a stop signal, which serves as a punctuation mark. Since
similar amino acids are coded by similar three-letter words (degree of shading represents similarity, with “stop” signals in brown), some researchers think a pressure to reduce the havoc brought
about by errors may have been an important driving force during the code’s early development.
CG
AU
C
AUCG
AU
G
AdApted by JAnel kiley
UUU phenylalanine UCU Serine
UUC phenylalanine UCC Serine
UUA leucine UCA Serine
UUG leucine UCG Serine
CUU leucine CCU proline
CUC leucine CCC proline
CUA leucine CCA proline
CUG leucine CCG proline
AUU isoleucine ACU threonine
AUC isoleucine ACC threonine
AUA isoleucine ACA threonine
AUG Methionine ACG threonine
GUU Valine GCU Alanine
GUC Valine GCC Alanine
GUA Valine GCA Alanine
GUG Valine GCG Alanine
SoUrCe: e. koonin And A. noVozhiloV/LIfe 2009
UAU tyrosine UGU
UAC tyrosine UGC
UAA terminate UGA
UAG terminate UGG
CAU histidine CGU
CAC histidine CGC
CAA Glutamine CGA
CAG Glutamine CGG
AAU Asparagine AGU
AAC Asparagine AGC
AAA lysine AGA
AAG lysine AGG
GAU Aspartic acid GGU
GAC Aspartic acid GGC
GAA Glutamicacid GGA
GAG Glutamic acid GGG
Cysteine
Cysteine
terminate
tryptophan
Arginine
Arginine
Arginine
Arginine
Serine
Serine
Arginine
Arginine
Glycine
Glycine
Glycine
Glycine
www.sciencenews.org
February 12, 2011 | science news | 19