Genomics and the Human Genome Project
Genomes and genomics
A 'comb' for analysing DNA sequences. Each of the 97 slots can hold a different DNA sample
The study of the genomes of living things is known as genomics. It involves carefully analysing the genome to identify the position, structure and role of every gene. The simplest living things - bacteria - have small genomes. The bacterium Escherichia coli, which is one of many bacteria that live in the human gut, contains about 4 million pairs of bases. The human genome is almost 1,000 times bigger at 3 billion pairs of bases. Genomics involves working with some very big numbers!
The human genome
Your genes are made of a chemical called DNA. DNA is a special chemical, because it contains a code - the genetic code - that is made up from four different bases or 'letters': adenine (A), thymine (T), guanine (G) and cytosine (C).
The human genome is the total DNA in a complete set of human chromosomes: that is, 22 pairs of ordinary chromosomes (or 'autosomes') and a pair of sex chromosomes (X and Y). (See DNA, genes and chromosomes for more detail.)
The human genome. (A photograph showing the chromosomes like this is called a 'karyotype'.)
The Human Genome Project (HGP)
In the Human Genome Project (HGP), scientists from around the world have worked together to decode the sequence of the entire human genome, that is, the complete sequence of As, Ts, Cs and Gs in the DNA molecules that make up every gene on every chromosome. The sequence was finished in April 2003, and the results are available over the internet. The scientists are still trying to identify the position and role of every gene, and it will take quite some time to work out what the 20, 000 to 25,000 genes already discovered actually do.
The logo of the Human Genome Project
Scientists use a technique called DNA sequencing to read the order of the bases in the genetic code. It is a huge task. There are over 3 billion (3,000 million) bases to work out. They start by cutting the long threads of DNA into overlapping sections, to form a collection of fragments called a gene library. They analyse the sequence of the bases in these fragments in the laboratory, and the use powerful computers to work out the complete sequence. It's like a very complicated - and very big - jigsaw puzzle.
An example DNA sequence. The order of the bases in this section of DNA starts GTGA . Can you work out the rest of the sequence?
On its own, of course, the order of the bases doesn't tell us much. The next step is to work out how these bases are grouped into genes, and to find out where those genes are located on the chromosomes. It seems as though only about 3 per cent of our DNA is used for genes. It's not clear what the rest does, although there is some evidence that it may play a part in controlling gene activity.
The final, and most difficult, step will be to work out what these genes do, and the role they play in our bodies. We know that genes code for proteins, and that proteins do important jobs in our cells and bodies. The hope is that this understanding this process in more detail may help us to develop new treatments for a range of diseases.
This work is licensed under a Creative Commons Licence.