An international team of scientists based in Cambridge, Singapore and California last Friday (26 July 2002) announced the publication in Science of their work describing the sequencing and preliminary analysis of the genome of the Japanese pufferfish, Fugu rubripes.

This marks the first publicly funded vertebrate genome to be published after the human genome, and is an important milestone for groups using the genomes of backboned animals to decode the human genome. The sequence and annotations have been made freely available to all, without restrictions.

The pufferfish was chosen over a decade ago, as a potential model vertebrate genome because of its compact properties. The decoding of the pufferfish sequence in the work presented in Science has not only confirmed these properties, but has shown how comparisons between the genomes of backboned animals can illuminate the human genome.

This process, often compared to the decipherment of the Rosetta stone, is contributing to our understanding of the human genome sequence.

Some of the key findings include:

  • The team found that while the number of genes in the fish is approximately the same as man, when human and pufferfish genes were compared directly as many as a quarter of all human proteins could not be recognised in the pufferfish sequence. When the human genome sequence was released in 2000, the unexpectedly low number of genes was a key feature - commentators speculated that human complexity must arise from differences in gene splicing or gene expression. These comparisons show that evolution of the protein sequences themselves is a significant component of the differences between fish and man. Direct comparisons between animals in this way helps to define the most rapidly evolving human proteins for further study.
  • Comparisons of the pufferfish sequence with the human sequence allowed the team to predict the existence of human genes which so far have not been found with other methods. Using the pufferfish genes they found evidence for approximately 900 human genes which had not been found in other databases. Calculations suggested that direct comparison of the fish and human sequence could yield more human genes, although the team emphasise that the final numbers are unlikely to be much beyond the 28,000-35,000 currently thought to be in the human genome.
  • The work also revealed how the order of genes in the two genomes is shuffled over time. While shuffling of the order of genes along the chromosomes is known to occur, this study has revealed in detail for the first time the extent of this reordering. The study showed that many small groups of a few genes are found in the same order in man and fish, but over long distances the order of genes is scrambled.
  • There were a number of unexpected findings reported by the group. For example, the presence of a relative handful of "giant" genes - genes which appear bigger than their human counterparts unlike the majority of compact Fugu genes.

The approach used to obtain the sequence of this animal was similar to that used by Celera in obtaining the human genome sequence - DNA fragments were sequenced at random and the order then assembled in a computer, without first making ordered segments or maps of the genome - the so called "whole genome shotgun" method. There has been great controversy about the effectiveness of this method - this report shows that at least for this complex genome, the WGS approach can produce genome sequence suitable for analysis. The great advantage is speed and cost - the consortium took only a few months to obtain all the sequence and estimate that the project cost a mere $12million.

Dr Samuel Aparicio whose group is based in the Cambridge University Department of Oncology is the lead author of the study. Dr Aparicio, who also holds a visiting professorship with the IMCB in Singapore, was a founder member of the team led by Sydney Brenner which over a decade ago proposed the pufferfish as a model genome. Dr Aparicio's group are working in close collaboration with colleagues in Singapore to use the pufferfish sequence to find regulatory elements in the mouse and human genomes.

It is very exciting to have completed a whole vertebrate genome," said Dr Aparicio. "The availablity of complete genome sequences is proving a major step forward in helping us to focus our genome wide screens for regulatory elements in mouse and human.

The team plan to 'finish' the genome in the same timescale as the human genome, by April 26 2003, but in the interim will release sequence updates and improved sequence annotations on a regular basis throughout the project.


This work is licensed under a Creative Commons Licence. If you use this content on your site please link back to this page.