a cracking combination

1
© 2001 Macmillan Magazines Ltd HIGHLIGHTS NATURE REVIEWS | GENETICS VOLUME 2 | DECEMBER 2001 | 915 As genome sequencing and high- throughput functional genomics approaches generate more and more data, researchers need new ways to tease out biologi- cally relevant information. Two important advances in bioinformatics techniques — both of which rely on the com- bination of data sets — are now reported that allow better predic- tions of gene and protein function. Ge et al. have shown how gene- expression and protein–protein inter- action data can be integrated to refine predictions of genetic interactions. And Dietmann and Holm have devised an improved computational method for detecting remote homol- ogy on the basis of structural similar- ities between proteins. Ge et al. began their study by investigating whether genes that are co-expressed at the transcriptional level are more likely to encode pro- teins that interact with each other. They tested this in yeast, using tran- scriptional microarray data and a combination of genome-wide two- hybrid data, as well as individual pro- tein data from literature searches. In general, clusters of co-expressed genes do indeed have a higher chance of giving a positive score in the two- hybrid assays and in experimental data derived from the literature. This lends weight to the idea that co- expression and protein interaction are both useful predictors of func- tional interaction. More importantly, the combination of these two types of data can also provide new insights into functionally related groups of genes. The authors provide an exam- ple in which a group of genes that is involved in a network of stress- response protein–protein interactions is divided into two co-expression clusters. This refines the view of how the genes interact and leads to the development of new, testable hypotheses. Important clues about function can come from protein structure, especially if the proteins in question are distantly related and their amino- acid sequences have diverged. Dietmann and Holm combine sequence, structural and functional information to create a tree of protein structures, in which the most structurally similar proteins lie on the same branch of the tree. This approach overcomes some of the shortcomings of other bioinfor- matics tools. Their analysis can also infer homology relationships between proteins and group them into ‘superfamilies’. The results com- pare very favourably with a protein classification system that is manually curated by experts (SCOP). Finally, the authors show that their automat- ed system could be used to predict the superfamily membership, and there- fore a putative function, of a new pro- tein structure that might be deter- mined as part of a structural genomics project. Such computational approaches are essential if we are to predict and understand complex genetic net- works from data generated by high- throughput genomic studies. And as the experimental technologies improve and generate more compre- hensive and complementary data sets, the prospects for further combi- natorial approaches to data analysis will be bright. Magdalena Skipper References and links ORIGINAL RESEARCH PAPERS Ge, H. et al. Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nature Genet. 10.1038/ng776 (2001) | Dietmann, S. & Holm, L. Identification of homology in protein structure classification. Nature Struct. Biol. 8, 953–957 (2001) FURTHER READING Lichtarge, O. Getting past appearances: the many-fold consequences of remote homology. Nature Struct. Biol. 8, 918–920 (2001) | Brenner, S. E. A tour of structural genomics. Nature Rev. Genet. 2, 801–809 (2001) A cracking combination FUNCTIONAL GENOMICS IN BRIEF Random monoallelic expression of three genes clustered within 60 kb of mouse t complex genomic DNA. Sano, Y. et al. Genome Res. 15, 1833–1841 (2001) There are three mechanisms of gene-dosage control in mammals: random X inactivation, parent-of-origin autosomal gene imprinting and random autosomal inactivation. This last is very rare, but Sano et al. now report a cluster of three genes Nubp2, Igfals and Jsap1 — on mouse chromosome 17 that undergo this process. In single cells, these genes show X-like, random monoallelic expression, but can switch between active and inactive states during cell division — monoallelic expression correlates with the 50% methylation status of the genomic region. The authors discuss possible mechanisms for such gene expression patterns and their involvement in the biology of the t complex. An apolipoprotein influencing triglycerides in humans and mice revealed by comparative sequencing. Pennacchio, L. A. et al. Science 294, 169–173 (2001) Apolipoproteins are known to affect plasma lipid levels in humans an important factor in susceptibility to heart disease. Pennacchio et al. explored the already known apolipoprotein gene cluster on chromosome 11 for new susceptibility loci. By comparing mouse and human sequence, they identified a new locus, which when knocked out in mice leads to a substantial increase in plasma lipid levels, but when overexpressed causes these levels to drop below wild-type levels. This marked effect prompted the authors to look for SNPs in the human locus all three rare alleles that they studied were associated with high lipid levels. Reciprocal mouse and human limb phenotypes caused by gain- and loss-of-function mutations affecting Lmbr1. Clark, R. M. et al. Genetics 159, 715–726 (2001) Most of the dominantly inherited preaxial polydactly and syndactyly phenotypes, which affect limb-digit number, map to human chromosome 7q36 and to a homologous region in mice. By using deletion chromosomes, these authors show that limb defects that map to this region in mice result from gain-of-function mutations, rather than by haploinsufficiency. Mice with loss-of-function mutations in limb region 1 (Lmbr1), which might be allelic to the known limb morphology mutants Hemimelic extra toes and Hammertoe, have fewer limb digits than normal. Because this phenotype is reciprocal to polydactly, the authors propose that levels of Lmbr1 activity control the morphology of the vertebrate limb skeleton. DEVELOPMENTAL BIOLOGY HUMAN GENETICS GENE EXPRESSION

Upload: magdalena

Post on 26-Jul-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A cracking combination

© 2001 Macmillan Magazines Ltd

H I G H L I G H T S

NATURE REVIEWS | GENETICS VOLUME 2 | DECEMBER 2001 | 915

As genome sequencing and high-throughput functional genomicsapproaches generate more andmore data, researchers neednew ways to tease out biologi-cally relevant information.Two important advances inbioinformatics techniques —both of which rely on the com-bination of data sets — are nowreported that allow better predic-tions of gene and protein function.Ge et al. have shown how gene-expression and protein–protein inter-action data can be integrated to refinepredictions of genetic interactions.And Dietmann and Holm havedevised an improved computationalmethod for detecting remote homol-ogy on the basis of structural similar-ities between proteins.

Ge et al. began their study byinvestigating whether genes that areco-expressed at the transcriptionallevel are more likely to encode pro-teins that interact with each other.They tested this in yeast, using tran-scriptional microarray data and acombination of genome-wide two-hybrid data, as well as individual pro-tein data from literature searches. Ingeneral, clusters of co-expressedgenes do indeed have a higher chanceof giving a positive score in the two-hybrid assays and in experimentaldata derived from the literature. Thislends weight to the idea that co-expression and protein interactionare both useful predictors of func-tional interaction. More importantly,the combination of these two types ofdata can also provide new insightsinto functionally related groups ofgenes. The authors provide an exam-ple in which a group of genes that isinvolved in a network of stress-response protein–protein interactionsis divided into two co-expressionclusters. This refines the view of howthe genes interact and leads to the development of new, testablehypotheses.

Important clues about functioncan come from protein structure,especially if the proteins in questionare distantly related and their amino-

acid sequenceshave diverged. Dietmann and Holmcombine sequence, structural andfunctional information to create atree of protein structures, in whichthe most structurally similar proteinslie on the same branch of the tree.This approach overcomes some ofthe shortcomings of other bioinfor-matics tools. Their analysis can alsoinfer homology relationshipsbetween proteins and group theminto ‘superfamilies’. The results com-pare very favourably with a proteinclassification system that is manuallycurated by experts (SCOP). Finally,the authors show that their automat-ed system could be used to predict thesuperfamily membership, and there-fore a putative function, of a new pro-tein structure that might be deter-mined as part of a structuralgenomics project.

Such computational approachesare essential if we are to predict andunderstand complex genetic net-works from data generated by high-throughput genomic studies. And asthe experimental technologiesimprove and generate more compre-hensive and complementary datasets, the prospects for further combi-natorial approaches to data analysiswill be bright.

Magdalena Skipper

References and linksORIGINAL RESEARCH PAPERS Ge, H. et al.Correlation between transcriptome andinteractome mapping data from Saccharomycescerevisiae. Nature Genet. 10.1038/ng776 (2001) |Dietmann, S. & Holm, L. Identification of homologyin protein structure classification. Nature Struct.Biol. 8, 953–957 (2001)FURTHER READING Lichtarge, O. Getting pastappearances: the many-fold consequences ofremote homology. Nature Struct. Biol. 8, 918–920(2001) | Brenner, S. E. A tour of structuralgenomics. Nature Rev. Genet. 2, 801–809 (2001)

A cracking combination

F U N C T I O N A L G E N O M I C S IN BRIEF

Random monoallelic expression of three genesclustered within 60 kb of mouse t complex genomic DNA.Sano, Y. et al. Genome Res. 15, 1833–1841 (2001)

There are three mechanisms of gene-dosage control inmammals: random X inactivation, parent-of-origin autosomalgene imprinting and random autosomal inactivation. This lastis very rare, but Sano et al. now report a cluster of three genes— Nubp2, Igfals and Jsap1 — on mouse chromosome 17 thatundergo this process. In single cells, these genes show X-like,random monoallelic expression, but can switch between activeand inactive states during cell division — monoallelicexpression correlates with the 50% methylation status of thegenomic region. The authors discuss possible mechanisms forsuch gene expression patterns and their involvement in thebiology of the t complex.

An apolipoprotein influencing triglycerides in humans and mice revealed by comparativesequencing.Pennacchio, L. A. et al. Science 294, 169–173 (2001)

Apolipoproteins are known to affect plasma lipid levels inhumans an important factor in susceptibility to heartdisease. Pennacchio et al. explored the already knownapolipoprotein gene cluster on chromosome 11 for newsusceptibility loci. By comparing mouse and human sequence,they identified a new locus, which when knocked out in miceleads to a substantial increase in plasma lipid levels, but whenoverexpressed causes these levels to drop below wild-typelevels. This marked effect prompted the authors to look forSNPs in the human locus all three rare alleles that theystudied were associated with high lipid levels.

Reciprocal mouse and human limb phenotypescaused by gain- and loss-of-function mutationsaffecting Lmbr1. Clark, R. M. et al. Genetics 159, 715–726 (2001)

Most of the dominantly inherited preaxial polydactly andsyndactyly phenotypes, which affect limb-digit number, mapto human chromosome 7q36 and to a homologous region inmice. By using deletion chromosomes, these authors show that limb defects that map to this region in mice result fromgain-of-function mutations, rather than by haploinsufficiency.Mice with loss-of-function mutations in limb region 1(Lmbr1), which might be allelic to the known limbmorphology mutants Hemimelic extra toes and Hammertoe,have fewer limb digits than normal. Because this phenotype is reciprocal to polydactly, the authors propose that levels of Lmbr1 activity control the morphology of the vertebratelimb skeleton.

D E V E LO P M E N TA L B I O LO G Y

H U M A N G E N E T I C S

G E N E E X P R E S S I O N