Sanger Institute - Publications 2006

Number of papers published in 2006: 264

  • Renin enhancer is critical for control of renin gene expression and cardiovascular function.

    Adams DJ, Head GA, Markus MA, Lovicu FJ, van der Weyden L, Köntgen F, Arends MJ, Thiru S, Mayorov DN and Morris BJ

    School of Medical Sciences and Bosch Institute, University of Sydney, Sydney, New South Wales 2006, Australia.

    The important cardiovascular regulator renin contains a strong in vitro enhancer 2.7 kb upstream of its gene. Here we tested the in vivo role of the mouse Ren-1c enhancer. In renin-expressing As4.1 cells stably transfected with Ren-1c promoter with or without enhancer, expression of linked beta-geo reporter, stable expression, and colony formation were dependent on the presence of the enhancer. We then generated mice carrying a targeted deletion of the enhancer (REKO mice) and found marked depletion of renin in renal juxtaglomerular and submandibular ductal cells, as well as hyperplasia of macula densa cells. Plasma creatinine was increased, but electrolytes were normal. Male REKO mice implanted with telemetry devices had 9 +/- 1 mm Hg lower mean arterial pressure (p < 0.001), which was partly normalized by a high NaCl diet. Locomotor activity was lower, and baroreflex sensitivity was normal. Markedly reduced mean arterial pressure variability in the midfrequency band indicated a contribution of reduced sympathetic vasomotor tone to the hypotension. In conclusion, the renin enhancer is critical for renin gene expression and physiological sequelae, including response to alteration in salt intake. The REKO mouse may be useful as a low renin expression model.

    The Journal of biological chemistry 2006;281;42;31753-61

  • Non-DNA binding, dominant-negative, human PPARgamma mutations cause lipodystrophic insulin resistance.

    Agostini M, Schoenmakers E, Mitchell C, Szatmari I, Savage D, Smith A, Rajanayagam O, Semple R, Luan J, Bath L, Zalin A, Labib M, Kumar S, Simpson H, Blom D, Marais D, Schwabe J, Barroso I, Trembath R, Wareham N, Nagy L, Gurnell M, O'Rahilly S and Chatterjee K

    Department of Medicine, University of Cambridge, United Kingdom.

    PPARgamma is essential for adipogenesis and metabolic homeostasis. We describe mutations in the DNA and ligand binding domains of human PPARgamma in lipodystrophic, severe insulin resistance. These receptor mutants lack DNA binding and transcriptional activity but can translocate to the nucleus, interact with PPARgamma coactivators and inhibit coexpressed wild-type receptor. Expression of PPARgamma target genes is markedly attenuated in mutation-containing versus receptor haploinsufficent primary cells, indicating that such dominant-negative inhibition operates in vivo. Our observations suggest that these mutants restrict wild-type PPARgamma action via a non-DNA binding, transcriptional interference mechanism, which may involve sequestration of functionally limiting coactivators.

    Funded by: Wellcome Trust: 080237

    Cell metabolism 2006;4;4;303-11

  • Autophagy occurs upstream or parallel to the apoptosome during histolytic cell death.

    Akdemir F, Farkas R, Chen P, Juhasz G, Medved'ová L, Sass M, Wang L, Wang X, Chittaranjan S, Gorski SM, Rodriguez A and Abrams JM

    Department of Cell Biology, UT Southwestern Medical Center, Dallas, TX 75390, USA.

    Histolysis refers to a widespread disintegration of tissues that is morphologically distinct from apoptosis and often associated with the stimulation of autophagy. Here, we establish that a component of the apoptosome, and pivotal regulator of apoptosis, is also required for histolytic cell death. Using in vivo and ex vivo assays, we demonstrate a global apoptogenic requirement for dark, the fly ortholog of Apaf1, and show that a required focus of dark(-) organismal lethality maps to the central nervous system. We further demonstrate that the Dark protein itself is a caspase substrate and find that alterations of this cleavage site produced the first hypermorphic point mutation within the Apaf1/Ced-4 gene family. In a model of ;autophagic cell death', dark was essential for histolysis but dispensable for characteristic features of the autophagic program, indicating that the induction of autophagy occurs upstream or parallel to histolytic cell death. These results demonstrate that stimulation of autophagy per se is not a ;killing event' and, at the same time, establish that common effector pathways, regulated by the apoptosome, can underlie morphologically distinct forms of programmed cell death.

    Funded by: NIGMS NIH HHS: GM072124, R01 GM072124-14A1

    Development (Cambridge, England) 2006;133;8;1457-65

  • Evolutionary history of the Coccolithoviridae.

    Allen MJ, Schroeder DC, Holden MT and Wilson WH

    Plymouth Marine Laboratory, Prospect Place, The Hoe, Plymouth, United Kingdom.

    We recently determined the genome sequence of the Coccolithoviridae strain Emiliania huxleyi virus 86 (EhV-86), a giant double-stranded DNA (dsDNA) algal virus from the family Phycodnaviridae that infects the marine coccolithophorid E. huxleyi. Here, we determine the phylogenetic relationship between EhV-86 and other large dsDNA viruses. Twenty-five core genes common to nuclear-cytoplasmic large dsDNA virus genomes were identified in the EhV-86 genome; sequence from eight of these genes were used to create a phylogenetic tree in which EhV-86 was placed firmly with the two other members of the Phycodnaviridae. We have also identified a 100-kb region of the EhV-86 genome which appears to have transferred into this genome from an unknown source. Furthermore, the presence of six RNA polymerase subunits (unique among the Phycodnaviridae) suggests both a unique evolutionary history and a unique lifestyle for this intriguing virus.

    Molecular biology and evolution 2006;23;1;86-92

  • High throughput protein expression screening in the nervous system--needs and limitations.

    Anderson CN and Grant SG

    Genes to Cognition programme, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    The cellular complexity of the brain (some estimate that there are up to 10(3) different cell types) is exceeded by the synaptic complexity, with each of the approximately 10(11) neurons in the brain having around 10(3)-10(4) synapses. Proteomic studies of the synapse have revealed that the postsynaptic density is the most complex multiprotein structure yet identified, with approximately 10(3) different proteins. Such studies, however, use brain tissue with many different regions and therefore different cell types, and there is clear potential for heterogeneity of protein content at different synapses within and between brain regions. Although large-scale mRNA-based assays are in progress to map this sort of complexity at the cellular level, and indeed all brain-expressed genes, analysis of protein distribution (at synapses and other structures) is still in the very early stages. We review existing large-scale protein expression studies and the specific technical obstacles that need to be overcome before applying the scaling used in nucleic acid based approaches.

    Funded by: Wellcome Trust

    The Journal of physiology 2006;575;Pt 2;367-72

  • Karyotype relationships of six bat species (Chiroptera, Vespertilionidae) from China revealed by chromosome painting and G-banding comparison.

    Ao L, Gu X, Feng Q, Wang J, O'Brien PC, Fu B, Mao X, Su W, Wang Y, Volleth M, Yang F and Nie W

    Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, The Chinese Academy of Sciences, Kunming, PR China.

    The Vespertilionidae is the largest family in the order Chiroptera and has a worldwide distribution in the temperate and tropical regions. In order to further clarify the karyotype relationships at the lower taxonomic level in Vespertilionidae, genome-wide comparative maps have been constructed between Myotis myotis (MMY, 2n = 44) and six vesper bats from China: Myotis altarium (MAL, 2n = 44), Hypsugo pulveratus (HPU, 2n = 44), Nyctalus velutinus (NVE, 2n = 36), Tylonycteris robustula (TRO, 2n = 32), Tylonycteris sp. (TSP, 2n = 30)and Miniopterus fuliginosus (MFU, 2n = 46) by cross-species chromosome painting with a set of painting probes derived from flow-sorted chromosomes of Myotis myotis. Each Myotis myotis autosomal probe detected a single homologous chromosomal segment in the genomes of these six vesper bats except for MMY chromosome 3/4 paint which hybridized onto two chromosomes in the genome of M. fuliginosus. Our results show that Robertsonian translocation is the main mode of karyotype evolution in Vespertilionidae and that the addition of heterochromatic material also plays an important role in the karyotypic evolution of the genera Tylonycteris and Nyctalus. Two conserved syntenic associations (MMY9 + 23 and 18 + 19) could be the synapomorphic features for the genus Tylonycteris. The integration of our maps with the published maps has enabled us to deduce chromosomal homologies between human and these six vesper bats and provided new insight into the karyotype evolution of the family Vespertilionidae.

    Cytogenetic and genome research 2006;115;2;145-53

  • Reconstructing protein complexes: from proteomics to systems biology.

    Armstrong JD, Pocklington AJ, Cumiskey MA and Grant SG

    School of Informatics, University of Edinburgh, Edinburgh, UK.

    Modern high throughput technologies in biological science often create lists of interesting molecules. The challenge is to reconstruct a descriptive model from these lists that reflects the underlying biological processes as accurately as possible. Once we have such a model or network, what can we learn from it? Specifically, given that we are interested in some biological process associated with the model, what new properties can we predict and subsequently test? Here, we describe, at an introductory level, a range of bioinformatics techniques that can be systematically applied to proteomic datasets. When combined, these methods give us a global overview of the network and the properties of the proteins and their interactions. These properties can then be used to predict functional pathways within the network and to examine substructure. To illustrate the application of these methods, we draw upon our own work concerning a complex of 186 proteins found in neuronal synapses in mammals. The techniques discussed are generally applicable and could be used to examine lists of proteins involved with the biological response to electric or magnetic fields.

    Proteomics 2006;6;17;4724-31

  • Gene Ontology annotation status of the fission yeast genome: preliminary coverage approaches 100%.

    Aslett M and Wood V

    Wellcome Trust Sanger Institute, Cambridge CB10 1HH, UK.

    In this review, we present an overview of the Gene Ontology (GO) structure and describe how the GO is implemented for Sz. pombe and made available via Sz. pombe GeneDB ( We give a detailed progress report of Sz. pombe GO annotation, providing the current status of both manual and automatic annotations. Fission yeast has at least one GO annotation for 98.3% of its genes (excluding annotations to 'unknown' terms), greater than the current percentage coverage for any other organism. Approximately 65% (3225 gene products) have at least one annotation to each of the three ontologies (biological process, cellular component and molecular function). Approximately 30% (1443 gene products) have GO terms derived directly from small-scale experiments in fission yeast, supporting the validity of fission yeast as a model eukaryote and a reference organism.

    Yeast (Chichester, England) 2006;23;13;913-9

  • Genetic analysis of the LGI/Epitempin gene family in sporadic and familial lateral temporal lobe epilepsy.

    Ayerdi-Izquierdo A, Stavrides G, Sellés-Martínez JJ, Larrea L, Bovo G, López de Munain A, Bisulli F, Martí-Massó JF, Michelucci R, Poza JJ, Tinuper P, Stephani U, Striano P, Striano S, Staub E, Sarafidou T, Hinzmann B, Moschonas N, Siebert R, Deloukas P, Nobile C and Pérez-Tur J

    Unitat de Genètica Molecular, Dept. de Genòmica i Proteòmica, Institut de Biomedicina de València - CSIC, Jaume Roig, 11. E46010 València, Spain.

    Mutations in the LGI1/Epitempin gene cause autosomal dominant lateral temporal lobe epilepsy (ADLTE), a partial epilepsy characterized by the presence of auditory seizures. However, not all the pedigrees with a phenotype consistent with ADLTE show mutations in LGI1/Epitempin, or evidence for linkage to the 10q24 locus. Other authors as well as ourselves have found an internal repeat (EPTP, pfam# PF03736) that allowed the identification of three other genes sharing a sequence and structural similarity with LGI1/Epitempin. In this work, we present the sequencing of these genes in a set of ADLTE families without mutations in both LGI1/Epitempin and sporadic cases. No analyzed polymorphisms modified susceptibility in either the familial or sporadic forms of this partial epilepsy.

    Funded by: Telethon: GGP02339

    Epilepsy research 2006;70;2-3;118-26

  • Autoregulation of ribosome biosynthesis by a translational response in fission yeast.

    Bachand F, Lackner DH, Bähler J and Silver PA

    Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Québec, Canada.

    Maintaining the appropriate balance between the small and large ribosomal subunits is critical for translation and cell growth. We previously identified the 40S ribosomal protein S2 (rpS2) as a substrate of the protein arginine methyltransferase 3 (RMT3) and reported a misregulation of the 40S/60S ratio in rmt3 deletion mutants of Schizosaccharomyces pombe. For this study, using DNA microarrays, we have investigated the genome-wide biological response of rmt3-null cells to this ribosomal subunit imbalance. Whereas little change was observed at the transcriptional level, a number of genes showed significant alterations in their polysomal-to-monosomal ratios in rmt3Delta mutants. Importantly, nearly all of the 40S ribosomal protein-encoding mRNAs showed increased ribosome density in rmt3 disruptants. Sucrose gradient analysis also revealed that the ribosomal subunit imbalance detected in rmt3-null cells is due to a deficit in small-subunit levels and can be rescued by rpS2 overexpression. Our results indicate that rmt3-null fission yeast compensate for the reduced levels of small ribosomal subunits by increasing the ribosome density, and likely the translation efficiency, of 40S ribosomal protein-encoding mRNAs. Our findings support the existence of autoregulatory mechanisms that control ribosome biosynthesis and translation as an important layer of gene regulation.

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Molecular and cellular biology 2006;26;5;1731-42

  • Synaptic Ras GTPase activating protein regulates pattern formation in the trigeminal system of mice.

    Barnett MW, Watson RF, Vitalis T, Porter K, Komiyama NH, Stoney PN, Gillingwater TH, Grant SG and Kind PC

    Centre for Integrative Physiology, University of Edinburgh, Edinburgh EH8 9XD, United Kingdom.

    The development of ordered connections or "maps" within the nervous system is a common feature of sensory systems and is crucial for their normal function. NMDA receptors are known to play a key role in the formation of these maps; however, the intracellular signaling pathways that mediate the effects of glutamate are poorly understood. Here, we demonstrate that SynGAP, a synaptic Ras GTPase activating protein, is essential for the anatomical development of whisker-related patterns in the developing somatosensory pathways in rodent forebrain. Mice lacking SynGAP show only partial segregation of barreloids in the thalamus, and thalamocortical axons segregate into rows but do not form whisker-related patches. In cortex, layer 4 cells do not aggregate to form barrels. In Syngap(+/-) animals, barreloids develop normally, and thalamocortical afferents segregate in layer 4, but cell segregation is retarded. SynGAP is not necessary for the development of whisker-related patterns in the brainstem. Immunoelectron microscopy for SynGAP from layer 4 revealed a postsynaptic localization with labeling in developing postsynaptic densities (PSDs). Biochemically, SynGAP associates with the PSD in a PSD-95-independent manner, and Psd-95(-/-) animals develop normal barrels. These data demonstrate an essential role for SynGAP signaling in the activity-dependent development of whisker-related maps selectively in forebrain structures indicating that the intracellular pathways by which NMDA receptor activation mediates map formation differ between brain regions and developmental stage.

    Funded by: Wellcome Trust

    The Journal of neuroscience : the official journal of the Society for Neuroscience 2006;26;5;1355-65

  • Meta-analysis of the Gly482Ser variant in PPARGC1A in type 2 diabetes and related phenotypes.

    Barroso I, Luan J, Sandhu MS, Franks PW, Crowley V, Schafer AJ, O'Rahilly S and Wareham NJ

    The Wellcome Trust Sanger Institute, Metabolic Disease Group, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Peroxisome proliferator-activated receptor-gamma co-activator-1alpha (PPARGC1A) is a transcriptional co-activator with a central role in energy expenditure and glucose metabolism. Several studies have suggested that the common PPARGC1A polymorphism Gly482Ser may be associated with risk of type 2 diabetes, with conflicting results. To clarify the role of Gly482Ser in type 2 diabetes and related human metabolic phenotypes we genotyped this polymorphism in a case-control study and performed a meta-analysis of relevant published data.

    Gly482Ser was genotyped in a type 2 diabetes case-control study (N=1,096) using MassArray technology. A literature search revealed publications that examined Gly482Ser for association with type 2 diabetes and related metabolic phenotypes. Meta-analysis of the current study and relevant published data was undertaken.

    Results: In the pooled meta-analysis, including data from this study and seven published reports (3,718 cases, 4,818 controls), there was evidence of between-study heterogeneity (p<0.1). In the fixed-effects meta-analysis, the pooled odds ratio for risk of type 2 diabetes per Ser482 allele was 1.07 (95% CI 1.00-1.15, p=0.044). Elimination of one of the studies from the meta-analysis gave a summary odds ratio of 1.11 (95% CI 1.04-1.20, p=0.004), with no between-study heterogeneity (p=0.475). For quantitative metabolic traits in normoglycaemic subjects, we also found significant between-study heterogeneity. However, no significant association was observed between Gly482Ser and BMI, fasting glucose or fasting insulin.

    This meta-analysis of data from the current and published studies supports a modest role for the Gly482Ser PPARGC1A variant in type 2 diabetes risk.

    Funded by: Wellcome Trust

    Diabetologia 2006;49;3;501-5

  • Chromosome localization of microsatellite markers in the shrews of the Sorex araneus group.

    Basset P, Yannic G, Yang F, O'Brien PC, Graphodatsky AS, Ferguson-Smith MA, Balmus G, Volobouev VT and Hausser J

    Department of Ecology and Evolution, Lausanne University, Biophore, 1015, Lausanne, Switzerland.

    The extremely high rate of karyotypic evolution that characterizes the shrews of the Sorex araneus group makes this group an exceptionally interesting model for population genetics and evolutionary studies. Here, we attempted to map 46 microsatellite markers at the chromosome arm level using flow-sorted chromosomes from three karyotypically different taxa of the Sorex araneus group (S. granarius and the chromosome races Cordon and Novosibirsk of S. araneus). The most likely localizations were provided for 35 markers, among which 25 were each unambiguously mapped to a single locus on the corresponding chromosomes in the three taxa, covering the three sexual chromosomes (XY1Y2) and nine of the 18 autosomal arms of the S. araneus group. The results provide further evidence for a high degree of conservation in genome organization in the S. araneus group despite the presence of numerous Robertsonian rearrangements. These markers can therefore be used to compare the genetic structure among taxa of the S. araneus group at the chromosome level and to study the role of chromosomal rearrangements in the genetic diversification and speciation process of this group.

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2006;14;3;253-62

  • Genetic analysis of the capsular biosynthetic locus from all 90 pneumococcal serotypes.

    Bentley SD, Aanensen DM, Mavroidi A, Saunders D, Rabbinowitsch E, Collins M, Donohoe K, Harris D, Murphy L, Quail MA, Samuel G, Skovsted IC, Kaltoft MS, Barrell B, Reeves PR, Parkhill J and Spratt BG

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.

    Several major invasive bacterial pathogens are encapsulated. Expression of a polysaccharide capsule is essential for survival in the blood, and thus for virulence, but also is a target for host antibodies and the basis for effective vaccines. Encapsulated species typically exhibit antigenic variation and express one of a number of immunochemically distinct capsular polysaccharides that define serotypes. We provide the sequences of the capsular biosynthetic genes of all 90 serotypes of Streptococcus pneumoniae and relate these to the known polysaccharide structures and patterns of immunological reactivity of typing sera, thereby providing the most complete understanding of the genetics and origins of bacterial polysaccharide diversity, laying the foundations for molecular serotyping. This is the first time, to our knowledge, that a complete repertoire of capsular biosynthetic genes has been available, enabling a holistic analysis of a bacterial polysaccharide biosynthesis system. Remarkably, the total size of alternative coding DNA at this one locus exceeds 1.8 Mbp, almost equivalent to the entire S. pneumoniae chromosomal complement.

    Funded by: Wellcome Trust

    PLoS genetics 2006;2;3;e31

  • Sequence analysis of the protein kinase gene family in human testicular germ-cell tumors of adolescents and adults.

    Bignell G, Smith R, Hunter C, Stephens P, Davies H, Greenman C, Teague J, Butler A, Edkins S, Stevens C, O'Meara S, Parker A, Avis T, Barthorpe S, Brackenbury L, Buck G, Clements J, Cole J, Dicks E, Edwards K, Forbes S, Gorton M, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jones D, Kosmidou V, Laman R, Lugg R, Menzies A, Perry J, Petty R, Raine K, Shepherd R, Small A, Solomon H, Stephens Y, Tofts C, Varian J, Webb A, West S, Widaa S, Yates A, Gillis AJ, Stoop HJ, van Gurp RJ, Oosterhuis JW, Looijenga LH, Futreal PA, Wooster R and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, United Kingdom.

    The protein kinase gene family is the most frequently mutated in human cancer. Previous work has documented activating mutations in the KIT receptor tyrosine kinase in testicular germ-cell tumors (TGCT). To investigate further the potential role of mutated protein kinases in the development of TGCT and to characterize the prevalence and patterns of point mutations in these tumors, we have sequenced the coding exons and splice junctions of the annotated protein kinase family of 518 genes in a series of seven seminomas and six nonseminomas. Our results show a remarkably low mutation frequency, with only a single somatic point mutation, a K277E mutation in the STK10 gene, being identified in a total of more than 15 megabases of sequence analyzed. Sequencing of STK10 in an additional 40 TGCTs revealed no further mutations. Comparative genomic hybridization and LOH analysis using SNP arrays demonstrated that the 13 TGCTs mutationally screened through the 518 protein kinase genes were uniformly aneuploid with consistent chromosomal gains on 12p, 8q, 7, and X and losses on 13q, 18q, 11q, and 4q. Our results do not provide evidence for a mutated protein kinase implicated in the development of TGCT other than KIT. Moreover, they demonstrate that the general prevalence of point mutations in TGCT is low, in contrast to the high frequency of copy number changes.

    Funded by: Wellcome Trust

    Genes, chromosomes & cancer 2006;45;1;42-6

  • Functional variation and evolution of non-coding DNA.

    Bird CP, Stranger BE and Dermitzakis ET

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK.

    The focus of large genomic studies has shifted from only looking at genes and protein-coding sequences to exploring the full set of elements in each genome. The explosion of comparative sequencing data has led to an increase in methodologies, approaches and ideas on how to analyze the unknown fraction of the genome, namely the non-protein-coding fraction. The main issues relate to the discovery, evolutionary analysis and natural variation of non-coding DNA, and the parameters that prevent us from fully understanding the properties of non-coding DNA.

    Current opinion in genetics & development 2006;16;6;559-64

  • Ensembl 2006.

    Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Flicek P, Gräf S, Hammond M, Herrero J, Howe K, Iyer V, Jekosch K, Kähäri A, Kasprzyk A, Keefe D, Kokocinski F, Kulesha E, London D, Longden I, Melsopp C, Meidl P, Overduin B, Parker A, Proctor G, Prlic A, Rae M, Rios D, Redmond S, Schuster M, Sealy I, Searle S, Severin J, Slater G, Smedley D, Smith J, Stabenau A, Stalker J, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C and Hubbard TJ

    European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    The Ensembl ( project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased from 4 to 19, with the addition of the mammalian genomes of Rhesus macaque and Opossum, the chordate genome of Ciona intestinalis and the import and integration of the yeast genome. The year has also seen extensive improvements to both data analysis and presentation, with the introduction of a redesigned website, the addition of RNA gene and regulatory annotation and substantial improvements to the integration of human genome variation data.

    Funded by: Biotechnology and Biological Sciences Research Council: BBS/B/13446, BBS/B/13470; Wellcome Trust

    Nucleic acids research 2006;34;Database issue;D556-61

  • RNA editing of human microRNAs.

    Blow MJ, Grocock RJ, van Dongen S, Enright AJ, Dicks E, Futreal PA, Wooster R and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Background: MicroRNAs (miRNAs) are short RNAs of around 22 nucleotides that regulate gene expression. The primary transcripts of miRNAs contain double-stranded RNA and are therefore potential substrates for adenosine to inosine (A-to-I) RNA editing.

    Results: We have conducted a survey of RNA editing of miRNAs from ten human tissues by sequence comparison of PCR products derived from matched genomic DNA and total cDNA from the same individual. Six out of 99 (6%) miRNA transcripts from which data were obtained were subject to A-to-I editing in at least one tissue. Four out of seven edited adenosines were in the mature miRNA and were predicted to change the target sites in 3' untranslated regions. For a further six miRNAs, we identified A-to-I editing of transcripts derived from the opposite strand of the genome to the annotated miRNA. These miRNAs may have been annotated to the wrong genomic strand.

    Conclusion: Our results indicate that RNA editing increases the diversity of miRNAs and their targets, and hence may modulate miRNA function.

    Funded by: Wellcome Trust

    Genome biology 2006;7;4;R27

  • Just one cross appears capable of dramatically altering the population biology of a eukaryotic pathogen like Toxoplasma gondii.

    Boyle JP, Rajasekar B, Saeij JP, Ajioka JW, Berriman M, Paulsen I, Roos DS, Sibley LD, White MW and Boothroyd JC

    Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA.

    Toxoplasma gondii, an obligate intracellular protozoan of the phylum Apicomplexa, is estimated to infect over a billion people worldwide as well as a great many other mammalian and avian hosts. Despite this ubiquity, the vast majority of human infections in Europe and North America are thought to be due to only three genotypes. Using a genome-wide analysis of single-nucleotide polymorphisms, we have constructed a genealogy for these three lines. The data indicate that types I and III are second- and first-generation offspring, respectively, of a cross between a type II strain and one of two ancestral strains. An extant T. gondii strain (P89) appears to be the modern descendant of the non-type II parent of type III, making the full genealogy of the type III clonotype known. The simplicity of this family tree demonstrates that even a single cross can lead to the emergence and dominance of a new clonal genotype that completely alters the population biology of a sexual pathogen.

    Funded by: NIAID NIH HHS: AI045806, AI05093, AI21423, AI41014, F32AI60306, R01 AI036629

    Proceedings of the National Academy of Sciences of the United States of America 2006;103;27;10514-9

  • Evidence against a major genetic basis for combined breast and colorectal cancer susceptibility.

    Brinkman H, Barwell J, Rose S, Tinworth L, Sodha N, Langman C, Brooks L, Payne S, Fisher S, Rowan A, Tomlinson I and Hodgson S

    Clinical genetics 2006;70;6;526-9

  • Notch ligands with contrasting functions: Jagged1 and Delta1 in the mouse inner ear.

    Brooker R, Hozumi K and Lewis J

    Vertebrate Development Laboratory, Cancer Research UK London Research Institute, 44 Lincoln's Inn Fields, London WC2A 3PX, UK.

    Each of the sensory patches in the epithelium of the inner ear is a mosaic of hair cells and supporting cells. Notch signalling is thought to govern this pattern of differentiation through lateral inhibition. Recent experiments in the chick suggest, however, that Notch signalling also has a prior function - inductive rather than inhibitory - in defining the prosensory patches from which the differentiated cells arise. Several Notch ligands are expressed in each patch, but their individual roles in relation to the two functions of Notch signalling are unclear. We have used a Cre-LoxP approach to knock out two of these ligands, Delta1 (Dll1) and Jagged1 (Jag1), in the mouse ear. In the absence of Dll1, auditory hair cells develop early and in excess, in agreement with the lateral inhibition hypothesis. In the absence of Jag1, by contrast, the total number of these cells is strongly reduced, with complete loss of cochlear outer hair cells and some groups of vestibular hair cells, indicating that Jag1 is required for the prosensory inductive function of Notch. The number of cochlear inner hair cells, however, is almost doubled. This correlates with loss of expression of the cell cycle inhibitor p27(Kip1) (Cdkn1b), suggesting that signalling by Jag1 is also needed to limit proliferation of prosensory cells, and that there is a core part of this population whose prosensory character is established independently of Jag1-Notch signalling. Our findings confirm that Notch signalling in the ear has distinct prosensory and lateral-inhibitory functions, for which different ligands are primarily responsible.

    Development (Cambridge, England) 2006;133;7;1277-86

  • Hedgehog lipid modifications are required for Hedgehog stabilization in the extracellular matrix.

    Callejo A, Torroja C, Quijada L and Guerrero I

    Centro de Biología Molecular Severo Ochoa, CSIC, Universidad Autónoma de Madrid, Cantoblanco, Spain.

    The Hedgehog (Hh) family of morphogenetic proteins has important instructional roles in metazoan development. Despite Hh being modified by Ct-cholesterol and Nt-palmitate adducts, Hh migrates far from its site of synthesis and programs cellular outcomes, depending on its local concentrations. We show that in the receiving cells of the Drosophila wing imaginal disc, lipid-unmodified Hh spreads across many more cell diameters than the wild type and this spreading leads to the activation of low but not high threshold responses. Unlipidated Hh forms become internalized through the apical plasma membrane, while wild-type Hh enters through the basolateral cell surface - in all cases via a dynamin-dependent mechanism. Full activation of the Hh pathway and the spread of Hh throughout the extracellular matrix depend on the ability of lipid-modified Hh to interact with heparan sulfate proteoglycans (HSPG). However, neither Hh-lipid modifications nor HSPG function are required to activate the targets that respond to low levels of Hh. All these data show that the interaction of lipid-modified Hh with HSPG is important both for precise Hh spreading through the epithelium surface and for correct Hh reception.

    Development (Cambridge, England) 2006;133;3;471-83

  • Notch, epidermal growth factor receptor, and beta1-integrin pathways are coordinated in neural stem cells.

    Campos LS, Decker L, Taylor V and Skarnes W

    INSERM U368, Biologie Moléculaire du Développement, Ecole Normale Supérieure, Paris, France.

    Notch1 and beta1-integrins are cell surface receptors involved in the recognition of the niche that surrounds stem cells through cell-cell and cell-extracellular matrix interactions, respectively. Notch1 is also involved in the control of cell fate choices in the developing central nervous system (Lewis, J. (1998) Semin. Cell Dev. Biol. 9, 583-589). Here we report that Notch and beta1-integrins are co-expressed and that these proteins cooperate with the epidermal growth factor receptor in neural progenitors. We describe data that suggests that beta1-integrins may affect Notch signaling through 1) physical interaction (sequestration) of the Notch intracellular domain fragment by the cytoplasmic tail of the beta1-integrin and 2) affecting trafficking of the Notch intracellular domain via caveolin-mediated mechanisms. Our findings suggest that caveolin 1-containing lipid rafts play a role in the coordination and coupling of beta1-integrin, Notch1, and tyrosine kinase receptor signaling pathways. We speculate that this will require the presence of the adequate beta1-activating extracellular matrix or growth factors in restricted regions of the central nervous system and namely in neurogenic niches.

    Funded by: Wellcome Trust

    The Journal of biological chemistry 2006;281;8;5300-9

  • Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes.

    Cannon SB, Sterck L, Rombauts S, Sato S, Cheung F, Gouzy J, Wang X, Mudge J, Vasdewani J, Schiex T, Scheix T, Spannagl M, Monaghan E, Nicholson C, Humphray SJ, Schoof H, Mayer KF, Rogers J, Quétier F, Oldroyd GE, Debellé F, Cook DR, Retzel EF, Roe BA, Town CD, Tabata S, Van de Peer Y and Young ND

    Department of Plant Pathology, University of Minnesota, St. Paul, MN 55108, USA.

    Genome sequencing of the model legumes, Medicago truncatula and Lotus japonicus, provides an opportunity for large-scale sequence-based comparison of two genomes in the same plant family. Here we report synteny comparisons between these species, including details about chromosome relationships, large-scale synteny blocks, microsynteny within blocks, and genome regions lacking clear correspondence. The Lotus and Medicago genomes share a minimum of 10 large-scale synteny blocks, each with substantial collinearity and frequently extending the length of whole chromosome arms. The proportion of genes syntenic and collinear within each synteny block is relatively homogeneous. Medicago-Lotus comparisons also indicate similar and largely homogeneous gene densities, although gene-containing regions in Mt occupy 20-30% more space than Lj counterparts, primarily because of larger numbers of Mt retrotransposons. Because the interpretation of genome comparisons is complicated by large-scale genome duplications, we describe synteny, synonymous substitutions and phylogenetic analyses to identify and date a probable whole-genome duplication event. There is no direct evidence for any recent large-scale genome duplication in either Medicago or Lotus but instead a duplication predating speciation. Phylogenetic comparisons place this duplication within the Rosid I clade, clearly after the split between legumes and Salicaceae (poplar).

    Funded by: Wellcome Trust

    Proceedings of the National Academy of Sciences of the United States of America 2006;103;40;14959-64

  • A high-resolution map of synteny disruptions in gibbon and human genomes.

    Carbone L, Vessere GM, ten Hallers BF, Zhu B, Osoegawa K, Mootnick A, Kofler A, Wienberg J, Rogers J, Humphray S, Scott C, Harris RA, Milosavljevic A and de Jong PJ

    BACPAC Resources, Children's Hospital of Oakland Research Institute, Oakland, California, United States of America.

    Gibbons are part of the same superfamily (Hominoidea) as humans and great apes, but their karyotype has diverged faster from the common hominoid ancestor. At least 24 major chromosome rearrangements are required to convert the presumed ancestral karyotype of gibbons into that of the hominoid ancestor. Up to 28 additional rearrangements distinguish the various living species from the common gibbon ancestor. Using the northern white-cheeked gibbon (2n = 52) (Nomascus leucogenys leucogenys) as a model, we created a high-resolution map of the homologous regions between the gibbon and human. The positions of 100 synteny breakpoints relative to the assembled human genome were determined at a resolution of about 200 kb. Interestingly, 46% of the gibbon-human synteny breakpoints occur in regions that correspond to segmental duplications in the human lineage, indicating a common source of plasticity leading to a different outcome in the two species. Additionally, the full sequences of 11 gibbon BACs spanning evolutionary breakpoints reveal either segmental duplications or interspersed repeats at the exact breakpoint locations. No specific sequence element appears to be common among independent rearrangements. We speculate that the extraordinarily high level of rearrangements seen in gibbons may be due to factors that increase the incidence of chromosome breakage or fixation of the derivative chromosomes in a homozygous state.

    Funded by: NHGRI NIH HHS: HG02523-02

    PLoS genetics 2006;2;12;e223

  • Genetic and genomic prospects for Xenopus tropicalis research.

    Carruthers S and Stemple DL

    Vertebrate Development and Genetics, The Morgan Building, Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, UK.

    Research using Xenopus laevis has made enormous contributions to our understanding of vertebrate development, control of the eukaryotic cell cycle and the cytoskeleton. One limitation, however, has been the lack of systematic genetic studies in Xenopus to complement molecular and cell biological investigations. Work with the closely related diploid frog Xenopus tropicalis is beginning to address this limitation. Here, we review the resources that will make genetic studies using X. tropicalis a reality.

    Funded by: NICHD NIH HHS: 1 R01 HD4 2276-01; Wellcome Trust

    Seminars in cell & developmental biology 2006;17;1;146-53

  • Vertebrate gene finding from multiple-species alignments using a two-level strategy.

    Carter D and Durbin R

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Background: One way in which the accuracy of gene structure prediction in vertebrate DNA sequences can be improved is by analyzing alignments with multiple related species, since functional regions of genes tend to be more conserved.

    Results: We describe DOGFISH, a vertebrate gene finder consisting of a cleanly separated site classifier and structure predictor. The classifier scores potential splice sites and other features, using sequence alignments between multiple vertebrate species, while the structure predictor hypothesizes coding transcripts by combining these scores using a simple model of gene structure. This also identifies and assigns confidence scores to possible additional exons. Performance is assessed on the ENCODE regions. We predict transcripts and exons across the whole human genome, and identify over 10,000 high confidence new coding exons not in the Ensembl gene set.

    Conclusion: We present a practical multiple species gene prediction method. Accuracy improves as additional species, up to at least eight, are introduced. The novel predictions of the whole-genome scan should support efficient experimental verification.

    Funded by: Wellcome Trust

    Genome biology 2006;7 Suppl 1;S6.1-12

  • Ancient Indian roots?

    Carvalho-Silva DR, Zerjal T and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs., UK.

    Journal of biosciences 2006;31;1;1-2

  • Animal, vegetable or mineral?

    Cerdeño-Tárraga AM and Bentley S

    Nature reviews. Microbiology 2006;4;10;725-6

  • Multiplexed expression and screening for recombinant protein production in mammalian cells.

    Chapple SD, Crofts AM, Shadbolt SP, McCafferty J and Dyson MR

    The Atlas of Protein Expression Project, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. <;

    Background: A variety of approaches to understanding protein structure and function require production of recombinant protein. Mammalian based expression systems have advantages over bacterial systems for certain classes of protein but can be slower and more laborious. Thus the availability of a simple system for production and rapid screening of constructs or conditions for mammalian expression would be of great benefit. To this end we have coupled an efficient recombinant protein production system based on transient transfection in HEK-293 EBNA1 (HEK-293E) suspension cells with a dot blot method allowing pre-screening of proteins expressed in cells in a high throughput manner.

    Results: A nested PCR approach was used to clone 21 extracellular domains of mouse receptors as CD4 fusions within a mammalian GATEWAY expression vector system. Following transient transfection, HEK-293E cells grown in 2 ml cultures in 24-deep well blocks showed similar growth kinetics, viability and recombinant protein expression profiles, to those grown in 50 ml shake flask cultures as judged by western blotting. Following optimisation, fluorescent dot blot analysis of transfection supernatants was shown to be a rapid method for analysing protein expression yielding similar results as western blot analysis. Addition of urea enhanced the binding of glycoproteins to a nitrocellulose membrane. A good correlation was observed between the results of a plate based small scale transient transfection dot blot pre-screen and successful purification of proteins expressed at the 50 ml scale.

    Conclusion: The combination of small scale multi-well plate culture and dot blotting described here will allow the multiplex analysis of different mammalian expression experiments enabling a faster identification of high yield expression constructs or conditions prior to large scale protein production. The methods for parallel GATEWAY cloning and expression of multiple constructs in cell culture will also be useful for applications such as the generation of receptor protein microarrays.

    Funded by: Wellcome Trust

    BMC biotechnology 2006;6;49

  • Homozygous mutation of focal adhesion kinase in embryonic stem cell derived neurons: normal electrophysiological and morphological properties in vitro.

    Charlesworth P, Komiyama NH and Grant SG

    Centre for Neuroscience Research, University of Edinburgh, Edinburgh, UK.

    Background: Genetically manipulated embryonic stem (ES) cell derived neurons (ESNs) provide a powerful system with which to study the consequences of gene manipulation in mature, synaptically connected neurons in vitro. Here we report a study of focal adhesion kinase (FAK), which has been implicated in synapse formation and regulation of ion channels, using the ESN system to circumvent the embryonic lethality of homozygous FAK mutant mice.

    Results: Mouse ES cells carrying homozygous null mutations (FAK-/-) were generated and differentiated in vitro into neurons. FAK-/- ESNs extended axons and dendrites and formed morphologically and electrophysiologically intact synapses. A detailed study of NMDA receptor gated currents and voltage sensitive calcium currents revealed no difference in their magnitude, or modulation by tyrosine kinases.

    Conclusion: FAK does not have an obligatory role in neuronal differentiation, synapse formation or the expression of NMDA receptor or voltage-gated calcium currents under the conditions used in this study. The use of genetically modified ESNs has great potential for rapidly and effectively examining the consequences of neuronal gene manipulation and is complementary to mouse studies.

    Funded by: Wellcome Trust

    BMC neuroscience 2006;7;47

  • Data sharing and intellectual property in a genomic epidemiology network: policies for large-scale research collaboration.

    Chokshi DA, Parker M and Kwiatkowski DP

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, England.

    Genomic epidemiology is a field of research that seeks to improve the prevention and management of common diseases through an understanding of their molecular origins. It involves studying thousands of individuals, often from different populations, with exacting techniques. The scale and complexity of such research has required the formation of research consortia. Members of these consortia need to agree on policies for managing shared resources and handling genetic data. Here we consider data-sharing and intellectual property policies for an international research consortium working on the genomic epidemiology of malaria. We outline specific guidelines governing how samples and data are transferred among its members; how results are released into the public domain; when to seek protection for intellectual property; and how intellectual property should be managed. We outline some pragmatic solutions founded on the basic principles of promoting innovation and access.

    Funded by: Medical Research Council: G0200454, G19/9; Wellcome Trust

    Bulletin of the World Health Organization 2006;84;5;382-7

  • Developmental timing in Dictyostelium is regulated by the Set1 histone methyltransferase.

    Chubb JR, Bloomfield G, Xu Q, Kaller M, Ivens A, Skelton J, Turner BM, Nellen W, Shaulsky G, Kay RR, Bickmore WA and Singer RH

    Department of Anatomy and Structural Biology, Albert Einstein College of Medicine, The Bronx, NY 10461, USA.

    Histone-modifying enzymes have enormous potential as regulators of the large-scale changes in gene expression occurring during differentiation. It is unclear how different combinations of histone modification coordinate regimes of transcription during development. We show that different methylation states of lysine 4 of histone H3 (H3K4) mark distinct developmental phases of the simple eukaryote, Dictyostelium. We demonstrate that the enzyme responsible for all mono, di and tri-methylation of H3K4 is the Dictyostelium homolog of the Set1 histone methyltransferase. In the absence of Set1, cells display unusually rapid development, characterized by precocious aggregation of amoebae into multicellular aggregates. Early differentiation markers are abundantly expressed in growing set1 cells, indicating the differentiation program is ectopically activated during growth. This phenotype is caused specifically by the loss of Set1 catalytic activity. Set1 mutants induce premature differentiation in wild-type cells, indicating Set1 regulates production of an extra-cellular factor required for the correct perception of growth conditions. Microarray analysis of the set1 mutants reveals genomic clustering of mis-expressed genes, suggesting a requirement for Set1 in the regulation of chromatin-mediated events at gene clusters.

    Funded by: Medical Research Council: G120/1013(75407)

    Developmental biology 2006;292;2;519-32

  • Multireplicon genome architecture of Lactobacillus salivarius.

    Claesson MJ, Li Y, Leahy S, Canchaya C, van Pijkeren JP, Cerdeño-Tárraga AM, Parkhill J, Flynn S, O'Sullivan GC, Collins JK, Higgins D, Shanahan F, Fitzgerald GF, van Sinderen D and O'Toole PW

    Department of Microbiology, Alimentary Pharmabiotic Centre, University College Cork, Cork, Ireland.

    Lactobacillus salivarius subsp. salivarius strain UCC118 is a bacteriocin-producing strain with probiotic characteristics. The 2.13-Mb genome was shown by sequencing to comprise a 1.83 Mb chromosome, a 242-kb megaplasmid (pMP118), and two smaller plasmids. Megaplasmids previously have not been characterized in lactic acid bacteria or intestinal lactobacilli. Annotation of the genome sequence indicated an intermediate level of auxotrophy compared with other sequenced lactobacilli. No single-copy essential genes were located on the megaplasmid. However, contingency amino acid metabolism genes and carbohydrate utilization genes, including two genes for completion of the pentose phosphate pathway, were megaplasmid encoded. The megaplasmid also harbored genes for the Abp118 bacteriocin, a bile salt hydrolase, a presumptive conjugation locus, and other genes potentially relevant for probiotic properties. Two subspecies of L. salivarius are recognized, salivarius and salicinius, and we detected megaplasmids in both subspecies by pulsed-field gel electrophoresis of sizes ranging from 100 kb to 380 kb. The discovery of megaplasmids of widely varying size in L. salivarius suggests a possible mechanism for genome expansion or contraction to adapt to different environments.

    Proceedings of the National Academy of Sciences of the United States of America 2006;103;17;6718-23

  • Unique organisation of tRNA genes in Entamoeba histolytica.

    Clark CG, Ali IK, Zaki M, Loftus BJ and Hall N

    Department of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK.

    The genome sequence of the protistan parasite Entamoeba histolytica HM-1:IMSS has been completed recently. Among the findings has been a unique organisation for the tRNA genes in this organism. Forty-two of the tRNA isoacceptor types are encoded in tandem arrays that vary in unit length from 490 to 1775 basepairs and contain from 1 to 5 tRNA genes. In three cases a 5S RNA gene is also present in the unit. An estimated 10% of the genome is made up of these arrays. Interspersed between RNA-encoding sequences are short tandem repeats that are polymorphic between isolates and, in some cases, within isolates. The number and organisation of tRNA genes in E. histolytica is unprecedented. In addition to encoding the tRNAs of the organism we propose that the arrays may fulfil a structural role in the genome.

    Funded by: NIAID NIH HHS: 5R01AI046516-03; Wellcome Trust: 064057

    Molecular and biochemical parasitology 2006;146;1;24-9

  • Molecular characterization and comparison of the components and multiprotein complexes in the postsynaptic proteome.

    Collins MO, Husi H, Yu L, Brandon JM, Anderson CN, Blackstock WP, Choudhary JS and Grant SG

    Genes to Cognition, The Wellcome Trust Sanger Institute, Hinxton, UK.

    Characterization of the composition of the postsynaptic proteome (PSP) provides a framework for understanding the overall organization and function of the synapse in normal and pathological conditions. We have identified 698 proteins from the postsynaptic terminal of mouse CNS synapses using a series of purification strategies and analysis by liquid chromatography tandem mass spectrometry and large-scale immunoblotting. Some 620 proteins were found in purified postsynaptic densities (PSDs), nine in AMPA-receptor immuno-purifications, 100 in isolates using an antibody against the NMDA receptor subunit NR1, and 170 by peptide-affinity purification of complexes with the C-terminus of NR2B. Together, the NR1 and NR2B complexes contain 186 proteins, collectively referred to as membrane-associated guanylate kinase-associated signalling complexes. We extracted data from six other synapse proteome experiments and combined these with our data to provide a consensus on the composition of the PSP. In total, 1124 proteins are present in the PSP, of which 466 were validated by their detection in two or more studies, forming what we have designated the Consensus PSD. These synapse proteome data sets offer a basis for future research in synaptic biology and will provide useful information in brain disease and mental disorder studies.

    Funded by: Wellcome Trust

    Journal of neurochemistry 2006;97 Suppl 1;16-23

  • A high-resolution survey of deletion polymorphism in the human genome.

    Conrad DF, Andrews TD, Carter NP, Hurles ME and Pritchard JK

    Department of Human Genetics, The University of Chicago, 920 East 58th Street, Chicago, Illinois 60637, USA.

    Recent work has shown that copy number polymorphism is an important class of genetic variation in human genomes. Here we report a new method that uses SNP genotype data from parent-offspring trios to identify polymorphic deletions. We applied this method to data from the International HapMap Project to produce the first high-resolution population surveys of deletion polymorphism. Approximately 100 of these deletions have been experimentally validated using comparative genome hybridization on tiling-resolution oligonucleotide microarrays. Our analysis identifies a total of 586 distinct regions that harbor deletion polymorphisms in one or more of the families. Notably, we estimate that typical individuals are hemizygous for roughly 30-50 deletions larger than 5 kb, totaling around 550-750 kb of euchromatic sequence across their genomes. The detected deletions span a total of 267 known and predicted genes. Overall, however, the deleted regions are relatively gene-poor, consistent with the action of purifying selection against deletions. Deletion polymorphisms may well have an important role in the genetics of complex traits; however, they are not directly observed in most current gene mapping studies. Our new method will permit the identification of deletion polymorphisms in high-density SNP surveys of trio or other family data.

    Funded by: NIGMS NIH HHS: GM07197; Wellcome Trust

    Nature genetics 2006;38;1;75-81

  • Fluoroquinolone resistance in Salmonella Typhi.

    Cooke FJ, Wain J and Threlfall EJ

    BMJ (Clinical research ed.) 2006;333;7563;353-4

  • Deep-seated resistance in relapsed paratyphoid fever.

    Cooke GS, Cooke FJ, Stone M, Turner K, Al-Nahhas A, Win Z, Wain J, Rogers TR, Friedland JS and Bamford KB

    Department of Infectious Diseases, Hammersmith Hospital, London, United Kingdom. graham.cooke@st-mary&#39;

    We describe a case of relapsed paratyphoid fever in which the isolate had reduced susceptibility to ciprofloxacin due to a rare mutation within the gyrA gene. 18fluorodeoxyglucose positron emission tomography scanning identified deep-seated infection including unsuspected aortitis and highlights the utility of novel imaging techniques to improve our understanding and treatment of this disease.

    Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 2006;42;11;e92-4

  • Boosting of cellular immunity against Mycobacterium tuberculosis and modulation of skin cytokine responses in healthy human volunteers by Mycobacterium bovis BCG substrain Moreau Rio de Janeiro oral vaccine.

    Cosgrove CA, Castello-Branco LR, Hussell T, Sexton A, Giemza R, Phillips R, Williams A, Griffin GE, Dougan G and Lewis DJ

    CMM (Infectious Diseases), St George's Hospital Medical School, Cranmer Terrace, London SW17 0RE, United Kingdom.

    Oral immunization of healthy adults with 10(7) CFU BCG Moreau Rio de Janeiro was well tolerated and significantly boosted gamma interferon responses to purified protein derivative, Ag85, and MPB70 from previous childhood intradermal BCG immunization. Oral BCG offers the possibility of a needle-free tuberculosis vaccine and of boosting the protective immunity from intradermal tuberculosis vaccines.

    Funded by: Wellcome Trust: 043139

    Infection and immunity 2006;74;4;2449-52

  • Budget genome.

    Crossman L

    Nature reviews. Microbiology 2006;4;5;326-7

  • Peddling the nitrogen cycle.

    Crossman L and Thomson N

    Nature reviews. Microbiology 2006;4;7;494-5

  • TranscriptSNPView: a genome-wide catalog of mouse coding variation.

    Cunningham F, Rios D, Griffiths M, Smith J, Ning Z, Cox T, Flicek P, Marin-Garcin P, Herrero J, Rogers J, van der Weyden L, Bradley A, Birney E and Adams DJ

    Funded by: Wellcome Trust: 062023, 077187

    Nature genetics 2006;38;8;853

  • Polymorphisms in the glucokinase-associated, dual-specificity phosphatase 12 (DUSP12) gene under chromosome 1q21 linkage peak are associated with type 2 diabetes.

    Das SK, Chu WS, Hale TC, Wang X, Craig RL, Wang H, Shuldiner AR, Froguel P, Deloukas P, McCarthy MI, Zeggini E, Hasstedt SJ and Elbein SC

    Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.

    Linkage of type 2 diabetes to chromosome 1q21-q23 is well replicated across populations. In an initial 50-kb marker map (580 markers) across the linked region, one of the two strongest associations observed in Utah Caucasians was at marker rs1503814 (P < 0.00001 in pools, P < 0.004 in individuals). Based on this association, we typed additional markers and screened for sequence variation in the nearby DUSP12 gene. The strongest associations mapped to a highly conserved nongenic sequence just telomeric to rs1503814 and extended 10 kb telomeric through the DUSP12 gene and into the 5' end of the adjacent ATF6 gene. No coding variant could explain the association in the DUSP12 gene. An extended haplotype encompassing markers from -8,379 to +10,309 bp relative to the ATG start was more common in Caucasian case (0.381) than control subjects (0.285, P = 0.005) and was uniquely tagged by a 194-bp allele at either of two simple tandem repeat variants or by the T allele at marker +7,580. Markers -8,379 and +7,580 were nominally associated with type 2 diabetes in African-American subjects (P < 0.05), but with different alleles. Marker rs1503814 was strongly associated with postchallenge insulin levels among family members (P = 0.000002), but sequence variation in this region was not associated with type 2 diabetes in three other populations of European ancestry. Our data suggest that sequences in or upstream of DUSP12 may contribute to type 2 diabetes susceptibility, but the lack of replication suggests a small effect size.

    Funded by: NCRR NIH HHS: M01RR14288; NIDDK NIH HHS: DK39311, R01 DK039311-24, U01-DK58026

    Diabetes 2006;55;9;2631-9

  • High throughput DNA sequence variant detection by conformation sensitive capillary electrophoresis and automated peak comparison.

    Davies H, Dicks E, Stephens P, Cox C, Teague J, Greenman C, Bignell G, O'meara S, Edkins S, Parker A, Stevens C, Menzies A, Blow M, Bottomley B, Dronsfield M, Futreal PA, Stratton MR and Wooster R

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    We report the development of a heteroduplex-based mutation detection method using multicapillary automated sequencers, known as conformation-sensitive capillary electrophoresis (CSCE). Our optimized CSCE protocol detected 93 of 95 known base substitution sequence variants. Since the optimization of the method, we have analyzed 215 Mb of DNA and identified 3397 unique variants. An analysis of this data set indicates that the sensitivity of CSCE is above 95% in the central 56% of the average PCR product. To fully exploit the mutation detection capacity of this method, we have developed software, canplot, which automatically compares normal and test results to prioritize samples that are most likely to contain variants. Using multiple fluorescent dyes, CSCE has the capacity to screen over 2.2 Mb on one ABI3730 each day. Therefore this technique is suitable for projects where a rapid and sensitive DNA mutation detection system is required.

    Genomics 2006;87;3;427-32

  • A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC.

    de Bakker PI, McVean G, Sabeti PC, Miretti MM, Green T, Marchini J, Ke X, Monsuur AJ, Whittaker P, Delgado M, Morrison J, Richardson A, Walsh EC, Gao X, Galver L, Hart J, Hafler DA, Pericak-Vance M, Todd JA, Daly MJ, Trowsdale J, Wijmenga C, Vyse TJ, Beck S, Murray SS, Carrington M, Gregory S, Deloukas P and Rioux JD

    Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Seven Cambridge Center, Cambridge, Massachusetts 02142, USA.

    The proteins encoded by the classical HLA class I and class II genes in the major histocompatibility complex (MHC) are highly polymorphic and are essential in self versus non-self immune recognition. HLA variation is a crucial determinant of transplant rejection and susceptibility to a large number of infectious and autoimmune diseases. Yet identification of causal variants is problematic owing to linkage disequilibrium that extends across multiple HLA and non-HLA genes in the MHC. We therefore set out to characterize the linkage disequilibrium patterns between the highly polymorphic HLA genes and background variation by typing the classical HLA genes and >7,500 common SNPs and deletion-insertion polymorphisms across four population samples. The analysis provides informative tag SNPs that capture much of the common variation in the MHC region and that could be used in disease association studies, and it provides new insight into the evolutionary dynamics and ancestral origins of the HLA loci and their haplotypes.

    Funded by: Medical Research Council: G9800943; NCI NIH HHS: N01-CO-12400; NIAID NIH HHS: U19 AI050864; Wellcome Trust: 077011

    Nature genetics 2006;38;10;1166-72

  • Expressed sequence tag (EST) analysis of the erythrocytic stages of Babesia bovis.

    de Vries E, Corton C, Harris B, Cornelissen AW and Berriman M

    Division of Infection Biology, Department of Infectious Diseases and Immunology, Utrecht University, P.O. Box 80165, 3508 TD Utrecht, The Netherlands.

    Expressed sequence tags (ESTs) provide an efficient way to identify large numbers of genes expressed in a specific stage of the life cycle of an organism. Here we analysed approximately 13,000 ESTs derived from the erythrocytic stage of the apicomplexan parasite Babesia bovis. The ESTs were clustered in order to obtain information on the expression level of a gene and to increase sequence length and reliability. A total of 3522 clusters were obtained and annotated using BLAST algorithms. The clusters were estimated to represent approximately 2600 genes of which in total approximately 2.1 Mbp sequence information was obtained. Expression levels of the genes, as determined by the numbers of ESTs contained within a cluster, were compared to those of their closest homologs in the erythrocytic stage of Plasmodium falciparum and Toxoplasma gondii tachyzoites. Pathways that are represented relatively abundant in B. bovis are, amongst others, the purine salvage pathway (displaying characteristics not identified before in apicomplexans), isoprenoid biosynthesis in the apicoplast and many genes encoding mitochondrial proteins. Especially remarkable in the latter group are the F-type ATPases - which are hardly expressed in P. falciparum and T. gondii - and two highly expressed glycerol-3-phosphate dehydrogenases creating a shuttle possibly controlling the cytoplasmic NADH/NAD+ -ratio. A comparison of known antigenic proteins from Australian and American strains of B. bovis with the Israel strain used here identifies considerable sequence variation in the rhoptry associated protein-1 (RAP-1), merozoite surface proteins of the variable merozoite surface antigen (VMSA) family and spherical body proteins. Analysis of the EST clusters representing the variable erythocyte surface antigen family reveals many variant transcripts of which a few are dominant. Two putative pseudogenes also seem to be transcribed at high levels.

    Funded by: NIAID NIH HHS: AI05093; Wellcome Trust

    Veterinary parasitology 2006;138;1-2;61-74

  • Evolution and comparative analysis of the MHC Class III inflammatory region.

    Deakin JE, Papenfuss AT, Belov K, Cross JG, Coggill P, Palmer S, Sims S, Speed TP, Beck S and Graves JA

    ARC Centre for Kangaroo Genomics, Research School of Biological Sciences, The Australian National University, Canberra, ACT 0200, Australia.

    Background: The Major Histocompatibility Complex (MHC) is essential for immune function. Historically, it has been subdivided into three regions (Class I, II, and III), but a cluster of functionally related genes within the Class III region has also been referred to as the Class IV region or "inflammatory region". This group of genes is involved in the inflammatory response, and includes members of the tumour necrosis family. Here we report the sequencing, annotation and comparative analysis of a tammar wallaby BAC containing the inflammatory region. We also discuss the extent of sequence conservation across the entire region and identify elements conserved in evolution.

    Results: Fourteen Class III genes from the tammar wallaby inflammatory region were characterised and compared to their orthologues in other vertebrates. The organisation and sequence of genes in the inflammatory region of both the wallaby and South American opossum are highly conserved compared to known genes from eutherian ("placental") mammals. Some minor differences separate the two marsupial species. Eight genes within the inflammatory region have remained tightly clustered for at least 360 million years, predating the divergence of the amphibian lineage. Analysis of sequence conservation identified 354 elements that are conserved. These range in size from 7 to 431 bases and cover 15.6% of the inflammatory region, representing approximately a 4-fold increase compared to the average for vertebrate genomes. About 5.5% of this conserved sequence is marsupial-specific, including three cases of marsupial-specific repeats. Highly Conserved Elements were also characterised.

    Conclusion: Using comparative analysis, we show that a cluster of MHC genes involved in inflammation, including TNF, LTA (or its putative teleost homolog TNF-N), APOM, and BAT3 have remained together for over 450 million years, predating the divergence of mammals from fish. The observed enrichment in conserved sequences within the inflammatory region suggests conservation at the transcriptional regulatory level, in addition to the functional level.

    Funded by: Wellcome Trust

    BMC genomics 2006;7;281

  • Genetic variation in human gene expression.

    Dermitzakis ET and Stranger BE

    Division of Informatics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Gene expression variation has been the focus of many studies in the past few years. The relevance of gene regulation and gene expression to disease and the development of the technologies used to screen large numbers of genes simultaneously have allowed this rapid development. In this review we discuss issues relating to the biological information one obtains from such studies and the biological significance and use of signals from mapping of gene expression variation.

    Mammalian genome : official journal of the International Mammalian Genome Society 2006;17;6;503-8

  • Analysis of ESTs from Lutzomyia longipalpis sand flies and their contribution toward understanding the insect-parasite relationship.

    Dillon RJ, Ivens AC, Churcher C, Holroyd N, Quail MA, Rogers ME, Soares MB, Bonaldo MF, Casavant TL, Lehane MJ and Bates PA

    Liverpool School of Tropical Medicine, Pembroke Place, Liverpool L3 5QA, UK.

    An expressed sequence tag library has been generated from a sand fly vector of visceral leishmaniasis, Lutzomyia longipalpis. A normalized cDNA library was constructed from whole adults and 16,608 clones were sequenced from both ends and assembled into 10,203 contigs and singlets. Of these 58% showed significant similarity to known genes from other organisms, <4% were identical to described sand fly genes, and 42% had no match to any database sequence. Our analyses revealed putative proteins involved in the barrier function of the gut (peritrophins, microvillar proteins, glutamine synthase), digestive physiology (secreted and membrane-anchored hydrolytic enzymes), and the immune response (gram-negative binding proteins, thioester proteins, scavenger receptors, galectins, signaling pathway factors, caspases, serpins, and peroxidases). Sequence analysis of this transcriptome dataset has provided new insights into genes that might be associated with the response of the vector to the development of Leishmania.

    Funded by: Wellcome Trust: 073405

    Genomics 2006;88;6;831-40

  • Functional analysis of luxS in Staphylococcus aureus reveals a role in metabolism but not quorum sensing.

    Doherty N, Holden MT, Qazi SN, Williams P and Winzer K

    Institute of Infections, Immunity, and Inflammation, University of Nottingham, Centre for Biomolecular Sciences, Nottingham NG7 2RD, United Kingdom.

    The function of AI-2 in many bacteria and the physiological role of LuxS, the enzyme responsible for its production, remain matters of debate. Here, we show that in Staphylococcus aureus the luxS gene forms a monocistronic transcriptional unit under the control of a sigma(70)-dependent promoter. The gene was transcribed throughout growth under a variety of conditions, including intracellular growth in MAC-T cells. AI-2 was produced in rich media under aerobic and anaerobic conditions, peaking during the transition to stationary phase, but was hardly detectable in a sulfur-limited defined medium. In the presence of glucose or under anaerobic conditions, cultures retained considerable AI-2 activity after entry into stationary phase. Inactivation of luxS in various S. aureus strains did not affect virulence-associated traits, such as production of hemolysins and extracellular proteases, biofilm formation, and the agr signaling system. Conversely, AI-2 production remained unchanged in an agr mutant. However, luxS mutants grown in a sulfur-limited defined medium exhibited a growth defect. When grown together with the wild type in mixed culture, luxS mutants of various S. aureus strains showed reduced ability to compete for growth under these conditions. In contrast, a complemented luxS mutant grew as well as the parent strain, suggesting that the observed growth defect was of an intracellular nature and had not been caused by either second-site mutations or the lack of a diffusible factor. However, the LuxS/AI-2 system does not appear to contribute to the overall fitness of S. aureus RN6390B during intracellular growth in epithelial cells: the wild type and a luxS mutant showed very similar growth patterns after their internalization by MAC-T cells.

    Journal of bacteriology 2006;188;8;2885-97

  • How bacteria and their products provide clues to vaccine and adjuvant development.

    Dougan G and Hormaeche C

    The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Evidence has emerged that both vertebrates and invertebrates share innate immune pathways involved in the recognition of and the response to micro-organisms, including bacteria and their products. As a consequence, particular degenerate products of bacteria can stimulate and modulate immune responses and influence acquired immunity and, potentially, protection against disease. New knowledge in this field is beginning to explain how vaccine adjuvants work and will facilitate the future development of novel adjuvants and vaccines.

    Funded by: Wellcome Trust

    Vaccine 2006;24 Suppl 2;S2-13-9

  • A machine learning strategy to identify candidate binding sites in human protein-coding sequence.

    Down T, Leong B and Hubbard TJ

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    Background: The splicing of RNA transcripts is thought to be partly promoted and regulated by sequences embedded within exons. Known sequences include binding sites for SR proteins, which are thought to mediate interactions between splicing factors bound to the 5' and 3' splice sites. It would be useful to identify further candidate sequences, however identifying them computationally is hard since exon sequences are also constrained by their functional role in coding for proteins.

    Results: This strategy identified a collection of motifs including several previously reported splice enhancer elements. Although only trained on coding exons, the model discriminates both coding and non-coding exons from intragenic sequence.

    Conclusion: We have trained a computational model able to detect signals in coding exons which seem to be orthogonal to the sequences' primary function of coding for proteins. We believe that many of the motifs detected here represent binding sites for both previously unrecognized proteins which influence RNA splicing as well as other regulatory elements.

    Funded by: Wellcome Trust: 077198

    BMC bioinformatics 2006;7;419

  • Multi-genome biology.

    Down TA

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    A report on the Genome Informatics meeting held at Cold Spring Harbor Laboratory, Cold Spring Harbor, USA, 28 October-1 November 2005.

    Funded by: Wellcome Trust

    Genome biology 2006;7;2;305

  • Conserved noncoding sequences are selectively constrained and not mutation cold spots.

    Drake JA, Bird C, Nemesh J, Thomas DJ, Newton-Cheh C, Reymond A, Excoffier L, Attar H, Antonarakis SE, Dermitzakis ET and Hirschhorn JN

    Program in Genomics and Division of Endocrinology, Children's Hospital, Boston, Massachusetts 02115, USA.

    Noncoding genetic variants are likely to influence human biology and disease, but recognizing functional noncoding variants is difficult. Approximately 3% of noncoding sequence is conserved among distantly related mammals, suggesting that these evolutionarily conserved noncoding regions (CNCs) are selectively constrained and contain functional variation. However, CNCs could also merely represent regions with lower local mutation rates. Here we address this issue and show that CNCs are selectively constrained in humans by analyzing HapMap genotype data. Specifically, new (derived) alleles of SNPs within CNCs are rarer than new alleles in nonconserved regions (P = 3 x 10(-18)), indicating that evolutionary pressure has suppressed CNC-derived allele frequencies. Intronic CNCs and CNCs near genes show greater allele frequency shifts, with magnitudes comparable to those for missense variants. Thus, conserved noncoding variants are more likely to be functional. Allele frequency distributions highlight selectively constrained genomic regions that should be intensively surveyed for functionally important variation.

    Funded by: Wellcome Trust

    Nature genetics 2006;38;2;223-7

  • Proteomic and microarray characterization of the AggR regulon identifies a pheU pathogenicity island in enteroaggregative Escherichia coli.

    Dudley EG, Thomson NR, Parkhill J, Morin NP and Nataro JP

    Center for Vaccine Development, University of Maryland School of Medicine, 685 W. Baltimore St., Baltimore, MD 21201, USA.

    Enteroaggregative Escherichia coli (EAEC) is defined by aggregative adherence (AA) to HEp-2 cells, where bacteria display adherence to cell surfaces and also to the intervening substratum in a stacked-brick configuration. We previously showed that an AraC homologue designated AggR is required for the expression of plasmid-encoded genes that mediate AA of EAEC strain 042. In this study, we hypothesized that AggR also controls the expression of other virulence determinants in EAEC 042. Using proteomic and microarray analysis, we identified for the first time that AggR activates the expression of chromosomal genes, including 25 contiguous genes (aaiA-Y), which are localized to a 117 kb pathogenicity island (PAI) inserted at pheU. Many of these genes have homologues in other Gram-negative bacteria and were recently proposed to constitute a type VI secretion system (T6SS). AaiC was identified as a secreted protein that has no apparent homologues within GenBank. EAEC strains carrying in-frame deletions of aaiB, aaiG, aaiO or aaiP still synthesized AaiC; however, AaiC secretion was abolished. Cloning of aai genes into E. coli HB101 suggested that aaiA-P are sufficient for AaiC secretion. A second T6SS was identified within the pheU PAI that secretes a protein unrelated by sequence identity to AaiC. Distribution studies indicated that aaiA and aaiC are commonly found in EAEC isolates worldwide, particularly in strains defined as typical EAEC. These data support the hypothesis that AggR is a global regulator of EAEC virulence determinants, and builds on the hypothesis that T6SS is an importance mediator of pathogenesis.

    Funded by: NIAID NIH HHS: AI033096; Wellcome Trust

    Molecular microbiology 2006;61;5;1267-82

  • Evaluation of new-generation serologic tests for the diagnosis of typhoid fever: data from a community-based surveillance in Calcutta, India.

    Dutta S, Sur D, Manna B, Sen B, Deb AK, Deen JL, Wain J, Von Seidlein L, Ochiai L, Clemens JD and Kumar Bhattacharya S

    National Institute of Cholera and Enteric Diseases, P-33 CIT Road, Scheme XM, Beliaghata, P.O. Box 177, Calcutta 700010, India.

    Although typhoid fever is confirmed by culture of Salmonella enterica serotype Typhi, rapid and simple diagnostic serologic tests would be useful in developing countries. We examined the performance of Widal test in a community field site and compared it with Typhidot and Tubex tests for diagnosis of typhoid fever. Blood samples were collected from 6697 patients with fever for > or =3 days for microscopy, culture, and serologic testing and from randomly selected 172 consenting healthy individuals to assess the baseline Widal anti-Typhi O lipopolysaccharide antibody (anti-TO) and anti-Typhi H flagellar antibody (anti-TH) titers. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the 3 serologic tests were calculated using culture-confirmed typhoid fever cases as "true positives" and paratyphoid fever and malaria cases as "true negatives". Comparing cutoff values for the Widal test, an anti-TO titer of 1/80 was optimal with 58% sensitivity, 85% specificity, 69% PPV, and 77% NPV. Sensitivity was increased to 67% when the Widal test was done on the 5th day of illness and thereafter. The sensitivity, specificity, PPV, and NPV of Typhidot and Tubex were not better than Widal test. There is a need for more efficient rapid diagnostic test for typhoid fever especially during the acute stage of the disease. Until then, culture remains the method of choice.

    Diagnostic microbiology and infectious disease 2006;56;4;359-65

  • Minimizing the risk of reporting false positives in large-scale RNAi screens.

    Echeverri CJ, Beachy PA, Baum B, Boutros M, Buchholz F, Chanda SK, Downward J, Ellenberg J, Fraser AG, Hacohen N, Hahn WC, Jackson AL, Kiger A, Linsley PS, Lum L, Ma Y, Mathey-Prévôt B, Root DE, Sabatini DM, Taipale J, Perrimon N and Bernards R

    Cenix BioScience GmbH, Tatzberg 47, Dresden, 10307, Germany.

    Large-scale RNA interference (RNAi)-based analyses, very much as other 'omic' approaches, have inherent rates of false positives and negatives. The variability in the standards of care applied to validate results from these studies, if left unchecked, could eventually begin to undermine the credibility of RNAi as a powerful functional approach. This Commentary is an invitation to an open discussion started among various users of RNAi to set forth accepted standards that would insure the quality and accuracy of information in the large datasets coming out of genome-scale screens.

    Nature methods 2006;3;10;777-9

  • DNA methylation profiling of human chromosomes 6, 20 and 22.

    Eckhardt F, Lewin J, Cortese R, Rakyan VK, Attwood J, Burger M, Burton J, Cox TV, Davies R, Down TA, Haefliger C, Horton R, Howe K, Jackson DK, Kunde J, Koenig C, Liddle J, Niblett D, Otto T, Pettett R, Seemann S, Thompson C, West T, Rogers J, Olek A, Berlin K and Beck S

    Epigenomics AG, Kleine Präsidentstrasse 1, 10178 Berlin, Germany.

    DNA methylation is the most stable type of epigenetic modification modulating the transcriptional plasticity of mammalian genomes. Using bisulfite DNA sequencing, we report high-resolution methylation profiles of human chromosomes 6, 20 and 22, providing a resource of about 1.9 million CpG methylation values derived from 12 different tissues. Analysis of six annotation categories showed that evolutionarily conserved regions are the predominant sites for differential DNA methylation and that a core region surrounding the transcriptional start site is an informative surrogate for promoter methylation. We find that 17% of the 873 analyzed genes are differentially methylated in their 5' UTRs and that about one-third of the differentially methylated 5' UTRs are inversely correlated with transcription. Despite the fact that our study controlled for factors reported to affect DNA methylation such as sex and age, we did not find any significant attributable effects. Our data suggest DNA methylation to be ontogenetically more stable than previously thought.

    Funded by: Wellcome Trust: 084071

    Nature genetics 2006;38;12;1378-85

  • Recurrent KRAS codon 146 mutations in human colorectal cancer.

    Edkins S, O'Meara S, Parker A, Stevens C, Reis M, Jones S, Greenman C, Davies H, Dalgliesh G, Forbes S, Hunter C, Smith R, Stephens P, Goldstraw P, Nicholson A, Chan TL, Velculescu VE, Yuen ST, Leung SY, Stratton MR and Futreal PA

    Cancer Genome Project, Welcome Trust Sanger Institute, Hinxton, UK.

    An activating point mutation in codon 12 of the HRAS gene was the first somatic point mutation identified in a human cancer and established the role of somatic mutations as the common driver of oncogenesis. Since then, there have been over 11,000 mutations in the three RAS (HRAS, KRAS and NRAS) genes in codons 12, 13 and 61 reported in the literature. We report here the identification of recurrent somatic missense mutations at alanine 146, a highly conserved residue in the guanine nucleotide binding domain. In two independent series of colorectal cancers from Hong Kong and the United States we detected KRAS A146 mutations in 7/126 and 2/94 cases, respectively, giving a combined frequency of 4%. We also detected KRAS A146 mutations in 2/40 (5%) colorectal cell lines, including the NCI-60 colorectal cancer line HCC2998. Codon 146 mutations thus are likely to make an equal or greater contribution to colorectal cancer than codon 61 mutations (4.2% in our combined series, 1% in the literature). Lung adenocarcinomas and large cell carcinomas did not show codon 146 mutations. We did, however, identify a KRAS A146 mutation in the ML-2 acute myeloid leukemia cell line and an NRAS A146 mutation in the NALM-6 B-cell acute lymphoblastic leukemia line, suggesting that the contribution of codon 146 mutations is not entirely restricted to colorectal cancers or to KRAS.

    Funded by: NCI NIH HHS: CA062924, CA121113; Wellcome Trust: 077012

    Cancer biology & therapy 2006;5;8;928-32

  • Synapse-specific and developmentally regulated targeting of AMPA receptors by a family of MAGUK scaffolding proteins.

    Elias GM, Funke L, Stein V, Grant SG, Bredt DS and Nicoll RA

    Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California 94143, USA.

    Trafficking of AMPA receptors (AMPA-Rs) to and from synapses controls the strength of excitatory synaptic transmission. However, proteins that cluster AMPA-Rs at synapses remain poorly understood. Here we show that PSD-95-like membrane-associated guanylate kinases (PSD-MAGUKs) mediate this synaptic targeting, and we uncover a remarkable functional redundancy within this protein family. By manipulating endogenous neuronal PSD-MAGUK levels, we find that both PSD-95 and PSD-93 independently mediate AMPA-R targeting at mature synapses. We also reveal unanticipated synapse heterogeneity as loss of either PSD-95 or PSD-93 silences largely nonoverlapping populations of excitatory synapses. In adult PSD-95 and PSD-93 double knockout animals, SAP-102 is upregulated and compensates for the loss of synaptic AMPA-Rs. At immature synapses, PSD-95 and PSD-93 play little role in synaptic AMPA-R clustering; instead, SAP-102 dominates. These studies establish a PSD-MAGUK-specific regulation of AMPA-R synaptic expression that establishes and maintains glutamatergic synaptic transmission in the mammalian central nervous system.

    Neuron 2006;52;2;307-20

  • New insights into the role of pendrin (SLC26A4) in inner ear fluid homeostasis.

    Everett LA

    Audiovestibular Genomics Group, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB1O ISA, UK.

    For over 100 years after the first description of the disorder, the molecular pathology underlying the deafness and thyroid pathology in Pendred syndrome (PS) remained unknown. In 1997, early progress towards understanding the molecular basis of the disorder was made when we identified the PS gene and found it to belong to the SLC26 family of anion transporters. The realization that an anion transporter was responsible for these clinical features soon highlighted a potential role for pendrin in thyroid hormone biosynthesis. The role of pendrin in deafness, however, remained unclear. Our determination of its expression pattern in the inner ear along with the development of a mouse with a targeted disruption of the Slc26a4 gene has revealed that Slc26a4 is expressed in areas of the endolymphatic compartment known to play a role in endolymph reabsorption and that absence of this protein leads to a profound prenatal endolymphatic hydrops and destruction of many of the epithelial cells surrounding the scala media. The precise mechanisms underlying endolymph reabsorption in the inner ear are not yet known; these studies, however, provide some of the groundwork for allowing the future delineation of these processes.

    Novartis Foundation symposium 2006;273;213-25; discussion 225-30, 261-4

  • Micro-array analyses decipher exceptional complex familial chromosomal rearrangement.

    Fauth C, Gribble SM, Porter KM, Codina-Pascual M, Ng BL, Kraus J, Uhrig S, Leifheit J, Haaf T, Fiegler H, Carter NP and Speicher MR

    Institut für Humangenetik, Technische Universität München, Trogerstr. 32, 81675 München, Germany.

    Recently there has been an increased interest in large-scale genomic variation and clinically in the consequences of haploinsufficiency of genomic segments or disruption of normal gene function by chromosome rearrangements. Here, we present an extraordinary case in which both mother and daughter presented with unexpected chromosomal rearrangement complexity, which we characterized with array-CGH, array painting and multicolor large insert clone hybridizations. We found the same 12 breakpoints involving four chromosomes in both mother and daughter. In addition, the daughter inherited a microdeletion from her father. We mapped all breakpoints to the resolution level of breakpoint spanning clones. Genes were found within 7 of the 12 breakpoint regions, some of which were disrupted by the chromosome rearrangement. One of the rearrangements disrupted a locus, which has been discussed as a quantitative trait locus for fetal hemoglobin expression in adults. Interestingly, both mother and daughter show persistent fetal hemoglobin levels. We detail the most complicated familial complex chromosomal rearrangement reported to date and thus an extreme example of inheritance of chromosomal rearrangements without error in meiotic segregation.

    Human genetics 2006;119;1-2;145-53

  • PARL Leu262Val is not associated with fasting insulin levels in UK populations.

    Fawcett KA, Wareham NJ, Luan J, Syddall H, Cooper C, O'Rahilly S, Day IN, Sandhu MS and Barroso I

    Metabolic Disease Group, Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    PARL, the gene encoding presenilins-associated rhomboid-like protein, maps to chromosome 3q27 within a quantitative trait locus that influences components of the metabolic syndrome. Recently, an amino acid substitution (Leu262Val, rs3732581) in PARL was associated with fasting plasma insulin levels in a US white population (N=1031). This variant was also found to modify the positive association between age and fasting insulin. The aim of this study was to test whether these findings could be replicated in two UK population-based cohorts.

    Methods: Participants from the Medical Research Council Ely and Hertfordshire cohort studies were genotyped for this variant using a SNaPshot primer extension assay and Taqman assay respectively. Full phenotypic and genotypic data were available for 3,666 study participants.

    Results: Based on a dominant model, we found no association between the Leu262Val polymorphism and fasting insulin levels (p=0.79) or BMI (p=0.98). We did not observe the previously reported interaction between age and genotype on fasting insulin (p=0.14).

    Despite having greater statistical power, our data do not support the previously reported association between PARL Leu262Val and fasting plasma insulin levels, a measure of insulin resistance. Our findings indicate that this variant is unlikely to be an important contributor to insulin resistance in UK populations.

    Funded by: Medical Research Council: MC_U106179471, MC_U147585824, MC_UP_A620_1014, U.1061.00.001 (79471); Wellcome Trust: 077016

    Diabetologia 2006;49;11;2649-52

  • Phylogenetic relationships of the Wolbachia of nematodes and arthropods.

    Fenn K, Conlon C, Jones M, Quail MA, Holroyd NE, Parkhill J and Blaxter M

    Institutes of Evolutionary Biology and Immunology and Infection Research, University of Edinburgh, Edinburgh, United Kingdom.

    Wolbachia are well known as bacterial symbionts of arthropods, where they are reproductive parasites, but have also been described from nematode hosts, where the symbiotic interaction has features of mutualism. The majority of arthropod Wolbachia belong to clades A and B, while nematode Wolbachia mostly belong to clades C and D, but these relationships have been based on analysis of a small number of genes. To investigate the evolution and relationships of Wolbachia symbionts we have sequenced over 70 kb of the genome of wOvo, a Wolbachia from the human-parasitic nematode Onchocerca volvulus, and compared the genes identified to orthologues in other sequenced Wolbachia genomes. In comparisons of conserved local synteny, we find that wBm, from the nematode Brugia malayi, and wMel, from Drosophila melanogaster, are more similar to each other than either is to wOvo. Phylogenetic analysis of the protein-coding and ribosomal RNA genes on the sequenced fragments supports reciprocal monophyly of nematode and arthropod Wolbachia. The nematode Wolbachia did not arise from within the A clade of arthropod Wolbachia, and the root of the Wolbachia clade lies between the nematode and arthropod symbionts. Using the wOvo sequence, we identified a lateral transfer event whereby segments of the Wolbachia genome were inserted into the Onchocerca nuclear genome. This event predated the separation of the human parasite O. volvulus from its cattle-parasitic sister species, O. ochengi. The long association between filarial nematodes and Wolbachia symbionts may permit more frequent genetic exchange between their genomes.

    PLoS pathogens 2006;2;10;e94

  • Accurate and reliable high-throughput detection of copy number variation in the human genome.

    Fiegler H, Redon R, Andrews D, Scott C, Andrews R, Carder C, Clark R, Dovey O, Ellis P, Feuk L, French L, Hunt P, Kalaitzopoulos D, Larkin J, Montgomery L, Perry GH, Plumb BW, Porter K, Rigby RE, Rigler D, Valsesia A, Langford C, Humphray SJ, Scherer SW, Lee C, Hurles ME and Carter NP

    The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.

    This study describes a new tool for accurate and reliable high-throughput detection of copy number variation in the human genome. We have constructed a large-insert clone DNA microarray covering the entire human genome in tiling path resolution that we have used to identify copy number variation in human populations. Crucial to this study has been the development of a robust array platform and analytic process for the automated identification of copy number variants (CNVs). The array consists of 26,574 clones covering 93.7% of euchromatic regions. Clones were selected primarily from the published "Golden Path," and mapping was confirmed by fingerprinting and BAC-end sequencing. Array performance was extensively tested by a series of validation assays. These included determining the hybridization characteristics of each individual clone on the array by chromosome-specific add-in experiments. Estimation of data reproducibility and false-positive/negative rates was carried out using self-self hybridizations, replicate experiments, and independent validations of CNVs. Based on these studies, we developed a variance-based automatic copy number detection analysis process (CNVfinder) and have demonstrated its robustness by comparison with the SW-ARRAY method.

    Funded by: Wellcome Trust

    Genome research 2006;16;12;1566-74

  • Mutations in the RSK2(RPS6KA3) gene cause Coffin-Lowry syndrome and nonsyndromic X-linked mental retardation.

    Field M, Tarpey P, Boyle J, Edkins S, Goodship J, Luo Y, Moon J, Teague J, Stratton MR, Futreal PA, Wooster R, Raymond FL and Turner G

    The NSW GOLD Service, Hunter Genetics, Newcastle, Australia.

    We describe three families with X-linked mental retardation, two with a deletion of a single amino acid and one with a missense mutation in the proximal domain of the RSK2(RPS6KA3) (ribosomal protein S6 kinase, 90 kDa, polypeptide 3) protein similar to mutations found in Coffin-Lowry syndrome (CLS). In two families, the clinical diagnosis had been nonsyndromic X-linked mental retardation. In the third family, although CLS had been suspected, the clinical features were atypical and the degree of intellectual disability much less than expected. These families show that strict reliance on classical clinical criteria for mutation testing may result in a missed diagnosis. A less targeted screening approach to mutation testing is advocated.

    Funded by: Wellcome Trust: 077010

    Clinical genetics 2006;70;6;509-15

  • Pfam: clans, web tools and services.

    Finn RD, Mistry J, Schuster-Böckler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R, Eddy SR, Sonnhammer EL and Bateman A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Pfam is a database of protein families that currently contains 7973 entries (release 18.0). A recent development in Pfam has enabled the grouping of related families into clans. Pfam clans are described in detail, together with the new associated web pages. Improvements to the range of Pfam web tools and the first set of Pfam web services that allow programmatic access to the database and associated tools are also presented. Pfam is available on the web in the UK (, the USA (, France ( and Sweden (

    Funded by: Wellcome Trust: 087656

    Nucleic acids research 2006;34;Database issue;D247-51

  • The diversity and complexity of the cyanobacterial thioredoxin systems.

    Florencio FJ, Pérez-Pérez ME, López-Maury L, Mata-Cabana A and Lindahl M

    Instituto de Bioquímica Vegetal y Fotosíntesis, Universidad de Sevilla-CSIC, Centro de Investigaciones Científicas Isla de la Cartuja, Avda Américo Vespucio 49, Seville, 41092, Spain.

    Cyanobacteria perform oxygenic photosynthesis, which makes them unique among the prokaryotes, and this feature together with their abundance and worldwide distribution renders them a central ecological role. Cyanobacteria and chloroplasts of plants and algae are believed to share a common ancestor and the modern chloroplast would thus be the remnant of an endosymbiosis between a eukaryotic cell and an ancestral oxygenic photosynthetic prokaryote. Chloroplast metabolic processes are coordinated with those of the other cellular compartments and are strictly controlled by means of regulatory systems that commonly involve redox reactions. Disulphide/dithiol exchange catalysed by thioredoxin is a fundamental example of such regulation and represents the molecular mechanism for light-dependent redox control of an ever-increasing number of chloroplast enzymatic activities. In contrast to chloroplast thioredoxins, the functions of the cyanobacterial thioredoxins have long remained elusive, despite their common origin. The sequenced genomes of several cyanobacterial species together with novel experimental approaches involving proteomics have provided new tools for re-examining the roles of the thioredoxin systems in these organisms. Thus, each cyanobacterial genome encodes between one and eight thioredoxins and all components necessary for the reduction of thioredoxins. Screening for thioredoxin target proteins in cyanobacteria indicates that assimilation and storage of nutrients, as well as some central metabolic pathways, are regulated by mechanisms involving disulphide/dithiol exchange, which could be catalysed by thioredoxins or related thiol-containing proteins.

    Photosynthesis research 2006;89;2-3;157-71

  • Identifying gene regulatory elements by genomic microarray mapping of DNaseI hypersensitive sites.

    Follows GA, Dhami P, Göttgens B, Bruce AW, Campbell PJ, Dillon SC, Smith AM, Koch C, Donaldson IJ, Scott MA, Dunham I, Janes ME, Vetrie D and Green AR

    Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 2XY, United Kingdom.

    The identification of cis-regulatory elements is central to understanding gene transcription. Hypersensitivity of cis-regulatory elements to digestion with DNaseI remains the gold-standard approach to locating such elements. Traditional methods used to identify DNaseI hypersensitive sites are cumbersome and can only be applied to short stretches of DNA at defined locations. Here we report the development of a novel genomic array-based approach to DNaseI hypersensitive site mapping (ADHM) that permits precise, large-scale identification of such sites from as few as 5 million cells. Using ADHM we identified all previously recognized hematopoietic regulatory elements across 200 kb of the mouse T-cell acute lymphocytic leukemia-1 (Tal1) locus, and, in addition, identified two novel elements within the locus, which show transcriptional regulatory activity. We further validated the ADHM protocol by mapping the DNaseI hypersensitive sites across 250 kb of the human TAL1 locus in CD34+ primary stem/progenitor cells and K562 cells and by mapping the previously known DNaseI hypersensitive sites across 240 kb of the human alpha-globin locus in K562 cells. ADHM provides a powerful approach to identifying DNaseI hypersensitive sites across large genomic regions.

    Genome research 2006;16;10;1310-9

  • COSMIC 2005.

    Forbes S, Clements J, Dawson E, Bamford S, Webb T, Dogan A, Flanagan A, Teague J, Wooster R, Futreal PA and Stratton MR

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.

    The Catalogue Of Somatic Mutations In Cancer (COSMIC) database and web site was developed to preserve somatic mutation data and share it with the community. Over the past 25 years, approximately 350 cancer genes have been identified, of which 311 are somatically mutated. COSMIC has been expanded and now holds data previously reported in the scientific literature for 28 known cancer genes. In addition, there is data from the systematic sequencing of 518 protein kinase genes. The total gene count in COSMIC stands at 538; 25 have a mutation frequency above 5% in one or more tumour type, no mutations were found in 333 genes and 180 are rarely mutated with frequencies <5% in any tumour set. The COSMIC web site has been expanded to give more views and summaries of the data and provide faster query routes and downloads. In addition, there is a new section describing mutations found through a screen of known cancer genes in 728 cancer cell lines including the NCI-60 set of cancer cell lines.

    Funded by: Wellcome Trust

    British journal of cancer 2006;94;2;318-22

  • Copy number variation: new insights in genome diversity.

    Freeman JL, Perry GH, Feuk L, Redon R, McCarroll SA, Altshuler DM, Aburatani H, Jones KW, Tyler-Smith C, Hurles ME, Carter NP, Scherer SW and Lee C

    Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts 02115, USA.

    DNA copy number variation has long been associated with specific chromosomal rearrangements and genomic disorders, but its ubiquity in mammalian genomes was not fully realized until recently. Although our understanding of the extent of this variation is still developing, it seems likely that, at least in humans, copy number variants (CNVs) account for a substantial amount of genetic variation. Since many CNVs include genes that result in differential levels of gene expression, CNVs may account for a significant proportion of normal phenotypic variation. Current efforts are directed toward a more comprehensive cataloging and characterization of CNVs that will provide the basis for determining how genomic diversity impacts biological function, evolution, and common human diseases.

    Funded by: Wellcome Trust

    Genome research 2006;16;8;949-61

  • Pseudo-messenger RNA: phantoms of the transcriptome.

    Frith MC, Wilming LG, Forrest A, Kawaji H, Tan SL, Wahlestedt C, Bajic VB, Kai C, Kawai J, Carninci P, Hayashizaki Y, Bailey TL and Huminiecki L

    Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan.

    The mammalian transcriptome harbours shadowy entities that resist classification and analysis. In analogy with pseudogenes, we define pseudo-messenger RNA to be RNA molecules that resemble protein-coding mRNA, but cannot encode full-length proteins owing to disruptions of the reading frame. Using a rigorous computational pipeline, which rules out sequencing errors, we identify 10,679 pseudo-messenger RNAs (approximately half of which are transposon-associated) among the 102,801 FANTOM3 mouse cDNAs: just over 10% of the FANTOM3 transcriptome. These comprise not only transcribed pseudogenes, but also disrupted splice variants of otherwise protein-coding genes. Some may encode truncated proteins, only a minority of which appear subject to nonsense-mediated decay. The presence of an excess of transcripts whose only disruptions are opal stop codons suggests that there are more selenoproteins than currently estimated. We also describe compensatory frameshifts, where a segment of the gene has changed frame but remains translatable. In summary, we survey a large class of non-standard but potentially functional transcripts that are likely to encode genetic information and effect biological processes in novel ways. Many of these transcripts do not correspond cleanly to any identifiable object in the genome, implying fundamental limits to the goal of annotating all functional elements at the genome sequence level.

    PLoS genetics 2006;2;4;e23

  • Are molecular cytogenetics and bioinformatics suggesting diverging models of ancestral mammalian genomes?

    Froenicke L, Caldés MG, Graphodatsky A, Müller S, Lyons LA, Robinson TJ, Volleth M, Yang F and Wienberg J

    Department of Population Health and Reproduction, School of Veterinary Medicine, University of California Davis, Davis, California 95616, USA.

    Genome research 2006;16;3;306-10

  • Combined array-comparative genomic hybridization and single-nucleotide polymorphism-loss of heterozygosity analysis reveals complex changes and multiple forms of chromosomal instability in colorectal cancers.

    Gaasenbeek M, Howarth K, Rowan AJ, Gorman PA, Jones A, Chaplin T, Liu Y, Bicknell D, Davison EJ, Fiegler H, Carter NP, Roylance RR and Tomlinson IP

    Molecular and Population Genetics Laboratory, London Research Institute, Cancer Research UK, London WC2A 3PX, UK.

    Cancers with chromosomal instability (CIN) are held to be aneuploid/polyploid with multiple large-scale gains/deletions, but the processes underlying CIN are unclear and different types of CIN might exist. We investigated colorectal cancer cell lines using array-comparative genomic hybridization (CGH) for copy number changes and single-copy number polymorphism (SNP) microarrays for allelic loss (LOH). Many array-based CGH changes were not found by LOH because they did not cause true reduction-to-homozygosity. Conversely, many regions of SNP-LOH occurred in the absence of copy number change, comprising an average per cell line of 2 chromosomes with complete LOH; 1-2 terminal regions of LOH (mitotic recombination); and 1 interstitial region of LOH. SNP-LOH detected many novel changes, representing possible locations of uncharacterized tumor suppressor loci. Microsatellite unstable (MSI+) lines infrequently showed gains/deletions or whole-chromosome LOH, but their near-diploid karyotypes concealed mitotic recombination frequencies similar to those of MSI- lines. We analyzed p53 and chromosome 18q (SMAD4) in detail, including mutation screening. Almost all MSI- lines showed LOH and/or deletion of p53 and 18q; some near-triploid lines had acquired three independent changes at these loci. We found consistent results in primary colorectal cancers. Overall, the distributions of mitotic recombination and whole-chromosome LOH in the MSI- cell lines differed significantly from random, with some lines having much higher than expected levels of these changes. Moreover, lines with more LOH changes had significantly fewer copy number changes. These data suggest that CIN is not synonymous with copy number change and some cancers have a specific tendency to whole-chromosome deletion and regain or to mitotic recombination.

    Cancer research 2006;66;7;3471-9

  • The Gene Ontology (GO) project in 2006.

    Gene Ontology Consortium

    The Gene Ontology (GO) project ( develops and uses a set of structured, controlled vocabularies for community use in annotating genes, gene products and sequences (also see The GO Consortium continues to improve to the vocabulary content, reflecting the impact of several novel mechanisms of incorporating community input. A growing number of model organism databases and genome annotation groups contribute annotation sets using GO terms to GO's public repository. Updates to the AmiGO browser have improved access to contributed genome annotations. As the GO project continues to grow, the use of the GO vocabularies is becoming more varied as well as more widespread. The GO project provides an ontological annotation system that enables biologists to infer knowledge from large amounts of data.

    Funded by: NHGRI NIH HHS: HG02273

    Nucleic acids research 2006;34;Database issue;D322-6

  • Zebrafish MiR-430 promotes deadenylation and clearance of maternal mRNAs.

    Giraldez AJ, Mishima Y, Rihel J, Grocock RJ, Van Dongen S, Inoue K, Enright AJ and Schier AF

    Developmental Genetics Program, Skirball Institute of Biomolecular Medicine, and Department of Cell Biology, New York University School of Medicine, New York, NY 10016, USA.

    MicroRNAs (miRNAs) comprise 1 to 3% of all vertebrate genes, but their in vivo functions and mechanisms of action remain largely unknown. Zebrafish miR-430 is expressed at the onset of zygotic transcription and regulates morphogenesis during early development. By using a microarray approach and in vivo target validation, we find that miR-430 directly regulates several hundred target messenger RNA molecules (mRNAs). Most targets are maternally expressed mRNAs that accumulate in the absence of miR-430. We also show that miR-430 accelerates the deadenylation of target mRNAs. These results suggest that miR-430 facilitates the deadenylation and clearance of maternal mRNAs during early embryogenesis.

    Science (New York, N.Y.) 2006;312;5770;75-9

  • Genetic screens for mutations affecting development of Xenopus tropicalis.

    Goda T, Abu-Daya A, Carruthers S, Clark MD, Stemple DL and Zimmerman LB

    Division of Developmental Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London, United Kingdom.

    We present here the results of forward and reverse genetic screens for chemically-induced mutations in Xenopus tropicalis. In our forward genetic screen, we have uncovered 77 candidate phenotypes in diverse organogenesis and differentiation processes. Using a gynogenetic screen design, which minimizes time and husbandry space expenditures, we find that if a phenotype is detected in the gynogenetic F2 of a given F1 female twice, it is highly likely to be a heritable abnormality (29/29 cases). We have also demonstrated the feasibility of reverse genetic approaches for obtaining carriers of mutations in specific genes, and have directly determined an induced mutation rate by sequencing specific exons from a mutagenized population. The Xenopus system, with its well-understood embryology, fate map, and gain-of-function approaches, can now be coupled with efficient loss-of-function genetic strategies for vertebrate functional genomics and developmental genetics.

    Funded by: Medical Research Council: MC_U117560482; NICHD NIH HHS: 1 R01 HD4 2276-01; Wellcome Trust

    PLoS genetics 2006;2;6;e91

  • The portability of tagSNPs across populations: a worldwide survey.

    González-Neira A, Ke X, Lao O, Calafell F, Navarro A, Comas D, Cann H, Bumpstead S, Ghori J, Hunt S, Deloukas P, Dunham I, Cardon LR and Bertranpetit J

    Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain.

    In the search for common genetic variants that contribute to prevalent human diseases, patterns of linkage disequilibrium (LD) among linked markers should be considered when selecting SNPs. Genotyping efficiency can be increased by choosing tagging SNPs (tagSNPs) in LD with other SNPs. However, it remains to be seen whether tagSNPs defined in one population efficiently capture LD in other populations; that is, how portable tagSNPs are. Indeed, tagSNP portability is a challenge for the applicability of HapMap results. We analyzed 144 SNPs in a 1-Mb region of chromosome 22 in 1055 individuals from 38 worldwide populations, classified into seven continental groups. We measured tagSNP portability by choosing three reference populations (to approximate the three HapMap populations), defining tagSNPs, and applying them to other populations independently on the availability of information on the tagSNPs in the compared population. We found that tagSNPs are highly informative in other populations within each continental group. Moreover, tagSNPs defined in Europeans are often efficient for Middle Eastern and Central/South Asian populations. TagSNPs defined in the three reference populations are also efficient for more distant and differentiated populations (Oceania, Americas), in which the impact of their special demographic history on the genetic structure does not interfere with successfully detecting the most common haplotype variation. This high degree of portability lends promise to the search for disease association in different populations, once tagSNPs are defined in a few reference populations like those analyzed in the HapMap initiative.

    Genome research 2006;16;3;323-30

  • Geminin is essential to prevent endoreduplication and to form pluripotent cells during mammalian development.

    Gonzalez MA, Tachibana KE, Adams DJ, van der Weyden L, Hemberger M, Coleman N, Bradley A and Laskey RA

    Medical Research Council Cancer Cell Unit, Hutchison/MRC Research Centre, UK.

    In multicellular eukaryotes, geminin prevents overreplication of DNA in proliferating cells. Here, we show that genetic ablation of geminin in the mouse prevents formation of inner cell mass (ICM) and causes premature endoreduplication at eight cells, rather than 32 cells. All cells in geminin-deficient embryos commit to the trophoblast cell lineage and consist of trophoblast giant cells (TGCs) only. Geminin is also down-regulated in TGCs of wild-type blastocysts during S and gap-like phases by proteasome-mediated degradation, suggesting that loss of geminin is part of the mechanism regulating endoreduplication.

    Funded by: Medical Research Council: G120/824, MC_U105359878

    Genes & development 2006;20;14;1880-4

  • The synapse proteome and phosphoproteome: a new paradigm for synapse biology.

    Grant SG

    Genes to Cognition Programme, Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK.

    Synapse proteomics has recently resulted in a quantum leap in knowledge of the protein composition of brain synapses and its phosphorylation. We now have the first draft picture of the synapse, comprising approximately 1000 proteins. This is not matched by available methods of functional analysis either in reduced systems or in whole animals. Fewer than 20% of synapse proteome proteins have a known function in the nervous system. A concerted effort is required to establish new technical approaches before we can understand the diversity of functions conferred by the synapse proteome on the synapse, the neuron and the animal. This review will highlight this change in knowledge and discuss current technical and interpretative limitations challenged by synapse proteomics.

    Funded by: Wellcome Trust

    Biochemical Society transactions 2006;34;Pt 1;59-63

  • Statistical analysis of pathogenicity of somatic mutations in cancer.

    Greenman C, Wooster R, Futreal PA, Stratton MR and Easton DF

    Cancer Genome Project, Wellcome Trust Sanger Institute, Cambridge, United Kingdom.

    Recent large-scale sequencing studies have revealed that cancer genomes contain variable numbers of somatic point mutations distributed across many genes. These somatic mutations most likely include passenger mutations that are not cancer causing and pathogenic driver mutations in cancer genes. Establishing a significant presence of driver mutations in such data sets is of biological interest. Whereas current techniques from phylogeny are applicable to large data sets composed of singly mutated samples, recently exemplified with a p53 mutation database, methods for smaller data sets containing individual samples with multiple mutations need to be developed. By constructing distinct models of both the mutation process and selection pressure upon the cancer samples, exact statistical tests to examine this problem are devised. Tests to examine the significance of selection toward missense, nonsense, and splice site mutations are derived, along with tests assessing variation in selection between functional domains. Maximum-likelihood methods facilitate parameter estimation, including levels of selection pressure and minimum numbers of pathogenic mutations. These methods are illustrated with 25 breast cancers screened across the coding sequences of 518 kinase genes, revealing 90 base substitutions in 71 genes. Significant selection pressure upon truncating mutations was established. Furthermore, an estimated minimum of 29.8 mutations were pathogenic.

    Funded by: Wellcome Trust

    Genetics 2006;173;4;2187-98

  • The DNA sequence and biological annotation of human chromosome 1.

    Gregory SG, Barlow KF, McLay KE, Kaul R, Swarbreck D, Dunham A, Scott CE, Howe KL, Woodfine K, Spencer CC, Jones MC, Gillson C, Searle S, Zhou Y, Kokocinski F, McDonald L, Evans R, Phillips K, Atkinson A, Cooper R, Jones C, Hall RE, Andrews TD, Lloyd C, Ainscough R, Almeida JP, Ambrose KD, Anderson F, Andrew RW, Ashwell RI, Aubin K, Babbage AK, Bagguley CL, Bailey J, Beasley H, Bethel G, Bird CP, Bray-Allen S, Brown JY, Brown AJ, Buckley D, Burton J, Bye J, Carder C, Chapman JC, Clark SY, Clarke G, Clee C, Cobley V, Collier RE, Corby N, Coville GJ, Davies J, Deadman R, Dunn M, Earthrowl M, Ellington AG, Errington H, Frankish A, Frankland J, French L, Garner P, Garnett J, Gay L, Ghori MR, Gibson R, Gilby LM, Gillett W, Glithero RJ, Grafham DV, Griffiths C, Griffiths-Jones S, Grocock R, Hammond S, Harrison ES, Hart E, Haugen E, Heath PD, Holmes S, Holt K, Howden PJ, Hunt AR, Hunt SE, Hunter G, Isherwood J, James R, Johnson C, Johnson D, Joy A, Kay M, Kershaw JK, Kibukawa M, Kimberley AM, King A, Knights AJ, Lad H, Laird G, Lawlor S, Leongamornlert DA, Lloyd DM, Loveland J, Lovell J, Lush MJ, Lyne R, Martin S, Mashreghi-Mohammadi M, Matthews L, Matthews NS, McLaren S, Milne S, Mistry S, Moore MJ, Nickerson T, O'Dell CN, Oliver K, Palmeiri A, Palmer SA, Parker A, Patel D, Pearce AV, Peck AI, Pelan S, Phelps K, Phillimore BJ, Plumb R, Rajan J, Raymond C, Rouse G, Saenphimmachak C, Sehra HK, Sheridan E, Shownkeen R, Sims S, Skuce CD, Smith M, Steward C, Subramanian S, Sycamore N, Tracey A, Tromans A, Van Helmond Z, Wall M, Wallis JM, White S, Whitehead SL, Wilkinson JE, Willey DL, Williams H, Wilming L, Wray PW, Wu Z, Coulson A, Vaudin M, Sulston JE, Durbin R, Hubbard T, Wooster R, Dunham I, Carter NP, McVean G, Ross MT, Harrow J, Olson MV, Beck S, Rogers J, Bentley DR, Banerjee R, Bryant SP, Burford DC, Burrill WD, Clegg SM, Dhami P, Dovey O, Faulkner LM, Gribble SM, Langford CF, Pandian RD, Porter KM and Prigmore E

    The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    The reference sequence for each human chromosome provides the framework for understanding genome function, variation and evolution. Here we report the finished sequence and biological annotation of human chromosome 1. Chromosome 1 is gene-dense, with 3,141 genes and 991 pseudogenes, and many coding sequences overlap. Rearrangements and mutations of chromosome 1 are prevalent in cancer and many other diseases. Patterns of sequence variation reveal signals of recent selection in specific genes that may contribute to human fitness, and also in regions where no function is evident. Fine-scale recombination occurs in hotspots of varying intensity along the sequence, and is enriched near genes. These and other studies of human biology and disease encoded within chromosome 1 are made possible with the highly accurate annotated sequence, as part of the completed set of chromosome sequences that comprise the reference human genome.

    Funded by: Medical Research Council: G0000107; Wellcome Trust

    Nature 2006;441;7091;315-21

  • miRBase: the microRNA sequence database.

    Griffiths-Jones S

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    The miRBase Sequence database is the primary repository for published microRNA (miRNA) sequence and annotation data. miRBase provides a user-friendly web interface for miRNA data, allowing the user to search using key words or sequences, trace links to the primary literature referencing the miRNA discoveries, analyze genomic coordinates and context, and mine relationships between miRNA sequences. miRBase also provides a confidential gene-naming service, assigning official miRNA names to novel genes before their publication. The methods outlined in this chapter describe these functions. miRBase is freely available to all at

    Funded by: Wellcome Trust

    Methods in molecular biology (Clifton, N.J.) 2006;342;129-38

  • miRBase: microRNA sequences, targets and gene nomenclature.

    Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A and Enright AJ

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    The miRBase database aims to provide integrated interfaces to comprehensive microRNA sequence data, annotation and predicted gene targets. miRBase takes over functionality from the microRNA Registry and fulfils three main roles: the miRBase Registry acts as an independent arbiter of microRNA gene nomenclature, assigning names prior to publication of novel miRNA sequences. miRBase Sequences is the primary online repository for miRNA sequence data and annotation. miRBase Targets is a comprehensive new database of predicted miRNA target genes. miRBase is available at

    Funded by: Wellcome Trust: 077044

    Nucleic acids research 2006;34;Database issue;D140-4

  • EGASP: the human ENCODE Genome Annotation Assessment Project.

    Guigó R, Flicek P, Abril JF, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic VB, Birney E, Castelo R, Eyras E, Ucla C, Gingeras TR, Harrow J, Hubbard T, Lewis SE and Reese MG

    Centre de Regulació Genòmica, Institut Municipal d'Investigació Mèdica-Universitat Pompeu Fabra, E08003 Barcelona, Catalonia, Spain.

    Background: We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment.

    Results: The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified.

    Conclusion: This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence.

    Funded by: Medical Research Council: G8225539

    Genome biology 2006;7 Suppl 1;S2.1-31

  • A shared Y-chromosomal heritage between Muslims and Hindus in India.

    Gutala R, Carvalho-Silva DR, Jin L, Yngvadottir B, Avadhanula V, Nanne K, Singh L, Chakraborty R and Tyler-Smith C

    Department of Medicine, University of Texas Health Science Center, San Antonio, TX, USA.

    Arab forces conquered the Indus Delta region in 711 AD: and, although a Muslim state was established there, their influence was barely felt in the rest of South Asia at that time. By the end of the tenth century, Central Asian Muslims moved into India from the northwest and expanded throughout the subcontinent. Muslim communities are now the largest minority religion in India, comprising more than 138 million people in a predominantly Hindu population of over one billion. It is unclear whether the Muslim expansion in India was a purely cultural phenomenon or had a genetic impact on the local population. To address this question from a male perspective, we typed eight microsatellite loci and 16 binary markers from the Y chromosome in 246 Muslims from Andhra Pradesh, and compared them to published data on 4,204 males from East Asia, Central Asia, other parts of India, Sri Lanka, Pakistan, Iran, the Middle East, Turkey, Egypt and Morocco. We find that the Muslim populations in general are genetically closer to their non-Muslim geographical neighbors than to other Muslims in India, and that there is a highly significant correlation between genetics and geography (but not religion). Our findings indicate that, despite the documented practice of marriage between Muslim men and Hindu women, Islamization in India did not involve large-scale replacement of Hindu Y chromosomes. The Muslim expansion in India was predominantly a cultural change and was not accompanied by significant gene flow, as seen in other places, such as China and Central Asia.

    Funded by: Wellcome Trust: 077009

    Human genetics 2006;120;4;543-51

  • Molecular characterization of the porcine deleted in malignant brain tumors 1 gene (DMBT1).

    Haase B, Humphray SJ, Lyer S, Renner M, Poustka A, Mollenhauer J and Leeb T

    Institute of Genetics, Vetsuisse Faculty, University of Berne, Bremgartenstrasse 109a, 3001 Berne, Switzerland.

    The human gene deleted in malignant brain tumors 1 (DMBT1) is considered to play a role in tumorigenesis and pathogen defense. It encodes a protein with multiple scavenger receptor cysteine-rich (SRCR) domains, which are involved in recognition and binding of a broad spectrum of bacterial pathogens. The SRCR domains are encoded by highly homologous repetitive exons, whose number in humans may vary from 8 to 13 due to genetic polymorphism. Here, we characterized the porcine DMBT1 gene on the mRNA and genomic level. We assembled a 4.5 kb porcine DMBT1 cDNA sequence from RT-PCR amplified seminal vesicle RNA. The porcine DMBT1 cDNA contains an open reading frame of 4050 nt. The transcript gives rise to a putative polypeptide of 1349 amino acids with a calculated mass of 147.9 kDa. Compared to human DMBT1, it contains only four N-terminal SRCR domains. Northern blotting revealed transcripts of approximately 4.7 kb in size in the tissues analyzed. Analysis of ESTs suggested the existence of secreted and transmembrane variants. The porcine DMBT1 gene spans about 54 kb on chromosome 14q28-q29. In contrast to the characterized cDNA, the genomic BAC clone only contained 3 exons coding for N-terminal SRCR domains. In different mammalian DMBT1 orthologs large interspecific differences in the number of SRCR exons and utilization of the transmembrane exon exist. Our data suggest that the porcine DMBT1 gene may share with the human DMBT1 gene additional intraspecific variations in the number of SRCR-coding exons.

    Gene 2006;376;2;184-91

  • A conserved sequence motif in 3' untranslated regions of ribosomal protein mRNAs in nematodes.

    Hajarnavis A and Durbin R

    The 3' untranslated regions (3' UTR) of eukaryotic genes can contain motifs involved in regulation of gene expression or localization at the post-transcriptional level. This study concerns the identification of novel, conserved elements in 3' UTRs of many ribosomal protein mRNAs in Caenorhabditis elegans and Caenorhabditis briggsae. Analysis of the region around the polyadenylation signal in many ribosomal protein mRNAs indicates the conservation of a sequence motif UUGUU occurring both before and immediately after the polyadenylation signal. Building a statistical model of this motif and searching a database of C. elegans 3' UTRs reveals that this motif is also present in the 3' UTR of some genes involved in translation and ribosome maturation, among others. We suggest that this signal may be involved in translation or other message-level regulation of ribosomal genes in C. elegans.

    RNA (New York, N.Y.) 2006;12;10;1786-9

  • Evaluation of a novel Vi conjugate vaccine in a murine model of salmonellosis.

    Hale C, Bowe F, Pickard D, Clare S, Haeuw JF, Powers U, Menager N, Mastroeni P and Dougan G

    The Wellcome Trust Genome Campus, The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

    Immunisation of BALB/c mice with a vaccine containing Vi polysaccharide conjugated to the Klebsiella pneumoniae outer membrane 40 kDa protein (rP40), in combination with Escherichia coli heat-labile toxin adjuvant (LT), elicited anti-Vi IgG antibodies after administration using different routes. Testing of the immune serum in opsonisation assays demonstrated the specific enhancement of Vi-positive bacterial uptake by cultured murine bone marrow derived macrophages. Intra-peritoneal challenge of mice immunised with the Vi-based vaccine elicited a degree of protection against virulent Vi+ Salmonella enterica serovar typhimurium (S. typhimurium). In contrast, Vi vaccination did not confer protection against oral challenge with virulent Vi-positive S. typhimurium or S. dublin.

    Funded by: Wellcome Trust

    Vaccine 2006;24;20;4312-20

  • Isolation of Salmonella enterica subspecies enterica serovar Paratyphi B dT+, or Salmonella Java, from Indonesia and alteration of the d-tartrate fermentation phenotype by disrupting the ORF STM 3356.

    Han KH, Choi SY, Lee JH, Lee H, Shin EH, Agtini MD, von Seidlein L, Ochiai RL, Clemens JD, Wain J, Hahn JS, Lee BK, Song M, Chun J and Kim DW

    International Vaccine Institute, San 4-8 Bongcheon 7 dong, Kwanak gu, Seoul, 151-818, Republic of Korea.

    Salmonella enterica subspecies enterica serovar Paratyphi B [O1,4,(5),12 : Hb : 1,2] can cause either an enteric fever (paratyphoid fever) or self-limiting gastroenteritis in humans. The d-tartrate non-fermenting variant S. enterica subsp. enterica serovar Paratyphi B dT- (S. Paratyphi B) is the causative agent of paratyphoid fever, and the d-tartrate fermenting variant S. enterica subsp. enterica serovar Paratyphi B dT+ (S. Paratyphi B dT+; formerly called Salmonella Java) causes gastroenteritis. S. Java is currently recognized as an emerging problem worldwide. Twelve dT+ S. Java isolates were collected in Indonesia between 2000 and 2002. One-third of them contained Salmonella genomic island 1 (SGI1), which gives the multidrug-resistant phenotype to the bacteria. In this study, a PCR-based method to detect a single nucleotide difference responsible for the inability to ferment d-tartrate, reported elsewhere, was validated. The d-tartrate fermenting phenotype of S. Java was converted to the non-fermenting phenotype by the disruption of the ORF STM 3356, and the d-tartrate non-fermenting phenotype of the ORF STM 3356-disrupted strain and the dT- reference strain was changed to the dT+ phenotype by complementing ORF STM 3356 in trans. The results show that the dT+ phenotype requires a functional product encoded by STM 3356, and support the use of the PCR-based discrimination method for S. Paratyphi B and S. Java as the standard differentiation method.

    Funded by: Wellcome Trust

    Journal of medical microbiology 2006;55;Pt 12;1661-5

  • Polymorphisms in the gene encoding sterol regulatory element-binding factor-1c are associated with type 2 diabetes.

    Harding AH, Loos RJ, Luan J, O'Rahilly S, Wareham NJ and Barroso I

    MRC Epidemiology Unit, Cambridge, UK.

    The sterol regulatory element-binding factor (SREBF)-1c is a transcription factor involved in the regulation of lipid and glucose metabolism. We have previously found evidence that a common SREBF1c single-nucleotide polymorphism (SNP), located between exons 18c and 19c, is associated with an increased risk of type 2 diabetes. The present study aimed to replicate our previously reported association in a larger case-control study and to examine an additional five SREBF1c SNPs for their association with diabetes risk and plasma glucose concentrations.

    Methods: We genotyped six SREBF1c SNPs in two case-control studies (n=1,938) and in a large cohort study (n=1,721) and tested for association with type 2 diabetes and with plasma glucose concentrations (fasting and 120-min post-glucose load), respectively.

    Results: In the case-control studies, carriers of the minor allele of the previously reported SNP (rs11868035) had a significantly increased diabetes risk (odds ratio [OR]=1.20 [95% CI 1.04-1.38], p=0.015). Also, three other SNPs (rs2236513, rs6502618 and rs1889018), located in the 5' region, were significantly associated with diabetes risk (OR > or =1.21, p< or =0.006). Furthermore, two SNPs (rs2236513 and rs1889018) in the 5' region were weakly (p<0.09) associated with plasma glucose concentrations in the cohort study. Rare homozygotes had increased (p< or =0.05) 120-min post-load glucose concentrations compared with carriers of the wild-type allele. Haplotype analyses showed significant (p=0.04) association with diabetes risk and confirmed the single SNP analyses.

    In summary, we replicated our previous finding and found evidence for SNPs in the 5' region of the SREBF1c gene to be associated with the risk of type 2 diabetes and plasma glucose concentration.

    Funded by: Medical Research Council: MC_U106179471, MC_U106188470; Wellcome Trust: 077016

    Diabetologia 2006;49;11;2642-8

  • GENCODE: producing a reference annotation for ENCODE.

    Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis SE and Guigo R

    Wellcome Trust Sanger Institute, Wellcome Trust Campus, Hinxton, Cambridge CB10 1SA, UK.

    Background: The GENCODE consortium was formed to identify and map all protein-coding genes within the ENCODE regions. This was achieved by a combination of initial manual annotation by the HAVANA team, experimental validation by the GENCODE consortium and a refinement of the annotation based on these experimental results.

    Results: The GENCODE gene features are divided into eight different categories of which only the first two (known and novel coding sequence) are confidently predicted to be protein-coding genes. 5' rapid amplification of cDNA ends (RACE) and RT-PCR were used to experimentally verify the initial annotation. Of the 420 coding loci tested, 229 RACE products have been sequenced. They supported 5' extensions of 30 loci and new splice variants in 50 loci. In addition, 46 loci without evidence for a coding sequence were validated, consisting of 31 novel and 15 putative transcripts. We assessed the comprehensiveness of the GENCODE annotation by attempting to validate all the predicted exon boundaries outside the GENCODE annotation. Out of 1,215 tested in a subset of the ENCODE regions, 14 novel exon pairs were validated, only two of them in intergenic regions.

    Conclusion: In total, 487 loci, of which 434 are coding, have been annotated as part of the GENCODE reference set available from the UCSC browser. Comparison of GENCODE annotation with RefSeq and ENSEMBL show only 40% of GENCODE exons are contained within the two sets, which is a reflection of the high number of alternative splice forms with unique exons annotated. Over 50% of coding loci have been experimentally verified by 5' RACE for EGASP and the GENCODE collaboration is continuing to refine its annotation of 1% human genome with the aid of experimental validation.

    Genome biology 2006;7 Suppl 1;S4.1-9

  • Genome-wide characterization of fission yeast DNA replication origins.

    Heichinger C, Penkett CJ, Bähler J and Nurse P

    Laboratory of Yeast Genetics and Cell Biology, The Rockefeller University, New York, NY 10021, USA.

    Eukaryotic DNA replication is initiated from multiple origins of replication, but little is known about the global regulation of origins throughout the genome or in different types of cell cycles. Here, we identify 401 strong origins and 503 putative weaker origins spaced in total every 14 kb throughout the genome of the fission yeast Schizosaccharomyces pombe. The same origins are used during premeiotic and mitotic S-phases. We found that few origins fire late in mitotic S-phase and that activating the Rad3 dependent S-phase checkpoint by inhibiting DNA replication had little effect on which origins were fired. A genome-wide analysis of eukaryotic origin efficiencies showed that efficiency was variable, with large chromosomal domains enriched for efficient or inefficient origins. Average efficiency is twice as high during mitosis compared with meiosis, which can account for their different S-phase lengths. We conclude that there is a continuum of origin efficiency and that there is differential origin activity in the mitotic and meiotic cell cycles.

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    The EMBO journal 2006;25;21;5171-9

  • Novel lethal mouse mutants produced in balancer chromosome screens.

    Hentges KE, Nakamura H, Furuta Y, Yu Y, Thompson DM, O'Brien W, Bradley A and Justice MJ

    Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.

    Mutagenesis screens are a valuable method to identify genes that are required for normal development. Previous mouse mutagenesis screens for lethal mutations were targeted at specific time points or for developmental processes. Here we present the results of lethal mutant isolation from two mutagenesis screens that use balancer chromosomes. One screen was localized to mouse chromosome 4, between the STS markers D4Mit281 and D4Mit51. The second screen covered the region between Trp53 and Wnt3 on mouse chromosome 11. These screens identified all lethal mutations in the balancer regions, without bias towards any phenotype or stage of death. We have isolated 19 lethal lines on mouse chromosome 4, and 59 lethal lines on chromosome 11, many of which are distinct from previous mutants that map to these regions of the genome. We have characterized the mutant lines to determine the time of death, and performed a pair-wise complementation cross to determine if the mutations are allelic. Our data suggest that the majority of mouse lethal mutations die during mid-gestation, after uterine implantation, with a variety of defects in gastrulation, heart, neural tube, vascular, or placental development. This initial group of mutants provides a functional annotation of mouse chromosomes 4 and 11, and indicates that many novel developmental phenotypes can be quickly isolated in defined genomic intervals through balancer chromosome mutagenesis screens.

    Funded by: NICHD NIH HHS: F32 HD42436, U01 HD39372; Wellcome Trust: 077187

    Gene expression patterns : GEP 2006;6;6;653-65

  • The grapes of wrath.

    Holden M, Lindsay J and Bentley S

    Nature reviews. Microbiology 2006;4;11;806-7

  • Insights into social insects from the genome of the honeybee Apis mellifera.

    Honeybee Genome Sequencing Consortium

    Here we report the genome sequence of the honeybee Apis mellifera, a key model for social behaviour and essential to global ecology through pollination. Compared with other sequenced insect genomes, the A. mellifera genome has high A+T and CpG contents, lacks major transposon families, evolves more slowly, and is more similar to vertebrates for circadian rhythm, RNA interference and DNA methylation genes, among others. Furthermore, A. mellifera has fewer genes for innate immunity, detoxification enzymes, cuticle-forming proteins and gustatory receptors, more genes for odorant receptors, and novel genes for nectar and pollen utilization, consistent with its ecology and social organization. Compared to Drosophila, genes in early developmental pathways differ in Apis, whereas similarities exist for functions that differ markedly, such as sex determination, brain function and behaviour. Population genetics suggests a novel African origin for the species A. mellifera and insights into whether Africanized bees spread throughout the New World via hybridization or displacement.

    Funded by: Medical Research Council: MC_U137761447; NIGMS NIH HHS: R01 GM058634, R01 GM058634-08, R01 GM067317-03, R37 GM041247-23; NINDS NIH HHS: R01 NS040296-06, R01 NS043244; Wellcome Trust: 062023

    Nature 2006;443;7114;931-49

  • The LRC haplotype project: a resource for killer immunoglobulin-like receptor-linked association studies.

    Horton R, Coggill P, Miretti MM, Sambrook JG, Traherne JA, Ward R, Sims S, Palmer S, Sehra H, Harrow J, Rogers J, Carrington M, Trowsdale J and Beck S

    Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    There is increasing evidence for epistatic interactions between gene products (e.g. KIR) encoded within the Leukocyte Receptor Complex (LRC) with those (e.g. HLA) of the Major Histocompatibility Complex (MHC), resulting in susceptibility to disease. Identification of such associations at the DNA level requires comprehensive knowledge of the genetic variation and haplotype structure of the underlying loci. The LRC haplotype project aims to provide this knowledge by sequencing common LRC haplotypes.

    Funded by: Medical Research Council: G0401569, G9800943; NCI NIH HHS: N01-CO-12400; Wellcome Trust: 077198

    Tissue antigens 2006;68;5;450-2

  • Phylogenomics of several deer species revealed by comparative chromosome painting with Chinese muntjac paints.

    Huang L, Chi J, Nie W, Wang J and Yang F

    Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, The Chinese Academy of Sciences, 650223, Kunming, Yunnan, PR China.

    A set of Chinese muntjac (Muntiacus reevesi) chromosome-specific paints has been hybridized onto the metaphases of sika deer (Cervus nippon, CNI, 2n = 66), red deer (Cervus elaphus, CEL, 2n = 62) and tufted deer (Elaphodus cephalophus, ECE, 2n = 47). Thirty-three homologous autosomal segments were detected in genomes of sika deer and red deer, while 31 autosomal homologous segments were delineated in genome of tufted deer. The Chinese muntjac chromosome X probe painted to the whole X chromosome, and the chromosome Y probe gave signals on the Y chromosome as well as distal region of the X chromosome of each species. Our results confirmed that exclusive Robertsonian translocations have contributed to the karyotypic evolution of sika deer and red deer. In addition to Robertsonian translocation, tandem fusions have played a more important role in the karyotypic evolution of tufted deer. Different types of chromosomal rearrangements have led to great differences in the genome organization between cervinae and muntiacinae species. Our analysis testified that six chromosomal fissions in the proposed 2n = 58 ancestral pecoran karyotype led to the formation of 2n = 70 ancestral cervid karyotype and the deer karyotypes is more derived compare with those of bovid species. Combining previous cytogenetic and molecular systematic studies, we analyzed the genome phylogeny for 11 cervid species.

    Genetica 2006;127;1-3;25-33

  • High-density comparative BAC mapping in the black muntjac (Muntiacus crinifrons): molecular cytogenetic dissection of the origin of MCR 1p+4 in the X1X2Y1Y2Y3 sex chromosome system.

    Huang L, Chi J, Wang J, Nie W, Su W and Yang F

    Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, People's Republic of China.

    The black muntjac (Muntiacus crinifrons, 2n = 8[female symbol]/9[male symbol]) is a critically endangered mammalian species that is confined to a narrow region of southeastern China. Male black muntjacs have an astonishing X1X2Y1Y2Y3 sex chromosome system, unparalleled in eutherian mammals, involving approximately half of the entire genome. A high-resolution comparative map between the black muntjac (M. crinifrons) and the Chinese muntjac (M. reevesi, 2n = 46) has been constructed based on the chromosomal localization of 304 clones from a genomic BAC (bacterial artificial chromosome) library of the Indian muntjac (M. muntjak vaginalis, 2n = 6[female symbol]/7[male symbol]). In addition to validating the chromosomal homologies between M. reevesi and M. crinifrons defined previously by chromosome painting, the comparative BAC map demonstrates that all tandem fusions that have occurred in the karyotypic evolution of M. crinifrons are centromere-telomere fusions. The map also allows for a more detailed reconstruction of the chromosomal rearrangements leading to this unique and complex sex chromosome system. Furthermore, we have identified 46 BAC clones that could be used to study the molecular evolution of the unique sex chromosomes of the male black muntjacs.

    Genomics 2006;87;5;608-15

  • Tandem chromosome fusions in karyotypic evolution of Muntiacus: evidence from M. feae and M. gongshanensis.

    Huang L, Wang J, Nie W, Su W and Yang F

    Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, and the Graduate School of the Chinese Academy of Sciences, Jiaochang Dong Lu 32#, Kunming, Yunnan 650223, PR China.

    The muntjacs (Muntiacus, Cervidae) are famous for their rapid and radical karyotypic diversification via repeated tandem chromosome fusions, constituting a paradigm for the studies of karyotypic evolution. Of the five muntjac species with defined karyotypes, three species (i.e. Muntiacus reevesi, 2n = 46; M. m. vaginalis, 2n = 6/7; and M. crinifrons, 2n = 8/9) have so far been investigated by a combined approach of comparative chromosome banding, chromosome painting and BAC mapping. The results demonstrated that extensive centromere-telomere fusions and a few centric fusions are the chromosomal mechanisms underlying the karyotypic evolution of muntjacs. Here we have applied the same approach to two additional muntjac species with less well-characterized karyotypes, M. feae (2n = 14 male ) and M. gongshanensis (2n = 8 female). High-resolution G-banded karyotypes for M. feae and M. gongshanensis are provided. The integrated analysis of hybridization results led to the establishment of a high-resolution comparative map between M. reevesi, M. feae, and M. gongshanensis, proving that all tandem fusions underpinning the karyotypic evolution of these two muntjac species are also centromere-telomere fusions. Furthermore, the results have improved our understanding of the karyotypic relationships of extant muntjac species and provided compelling cytogenetic evidence that supports the view that M. crinifrons, M. feae, and M. gongshanensis should each be treated as a distinct species.

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2006;14;6;637-47

  • A hypermutation phenotype and somatic MSH6 mutations in recurrent human malignant gliomas after alkylator chemotherapy.

    Hunter C, Smith R, Cahill DP, Stephens P, Stevens C, Teague J, Greenman C, Edkins S, Bignell G, Davies H, O'Meara S, Parker A, Avis T, Barthorpe S, Brackenbury L, Buck G, Butler A, Clements J, Cole J, Dicks E, Forbes S, Gorton M, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jenkinson A, Jones D, Kosmidou V, Laman R, Lugg R, Menzies A, Perry J, Petty R, Raine K, Richardson D, Shepherd R, Small A, Solomon H, Tofts C, Varian J, West S, Widaa S, Yates A, Easton DF, Riggins G, Roy JE, Levine KK, Mueller W, Batchelor TT, Louis DN, Stratton MR, Futreal PA and Wooster R

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    Malignant gliomas have a very poor prognosis. The current standard of care for these cancers consists of extended adjuvant treatment with the alkylating agent temozolomide after surgical resection and radiotherapy. Although a statistically significant increase in survival has been reported with this regimen, nearly all gliomas recur and become insensitive to further treatment with this class of agents. We sequenced 500 kb of genomic DNA corresponding to the kinase domains of 518 protein kinases in each of nine gliomas. Large numbers of somatic mutations were observed in two gliomas recurrent after alkylating agent treatment. The pattern of mutations in these cases showed strong similarity to that induced by alkylating agents in experimental systems. Further investigation revealed inactivating somatic mutations of the mismatch repair gene MSH6 in each case. We propose that inactivating somatic mutations of MSH6 confer resistance to alkylating agents in gliomas in vivo and concurrently unleash accelerated mutagenesis in resistant clones as a consequence of continued exposure to alkylating agents in the presence of defective mismatch repair. The evidence therefore suggests that when MSH6 is inactivated in gliomas, alkylating agents convert from induction of tumor cell death to promotion of neoplastic progression. These observations highlight the potential of large scale sequencing for revealing and elucidating mutagenic processes operative in individual human cancers.

    Funded by: Wellcome Trust

    Cancer research 2006;66;8;3987-91

  • Recombination hotspots in nonallelic homologous recombination

    HURLES,M.E. and Lupski,J.R.;

    Genomic Disorders: The Genomic Basis of Disease 2006;Chapter 24;341-355

  • Y-chromosomal rearrangements and azoospermia

    Hurles,M.E. and Tyler-Smith,C.

    Genomic Disorders: The Genomic Basis of Disease 2006;19;273-288

  • Small regions of overlapping deletions on 6q26 in human astrocytic tumours identified using chromosome 6 tile path array-CGH.

    Ichimura K, Mungall AJ, Fiegler H, Pearson DM, Dunham I, Carter NP and Collins VP

    Department of Pathology, Division of Molecular Histopathology, University of Cambridge, Addenbrooke's Hospital, Cambridge, UK.

    Deletions of chromosome 6 are a common abnormality in diverse human malignancies including astrocytic tumours, suggesting the presence of tumour suppressor genes (TSG). In order to help identify candidate TSGs, we have constructed a chromosome 6 tile path microarray. The array contains 1,780 clones (778 P1-derived artificial chromosome and 1,002 bacterial artificial chromosome) that cover 98.3% of the published chromosome 6 sequences. A total of 104 adult astrocytic tumours (10 diffuse astrocytomas, 30 anaplastic astrocytomas (AA), 64 glioblastomas (GB)) were analysed using this array. Single copy number change was successfully detected and the result was in general concordant with a microsatellite analysis. The pattern of copy number change was complex with multiple interstitial deletions/gains. However, a predominance of telomeric 6q deletions was seen. Two small common and overlapping regions of deletion at 6q26 were identified. One was 1,002 kb in size and contained PACRG and QKI, while the second was 199 kb and harbours a single gene, ARID1B. The data show that the chromosome 6 tile path array is useful in mapping copy number changes with high resolution and accuracy. We confirmed the high frequency of chromosome 6 deletions in AA and GB, and identified two novel commonly deleted regions that may harbour TSGs.

    Funded by: Cancer Research UK: A6618

    Oncogene 2006;25;8;1261-71

  • Mutation analysis of 24 known cancer genes in the NCI-60 cell line set.

    Ikediobi ON, Davies H, Bignell G, Edkins S, Stevens C, O'Meara S, Santarius T, Avis T, Barthorpe S, Brackenbury L, Buck G, Butler A, Clements J, Cole J, Dicks E, Forbes S, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Hunter C, Jenkinson A, Jones D, Kosmidou V, Lugg R, Menzies A, Mironenko T, Parker A, Perry J, Raine K, Richardson D, Shepherd R, Small A, Smith R, Solomon H, Stephens P, Teague J, Tofts C, Varian J, Webb T, West S, Widaa S, Yates A, Reinhold W, Weinstein JN, Stratton MR, Futreal PA and Wooster R

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom.

    The panel of 60 human cancer cell lines (the NCI-60) assembled by the National Cancer Institute for anticancer drug discovery is a widely used resource. The NCI-60 has been characterized pharmacologically and at the molecular level more extensively than any other set of cell lines. However, no systematic mutation analysis of genes causally implicated in oncogenesis has been reported. This study reports the sequence analysis of 24 known cancer genes in the NCI-60 and an assessment of 4 of the 24 genes for homozygous deletions. One hundred thirty-seven oncogenic mutations were identified in 14 (APC, BRAF, CDKN2, CTNNB1, HRAS, KRAS, NRAS, SMAD4, PIK3CA, PTEN, RB1, STK11, TP53, and VHL) of the 24 genes. All lines have at least one mutation among the cancer genes examined, with most lines (73%) having more than one. Identification of those cancer genes mutated in the NCI-60, in combination with pharmacologic and molecular profiles of the cells, will allow for more informed interpretation of anticancer agent screening and will enhance the use of the NCI-60 cell lines for molecularly targeted screens.

    Funded by: Wellcome Trust: 077012

    Molecular cancer therapeutics 2006;5;11;2606-12

  • Comparative genomics and concerted evolution of beta-tubulin paralogs in Leishmania spp.

    Jackson AP, Vaughan S and Gull K

    Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford, OX1 3RE, UK.

    Background: Tubulin isotypes and expression patterns are highly regulated in diverse organisms. The genome sequence of the protozoan parasite Leishmania major contains three distinct beta-tubulin loci. To investigate the diversity of beta-tubulin genes, we have compared the published genome sequence to draft genome sequences of two further species L. infantum and L. braziliensis. Untranscribed regions and coding sequences for each isoform were compared within and between species in relation to the known diversity of beta-tubulin transcripts in Leishmania spp.

    Results: All three beta-tubulin loci were present in L. infantum and L. braziliensis, showing conserved synteny with the L. major sequence, hence confirming that these loci are paralogous. Flanking regions suggested that the chromosome 21 locus is an amastigote-specific isoform and more closely related (either structurally or functionally) to the chromosome 33 'array' locus than the chromosome 8 locus. A phylogenetic network of all isoforms indicated that paralogs from L. braziliensis and L. mexicana were monophyletic, rather than clustering by locus.

    Conclusion: L. braziliensis and L. mexicana sequences appeared more similar to each other than each did to its closest relative in another species; this indicates that these sequences have evolved convergently in each species, perhaps through ectopic gene conversion; a process not yet evident among the more recently derived L. major and L. infantum isoforms. The distinctive non-coding regions of each beta-tubulin locus showed that it is the regulatory regions of these loci that have evolved most during the diversification of these genes in Leishmania, while the coding regions have been conserved and concerted. The various loci in Leishmania satisfy a need for innovative expression of beta-tubulin, rather than elaboration of its structural role.

    Funded by: Wellcome Trust

    BMC genomics 2006;7;137

  • Evolution of tubulin gene arrays in Trypanosomatid parasites: genomic restructuring in Leishmania.

    Jackson AP, Vaughan S and Gull K

    Background: alpha- and beta-tubulin are fundamental components of the eukaryotic cytoskeleton and cell division machinery. While overall tubulin expression is carefully controlled, most eukaryotes express multiple tubulin genes in specific regulatory or developmental contexts. The genomes of the human parasites Trypanosoma brucei and Leishmania major reveal that these unicellular kinetoplastids possess arrays of tandem-duplicated tubulin genes, but with differences in organisation. While L. major possesses monotypic alpha and beta arrays in trans, an array of alternating alpha- and beta tubulin genes occurs in T. brucei. Polycistronic transcription in these organisms makes the chromosomal arrangement of tubulin genes important with respect to gene expression.

    Results: We investigated the genomic architecture of tubulin tandem arrays among these parasites, establishing which character state is derived, and the timing of character transition. Tubulin loci in T. brucei and L. major were compared to examine the relationship between the two character states. Intergenic regions between tubulin genes were sequenced from several trypanosomatids and related, non-parasitic bodonids to identify the ancestral state. Evidence of alternating arrays was found among non-parasitic kinetoplastids and all Trypanosoma spp.; monotypic arrays were confirmed in all Leishmania spp. and close relatives.

    Conclusion: Alternating and monotypic tubulin arrays were found to be mutually exclusive through comparison of genome sequences. The presence of alternating gene arrays in non-parasitic kinetoplastids confirmed that separate, monotypic arrays are the derived state and evolved through genomic restructuring in the lineage leading to Leishmania. This fundamental reorganisation accounted for the dissimilar genomic architectures of T. brucei and L. major tubulin repertoires.

  • Array-based comparative genomic hybridisation identifies high frequency of cryptic chromosomal rearrangements in patients with syndromic autism spectrum disorders.

    Background: Autism spectrum disorders (ASD) refer to a broader group of neurobiological conditions, pervasive developmental disorders. They are characterised by a symptomatic triad associated with qualitative changes in social interactions, defect in communication abilities, and repetitive and stereotyped interests and activities. ASD is prevalent in 1 to 3 per 1000 people. Despite several arguments for a strong genetic contribution, the molecular basis of a most cases remains unexplained. About 5% of patients with autism have a chromosome abnormality visible with cytogenetic methods. The most frequent are 15q11-q13 duplication, 2q37 and 22q13.3 deletions. Many other chromosomal imbalances have been described. However, most of them remain undetectable using routine karyotype analysis, thus impeding diagnosis and genetic counselling.

    29 patients presenting with syndromic ASD were investigated using a DNA microarray constructed from large insert clones spaced at approximately 1 Mb intervals across the genome. Eight clinically relevant rearrangements were identified in 8 (27.5%) patients: six deletions and two duplications. Altered segments ranged in size from 1.4 to 16 Mb (2-19 clones). No recurrent abnormality was identified.

    Conclusion: These results clearly show that array comparative genomic hybridisation should be considered to be an essential aspect of the genetic analysis of patients with syndromic ASD. Moreover, besides their importance for diagnosis and genetic counselling, they may allow the delineation of new contiguous gene syndromes associated with ASD. Finally, the detailed molecular analysis of the rearranged regions may pave the way for the identification of new ASD genes.

  • Differential gene expression of pathogens inside infected hosts.

    Jansen A and Yu J

    DNA microarray is a useful technology for studying differential gene expression in the context of microbe-host interactions. This review concentrates on recent findings of the survival strategies of three intracellular pathogens: Shigella flexneri, Salmonella enterica serovar Typhimurium and Mycobacterium tuberculosis.

  • Detailed assessment of chromosome 22 aberrations in sporadic pheochromocytoma using array-CGH.

    Pheochromocytoma is a predominantly sporadic neuroendocrine tumor derived from the adrenal medulla. Previous low resolution LOH and metaphase-CGH studies reported the loss of chromosomes 1p, 3q, 17p and 22q at various frequencies. However, the molecular mechanism(s) behind development of sporadic pheochromocytoma remains largely unknown. We have applied high-resolution tiling-path microarray-CGH with the primary aim to characterize copy number imbalances affecting chromosome 22 in 66 sporadic pheochromocytomas. We detected copy number alterations on 22q at a frequency of 44%. The predominant finding was monosomy 22 (30%), followed by terminal deletions in 8 samples (12%) and a single interstitial deletion. We further applied a chromosome 1 tiling-path array in 7 tumors with terminal deletions of 22q and found deletions of 1p in all cases. Our overall results suggest that at least 2 distinct regions on both 22q and 1p are important in the tumorigenesis of sporadic pheochromocytoma. A large proportion of pheochromocytomas also displayed indications of cellular heterogeneity. Our study is to our knowledge the first array-CGH study of sporadic pheochromocytoma. Future analysis of this tumor type should preferably be performed in the context of the entire human genome using genome-wide array-CGH, which is a superior methodological approach. Supplemental material for this article can be found on the International Journal of Cancer website at

  • The biology of intron gain and loss.

    Jeffares DC, Mourier T and Penny D

    Intron density in eukaryote genomes varies by more than three orders of magnitude, so there must have been extensive intron gain and/or intron loss during evolution. A favored and partial explanation for this range of intron densities has been that introns have accumulated stochastically in large eukaryote genomes during their evolution from an intron-poor ancestor. However, recent studies have shown that some eukaryotes lost many introns, whereas others accumulated and/or gained many introns. In this article, we discuss the growing evidence that these differences are subject to selection acting on introns depending on the biology of the organism and the gene involved.

  • [X]uniqMAP: unique gene sequence regions in the human and mouse genomes.

    Jiménez JL and Durbin R

    Background: Current approaches for genome-wise functional analyses, such as microarray and RNA interference studies, rely on the specificity of oligonucleotide sequences to selectively target cellular transcripts. The design of specific oligos involves the determination of unique DNA regions in the gene/transcripts of interest from the targeted organism. This process is tedious, time consuming and it does not scale up for high-throughput studies.

    Description: Taking advantage of the availability of complete genome sequence information for mouse and human, the most widely used systems for the study of mammalian genetics, we have built a database, [X]uniqMAP, that stores the precalculated unique regions for all transcripts of these two organisms. For each gene, the database discriminates between those unique regions that are shared by all transcripts and those exclusive to single transcripts. In addition, it also provides those unique regions that are shared between orthologous genes from the two organisms. The database is updated regularly to reflect changes in genome assemblies and gene builds.

    Conclusion: Over 85% of genes have unique regions at least 19 bases long, with the majority being unique over 60% of their lengths. 14482 human genes share exactly at least a unique region with mouse genes, though such regions are typically under 40 bases long. The full data are publicly accessible online both interactively and for download. They should facilitate (i) the design of probes, primers and siRNAs for both small- and large-scale projects; and (ii) the identification of regions for the design of oligos that could be re-used to target equivalent gene/transcripts from human and mouse.

  • Forensic genetic analysis of mitochondrial DNA hypervariable region I/II sequences: an expanded Korean population database.

    Jin HJ, Kwak KD, Hong SB, Shin DJ, Han MS, Tyler-Smith C and Kim W

    We have analyzed variation of the mitochondrial DNA (mtDNA) hypervariable segments I and II (HVS-I and HVS-II) in 185 randomly chosen individuals from Korea to provide an expanded and reliable Korean database. Combined sequence comparison of HVS-I and HVS-II led to the identification of 167 different haplotypes characterized by 154 variable sites. One hundred and fifty-one of the haplotypes were individual-specific, 14 were found in two individuals and 2 were found in three individuals. A pairwise comparison of the 185 HVS-I/II sequences found an average of 10.11 +/- 4.63 differences between individuals. The random match probability and gene diversity for the combined hypervariable regions were estimated at 0.66% and 0.9988, respectively. Analyzing the expanded database including three previously reported data sets and the present data using haplogroup-based comparisons and comparison with closely related sequences allowed errors to be detected and eliminated, thus considerably improving data quality. Sample division comparisons based on PhiST genetic distance measures revealed no significant population differentiation in the distribution of mtDNA sequence variations between the present data set and a database in The Scientific Working Group on DNA Analysis Methods (SWGDAM), but did indicate differences from other sets of data. Based on the results of mtDNA profiles, almost all of the mtDNA types studied here could be classified into subsets of haplogroups common in east Asia, and show that the Koreans possess lineages from both the southern and the northern haplogroup complexes of east Asian populations. The new data, combined with other mtDNA sequences, demonstrate how useful comparison with closely related mtDNA sequences can be for improving database quality, as well as providing haplotype information for forensic and population genetic analyses in the Korean population.

  • Identification of the REST regulon reveals extensive transposable element-mediated binding site duplication.

    Johnson R, Gamblin RJ, Ooi L, Bruce AW, Donaldson IJ, Westhead DR, Wood IC, Jackson RM and Buckley NJ

    The genome-wide mapping of gene-regulatory motifs remains a major goal that will facilitate the modelling of gene-regulatory networks and their evolution. The repressor element 1 is a long, conserved transcription factor-binding site which recruits the transcriptional repressor REST to numerous neuron-specific target genes. REST plays important roles in multiple biological processes and disease states. To map RE1 sites and target genes, we created a position specific scoring matrix representing the RE1 and used it to search the human and mouse genomes. We identified 1301 and 997 RE1s inhuman and mouse genomes, respectively, of which >40% are novel. By employing an ontological analysis we show that REST target genes are significantly enriched in a number of functional classes. Taking the novel REST target gene CACNA1A as an experimental model, we show that it can be regulated by multiple RE1s of different binding affinities, which are only partially conserved between human and mouse. A novel BLAST methodology indicated that many RE1s belong to closely related families. Most of these sequences are associated with transposable elements, leading us to propose that transposon-mediated duplication and insertion of RE1s has led to the acquisition of novel target genes by REST during evolution.

  • Immunization with the iron uptake ABC transporter proteins PiaA and PiuA prevents respiratory infection with Streptococcus pneumoniae.

    Jomaa M, Terry S, Hale C, Jones C, Dougan G and Brown J

    Previous studies show that vaccination with the recombinant Streptococcus pneumoniae lipoproteins PiuA and PiaA protects mice against systemic S. pneumoniae disease. The aim of this study was to assess the level of conservation of PiaA and PiuA and a third iron uptake ABC transporter lipoprotein, PitA, between common S. pneumoniae capsular serotypes by sequencing the corresponding genes, and to investigate whether these antigens can protect against respiratory infection. The nucleotide sequences of piuA and piaA were highly conserved in all strains, whereas pitA had significant variation in its nucleotide sequence making PitA an unattractive vaccine candidate. Mucosal vaccination of mice with PiuA and PiaA elicited specific antibody responses in serum and respiratory secretions, and protected against intranasal challenge with S. pneumoniae. These results provide further data indicating that PiuA and PiaA would be suitable candidates for a S. pneumoniae protein antigen vaccine.

  • A conserved supergene locus controls colour pattern diversity in Heliconius butterflies.

    We studied whether similar developmental genetic mechanisms are involved in both convergent and divergent evolution. Mimetic insects are known for their diversity of patterns as well as their remarkable evolutionary convergence, and they have played an important role in controversies over the respective roles of selection and constraints in adaptive evolution. Here we contrast three butterfly species, all classic examples of Müllerian mimicry. We used a genetic linkage map to show that a locus, Yb, which controls the presence of a yellow band in geographic races of Heliconius melpomene, maps precisely to the same location as the locus Cr, which has very similar phenotypic effects in its co-mimic H. erato. Furthermore, the same genomic location acts as a "supergene", determining multiple sympatric morphs in a third species, H. numata. H. numata is a species with a very different phenotypic appearance, whose many forms mimic different unrelated ithomiine butterflies in the genus Melinaea. Other unlinked colour pattern loci map to a homologous linkage group in the co-mimics H. melpomene and H. erato, but they are not involved in mimetic polymorphism in H. numata. Hence, a single region from the multilocus colour pattern architecture of H. melpomene and H. erato appears to have gained control of the entire wing-pattern variability in H. numata, presumably as a result of selection for mimetic "supergene" polymorphism without intermediates. Although we cannot at this stage confirm the homology of the loci segregating in the three species, our results imply that a conserved yet relatively unconstrained mechanism underlying pattern switching can affect mimicry in radically different ways. We also show that adaptive evolution, both convergent and diversifying, can occur by the repeated involvement of the same genomic regions.

    PLoS biology 2006;4;10;e303

  • DNA copy number alterations and expression of relevant genes in mouse thymic lymphomas induced by gamma-irradiation and N-methyl-N-nitrosourea.

    Kang HM, Jang JJ, Langford C, Shin SH, Park SY and Chung YJ

    The genetic mechanism for the development and progression of a lymphoma is unclear. This study investigated the alterations in the DNA copy number and the expression profiles of the genes located in the altered regions in mouse thymic lymphomas that were induced by two mutagens, gamma-irradiation and N-methyl-N-nitrosourea (MNU). Microarray-based comparative genomic hybridization was used to precisely delineate the boundaries of the altered region. The copy number gains of chromosomes 4 and 5 were observed only in the radiation-induced lymphomas, and gains of chromosomes 10 and 14 were observed only in the MNU-induced lymphomas. Regional copy number losses in chromosomes 11, 16, and 19 appeared frequently in the radiation-induced lymphomas. The cancer-related genes Pten, Ikaros/Znfn1a1, Ercc4, and Top3b were located in the minimal deletion regions. In particular, the expression levels of the Pten, Top3b, and Ikaros genes were downregulated in both lymphoma groups, but the expression level of Ercc4 was downregulated only in the MNU group. This study also examined the expression levels of Sparc, Cxcl1, and Myc (synonym: c-Myc), which are located in the copy number gained chromosomes. Sparc was upregulated specifically in the radiation group, and Cxcl1 in the MNU group. c-Myc was upregulated in both groups. There was limited correlation between the DNA copy number profiles and the expression of the cancer-related genes in mouse lymphomagenesis. The chromosome aberrations and novel expression profiles of the cancer-related genes within the altered regions may provide important clues to the genetic mechanism for the development of lymphoma.

  • Dual mutations in the Autographa californica nucleopolyhedrovirus FP-25 and p35 genes result in plasma-membrane blebbing in Trichoplusia ni cells.

    Kelly BJ, King LA, Possee RD and Chapple SD

    Spodoptera frugiperda cells infected with Autographa californica nucleopolyhedrovirus (AcMNPV) lacking a functional anti-apoptotic p35 protein undergo apoptosis. However, such mutants replicate normally in Trichoplusia ni (TN-368) cells. An AcMNPV plaque isolate (AcdefrT) was identified during propagation of a virus deficient in p35 in TN-368 cells. This virus exhibited enhanced budded-particle formation in TN-368 cells, but was partially defective for polyhedra production in the same cells. Virus replication in AcdefrT-infected TN-368 cells was accompanied by extensive plasma-membrane blebbing and caspase activation late in infection, both features of apoptosis. Rescue of the p35 locus of AcdefrT continued to result in a reduction in polyhedra and increase in budded virus production in TN-368 cells, but no plasma-membrane blebbing was observed. The mutation was mapped to the FP-25 gene locus. This gene mutation combined with the non-functional p35 was found to be responsible for the cell-blebbing effect observed in AcdefrT-infected TN-368 cells.

  • p63 heterozygous mutant mice are not prone to spontaneous or chemically induced tumors.

    Keyes WM, Vogel H, Koster MI, Guo X, Qi Y, Petherbridge KM, Roop DR, Bradley A and Mills AA

    Homology between p63 and p53 has suggested that these proteins might function similarly. However, the majority of data from human tumors have not supported a similar role for p63 in tumor suppression. To investigate this issue, we studied spontaneous tumorigenesis in p63+/- mice in both WT and p53-compromised backgrounds. We found that p63+/- mice were not tumor prone and mice heterozygous for both p63 and p53 had fewer tumors than p53+/- mice. The rare tumors that developed in mice with compromised p63 were also distinct from those of p53+/- mice. Furthermore, p63+/- mice were not prone to chemically induced tumorigenesis, and p63 expression was maintained in carcinomas. These findings demonstrate that, in agreement with data from human tumors, p63 plays a markedly different biological role in cancer than p53.

  • Genome assembly comparison identifies structural variants in the human genome.

    Khaja R, Zhang J, MacDonald JR, He Y, Joseph-George AM, Wei J, Rafiq MA, Qian C, Shago M, Pantano L, Aburatani H, Jones K, Redon R, Hurles M, Armengol L, Estivill X, Mural RJ, Lee C, Scherer SW and Feuk L

    Numerous types of DNA variation exist, ranging from SNPs to larger structural alterations such as copy number variants (CNVs) and inversions. Alignment of DNA sequence from different sources has been used to identify SNPs and intermediate-sized variants (ISVs). However, only a small proportion of total heterogeneity is characterized, and little is known of the characteristics of most smaller-sized (<50 kb) variants. Here we show that genome assembly comparison is a robust approach for identification of all classes of genetic variation. Through comparison of two human assemblies (Celera's R27c compilation and the Build 35 reference sequence), we identified megabases of sequence (in the form of 13,534 putative non-SNP events) that were absent, inverted or polymorphic in one assembly. Database comparison and laboratory experimentation further demonstrated overlap or validation for 240 variable regions and confirmed >1.5 million SNPs. Some differences were simple insertions and deletions, but in regions containing CNVs, segmental duplication and repetitive DNA, they were more complex. Our results uncover substantial undescribed variation in humans, highlighting the need for comprehensive annotation strategies to fully interpret genome scanning and personalized sequencing projects.

  • Common inheritance of chromosome Ia associated with clonal expansion of Toxoplasma gondii.

    Khan A, Böhme U, Kelly KA, Adlem E, Brooks K, Simmonds M, Mungall K, Quail MA, Arrowsmith C, Chillingworth T, Churcher C, Harris D, Collins M, Fosker N, Fraser A, Hance Z, Jagels K, Moule S, Murphy L, O'Neil S, Rajandream MA, Saunders D, Seeger K, Whitehead S, Mayr T, Xuan X, Watanabe J, Suzuki Y, Wakaguri H, Sugano S, Sugimoto C, Paulsen I, Mackey AJ, Roos DS, Hall N, Berriman M, Barrell B, Sibley LD and Ajioka JW

    Toxoplasma gondii is a globally distributed protozoan parasite that can infect virtually all warm-blooded animals and humans. Despite the existence of a sexual phase in the life cycle, T. gondii has an unusual population structure dominated by three clonal lineages that predominate in North America and Europe, (Types I, II, and III). These lineages were founded by common ancestors approximately10,000 yr ago. The recent origin and widespread distribution of the clonal lineages is attributed to the circumvention of the sexual cycle by a new mode of transmission-asexual transmission between intermediate hosts. Asexual transmission appears to be multigenic and although the specific genes mediating this trait are unknown, it is predicted that all members of the clonal lineages should share the same alleles. Genetic mapping studies suggested that chromosome Ia was unusually monomorphic compared with the rest of the genome. To investigate this further, we sequenced chromosome Ia and chromosome Ib in the Type I strain, RH, and the Type II strain, ME49. Comparative genome analyses of the two chromosomal sequences revealed that the same copy of chromosome Ia was inherited in each lineage, whereas chromosome Ib maintained the same high frequency of between-strain polymorphism as the rest of the genome. Sampling of chromosome Ia sequence in seven additional representative strains from the three clonal lineages supports a monomorphic inheritance, which is unique within the genome. Taken together, our observations implicate a specific combination of alleles on chromosome Ia in the recent origin and widespread success of the clonal lineages of T. gondii.

  • The systematic functional characterisation of Xq28 genes prioritises candidate disease genes.

    Kolb-Kokocinski A, Mehrle A, Bechtel S, Simpson JC, Kioschis P, Wiemann S, Wellenreuther R and Poustka A

    Background: Well known for its gene density and the large number of mapped diseases, the human sub-chromosomal region Xq28 has long been a focus of genome research. Over 40 of approximately 300 X-linked diseases map to this region, and systematic mapping, transcript identification, and mutation analysis has led to the identification of causative genes for 26 of these diseases, leaving another 17 diseases mapped to Xq28, where the causative gene is still unknown. To expedite disease gene identification, we have initiated the functional characterisation of all known Xq28 genes.

    Results: By using a systematic approach, we describe the Xq28 genes by RNA in situ hybridisation and Northern blotting of the mouse orthologs, as well as subcellular localisation and data mining of the human genes. We have developed a relational web-accessible database with comprehensive query options integrating all experimental data. Using this database, we matched gene expression patterns with affected tissues for 16 of the 17 remaining Xq28 linked diseases, where the causative gene is unknown.

    Conclusion: By using this systematic approach, we have prioritised genes in linkage regions of Xq28-mapped diseases to an amenable number for mutational screens. Our database can be queried by any researcher performing highly specified searches including diseases not listed in OMIM or diseases that might be linked to Xq28 in the future.

  • Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays.

    Komura D, Shen F, Ishikawa S, Fitch KR, Chen W, Zhang J, Liu G, Ihara S, Nakamura H, Hurles ME, Lee C, Scherer SW, Jones KW, Shapero MH, Huang J and Aburatani H

    Recent reports indicate that copy number variations (CNVs) within the human genome contribute to nucleotide diversity to a larger extent than single nucleotide polymorphisms (SNPs). In addition, the contribution of CNVs to human disease susceptibility may be greater than previously expected, although a complete understanding of the phenotypic consequences of CNVs is incomplete. We have recently reported a comprehensive view of CNVs among 270 HapMap samples using high-density SNP genotyping arrays and BAC array CGH. In this report, we describe a novel algorithm using Affymetrix GeneChip Human Mapping 500K Early Access (500K EA) arrays that identified 1203 CNVs ranging in size from 960 bp to 3.4 Mb. The algorithm consists of three steps: (1) Intensity pre-processing to improve the resolution between pairwise comparisons by directly estimating the allele-specific affinity as well as to reduce signal noise by incorporating probe and target sequence characteristics via an improved version of the Genomic Imbalance Map (GIM) algorithm; (2) CNV extraction using an adapted SW-ARRAY procedure to automatically and robustly detect candidate CNV regions; and (3) copy number inference in which all pairwise comparisons are summarized to more precisely define CNV boundaries and accurately estimate CNV copy number. Independent testing of a subset of CNVs by quantitative PCR and mass spectrometry demonstrated a >90% verification rate. The use of high-resolution oligonucleotide arrays relative to other methods may allow more precise boundary information to be extracted, thereby enabling a more accurate analysis of the relationship between CNVs and other genomic features.

  • Recording long-term potentiation of synaptic transmission by three-dimensional multi-electrode arrays.

    Kopanitsa MV, Afinowi NO and Grant SG

    Background: Multi-electrode arrays (MEAs) have become popular tools for recording spontaneous and evoked electrical activity of excitable tissues. The majority of previous studies of synaptic transmission in brain slices employed MEAs with planar electrodes that had limited ability to detect signals coming from deeper, healthier layers of the slice. To overcome this limitation, we used three-dimensional (3D) MEAs with tip-shaped electrodes to probe plasticity of field excitatory synaptic potentials (fEPSPs) in the CA1 area of hippocampal slices of 129S5/SvEvBrd and C57BL/6J-TyrC-Brd mice.

    Results: Using 3D MEAs, we were able to record larger fEPSPs compared to signals measured by planar MEAs. Several stimulation protocols were used to induce long-term potentiation (LTP) of synaptic responses in the CA1 area recorded following excitation of Schäffer collateral/commissural fibres. Either two trains of high frequency tetanic stimulation or three trains of theta-burst stimulation caused a persistent, pathway specific enhancement of fEPSPs that remained significantly elevated for at least 60 min. A third LTP induction protocol that comprised 150 pulses delivered at 5 Hz, evoked moderate LTP if excitation strength was increased to 1.5x of the baseline stimulus. In all cases, we observed a clear spatial plasticity gradient with maximum LTP levels detected in proximal apical dendrites of pyramidal neurones. No significant differences in the manifestation of LTP were observed between 129S5/SvEvBrd and C57BL/6J-TyrC-Brd mice with the three protocols used. All forms of plasticity were sensitive to inhibition of N-methyl-D-aspartate (NMDA) receptors.

    Conclusion: Principal features of LTP (magnitude, pathway specificity, NMDA receptor dependence) recorded in the hippocampal slices using MEAs were very similar to those seen in conventional glass electrode experiments. Advantages of using MEAs are the ability to record from different regions of the slice and the ease of conducting several experiments on a multiplexed platform which could be useful for efficient screening of novel transgenic mice.

  • A complex rearrangement on chromosome 22 affecting both homologues; haplo-insufficiency of the Cat eye syndrome region may have no clinical relevance.

    Kriek M, Szuhai K, Kant SG, White SJ, Dauwerse H, Fiegler H, Carter NP, Knijnenburg J, den Dunnen JT, Tanke HJ, Breuning MH and Rosenberg C

    The presence of highly homologous sequences, known as low copy repeats, predisposes for unequal recombination within the 22q11 region. This can lead to genomic imbalances associated with several known genetic disorders. We report here a developmentally delayed patient carrying different rearrangements on both chromosome 22 homologues, including a previously unreported rearrangement within the 22q11 region. One homologue carries a deletion of the proximal part of chromosome band 22q11. To our knowledge, a 'pure' deletion of this region has not been described previously. Four copies of this 22q11 region, however, are associated with Cat eye syndrome (CES). While the phenotypic impact of this deletion is unclear, familial investigation revealed five normal relatives carrying this deletion, suggesting that haplo-insufficiency of the CES region has little clinical relevance. The other chromosome 22 homologue carries a duplication of the Velocardiofacial/DiGeorge syndrome (VCFS/DGS) region. In addition, a previously undescribed deletion of 22q12.1, located in a relatively gene-poor region, was identified. As the clinical features of patients suffering from a duplication of the VCFS/DGS region have proven to be extremely variable, it is impossible to postulate as to the contribution of the 22q12.1 deletion to the phenotype of the patient. Additional patients with a deletion within this region are needed to establish the consequences of this copy number alteration. This study highlights the value of using different genomic approaches to unravel chromosomal alterations in order to study their phenotypic impact.

  • Dramatic reorganisation of Trichomonas endomembranes during amoebal transformation: a possible role for G-proteins.

    Lal K, Noel CJ, Field MC, Goulding D and Hirt RP

  • The leukocyte receptor complex in chicken is characterized by massive expansion and diversification of immunoglobulin-like Loci.

    Laun K, Coggill P, Palmer S, Sims S, Ning Z, Ragoussis J, Volpi E, Wilson N, Beck S, Ziegler A and Volz A

    The innate and adaptive immune systems of vertebrates possess complementary, but intertwined functions within immune responses. Receptors of the mammalian innate immune system play an essential role in the detection of infected or transformed cells and are vital for the initiation and regulation of a full adaptive immune response. The genes for several of these receptors are clustered within the leukocyte receptor complex (LRC). The purpose of this study was to carry out a detailed analysis of the chicken (Gallus gallus domesticus) LRC. Bacterial artificial chromosomes containing genes related to mammalian leukocyte immunoglobulin-like receptors were identified in a chicken genomic library and shown to map to a single microchromosome. Sequencing revealed 103 chicken immunoglobulin-like receptor (CHIR) loci (22 inhibitory, 25 activating, 15 bifunctional, and 41 pseudogenes). A very complex splicing pattern was found using transcript analyses and seven hypervariable regions were detected in the external CHIR domains. Phylogenetic and genomic analysis showed that CHIR genes evolved mainly by block duplications from an ancestral inhibitory receptor locus, with transformation into activating receptors occurring more than once. Evolutionary selection pressure has led not only to an exceptional expansion of the CHIR cluster but also to a dramatic diversification of CHIR loci and haplotypes. This indicates that CHIRs have the potential to complement the adaptive immune system in fighting pathogens.

  • Replication of the association of HLA-B7 with Alzheimer's disease: a role for homozygosity?

    Lehmann DJ, Barnardo MC, Fuggle S, Quiroga I, Sutherland A, Warden DR, Barnetson L, Horton R, Beck S and Smith AD

    Background: There are reasons to expect an association with Alzheimer's disease (AD) within the HLA region. The HLA-B & C genes have, however, been relatively understudied. A geographically specific association with HLA-B7 & HLA-Cw*0702 had been suggested by our previous, small study.

    Methods: We studied the HLA-B & C alleles in 196 cases of 'definite' or 'probable' AD and 199 elderly controls of the OPTIMA cohort, the largest full study of these alleles in AD to date.

    Results: We replicated the association of HLA-B7 with AD (overall, adjusted odds ratio = 2.3, 95% confidence interval = 1.4-3.7, p = 0.001), but not the previously suggested interaction with the epsilon4 allele of apolipoprotein E. Results for HLA-Cw*0702, which is in tight linkage disequilibrium with HLA-B7, were consistent with those for the latter. Homozygotes of both alleles appeared to be at particularly high risk of AD.

    Conclusion: HLA-B7 and HLA-Cw*0702 are associated with AD in the Oxford population. Because of the contradictions between cohorts in our previous study, we suggest that these results may be geographically specific. This might be because of differences between populations in the structure of linkage disequilibrium or in interactions with environmental, genetic or epigenetic factors. A much larger study will be needed to clarify the role of homozygosity of HLA alleles in AD risk.

  • Loss of LIN-35, the Caenorhabditis elegans ortholog of the tumor suppressor p105Rb, results in enhanced RNA interference.

    Lehner B, Calixto A, Crombie C, Tischler J, Fortunato A, Chalfie M and Fraser AG

    Background: Genome-wide RNA interference (RNAi) screening is a very powerful tool for analyzing gene function in vivo in Caenorhabditis elegans. The effectiveness of RNAi varies from gene to gene, however, and neuronally expressed genes are largely refractive to RNAi in wild-type worms.

    Results: We found that C. elegans strains carrying mutations in lin-35, the worm ortholog of the tumor suppressor gene p105Rb, or a subset of the genetically related synMuv B family of chromatin-modifying genes, show increased strength and penetrance for many germline, embryonic, and post-embryonic RNAi phenotypes, including neuronal RNAi phenotypes. Mutations in these same genes also enhance somatic transgene silencing via an RNAi-dependent mechanism. Two genes, mes-4 and zfp-1, are required both for the vulval lineage defects resulting from mutations in synMuv B genes and for RNAi, suggesting a common mechanism for the function of synMuv B genes in vulval development and in regulating RNAi. Enhanced RNAi in the germline of lin-35 worms suggests that misexpression of germline genes in somatic cells cannot alone account for the enhanced RNAi observed in this strain.

    Conclusion: A worm strain with a null mutation in lin-35 is more sensitive to RNAi than any other previously described single mutant strain, and so will prove very useful for future genome-wide RNAi screens, particularly for identifying genes with neuronal functions. As lin-35 is the worm ortholog of the mammalian tumor suppressor gene p105Rb, misregulation of RNAi may be important during human oncogenesis.

  • Systematic mapping of genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse signaling pathways.

    Lehner B, Crombie C, Tischler J, Fortunato A and Fraser AG

    Most heritable traits, including disease susceptibility, are affected by interactions between multiple genes. However, we understand little about how genes interact because very few possible genetic interactions have been explored experimentally. We have used RNA interference in Caenorhabditis elegans to systematically test approximately 65,000 pairs of genes for their ability to interact genetically. We identify approximately 350 genetic interactions between genes functioning in signaling pathways that are mutated in human diseases, including components of the EGF/Ras, Notch and Wnt pathways. Most notably, we identify a class of highly connected 'hub' genes: inactivation of these genes can enhance the phenotypic consequences of mutation of many different genes. These hub genes all encode chromatin regulators, and their activity as genetic hubs seems to be conserved across animals. We propose that these genes function as general buffers of genetic variation and that these hub genes may act as modifier genes in multiple, mechanistically unrelated genetic diseases in humans.

  • RNAi screens in Caenorhabditis elegans in a 96-well liquid format and their application to the systematic identification of genetic interactions.

    Lehner B, Tischler J and Fraser AG

    We describe a protocol for performing RNA interference (RNAi) screens in Caenorhabditis elegans in liquid culture in 96-well plates. The procedure allows a single researcher to set-up and score RNAi experiments at approximately 2,000 genes per day. By comparing RNAi phenotypes between wild-type worms and worms carrying a defined genetic mutation, we have used this protocol to identify synthetic lethal interactions between genes systematically. We also describe how the protocol can be adapted to target two genes simultaneously by combinatorial RNAi.

  • Clinical factors and ABCB1 polymorphisms in prediction of antiepileptic drug response: a prospective cohort study.

    Leschziner G, Jorgensen AL, Andrew T, Pirmohamed M, Williamson PR, Marson AG, Coffey AJ, Middleditch C, Rogers J, Bentley DR, Chadwick DW, Balding DJ and Johnson MR

    Background: The ABCB1 3435C-->T single-nucleotide polymorphism (SNP) or a three-SNP haplotype containing 3435C-->T has been implicated in multidrug resistance in epilepsy in three retrospective case-control studies, but a further three have failed to replicate the association. We aimed to determine the effect of the ABCB1 gene on epilepsy drug response, using a unique large cohort of epilepsy patients with prospectively measured seizure and drug response outcomes.

    Methods: The ABCB1 3435C-->T polymorphism and three-SNP haplotype, plus a comprehensive set of tag SNPs across ABCB1 and adjacent ABCB4, were genotyped in a cohort of 503 epilepsy patients with prospectively measured seizure and drug response outcomes. Clinical, demographic, and genetic data were analysed. Treatment outcome was measured in terms of time to 12-month remission, time to first seizure, and time to drug withdrawal due to inadequate seizure control or side-effects. Randomly selected genome-wide HapMap SNPs (n=129) were genotyped in all patients for genomic control.

    Findings: Number of seizures before treatment was the dominant feature predicting seizure outcome after starting antiepileptic drug therapy, measured by both time to first seizure (hazard ratio 1.34, 95% CI 1.21-1.49, p<0.0001) and time to 12-month remission (0.83, 0.73-0.94, p=0.003). There was no association of the ABCB1 3435C-->T polymorphism, the three-SNP haplotype, or any gene-wide tag SNP with time to first seizure after starting drug therapy, time to 12-month remission, or time to drug withdrawal due to unacceptable side-effects or to lack of seizure control.

    Interpretation: We found no evidence that ABCB1 common variation influences either seizure or drug withdrawal outcomes after initiation of antiepileptic drug therapy.

  • Exon sequencing and high resolution haplotype analysis of ABC transporter genes implicated in drug resistance.

    Leschziner G, Zabaneh D, Pirmohamed M, Owen A, Rogers J, Coffey AJ, Balding DJ, Bentley DB and Johnson MR

    Background: The ATP-binding cassette (ABC) proteins are a superfamily of efflux pumps implicated as a mechanism for multidrug resistance in cytotoxic chemotherapy, immunosuppressive therapy, HIV and epilepsy. Genetic variation in P-glycoprotein, the product of the ABCB1 gene, is proposed to mediate de novo drug resistance, but associations between polymorphisms in ABCB1 and pharmacoresistance have produced conflicting results. Potential explanations for the inconsistency of results include inadequate characterization of gene structure, variation and linkage disequilibrium (LD) in ABCB1, as well as overlap in substrate specificity between ABCB1 and the various other drug transporters.

    We undertook a fundamental analysis of gene structure, variation and LD in ABCB1 and four other drug transporter genes implicated in pharmacoresistance: ABCC1, ABCC2, ABCC5 and ABCB4. Manual annotation of the five genes revealed nine shorter alternative transcripts with new untranslated regions and one novel region of coding sequence, demonstrating that on-line annotations are incomplete. Sequencing of exons in 47 Caucasian individuals identified 75 novel single nucleotide polymorphisms (SNPs) previously undescribed in any public database, including 14 new coding sequence SNPs. Genotyping of 502 SNPs in 842 Caucasian individuals across the five genes revealed large blocks of high LD, and low haplotype diversity across all five genes that could be characterized by between 67 and 114 tagging SNPs, depending on the tagging criteria.

    Conclusion: The study illustrates that publicly available data resources on genomic organization of genes and common variation can have important gaps and limitations, and establishes a comprehensive set of tagging SNPs for future association studies in pharmacoresistance.

  • TreeFam: a curated database of phylogenetic trees of animal gene families.

    Li H, Coghlan A, Ruan J, Coin LJ, Hériché JK, Osmotherly L, Li R, Liu T, Zhang Z, Bolund L, Wong GK, Zheng W, Dehal P, Wang J and Durbin R

    TreeFam is a database of phylogenetic trees of gene families found in animals. It aims to develop a curated resource that presents the accurate evolutionary history of all animal gene families, as well as reliable ortholog and paralog assignments. Curated families are being added progressively, based on seed alignments and trees in a similar fashion to Pfam. Release 1.1 of TreeFam contains curated trees for 690 families and automatically generated trees for another 11 646 families. These represent over 128 000 genes from nine fully sequenced animal genomes and over 45 000 other animal proteins from UniProt; approximately 40-85% of proteins encoded in the fully sequenced animal genomes are included in TreeFam. TreeFam is freely available at and

  • Chromosomal mechanisms underlying the karyotype evolution of the oriental voles (Muridae, Eothenomys).

    Li T, Wang J, Su W and Yang F

    We have investigated the karyotype relationships of two oriental voles, i.e. the Yulong vole (Eothenomys proditor, 2n = 32) and the large oriental vole (Eothenomys miletus, 2n = 56) as well as the Clarke's vole (Microtus clarkei, 2n = 52), by a combined approach of cross-species chromosome painting and high-resolution G-banding comparison. Chromosome-specific painting probes were generated from flow-sorted chromosomes of E. proditor and hybridized onto metaphases of E. proditor, E. miletus and M. clarkei, leading to the establishment of genome-wide comparative chromosome maps. Our results demonstrate that Robertsonian translocations (centric fusions) have played a major role in the karyotype evolution of oriental voles with no obvious evidence for the involvement of tandem fusions as proposed previously and that the genome organizations of vole species are highly conserved. The comparative chromosome maps of these three vole species belonging to two phylogenetically distinct genera provide a framework for future studies on the karyotype evolution in voles.

  • Karyotypic evolution of the family Sciuridae: inferences from the genome organizations of ground squirrels.

    Li T, Wang J, Su W, Nie W and Yang F

    Cross-species chromosome painting has made a great contribution to our understanding of the evolution of karyotypes and genome organizations of mammals. Several recent papers of comparative painting between tree and flying squirrels have shed some light on the evolution of the family Sciuridae and the order Rodentia. In the present study we have extended the comparative painting to the Himalayan marmot (Marmotahimalayana) and the African ground squirrel (Xerus cf. erythropus), i.e. representative species from another important squirrel group--the ground squirrels--, and have established genome-wide comparative chromosome maps between human, eastern gray squirrel, and these two ground squirrels. The results show that 1) the squirrels so far studied all have conserved karyotypes that resemble the ancestral karyotype of the order Rodentia; 2) the African ground squirrels could have retained the ancestral karyotype of the family Sciuridae. Furthermore, we have mapped the evolutionary rearrangements onto a molecular-based consensus phylogenetic tree of the family Sciuridae.

  • Applying whole-genome studies of epigenetic regulation to study human disease.

    Lieb JD, Beck S, Bulyk ML, Farnham P, Hattori N, Henikoff S, Liu XS, Okumura K, Shiota K, Ushijima T and Greally JM

  • Understanding the rise of the superbug: investigation of the evolution and genomic variation of Staphylococcus aureus.

    Lindsay JA and Holden MT

    The bacterium Staphylococcus aureus is a common cause of human infection, and it is becoming increasingly virulent and resistant to antibiotics. Our understanding of the evolution of this species has been greatly enhanced by the recent sequencing of the genomes of seven strains of S. aureus. Comparative genomic analysis allows us to identify variation in the chromosomes and understand the mechanisms by which this versatile bacterium has accumulated diversity within its genome structure.

  • A chromosomal rearrangement hotspot can be identified from population genetic variation and is coincident with a hotspot for allelic recombination.

    Lindsay SJ, Khajavi M, Lupski JR and Hurles ME

    Insights into the origins of structural variation and the mutational mechanisms underlying genomic disorders would be greatly improved by a genomewide map of hotspots of nonallelic homologous recombination (NAHR). Moreover, our understanding of sequence variation within the duplicated sequences that are substrates for NAHR lags far behind that of sequence variation within the single-copy portion of the genome. Perhaps the best-characterized NAHR hotspot lies within the 24-kb-long Charcot-Marie-Tooth disease type 1A (CMT1A)-repeats (REPs) that sponsor deletions and duplications that cause peripheral neuropathies. We investigated structural and sequence diversity within the CMT1A-REPs, both within and between species. We discovered a high frequency of retroelement insertions, accelerated sequence evolution after duplication, extensive paralogous gene conversion, and a greater than twofold enrichment of SNPs in humans relative to the genome average. We identified an allelic recombination hotspot underlying the known NAHR hotspot, which suggests that the two processes are intimately related. Finally, we used our data to develop a novel method for inferring the location of an NAHR hotspot from sequence variation within segmental duplications and applied it to identify a putative NAHR hotspot within the LCR22 repeats that sponsor velocardiofacial syndrome deletions. We propose that a large-scale project to map sequence variation within segmental duplications would reveal a wealth of novel chromosomal-rearrangement hotspots.

  • Chromosome-engineered mouse models

  • Carbon partitioning and export in transgenic Arabidopsis thaliana with altered capacity for sucrose synthesis grown at low temperature: a role for metabolite transporters.

    Lundmark M, Cavaco AM, Trevanion S and Hurry V

    We investigated the role of metabolite transporters in cold acclimation by comparing the responses of wild-type (WT) Arabidopsis thaliana (Heynh.) with that of transgenic plants over-expressing sucrose-phosphate synthase (SPSox) or with that of antisense repression of cytosolic fructose-1,6-bisphosphatase (FBPas). Plants were grown at 23 degrees C and then shifted to 5 degrees C. We compared the leaves shifted to 5 degrees C for 3 and 10 d with new leaves that developed at 5 degrees C with control leaves on plants at 23 degrees C. At 23 degrees C, ectopic expression of SPS resulted in 30% more carbon being fixed per day and an increase in sucrose export from source leaves. This increase in fixation and export was supported by increased expression of the plastidic triose-phosphate transporter AtTPT and, to a lesser extent, the high-affinity Suc transporter AtSUC1. The improved photosynthetic performance of the SPSox plants was maintained after they were shifted to 5 degrees C and this was associated with further increases in AtSUC1 expression but with a strong repression of AtTPT mRNA abundance. Similar responses were shown by WT plants during acclimation to low temperature and this response was attenuated in the low sucrose producing FBPas plants. These data suggest that a key element in recovering flux through carbohydrate metabolism in the cold is to control the partitioning of metabolites between the chloroplast and the cytosol, and Arabidopsis modulates the expression of AtTPT to maintain balanced carbon flow. Arabidopsis also up-regulates the expression of AtSUC1, and to lesser extent AtSUC2, as down-stream components facilitate sucrose transport in leaves that develop at low temperatures.

  • A novel role for cyclic guanosine 3',5'monophosphate signaling in synaptic plasticity: a selective suppressor of protein kinase A-dependent forms of long-term potentiation.

    Makhinson M, Opazo P, Carlisle HJ, Godsil B, Grant SG and O'Dell TJ

    At excitatory synapses onto hippocampal CA1 pyramidal cells, activation of cyclic AMP-dependent protein kinase and subsequent down-regulation of protein phosphatases has a crucial role in the induction of long-term potentiation by low-frequency patterns of synaptic stimulation. Because the second messenger cyclic guanosine 3',5'monophosphate can regulate the activity of different forms of the cyclic AMP degrading enzyme phosphodiesterase, we examined whether increases in cyclic guanosine 3',5'monophosphate can modulate long-term potentiation induction in the mouse hippocampal CA1 region through effects on cyclic AMP signaling. Using the cyclic guanosine 3',5'monophosphate-specific phosphodiesterase inhibitor zaprinast or the nitric oxide donor S-nitroso-D,L-penicillamine to elevate cyclic guanosine 3',5'monophosphate levels we found that increases in cyclic guanosine 3',5'monophosphate strongly inhibit the induction of long-term potentiation by low-frequency patterns of synaptic stimulation where protein kinase A activation is required for long-term potentiation induction. In contrast, zaprinast and S-nitroso-D,L-penicillamine had no effect on the induction of long-term potentiation by high-frequency patterns of synaptic stimulation that induce long-term potentiation in a protein kinase A-independent manner. Directly activating protein kinase A with the phosphodiesterase-resistant cyclic AMP analog 8-Br-cAMP, blocking all phosphodiesterases with 3-isobutyl-1-methylxanthine, or inhibiting protein phosphatases rescued long-term potentiation induction in zaprinast-treated slices. Together, these results suggest that increases in cyclic guanosine 3',5'monophosphate inhibit long-term potentiation by activating phosphodiesterases that interfere with the protein kinase A-mediated suppression of protein phosphatases needed for long-term potentiation induction. Consistent with the notion that this cyclic guanosine 3',5'monophosphate-mediated inhibitory pathway is recruited by some patterns of synaptic activity, blocking cyclic guanosine 3',5'monophosphate production strongly facilitated the induction of long-term potentiation by long trains of theta-frequency synaptic stimulation. Together, our results indicate that increases in cyclic guanosine 3',5'monophosphate can act as a long-term potentiation suppressor mechanism that selectively constrains the induction of protein kinase A-dependent forms of long-term potentiation induced by low-frequency patterns of synaptic stimulation.

  • Helicobacter pylori has stimulatory effects on naive T cells.

    Malfitano AM, Cahill R, Mitchell P, Frankel G, Dougan G, Bifulco M, Lombardi G, Lechler RI and Bamford KB

    Background: Despite an apparently active host response, Helicobacter pylori infection can persist for life. Unexpectedly, T cells from apparently uninfected individuals respond to H. pylori antigen by proliferating. Also, the T-cell proliferative response appears to be less in infected compared with uninfected individuals.

    We have investigated the T-cell response of isolated human peripheral blood, naive, and memory CD4+ T cells to H. pylori antigen in infected and uninfected subjects.

    Results: In agreement with previous findings, the peripheral blood proliferative response was higher in uninfected compared with infected subjects. Interestingly, there was a response in CD4+ CD45RO+ (memory) and CD4+CD45RA+ (naive) subsets. The RO/RA ratio of the response to H. pylori antigen was 0.8-2.1 in both H. pylori-positive and H. pylori-negative subjects, which was similar to that of a known superantigen (2.5 and 2.2 in Helicobacter-positive and -negative subjects, respectively) whereas the RO/RA response ratio to a recall antigen (tetanus toxoid) was 9.8 and 18.7 in Helicobacter-positive and -negative subjects, respectively. Mononuclear cells isolated from cord blood also responded to H. pylori antigen, whereas there was no response to tetanus toxoid. The cord blood response and CD4+ CD45RA+ cell response to H. pylori antigen were inhibited predominantly by anti-HLA-DR and to some extent by anti-HLA-DQ antibodies. Investigation of the response to five different recombinant H. pylori antigens identified two that produced a response in naive T cells.

    Conclusions: These data suggest that H. pylori possesses molecules that cause higher than expected proliferation of naive T cells.

  • Identification of novel deletion breakpoints bordered by segmental duplications in the NF1 locus using high resolution array-CGH.

    Mantripragada KK, Thuresson AC, Piotrowski A, Díaz de Ståhl T, Menzel U, Grigelionis G, Ferner RE, Griffiths S, Bolund L, Mautner V, Nordling M, Legius E, Vetrie D, Dahl N, Messiaen L, Upadhyaya M, Bruder CE and Dumanski JP

    Background: Segmental duplications flanking the neurofibromatosis type 1 (NF1) gene locus on 17q11 mediate most gene deletions in NF1 patients. However, the large size of the gene and the complexity of the locus architecture pose difficulties in deletion analysis. We report the construction and application of the first NF1 locus specific microarray, covering 2.24 Mb of 17q11, using a non-redundant approach for array design. The average resolution of analysis for the array is approximately 12 kb per measurement point with an increased average resolution of 6.4 kb for the NF1 gene.

    Methods: We performed a comprehensive array-CGH analysis of 161 NF1 derived samples and identified heterozygous deletions of various sizes in 39 cases. The typical deletion was identified in 26 cases, whereas 13 samples showed atypical deletion profiles.

    Results: The size of the atypical deletions, contained within the segment covered by the array, ranged from 6 kb to 1.6 Mb and their breakpoints could be accurately determined. Moreover, 10 atypical deletions were observed to share a common breakpoint either on the proximal or distal end of the deletion. The deletions identified by array-CGH were independently confirmed using multiplex ligation-dependent probe amplification. Bioinformatic analysis of the entire locus identified 33 segmental duplications.

    Conclusions: We show that at least one of these segmental duplications, which borders the proximal breakpoint located within the NF1 intron 1 in five atypical deletions, might represent a novel hot spot for deletions. Our array constitutes a novel and reliable tool offering significantly improved diagnostics for this common disorder.

  • Tmc1 is necessary for normal functional maturation and survival of inner and outer hair cells in the mouse cochlea.

    Marcotti W, Erven A, Johnson SL, Steel KP and Kros CJ

    The deafness (dn) and Beethoven (Bth) mutant mice are models for profound congenital deafness (DFNB7/B11) and progressive hearing loss (DFNA36), respectively, caused by recessive and dominant mutations of transmembrane cochlear-expressed gene 1 (TMC1), which encodes a transmembrane protein of unknown function. In the mouse cochlea Tmc1 is expressed in both outer (OHCs) and inner (IHCs) hair cells from early stages of development. Immature hair cells of mutant mice seem normal in appearance and biophysical properties. From around P8 for OHCs and P12 for IHCs, mutants fail to acquire (dn/dn) or show reduced expression (Bth/Bth and, to a lesser extent Bth/+) of the K+ currents which contribute to their normal functional maturation (the BK-type current IK,f in IHCs, and the delayed rectifier IK,n in both cell types). Moreover, the exocytotic machinery in mutant IHCs does not develop normally as judged by the persistence of immature features of the Ca2+ current and exocytosis into adulthood. Mutant mice exhibited progressive hair cell damage and loss. The compound action potential (CAP) thresholds of Bth/+ mice were raised and correlated with the degree of hair cell loss. Homozygous mutants (dn/dn and Bth/Bth) never showed CAP responses, even at ages where many hair cells were still present in the apex of the cochlea, suggesting their hair cells never function normally. We propose that Tmc1 is involved in trafficking of molecules to the plasma membrane or serves as an intracellular regulatory signal for differentiation of immature hair cells into fully functional auditory receptors.

  • The more the merrier: comparative analysis of microarray studies on cell cycle-regulated genes in fission yeast.

    Marguerat S, Jensen TS, de Lichtenberg U, Wilhelm BT, Jensen LJ and Bähler J

    The last two years have seen the publication of three genome-wide gene expression studies of the fission yeast cell cycle. While these microarray papers largely agree on the main patterns of cell cycle-regulated transcription and its control, there are discrepancies with regard to the identity and numbers of periodically expressed genes. We present benchmark and reproducibility analyses showing that the main discrepancies do not reflect differences in the data themselves (microarray or synchronization methods seem to lead only to minor biases) but rather in the interpretation of the data. Our reanalysis of the three datasets reveals that combining all independent information leads to an improved identification of periodically expressed genes. These evaluations suggest that the available microarray data do not allow reliable identification of more than about 500 cell cycle-regulated genes. The temporal expression pattern of the top 500 periodically expressed genes is generally consistent across experiments and the three studies, together with our integrated analysis, provide a coherent and rich source of information on cell cycle-regulated gene expression in Schizosaccharomyces pombe. The reanalysed datasets and other supplementary information are available from an accompanying website: We hope that this paper will resolve the apparent discrepancies between the previous studies and be useful both for wet-lab biologists and for theoretical scientists who wish to take advantage of the data for follow-up work.

  • Cip1 and Cip2 are novel RNA-recognition-motif proteins that counteract Csx1 function during oxidative stress.

    Martín V, Rodríguez-Gabriel MA, McDonald WH, Watt S, Yates JR, Bähler J and Russell P

    Eukaryotic cells reprogram their global patterns of gene expression in response to stress. Recent studies in Schizosaccharomyces pombe showed that the RNA-binding protein Csx1 plays a central role in controlling gene expression during oxidative stress. It does so by stabilizing atf1(+) mRNA, which encodes a subunit of a bZIP transcription factor required for gene expression during oxidative stress. Here, we describe two related proteins, Cip1 and Cip2, that were identified by multidimensional protein identification technology (MudPIT) as proteins that coprecipitate with Csx1. Cip1 and Cip2 are cytoplasmic proteins that have RNA recognition motifs (RRMs). Neither protein is essential for viability, but a cip1Delta cip2Delta strain grows poorly and has altered cellular morphology. Genetic epistasis studies and whole genome expression profiling show that Cip1 and Cip2 exert posttranscriptional control of gene expression in a manner that is counteracted by Csx1. Notably, the sensitivity of csx1Delta cells to oxidative stress and their inability to induce expression of Atf1-dependent genes are partially rescued by cip1Delta and cip2Delta mutations. This study emphasizes the importance of a modulated mRNA stability in the eukaryotic stress response pathways and adds new information to the role of RNA-binding proteins in the oxidative stress response.

  • A simple vector system to improve performance and utilisation of recombinant antibodies.

    Martin CD, Rojas G, Mitchell JN, Vincent KJ, Wu J, McCafferty J and Schofield DJ

    Background: Isolation of recombinant antibody fragments from antibody libraries is well established using technologies such as phage display. Phage display vectors are ideal for efficient display of antibody fragments on the surface of bacteriophage particles. However, they are often inefficient for expression of soluble antibody fragments, and sub-cloning of selected antibody populations into dedicated soluble antibody fragment expression vectors can enhance expression.

    Results: We have developed a simple vector system for expression, dimerisation and detection of recombinant antibody fragments in the form of single chain Fvs (scFvs). Expression is driven by the T7 RNA polymerase promoter in conjunction with the inducible lysogen strain BL21 (DE3). The system is compatible with a simple auto-induction culture system for scFv production. As an alternative to periplasmic expression, expression directly in the cytoplasm of a mutant strain with a more oxidising cytoplasmic environment (Origami 2 (DE3)) was investigated and found to be inferior to periplasmic expression in BL21 (DE3) cells. The effect on yield and binding activity of fusing scFvs to the N terminus of maltose binding protein (a solubility enhancing partner), bacterial alkaline phosphatase (a naturally dimeric enzymatic reporter molecule), or the addition of a free C-terminal cysteine was determined. Fusion of scFvs to the N-terminus of maltose binding protein increased scFv yield but binding activity of the scFv was compromised. In contrast, fusion to the N-terminus of bacterial alkaline phosphatase led to an improved performance. Alkaline phosphatase provides a convenient tag allowing direct enzymatic detection of scFv fusions within crude extracts without the need for secondary reagents. Alkaline phosphatase also drives dimerisation of the scFv leading to an improvement in performance compared to monovalent constructs. This is illustrated by ELISA, western blot and immunohistochemistry.

    Conclusion: Nine scFv expression vectors have been generated and tested. Three vectors showed utility for expression of functional scFv fragments. One vector, pSANG14-3F, produces scFv-alkaline phosphatase fusion molecules which offers a simple, convenient and sensitive way of determining the reactivity of recombinant antibody fragments in a variety of common assay systems.

  • Two quantitative trait loci affecting progressive hearing loss in 101/H mice.

    Mashimo T, Erven AE, Spiden SL, Guénet JL and Steel KP

    Although recent progress in identifying genes involved in deafness has been remarkable, the genetic basis of progressive hearing loss (or age-related hearing loss) is poorly understood because of the extreme difficulty in studying such a late-onset, complex disease in human populations. Several inbred strains of mice such as 129P1/ReJ, C57BL/6J, DBA/2J, and BALB/cByJ have been reported to exhibit age-related hearing loss and provide valuable models for human nonsyndromic progressive deafness. In this article we show that 101/H mice also exhibit progressive deafness with early onset. Linkage analysis of F(2) populations derived from crosses between the 101/H and the MAI/Pas and MBT/Pas wild-derived mice suggested at least two major quantitative trait loci (QTLs) that influence progressive hearing loss. A first QTL, designated Phl1, was mapped with a maximum LOD score of 6.7 to the centromeric region of Chromosome 17, where no deafness-related QTL has been mapped so far. A second QTL, designated Phl2, mapped to Chromosome 10 and exhibited a maximum LOD score of 5.3. The map position of Phl2 near the well-known QTL of age-related hearing loss (Ahl) suggested the possibility of allelism, although the Ahl mutation itself did not segregate in these crosses. Finally, we found some evidence of epistatic interaction between Phl1 and Phl2.

  • Global roles of Ste11p, cell type, and pheromone in the control of gene expression during early sexual differentiation in fission yeast.

    Mata J and Bähler J

    Fission yeast cells belong to one of two specialized cell types, M or P. Specific environmental conditions trigger sexual differentiation, which leads to an internal program starting with pheromone signaling between M and P cells, followed by mating, meiosis, and sporulation. The initial steps of this process are controlled by Ste11p, a master transcriptional regulator that activates the expression of cell type-specific genes (only expressed in either M or P cells) as well as genes expressed in both M and P cells. Pheromone signaling is activated by Ste11p-dependent transcription and, in turn, enhances some of this transcription in a positive feedback. To obtain a genomewide view of Ste11p target genes, their cell-type specificity, and their dependence on pheromone, we used DNA microarrays along with different genetic and environmental manipulations of fission yeast cells. We identified 78 Ste11p-dependent genes, 12 and 4 of which are only expressed in M and P cells, respectively. These genes show differing grades of pheromone dependencies for Ste11p-activated transcription, ranging from complete independence to complete dependence on pheromone. We systematically deleted all novel cell type-specific genes and characterized their phenotype during sexual differentiation. A comparison with a similar data set from the distantly related budding yeast reveals striking conservation in both number and types of the proteins that define cell types. Given the divergent mechanisms regulating cell type-specific gene expression, our results highlight the plasticity of regulatory circuits, which evolve to allow adaptation to changing environments and lifestyles.

  • Ancient duplicated conserved noncoding elements in vertebrates: a genomic and functional analysis.

    McEwen GK, Woolfe A, Goode D, Vavouri T, Callaway H and Elgar G

    Fish-mammal genomic comparisons have proved powerful in identifying conserved noncoding elements likely to be cis-regulatory in nature, and the majority of those tested in vivo have been shown to act as tissue-specific enhancers associated with genes involved in transcriptional regulation of development. Although most of these elements share little sequence identity to each other, a small number are remarkably similar and appear to be the product of duplication events. Here, we searched for duplicated conserved noncoding elements in the human genome, using comparisons with Fugu to select putative cis-regulatory sequences. We identified 124 families of duplicated elements, each containing between two and five members, that are highly conserved within and between vertebrate genomes. In 74% of cases, we were able to assign a specific set of paralogous genes with annotation relating to transcriptional regulation and/or development to each family, thus removing much of the ambiguity in identifying associated genes. We find that duplicate elements have the potential to up-regulate reporter gene expression in a tissue-specific manner and that expression domains often overlap, but are not necessarily identical, between family members. Over two thirds of the families are conserved in duplicate in fish and appear to predate the large-scale duplication events thought to have occurred at the origin of vertebrates. We propose a model whereby gene duplication and the evolution of cis-regulatory elements can be considered in the context of increased morphological diversity and the emergence of the modern vertebrate body plan.

  • Canine RPGRIP1 mutation establishes cone-rod dystrophy in miniature longhaired dachshunds as a homologue of human Leber congenital amaurosis.

    Mellersh CS, Boursnell ME, Pettitt L, Ryder EJ, Holmes NG, Grafham D, Forman OP, Sampson J, Barnett KC, Blanton S, Binns MM and Vaudin M

    Cone-rod dystrophy 1 (cord1) is a recessive condition that occurs naturally in miniature longhaired dachshunds (MLHDs). We mapped the cord1 locus to a region of canine chromosome CFA15 that is syntenic with a region of human chromosome 14 (HSA14q11.2) containing the retinitis pigmentosa GTPase regulator-interacting protein 1 (RPGRIP1) gene. Mutations in RPGRIP1 have been shown to cause Leber congenital amaurosis, a group of retinal dystrophies that represent the most common genetic causes of congenital visual impairment in infants and children. Using the newly available canine genome sequence we sequenced RPGRIP1 in affected and carrier MLHDs and identified a 44-nucleotide insertion in exon 2 that alters the reading frame and introduces a premature stop codon. All affected and carrier dogs within an extended inbred pedigree were homozygous and heterozygous, respectively, for the mutation. We conclude the mutation is responsible for cord1 and demonstrate that this canine disease is a valuable model for exploring disease mechanisms and potential therapies for human Leber congenital amaurosis.

  • Defective postnatal development of the male reproductive tract in LGR4 knockout mice.

    Mendive F, Laurent P, Van Schoore G, Skarnes W, Pochet R and Vassart G

    Institut de Recherche en Biologie Humaine et Moléculaire (IRIBHM), University of Brussels (ULB), Campus Erasme, 808 Route de Lennik, B-1070 Brussels, Belgium.

  • Mapping trait loci by use of inferred ancestral recombination graphs.

    Minichiello MJ and Durbin R

    Large-scale association studies are being undertaken with the hope of uncovering the genetic determinants of complex disease. We describe a computationally efficient method for inferring genealogies from population genotype data and show how these genealogies can be used to fine map disease loci and interpret association signals. These genealogies take the form of the ancestral recombination graph (ARG). The ARG defines a genealogical tree for each locus, and, as one moves along the chromosome, the topologies of consecutive trees shift according to the impact of historical recombination events. There are two stages to our analysis. First, we infer plausible ARGs, using a heuristic algorithm, which can handle unphased and missing data and is fast enough to be applied to large-scale studies. Second, we test the genealogical tree at each locus for a clustering of the disease cases beneath a branch, suggesting that a causative mutation occurred on that branch. Since the true ARG is unknown, we average this analysis over an ensemble of inferred ARGs. We have characterized the performance of our method across a wide range of simulated disease models. Compared with simpler tests, our method gives increased accuracy in positioning untyped causative loci and can also be used to estimate the frequencies of untyped causative alleles. We have applied our method to Ueda et al.'s association study of CTLA4 and Graves disease, showing how it can be used to dissect the association signal, giving potentially interesting results of allelic heterogeneity and interaction. Similar approaches analyzing an ensemble of ARGs inferred using our method may be applicable to many other problems of inference from population genotype data.

  • Immunogenomics: molecular hide and seek.

    Miretti MM and Beck S

    Similar to other classical science disciplines, immunology has been embracing novel technologies and approaches giving rise to specialised sub-disciplines such as immunogenetics and, more recently, immunogenomics, which, in many ways, is the genome-wide application of immunogenetic approaches. Here, recent progress in the understanding of the immune sub-genome will be reviewed, and the ways in which immunogenomic datasets consisting of genetic and epigenetic variation, linkage disequilibrium and recombination can be harnessed for disease association and evolutionary studies will be discussed. The discussion will focus on data available for the major histocompatibility complex and the leukocyte receptor complex, the two most polymorphic regions of the human immune sub-genome.

  • Mammographic density and breast cancer risk in BRCA1 and BRCA2 mutation carriers.

    Mitchell G, Antoniou AC, Warren R, Peock S, Brown J, Davies R, Mattison J, Cook M, Warsi I, Evans DG, Eccles D, Douglas F, Paterson J, Hodgson S, Izatt L, Cole T, Burgess L, Eeles R and Easton DF

    High breast density as measured on mammograms is a strong risk factor for breast cancer in the general population, but its effect in carriers of germline BRCA1 and BRCA2 mutations is unclear. We obtained mammograms from 206 female carriers of BRCA1 or BRCA2 mutations, 96 of whom were subsequently diagnosed with breast cancer and 136 relatives of carriers who were themselves noncarriers. We compared the mammographic densities of affected carriers (cases) and unaffected carriers (controls), and of mutation carriers and noncarriers, using a computer-assisted method of measurement and visual assessment by two observers. Analyses were adjusted for age, parity, body mass index, menopausal status, and hormone replacement therapy use. There was no difference in the mean percent density between noncarriers and carriers. Among carriers, increasing mammographic density was associated with an increased risk of breast cancer (P(trend) = 0.024). The odds ratio (OR; 95% confidence interval) for breast cancer associated with a density of > or =50% was 2.29 (1.23-4.26; P = 0.009). The OR did not differ between BRCA1 and BRCA2 carriers or between premenopausal and postmenopausal carriers. The results suggest that the distribution of breast density in BRCA1 and BRCA2 carriers is similar to that in non-carriers. High breast density in carriers is associated with an increased risk of breast cancer, with the relative risk being similar to that observed in the general population. Use of mammographic density could improve individual risk prediction in carriers.

    Cancer research 2006;66;3;1866-72

  • Detection of novel Y SNPs provides further insights into Y chromosomal variation in Pakistan.

    Mohyuddin A, Ayub Q, Underhill PA, Tyler-Smith C and Mehdi SQ

    Biomedical and Genetic Engineering Laboratories, G. P. O Box 2891, 44000, Islamabad, Pakistan.

    Biallelic polymorphisms on the Y chromosome have been extensively used to study the history, evolution, and migration patterns of world populations. In this study we screened 8.5 kb of Y chromosomal DNA for single nucleotide polymorphisms (SNPs) in a panel of 95 male individuals belonging to different haplogroups. Five novel Y-SNPs (PK1-5) were identified, four in the Pakistani sample and one in an African sample. The ancestral state of each SNP was determined in two chimpanzee samples and a variety of Pakistani ethnic groups. In addition to these novel Y-SNPs 77 additional markers on the Y chromosome were analyzed to place the SNPs on the phylogenetic tree of Y chromosomal lineages and to further investigate extant human Y chromosomal variation within Pakistan. BATWING analysis gave an estimate of between 2,500 and 7,300 YBP for population expansion in Pakistan which coincides with the period of the Indus Valley civilizations.

    Journal of human genetics 2006;51;4;375-8

  • Genomic profiling identifies discrete deletions associated with translocations in glioblastoma multiforme.

    Mulholland PJ, Fiegler H, Mazzanti C, Gorman P, Sasieni P, Adams J, Jones TA, Babbage JW, Vatcheva R, Ichimura K, East P, Poullikas C, Collins VP, Carter NP, Tomlinson IP and Sheer D

    Human Cytogenetics Laboratory, Cancer Research, UK.

    Glioblastoma multiforme is the most common tumor arising in the central nervous system. Patients with these tumors have limited treatment options and their disease is invariably fatal. Molecularly targeted agents offer the potential to improve patient treatment, however the use of these will require a fuller understanding of the genetic changes in these complex tumors. In this study, we identify copy number changes in a series of glioblastoma multiforme tumors and cell lines by applying high-resolution microarray comparative genomic hybridization. Molecular cytogenetic characterization of the cell lines revealed that copy number changes define translocation breakpoints. We focused on chromosome 6 and further characterized three regions of copy number change associated with translocations including a discrete deletion involving IGF2R, PARK2, PACRG and QKI and an unbalanced translocation involving POLH, GTPBP2 and PTPRZ1.

    Funded by: Cancer Research UK: A3585, A6618

    Cell cycle (Georgetown, Tex.) 2006;5;7;783-91

  • Molecular analysis of fluoroquinolone-resistant Salmonella Paratyphi A isolate, India.

    Nair S, Unnikrishnan M, Turner K, Parija SC, Churcher C, Wain J and Harish N

    The Wellcome Trust Sanger Institute, Cambridgeshire, United Kingdom.

    Salmonella enterica serovar Paratyphi A is increasingly a cause of enteric fever. Sequence analysis of an Indian isolate showed a unique strain with high-level resistance to ciprofloxacin associated with double mutations in the DNA gyrase subunit gyrA (Ser83-->Phe and Asp87-->Gly) and a mutation in topoisomerase IV subunit parC (Ser80-->Arg).

    Funded by: Wellcome Trust

    Emerging infectious diseases 2006;12;3;489-91

  • Array CGH profiling of favourable histology Wilms tumours reveals novel gains and losses associated with relapse.

    Natrajan R, Williams RD, Hing SN, Mackay A, Reis-Filho JS, Fenwick K, Iravani M, Valgeirsson H, Grigoriadis A, Langford CF, Dovey O, Gregory SG, Weber BL, Ashworth A, Grundy PE, Pritchard-Jones K and Jones C

    Paediatric Oncology, Institute of Cancer Research/Royal Marsden NHS Trust, Sutton, Surrey SM2 5NG, UK.

    Despite the excellent survival of Wilms tumour patients treated with multimodality therapy, approximately 15% will suffer from tumour relapse, where response rates are markedly reduced. We have carried out microarray-based comparative genomic hybridisation on a series of 76 Wilms tumour samples, enriched for cases which recurred, to identify changes in DNA copy number associated with clinical outcome. Using 1Mb-spaced genome-wide BAC arrays, the most significantly different genomic changes between favourable histology tumours that did (n = 37), and did not (n = 39), subsequently relapse were gains on 1q, and novel deletions at 12q24 and 18q21. Further relapse-associated loci included losses at 1q32.1, 2q36.3-2q37.1, and gain at 13q31. 1q gains correlated strongly with loss of 1p and/or 16q. In 3 of 11 cases with concurrent 1p(-)/1q(+), a breakpoint was identified at 1p13. Multiple low-level sub-megabase gains along the length of 1q were identified using chromosome 1 tiling-path arrays. One such recurrent region at 1q22-q23.1 included candidate genes RAB25, NES, CRABP2, HDGF and NTRK1, which were screened for mRNA expression using quantitative RT-PCR. These data provide a high-resolution catalogue of genomic copy number changes in relapsing favourable histology Wilms tumours.

    The Journal of pathology 2006;210;1;49-58

  • Identification of inhibitors of the kinase activity of oncogenic V600E BRAF in an enzyme cascade high-throughput screen.

    Newbatt Y, Burns S, Hayward R, Whittaker S, Kirk R, Marshall C, Springer C, McDonald E, Cancer Genome Project, Marais R, Workman P and Aherne W

    Cancer Research UK Centre for Cancer Therapeutics, Haddow Laboratories, The Institute of Cancer Research, Sutton, UK.

    The Cancer Genome Project has identified several oncogenic mutations in BRAF that represent important opportunities for cancer drug discovery. The V600E BRAF mutation accounts for approximately 90% of the mutations identified. A strong case has emerged from molecular, cellular, and structural studies for the identification and development of inhibitors of this mutated BRAF protein. The authors have developed and run a high-throughput screen to find inhibitors of V600E BRAF using an enzyme cascade assay in which oncogenic BRAF activates MEK1, which in turn activates ERK2, which then phosphorylates the transcription factor ELK1. A phosphospecific antibody, Europium-labeled secondary antibody, and a time-resolved fluorescent readout were used to measure phosphorylation of ELK1. Overall assay variation was 12.4%. The assay was used to screen 64,000 compounds with an overall Z' factor of 0.58 +/- 0.12. A series of 3,5, di-substituted pyridines were identified as inhibitors of the cascade assay. These compounds did not inhibit a shortened activated MEK1 to ELK1 cascade but were active (0.5-27.9 microM) in a V600E BRAF assay and represent a potential starting point for future drug discovery and development.

    Journal of biomolecular screening 2006;11;2;145-54

  • A deletion defining a common Asian lineage of Mycobacterium tuberculosis associates with immune subversion.

    Newton SM, Smith RJ, Wilkinson KA, Nicol MP, Garton NJ, Staples KJ, Stewart GR, Wain JR, Martineau AR, Fandrich S, Smallie T, Foxwell B, Al-Obaidi A, Shafi J, Rajakumar K, Kampmann B, Andrew PW, Ziegler-Heitbrock L, Barer MR and Wilkinson RJ

    Wellcome Trust Center for Research in Clinical Tropical Medicine, Center for Molecular Microbiology and Infection, and Kennedy Institute of Rheumatology, Faculty of Medicine, Imperial College London, London W2 1PG, United Kingdom.

    Six major lineages of Mycobacterium tuberculosis appear preferentially transmitted amongst distinct ethnic groups. We identified a deletion affecting Rv1519 in CH, a strain isolated from a large outbreak in Leicester U.K., that coincidentally defines the East African-Indian lineage matching a major ethnic group in this city. In broth media, CH grew less rapidly and was less acidic and H2O2-tolerant than reference sequenced strains (CDC1551 and H37Rv). Nevertheless, CH was not impaired in its ability to grow in human monocyte-derived macrophages. When compared with CDC1551 and H37Rv, CH induced less protective IL-12p40 and more antiinflammatory IL-10 and IL-6 gene transcription and secretion from monocyte-derived macrophages. It thus appears that CH compensates microbiological attenuation by skewing the innate response toward phagocyte deactivation. Complementation of Rv1519, but none of nine additional genes absent from CH compared with the type strain, H37Rv, reversed the capacity of CH to elicit antiinflammatory IL-10 production by macrophages. The Rv1519 polymorphism in M. tuberculosis confers an immune subverting phenotype that contributes to the persistence and outbreak potential of this lineage.

    Funded by: Wellcome Trust: 072070, 077273

    Proceedings of the National Academy of Sciences of the United States of America 2006;103;42;15594-8

  • Factors affecting flow karyotype resolution.

    Ng BL and Carter NP

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Background: One of the major factors which influences the chromosome purity achievable particularly during high speed sorting is the analytical resolution of individual chromosome peaks in the flow karyotype, as well as the amount of debris and fragmented chromosomes. We have investigated the factors involved in the preparation of chromosome suspensions that influence karyotype resolution.

    Methods: Chromosomes were isolated from various human and animal cell types using a series of polyamine buffer isolation protocols modified with respect to pH, salt concentration, and chromosome staining time. Each preparation was analyzed on a MoFlo sorter (DAKO) configured for high speed sorting and the resolution of the flow karyotypes compared.

    Results: High resolution flow cytometric data was obtained with chromosomes optimally isolated using hypotonic solution buffered at pH 8.0 and polyamine isolation buffer (with NaCl excluded) between pH 7.50 and 8.0. Extending staining time to more than 8 h with chromosome suspensions isolated from cell lines subjected to sufficient metaphase arrest times gave the best result with the lowest percentage of debris generated, tighter chromosome peaks with overall lower coefficients of variation, and a 1- to 5-fold increase in the yield of isolated chromosomes.

    Conclusions: Optimization of buffer pH and the length of staining improved karyotype resolution particularly for larger chromosomes and reduced the presence of chromosome fragments (debris). However, the most interesting and surprising finding was that the exclusion of NaCl in PAB buffer improved the yield and resolution of larger chromosomes.

    Cytometry. Part A : the journal of the International Society for Analytical Cytology 2006;69;9;1028-36

  • Secretin receptor-deficient mice exhibit impaired synaptic plasticity and social behavior.

    Nishijima I, Yamagata T, Spencer CM, Weeber EJ, Alekseyenko O, Sweatt JD, Momoi MY, Ito M, Armstrong DL, Nelson DL, Paylor R and Bradley A

    Center for Molecular and Human Genetics, Columbus Children's Research Institute, The Ohio State University, Columbus, OH 43205, USA.

    Secretin is a peptide hormone released from the duodenum to stimulate the secretion of digestive juice by the pancreas. Secretin also functions as a neuropeptide hormone in the brain, and exogenous administration has been reported to alleviate symptoms in some patients with autism. We have generated secretin receptor-deficient mice to explore the relationship between secretin signaling in the brain and behavioral phenotypes. Secretin receptor-deficient mice are overtly normal and fertile; however, synaptic plasticity in the hippocampus is impaired and there are slightly fewer dendritic spines in the CA1 hippocampal pyramidal cells. Furthermore, secretin receptor-deficient mice show abnormal social and cognitive behaviors. These findings suggest that the secretin receptor system has an important role in the central nervous system relating to social behavior.

    Funded by: NICHD NIH HHS: HD24064, HD29256; Wellcome Trust: 077187

    Human molecular genetics 2006;15;21;3241-50

  • The International Gene Trap Consortium Website: a portal to all publicly available gene trap cell lines in mouse.

    Nord AS, Chang PJ, Conklin BR, Cox AV, Harper CA, Hicks GG, Huang CC, Johns SJ, Kawamoto M, Liu S, Meng EC, Morris JH, Rossant J, Ruiz P, Skarnes WC, Soriano P, Stanford WL, Stryke D, von Melchner H, Wurst W, Yamamura K, Young SG, Babbitt PC and Ferrin TE

    University of California San Francisco, 600 16th Street, San Francisco, CA 94143-2240, USA.

    Gene trapping is a method of generating murine embryonic stem (ES) cell lines containing insertional mutations in known and novel genes. A number of international groups have used this approach to create sizeable public cell line repositories available to the scientific community for the generation of mutant mouse strains. The major gene trapping groups worldwide have recently joined together to centralize access to all publicly available gene trap lines by developing a user-oriented Website for the International Gene Trap Consortium (IGTC). This collaboration provides an impressive public informatics resource comprising approximately 45 000 well-characterized ES cell lines which currently represent approximately 40% of known mouse genes, all freely available for the creation of knockout mice on a non-collaborative basis. To standardize annotation and provide high confidence data for gene trap lines, a rigorous identification and annotation pipeline has been developed combining genomic localization and transcript alignment of gene trap sequence tags to identify trapped loci. This information is stored in a new bioinformatics database accessible through the IGTC Website interface. The IGTC Website ( allows users to browse and search the database for trapped genes, BLAST sequences against gene trap sequence tags, and view trapped genes within biological pathways. In addition, IGTC data have been integrated into major genome browsers and bioinformatics sites to provide users with outside portals for viewing this data. The development of the IGTC Website marks a major advance by providing the research community with the data and tools necessary to effectively use public gene trap resources for the large-scale characterization of mammalian gene function.

    Funded by: NCRR NIH HHS: P41 RR01081; NHLBI NIH HHS: U01 HL66600; Wellcome Trust

    Nucleic acids research 2006;34;Database issue;D642-8

  • Generation and maintenance of Dmbx1 gene-targeted mutant alleles.

    Ohtoshi A, Bradley A, Behringer RR and Nishijima I

    Center of Molecular and Human Genetics, Children's Research Institute, 700 Children's Drive, Columbus, Ohio 43205, USA.

    Dmbx1 encodes a paired-like homeodomain protein that is expressed in neural tissues at mouse embryonic and postnatal stages. We previously generated two Dmbx1 mutant alleles, Dmbx1 (-) and Dmbx1 ( z ), by homologous recombination in mouse embryonic stem (ES) cells. In this article we report the generation of three novel Dmbx1 mutant alleles, Dmbx1 (tauZ ), Dmbx1 (tauG ), and Dmbx1 ( Cre ), that carry the intronic insertion of tau (tau)-lacZ, tau-eGFP, and Cre reporter genes, respectively. Dmbx1 (tauZ ) and Dmbx1 (tauG ) recapitulated the Dmbx1 expression, and the reporter gene expression was detected in the diencephalon and mesencephalon during embryogenesis. The crossing of Dmbx1 ( Cre ) mice with Rosa26 reporter mice identified the Cre-mediated DNA excision in the postnatal midbrain, cerebellum, medulla oblongata, and spinal cord. To maintain the Dmbx1 mutant alleles without genotyping, we crossed Dmbx1 mutant mice with Inv4(1) ( Brd ) mice that possess the inversion between D4Mit117 and D4Mit281 on Chromosome 4, where Dmbx1 is located. The intercrossing of the non-agouti (a/a) albino (Tyr ( c-Brd )/Tyr ( c-Brd )) Dmbx1 mutant mice carrying Inv4(1) ( Brd ) tagged with K14-Agouti and Tyrosinase coat-color markers resulted in the generation of dark brown Dmbx1 wild-type [Inv4(1) ( Brd )/Inv4(1) ( Brd )], light brown Dmbx1 heterozygous [Dmbx1 ( tm )/Inv4(1) ( Brd )], and albino Dmbx1 homozygous (Dmbx1 ( tm )/Dmbx1 ( tm )) mutant mice. To our knowledge, this is the first demonstration of the proof-of-principle of the maintenance of viable gene-targeted alleles using coat-color-tagged nonlethal balancer chromosomes.

    Funded by: NICHD NIH HHS: HD30284

    Mammalian genome : official journal of the International Mammalian Genome Society 2006;17;7;744-50

  • Hot and sexy moulds!

    Pain A, Böhme U and Berriman M

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Nature reviews. Microbiology 2006;4;4;244-5

  • Gene expression profiles of CD34+ cells in myelodysplastic syndromes: involvement of interferon-stimulated genes and correlation to FAB subtype and karyotype.

    Pellagatti A, Cazzola M, Giagounidis AA, Malcovati L, Porta MG, Killick S, Campbell LJ, Wang L, Langford CF, Fidler C, Oscier D, Aul C, Wainscoat JS and Boultwood J

    Leukaemia Research Fund (LRF) Molecular Haematology Unit, Nuffield Department of Clinical Laboratory Sciences (NDCLS), John Radcliffe Hospital, Oxford OX3 9DU, United Kingdom.

    To gain insight into the poorly understood pathophysiology of the myelodysplastic syndromes (MDSs), we have determined the gene expression profiles of the CD34+ cells of 55 patients with MDS by using a comprehensive array platform. These profiles showed many similarities to reported interferon-gamma-induced gene expression in normal CD34+ cells; indeed the 2 most up-regulated genes, IFIT1 and IFITM1, are interferon-stimulated genes (ISGs). Alterations in the expression of ISGs may play a role in the hematologic features of MDS, such as peripheral blood cytopenias. Up-regulation of IFIT1 is a potential diagnostic marker for MDS. We determined whether distinct gene expression profiles were associated with specific FAB and cytogenetic groups. CD34+ cells from patients with refractory anemia with ringed sideroblasts (RARS) showed a particular gene expression profile characterized by up-regulation of mitochondrial-related genes and, in particular, of those of heme synthesis (eg, ALAS2). CD34+ cells from patients with the del(5q) had a distinct gene expression profile, characterized by down-regulation of genes assigned to 5q, and up-regulation of the histone HIST1 gene cluster at chromosome 6p21 and of genes related to the actin cytoskeleton. This study provides important and new insights into the pathophysiology of MDS.

    Blood 2006;108;1;337-45

  • Simplified primer design for PCR-based gene targeting and microarray primer database: two web tools for fission yeast.

    Penkett CJ, Birtle ZE and Bähler J

    Cancer Research UK Fission Yeast Functional Genomics Group, Wellcome Trust Sanger Institute, Cambridge CB10 1HH, UK.

    PCR-based gene targeting is a popular method for manipulating yeast genes in their normal chromosomal locations. The manual design of primers, however, can be cumbersome and error-prone. We have developed a straightforward web-based tool that applies user-specified inputs to automate and simplify the task of primer selection for deletion, tagging and/or regulated expression of genes in Schizosaccharomyces pombe. This tool, named PPPP (for Pombe PCR Primer Programs), is available at We also present a searchable Microarray Primer Database to retrieve the sequences and accompanying information for primers and PCR products used to build our in-house Sz. pombe microarrays. This database contains information on both coding and intergenic regions to provide context for the microarray data, and it should be useful also for other applications, such as quantitative PCR. The database can be accessed at

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Yeast (Chichester, England) 2006;23;13;921-8

  • YOGY: a web-based, integrated database to retrieve protein orthologs and associated Gene Ontology terms.

    Penkett CJ, Morris JA, Wood V and Bähler J

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1HH, UK.

    We present YOGY a web-based resource for orthologous proteins from nine eukaryotic organisms: Homo sapiens, Mus musculus, Rattus norvegicus, Arabidopsis thaliana, Drosophila melanogaster, Caenorhabditis elegans, Plasmodium falciparum, Schizosaccharomyces pombe and Saccharomyces cerevisiae. Using a gene name from any of these organisms as a query, this database provides comprehensive, combined information on orthologs in other species using data from five independent resources: KOGs, Inparanoid, HomoloGene, OrthoMCL and a table of curated fission and budding yeast orthologs. Associated Gene Ontology (GO) terms of orthologs can also be retrieved for functional inference. Integrating these different and complementary datasets provides a straightforward tool to identify known and predicted orthologs of proteins from a variety of species. This resource should be useful for bench scientists looking for functional clues for their genes of interest as well as for curators looking for information that can be transferred based on orthology and for rapidly identifying the relevant GO terms as an aid to literature curation. YOGY is accessible online at

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Nucleic acids research 2006;34;Web Server issue;W330-4

  • Hotspots for copy number variation in chimpanzees and humans.

    Perry GH, Tchinda J, McGrath SD, Zhang J, Picker SR, Cáceres AM, Iafrate AJ, Tyler-Smith C, Scherer SW, Eichler EE, Stone AC and Lee C

    School of Human Evolution and Social Change, Arizona State University, Tempe, AZ 85287, USA.

    Copy number variation is surprisingly common among humans and can be involved in phenotypic diversity and variable susceptibility to complex diseases, but little is known of the extent of copy number variation in nonhuman primates. We have used two array-based comparative genomic hybridization platforms to identify a total of 355 copy number variants (CNVs) in the genomes of 20 wild-born chimpanzees (Pan troglodytes) and have compared the identified chimpanzee CNVs to known human CNVs from previous studies. Many CNVs were observed in the corresponding regions in both chimpanzees and humans; especially those CNVs of higher frequency. Strikingly, these loci are enriched 20-fold for ancestral segmental duplications, which may facilitate CNV formation through nonallelic homologous recombination mechanisms. Therefore, some of these regions may be unstable "hotspots" for the genesis of copy number variation, with recurrent duplications and deletions occurring across and within species.

    Proceedings of the National Academy of Sciences of the United States of America 2006;103;21;8006-11

  • DNA detection using recombination proteins.

    Piepenburg O, Williams CH, Stemple DL and Armes NA

    ASM Scientific Ltd, Cambridge, United Kingdom.

    DNA amplification is essential to most nucleic acid testing strategies, but established techniques require sophisticated equipment or complex experimental procedures, and their uptake outside specialised laboratories has been limited. Our novel approach, recombinase polymerase amplification (RPA), couples isothermal recombinase-driven primer targeting of template material with strand-displacement DNA synthesis. It achieves exponential amplification with no need for pretreatment of sample DNA. Reactions are sensitive, specific, and rapid and operate at constant low temperature. We have also developed a probe-based detection system. Key aspects of the combined RPA amplification/detection process are illustrated by a test for the pathogen methicillin-resistant Staphylococcus aureus. The technology proves to be sensitive to fewer than ten copies of genomic DNA. Furthermore, products can be detected in a simple sandwich assay, thereby establishing an instrument-free DNA testing system. This unique combination of properties is a significant advance in the development of portable and widely accessible nucleic acid-based tests.

    PLoS biology 2006;4;7;e204

  • Deciphering past human population movements in Oceania: provably optimal trees of 127 mtDNA genomes.

    Pierson MJ, Martinez-Arias R, Holland BR, Gemmell NJ, Hurles ME and Penny D

    Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand.

    The settlement of the many island groups of Remote Oceania occurred relatively late in prehistory, beginning approximately 3,000 years ago when people sailed eastwards into the Pacific from Near Oceania, where evidence of human settlement dates from as early as 40,000 years ago. Archeological and linguistic analyses have suggested the settlers of Remote Oceania had ancestry in Taiwan, as descendants of a proposed Neolithic expansion that began approximately 5,500 years ago. Other researchers have suggested that the settlers were descendants of peoples from Island Southeast Asia or the existing inhabitants of Near Oceania alone. To explore patterns of maternal descent in Oceania, we have assembled and analyzed a data set of 137 mitochondrial DNA (mtDNA) genomes from Oceania, Australia, Island Southeast Asia, and Taiwan that includes 19 sequences generated for this project. Using the MinMax Squeeze Approach (MMS), we report the consensus network of 165 most parsimonious trees for the Oceanic data set, increasing by many orders of magnitude the numbers of trees for which a provable minimal solution has been found. The new mtDNA sequences highlight the limitations of partial sequencing for assigning sequences to haplogroups and dating recent divergence events. The provably optimal trees found for the entire mtDNA sequences using the MMS method provide a reliable and robust framework for the interpretation of evolutionary relationships and confirm that the female settlers of Remote Oceania descended from both the existing inhabitants of Near Oceania and more recent migrants into the region.

    Funded by: Wellcome Trust: 077014

    Molecular biology and evolution 2006;23;10;1966-75

  • Transcriptional link between blood and bone: the stem cell leukemia gene and its +19 stem cell enhancer are active in bone cells.

    Pimanda JE, Silberstein L, Dominici M, Dekel B, Bowen M, Oldham S, Kallianpur A, Brandt SJ, Tannahill D, Göttgens B and Green AR

    Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 2XY, United Kingdom.

    Blood and vascular cells are generated during early embryogenesis from a common precursor, the hemangioblast. The stem cell leukemia gene (SCL/tal 1) encodes a basic helix-loop-helix transcription factor that is essential for the normal development of blood progenitors and blood vessels. We have previously characterized a panel of SCL enhancers including the +19 element, which directs expression to hematopoietic stem cells and endothelium. Here we demonstrate that SCL is expressed in bone primordia during embryonic development and in adult osteoblasts. Despite consistent expression in cells of the osteogenic lineage, SCL protein is not required for bone specification of embryonic stem cells. In transgenic mice, the SCL +19 core enhancer directed reporter gene expression to vascular smooth muscle and bone in addition to blood and endothelium. A 644-bp fragment containing the SCL +19 core enhancer was active in both blood and bone cell lines and was bound in vivo by a common array of Ets and GATA transcription factors. Taken together with the recent observation that a common progenitor can give rise to blood and bone cells, our results suggest that the SCL +19 enhancer targets a mesodermal progenitor capable of generating hematopoietic, vascular, and osteoblastic progeny.

    Funded by: Wellcome Trust

    Molecular and cellular biology 2006;26;7;2615-25

  • Arc/Arg3.1 is essential for the consolidation of synaptic plasticity and memories.

    Plath N, Ohana O, Dammermann B, Errington ML, Schmitz D, Gross C, Mao X, Engelsberg A, Mahlke C, Welzl H, Kobalz U, Stawrakakis A, Fernandez E, Waltereit R, Bick-Sander A, Therstappen E, Cooke SF, Blanquet V, Wurst W, Salmen B, Bösl MR, Lipp HP, Grant SG, Bliss TV, Wolfer DP and Kuhl D

    Molecular Neurobiology, Department of Biology-Chemistry-Pharmacy, Freie Universität Berlin, 14195 Berlin, Germany.

    Arc/Arg3.1 is robustly induced by plasticity-producing stimulation and specifically targeted to stimulated synaptic areas. To investigate the role of Arc/Arg3.1 in synaptic plasticity and learning and memory, we generated Arc/Arg3.1 knockout mice. These animals fail to form long-lasting memories for implicit and explicit learning tasks, despite intact short-term memory. Moreover, they exhibit a biphasic alteration of hippocampal long-term potentiation in the dentate gyrus and area CA1 with an enhanced early and absent late phase. In addition, long-term depression is significantly impaired. Together, these results demonstrate a critical role for Arc/Arg3.1 in the consolidation of enduring synaptic plasticity and memory storage.

    Neuron 2006;52;3;437-44

  • Organization of brain complexity--synapse proteome form and function.

    Pocklington AJ, Armstrong JD and Grant SG

    Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK.

    Proteomic study of the synapse has generated an extensive list of molecular components, revealing one of the most complex functional systems currently known to cell biology. While fundamental to neural information processing, behaviour and disease, the molecular organisation of the synapse and its relation to higher-level function has yet to be clearly understood. Neurotransmitter receptor complexes, such as the N-methyl-D-aspartate receptor complex (NRC/MASC), are major components of the synaptic proteome. We have recently completed a detailed study of MASC, its functional organisation and involvement in behaviour and disease. This pointed to simple design principles underlying synaptic organisation. Drawing together the results of proteomic and analytical study, we sketch out a model for synaptic functional organisation.

    Funded by: Wellcome Trust

    Briefings in functional genomics & proteomics 2006;5;1;66-73

  • The proteomes of neurotransmitter receptor complexes form modular networks with distributed functionality underlying plasticity and behaviour.

    Pocklington AJ, Cumiskey M, Armstrong JD and Grant SG

    School of Informatics, Edinburgh University, Edinburgh, UK.

    Neuronal synapses play fundamental roles in information processing, behaviour and disease. Neurotransmitter receptor complexes, such as the mammalian N-methyl-D-aspartate receptor complex (NRC/MASC) comprising 186 proteins, are major components of the synapse proteome. Here we investigate the organisation and function of NRC/MASC using a systems biology approach. Systematic annotation showed that the complex contained proteins implicated in a wide range of cognitive processes, synaptic plasticity and psychiatric diseases. Protein domains were evolutionarily conserved from yeast, but enriched with signalling domains associated with the emergence of multicellularity. Mapping of protein-protein interactions to create a network representation of the complex revealed that simple principles underlie the functional organisation of both proteins and their clusters, with modularity reflecting functional specialisation. The known functional roles of NRC/MASC proteins suggest the complex co-ordinates signalling to diverse effector pathways underlying neuronal plasticity. Importantly, using quantitative data from synaptic plasticity experiments, our model correctly predicts robustness to mutations and drug interference. These studies of synapse proteome organisation suggest that molecular networks with simple design principles underpin synaptic signalling properties with important roles in physiology, behaviour and disease.

    Funded by: Medical Research Council: G90/93; Wellcome Trust

    Molecular systems biology 2006;2;2006.0023

  • Essential and overlapping roles for laminin alpha chains in notochord and blood vessel formation.

    Pollard SM, Parsons MJ, Kamei M, Kettleborough RN, Thomas KA, Pham VN, Bae MK, Scott A, Weinstein BM and Stemple DL

    Division of Developmental Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London, UK.

    Laminins are major constituents of basement membranes and have wide ranging functions during development and in the adult. They are a family of heterotrimeric molecules created through association of an alpha, beta and gamma chain. We previously reported that two zebrafish loci, grumpy (gup) and sleepy (sly), encode laminin beta1 and gamma1, which are important both for notochord differentiation and for proper intersegmental blood vessel (ISV) formation. In this study we show that bashful (bal) encodes laminin alpha1 (lama1). Although the strongest allele, bal(m190), is fully penetrant, when compared to gup or sly mutant embryos, bal mutants are not as severely affected, as only anterior notochord fails to differentiate and ISVs are unaffected. This suggests that other alpha chains, and hence other isoforms, act redundantly to laminin 1 in posterior notochord and ISV development. We identified cDNA sequences for lama2, lama4 and lama5 and disrupted the expression of each alone or in mutant embryos also lacking laminin alpha1. When expression of laminin alpha4 and laminin alpha1 are simultaneously disrupted, notochord differentiation and ISVs are as severely affected as sly or gup mutants. Moreover, live imaging of transgenic embryos expressing enhanced green fluorescent protein in forming ISVs reveals that the vascular defects in these embryos are due to an inability of ISV sprouts to migrate correctly along the intersegmental, normally laminin-rich regions.

    Developmental biology 2006;289;1;64-76

  • Epigenetic variation and inheritance in mammals.

    Rakyan VK and Beck S

    Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

    What determines phenotype is one of the most fundamental questions in biology. Historically, the search for answers had focused on genetic or environmental variants, but recent studies in epigenetics have revealed a third mechanism that can influence phenotypic outcomes, even in the absence of genetic or environmental heterogeneity. Even more surprisingly, some epigenetic variants, or epialleles, can be inherited by the offspring, indicating the existence of a mechanism for biological heredity that is not based on DNA sequence. Recent work from mouse models, human monozygotic twin studies, and large-scale epigenetic profiling suggests that epigenetically determined phenotypes and epigenetic inheritance are more common than previously appreciated.

    Funded by: Wellcome Trust

    Current opinion in genetics & development 2006;16;6;573-7

  • MEROPS: the peptidase database.

    Rawlings ND, Morton FR and Barrett AJ

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    Peptidases (proteolytic enzymes) and their natural, protein inhibitors are of great relevance to biology, medicine and biotechnology. The MEROPS database ( aims to fulfil the need for an integrated source of information about these proteins. The organizational principle of the database is a hierarchical classification in which homologous sets of proteins of interest are grouped into families and the homologous families are grouped in clans. The most important addition to the database has been newly written, concise text annotations for each peptidase family. Other forms of information recently added include highlighting of active site residues (or the replacements that render some homologues inactive) in the sequence displays and BlastP search results, dynamically generated alignments and trees at the peptidase or inhibitor level, and a curated list of human and mouse homologues that have been experimentally characterized as active. A new way to display information at taxonomic levels higher than species has been devised. In the Literature pages, references have been flagged to draw attention to particularly 'hot' topics.

    Funded by: Wellcome Trust

    Nucleic acids research 2006;34;Database issue;D270-2

  • The genetics of mental retardation.

    Raymond FL and Tarpey P

    Department of Medical Genetics, Cambridge Institute of Medical Research, University of Cambridge, Addenbrookes Hospital, Cambridge CB2 2XY, UK.

    Genetic abnormalities frequently give rise to a mental retardation phenotype. Recent advances in resolution of comparative genomic hybridization and genomic sequence annotation has identified new syndromes at chromosome 3q29 and 9q34. The finding of a significant number of copy number polymorphisms in the genome in the normal population, means that assigning pathogenicity to deletions and duplications in patients with mental retardation can be difficult but has been identified for duplications of MECP2 and L1CAM. Novel autosomal genes that cause mental retardation have been identified recently including CC2D1A identified by homozygosity mapping. Several new genes and pathways have been identified in the field of X-linked mental retardation but many more still await identification. Analysis of families where only a single male is affected reveals that the chance of this being due to a single X-linked gene abnormality is significantly less than would be expected if the excess of males in the population is entirely due to X-linked disease. Recent identification of novel X-linked mental retardation genes has identified components of the post-synaptic density and multiple zinc finger transcription factors as disease causing suggesting new mechanisms of disease causation. The first therapeutic treatments of animal models of mental retardation have been reported, a Drosophila model of Fragile X syndrome has been treated with lithium or metabotropic glutamate receptor (mGluR) antagonists and a mouse model of NF1 has been treated with the HMG-CoA reductase inhibitor lavastatin, which improves the learning and memory skills in these models.

    Funded by: Wellcome Trust

    Human molecular genetics 2006;15 Spec No 2;R110-6

  • Interstitial 9q22.3 microdeletion: clinical and molecular characterisation of a newly recognised overgrowth syndrome.

    Redon R, Baujat G, Sanlaville D, Le Merrer M, Vekemans M, Munnich A, Carter NP, Cormier-Daire V and Colleaux L

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    In the course of a systematic whole genome screening of patients with unexplained overgrowth syndrome by microarray-based comparative genomic hybridisation (array-CGH), we have identified two children with nearly identical 6.5 Mb-long de novo interstitial deletions at 9q22.32-q22.33. The clinical phenotype includes macrocephaly, overgrowth and trigonocephaly. In addition, both children present with psychomotor delay, hyperactivity and distinctive facial features. Further analysis with a high-resolution custom microarray covering the whole breakpoint intervals with fosmids mapped the deletion breakpoints within 100-kb intervals: although the deletion boundaries are different for the two patients, nearly the same genes are deleted in both cases. We suggest therefore that microdeletion of 9q22.32-q22.33 is a novel cause of overgrowth and mental retardation. Its association with distinctive facial features should help in recognising this novel phenotype.

    Funded by: Wellcome Trust

    European journal of human genetics : EJHG 2006;14;6;759-67

  • Global variation in copy number in the human genome.

    Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W, Cho EK, Dallaire S, Freeman JL, González JR, Gratacòs M, Huang J, Kalaitzopoulos D, Komura D, MacDonald JR, Marshall CR, Mei R, Montgomery L, Nishimura K, Okamura K, Shen F, Somerville MJ, Tchinda J, Valsesia A, Woodwark C, Yang F, Zhang J, Zerjal T, Zhang J, Armengol L, Conrad DF, Estivill X, Tyler-Smith C, Carter NP, Aburatani H, Lee C, Jones KW, Scherer SW and Hurles ME

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Copy number variation (CNV) of DNA sequences is functionally significant but has yet to be fully ascertained. We have constructed a first-generation CNV map of the human genome through the study of 270 individuals from four populations with ancestry in Europe, Africa or Asia (the HapMap collection). DNA from these individuals was screened for CNV using two complementary technologies: single-nucleotide polymorphism (SNP) genotyping arrays, and clone-based comparative genomic hybridization. A total of 1,447 copy number variable regions (CNVRs), which can encompass overlapping or adjacent gains or losses, covering 360 megabases (12% of the genome) were identified in these populations. These CNVRs contained hundreds of genes, disease loci, functional elements and segmental duplications. Notably, the CNVRs encompassed more nucleotide content per genome than SNPs, underscoring the importance of CNV in genetic diversity and evolution. The data obtained delineate linkage disequilibrium patterns for many CNVs, and reveal marked variation in copy number among populations. We also demonstrate the utility of this resource for genetic disease studies.

    Funded by: NHLBI NIH HHS: T32 HL007627; Wellcome Trust: 077008, 077009, 077014

    Nature 2006;444;7118;444-54

  • Tectonic, a novel regulator of the Hedgehog pathway required for both activation and inhibition.

    Reiter JF and Skarnes WC

    Developmental and Stem Cell Biology Program, and Diabetes Center, University of California, San Francisco, 94143-0525, USA.

    We report the identification of a novel protein that participates in Hedgehog-mediated patterning of the neural tube. This protein, named Tectonic, is the founding member of a previously undescribed family of evolutionarily conserved secreted and transmembrane proteins. During neural tube development, mouse Tectonic is required for formation of the most ventral cell types and for full Hedgehog (Hh) pathway activation. Epistasis analyses reveal that Tectonic modulates Hh signal transduction downstream of Smoothened (Smo) and Rab23. Interestingly, characterization of Tectonic Shh and Tectonic Smo double mutants indicates that Tectonic plays an additional role in repressing Hh pathway activity.

    Genes & development 2006;20;1;22-7

  • The genomic sequence and analysis of the swine major histocompatibility complex.

    Renard C, Hart E, Sehra H, Beasley H, Coggill P, Howe K, Harrow J, Gilbert J, Sims S, Rogers J, Ando A, Shigenari A, Shiina T, Inoko H, Chardon P and Beck S

    LREG INRA CEA, Jouy en Josas, France.

    We describe the generation and analysis of an integrated sequence map of a 2.4-Mb region of pig chromosome 7, comprising the classical class I region, the extended and classical class II regions, and the class III region of the major histocompatibility complex (MHC), also known as swine leukocyte antigen (SLA) complex. We have identified and manually annotated 151 loci, of which 121 are known genes (predicted to be functional), 18 are pseudogenes, 8 are novel CDS loci, 3 are novel transcripts, and 1 is a putative gene. Nearly all of these loci have homologues in other mammalian genomes but orthologues could be identified with confidence for only 123 genes. The 28 genes (including all the SLA class I genes) for which unambiguous orthology to genes within the human reference MHC could not be established are of particular interest with respect to porcine-specific MHC function and evolution. We have compared the porcine MHC to other mammalian MHC regions and identified the differences between them. In comparison to the human MHC, the main differences include the absence of HLA-A and other class I-like loci, the absence of HLA-DP-like loci, and the separation of the extended and classical class II regions from the rest of the MHC by insertion of the centromere. We show that the centromere insertion has occurred within a cluster of BTNL genes located at the boundary of the class II and III regions, which might have resulted in the loss of an orthologue to human C6orf10 from this region.

    Funded by: Wellcome Trust

    Genomics 2006;88;1;96-110

  • ATM mutations that cause ataxia-telangiectasia are breast cancer susceptibility alleles.

    Renwick A, Thompson D, Seal S, Kelly P, Chagtai T, Ahmed M, North B, Jayatilake H, Barfoot R, Spanova K, McGuffog L, Evans DG, Eccles D, Breast Cancer Susceptibility Collaboration (UK), Easton DF, Stratton MR and Rahman N

    Section of Cancer Genetics, Institute of Cancer Research, 15 Cotswold Road, Sutton, Surrey, SM2 5NG, UK.

    We screened individuals from 443 familial breast cancer pedigrees and 521 controls for ATM sequence variants and identified 12 mutations in affected individuals and two in controls (P = 0.0047). The results demonstrate that ATM mutations that cause ataxia-telangiectasia in biallelic carriers are breast cancer susceptibility alleles in monoallelic carriers, with an estimated relative risk of 2.37 (95% confidence interval (c.i.) = 1.51-3.78, P = 0.0003). There was no evidence that other classes of ATM variant confer a risk of breast cancer.

    Funded by: Medical Research Council: G0000934; Wellcome Trust: 068545/Z/02

    Nature genetics 2006;38;8;873-5

  • Prenatal detection of unbalanced chromosomal rearrangements by array CGH.

    Rickman L, Fiegler H, Shaw-Smith C, Nash R, Cirigliano V, Voglino G, Ng BL, Scott C, Whittaker J, Adinolfi M, Carter NP and Bobrow M

    University of Cambridge, Department of Medical Genetics, Addenbrooke's Hospital, Hills Road, Cambridge,UK.

    Background: Karyotype analysis has been the standard method for prenatal cytogenetic diagnosis since the 1970s. Although highly reliable, the major limitation remains the requirement for cell culture, resulting in a delay of as much as 14 days to obtaining test results. Fluorescent in situ hybridisation (FISH) and quantitative fluorescent PCR (QF-PCR) rapidly detect common chromosomal abnormalities but do not provide a genome wide screen for unexpected imbalances. Array comparative genomic hybridisation (CGH) has the potential to combine the speed of DNA analysis with a large capacity to scan for genomic abnormalities. We have developed a genomic microarray of approximately 600 large insert clones designed to detect aneuploidy, known microdeletion syndromes, and large unbalanced chromosomal rearrangements.

    Methods: This array was tested alongside an array with an approximate resolution of 1 Mb in a blind study of 30 cultured prenatal and postnatal samples with microscopically confirmed unbalanced rearrangements.

    Results: At 1 Mb resolution, 22/30 rearrangements were identified, whereas 29/30 aberrations were detected using the custom designed array, owing to the inclusion of specifically chosen clones to give increased resolution at genomic loci clinically implicated in known microdeletion syndromes. Both arrays failed to identify a triploid karyotype. Thirty normal control samples produced no false positive results.

    Conclusions: Analysis of 30 uncultured prenatal samples showed that array CGH is capable of detecting aneuploidy in DNA isolated from as little as 1 ml of uncultured amniotic fluid; 29/30 samples were correctly diagnosed, the exception being another case of triploidy. These studies demonstrate the potential for array CGH to replace conventional cytogenetics in the great majority of prenatal diagnosis cases.

    Funded by: Wellcome Trust

    Journal of medical genetics 2006;43;4;353-61

  • Escherichia coli K-12: a cooperatively developed annotation snapshot--2005.

    Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, Mori H, Perna NT, Plunkett G, Rudd KE, Serres MH, Thomas GH, Thomson NR, Wishart D and Wanner BL

    Josephine Bay Paul Center, Marine Biological Laboratory, Woods Hole, MA 02543, USA.

    The goal of this group project has been to coordinate and bring up-to-date information on all genes of Escherichia coli K-12. Annotation of the genome of an organism entails identification of genes, the boundaries of genes in terms of precise start and end sites, and description of the gene products. Known and predicted functions were assigned to each gene product on the basis of experimental evidence or sequence analysis. Since both kinds of evidence are constantly expanding, no annotation is complete at any moment in time. This is a snapshot analysis based on the most recent genome sequences of two E.coli K-12 bacteria. An accurate and up-to-date description of E.coli K-12 genes is of particular importance to the scientific community because experimentally determined properties of its gene products provide fundamental information for annotation of innumerable genes of other organisms. Availability of the complete genome sequence of two K-12 strains allows comparison of their genotypes and mutant status of alleles.

    Funded by: NIGMS NIH HHS: 1 R13 GM74562-01; Wellcome Trust

    Nucleic acids research 2006;34;1;1-9

  • Comparisons of dN/dS are time dependent for closely related bacterial genomes.

    Rocha EP, Smith JM, Hurst LD, Holden MT, Cooper JE, Smith NH and Feil EJ

    Atelier de BioInformatique, Université Paris VI, 75005 Paris, France.

    The ratio of non-synonymous (dN) to synonymous (dS) changes between taxa is frequently computed to assay the strength and direction of selection. Here we note that for comparisons between closely related strains and/or species a second parameter needs to be considered, namely the time since divergence of the two sequences under scrutiny. We demonstrate that a simple time lag model provides a general, parsimonious explanation of the extensive variation in the dN/dS ratio seen when comparing closely related bacterial genomes. We explore this model through simulation and comparative genomics, and suggest a role for hitch-hiking in the accumulation of non-synonymous mutations. We also note taxon-specific differences in the change of dN/dS over time, which may indicate variation in selection, or in population genetics parameters such as population size or the rate of recombination. The effect of comparing intra-species polymorphism and inter-species substitution, and the problems associated with these concepts for asexual prokaryotes, are also discussed. We conclude that, because of the critical effect of time since divergence, inter-taxa comparisons are only possible by comparing trajectories of dN/dS over time and it is not valid to compare taxa on the basis of single time points.

    Journal of theoretical biology 2006;239;2;226-35

  • Upf1, an RNA helicase required for nonsense-mediated mRNA decay, modulates the transcriptional response to oxidative stress in fission yeast.

    Rodríguez-Gabriel MA, Watt S, Bähler J and Russell P

    Department of Molecular Biology, MB-3, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA.

    In the fission yeast Schizosaccharomyces pombe, oxidative stress triggers the activation of the Spc1/Sty1 mitogen-activated protein kinase, which in turn phosphorylates the Atf1/Pcr1 heterodimeric transcription factor to effect global changes in the patterns of gene expression. This transcriptional response is also controlled by Csx1, an RNA-binding protein that directly associates with and stabilizes atf1(+) mRNA. Here we report the surprising observation that this response also requires Upf1, a component of the nonsense-mediated mRNA decay (NMD) system. Accordingly, upf1Delta and csx1Delta strains are similarly sensitive to oxidative stress, and the effects of the mutations are not additive, suggesting that Upf1 and Csx1 work in the same pathway to stabilize atf1(+) mRNA during oxidative stress. Consistent with these observations, whole-genome expression profiling studies have shown that Upf1 controls the expression of more than 100 genes that are transcriptionally induced in response to oxidative stress, the large majority of which are also controlled by Atf1 and Csx1. The unexpected connection between an NMD factor and the oxidative stress response in fission yeast may provide important new clues about the physiological function of NMD in other species.

    Funded by: Cancer Research UK: A6517; NIEHS NIH HHS: ES10337; Wellcome Trust: 077118

    Molecular and cellular biology 2006;26;17;6347-56

  • Reciprocal chromosome painting between three laboratory rodent species.

    Romanenko SA, Perelman PL, Serdukova NA, Trifonov VA, Biltueva LS, Wang J, Li T, Nie W, O'Brien PC, Volobouev VT, Stanyon R, Ferguson-Smith MA, Yang F and Graphodatsky AS

    Institute of Cytology and Genetics, Siberian Branch, Russian Academy of Sciences, Novosibirsk, 630090, Russia.

    The laboratory mouse (Mus musculus, 2n = 40), the Chinese hamster (Cricetulus griseus, 2n = 22), and the golden (Syrian) hamster (Mesocricetus auratus, 2n = 44) are common laboratory animals, extensively used in biomedical research. In contrast with the mouse genome, which was sequenced and well characterized, the hamster species has been set aside. We constructed a chromosome paint set for the golden hamster, which for the first time allowed us to perform multidirectional chromosome painting between the golden hamster and the mouse and between the two species of hamster. From these data we constructed a detailed comparative chromosome map of the laboratory mouse and the two hamster species. The golden hamster painting probes revealed 25 autosomal segments in the Chinese hamster and 43 in the mouse. Using the Chinese hamster probes, 23 conserved segments were found in the golden hamster karyotype. The mouse probes revealed 42 conserved autosomal segments in the golden hamster karyotype. The two largest chromosomes of the Chinese hamster (1 and 2) are homologous to seven and five chromosomes of the golden hamster, respectively. The golden hamster karyotype can be transformed into the Chinese hamster karyotype by 15 fusions and 3 fissions. Previous reconstructions of the ancestral murid karyotype proposed diploid numbers from 2n = 52 to 2n = 54. By integrating the new multidirectional chromosome painting data presented here with previous comparative genomics data, we can propose that syntenies to mouse Chrs 6 and 16 were both present and to hypothesize a diploid number of 2n = 48 for the ancestral Murinae/Cricetinae karyotype.

    Funded by: Wellcome Trust

    Mammalian genome : official journal of the International Mammalian Genome Society 2006;17;12;1183-92

  • Array-CGH detection of micro rearrangements in mentally retarded individuals: clinical significance of imbalances present both in affected children and normal parents.

    Rosenberg C, Knijnenburg J, Bakker E, Vianna-Morgante AM, Sloos W, Otto PA, Kriek M, Hansson K, Krepischi-Santos AC, Fiegler H, Carter NP, Bijlsma EK, van Haeringen A, Szuhai K and Tanke HJ

    Background: The underlying causes of mental retardation remain unknown in about half the cases. Recent array-CGH studies demonstrated cryptic imbalances in about 25% of patients previously thought to be chromosomally normal.

    Array-CGH with approximately 3500 large insert clones spaced at approximately 1 Mb intervals was used to investigate DNA copy number changes in 81 mentally impaired individuals.

    Results: Imbalances never observed in control chromosomes were detected in 20 patients (25%): seven were de novo, nine were inherited, and four could not have their origin determined. Six other alterations detected by array were disregarded because they were shown by FISH either to hybridise to both homologues similarly in a presumptive deletion (one case) or to involve clones that hybridised to multiple sites (five cases). All de novo imbalances were assumed to be causally related to the abnormal phenotypes. Among the others, a causal relation between the rearrangements and an aberrant phenotype could be inferred in six cases, including two imbalances of the X chromosome, where the associated clinical features segregated as X linked recessive traits.

    Conclusions: In all, 13 of 81 patients (16%) were found to have chromosomal imbalances probably related to their clinical features. The clinical significance of the seven remaining imbalances remains unclear. The limited ability to differentiate between inherited copy number variations which cause abnormal phenotypes and rare variants unrelated to clinical alterations currently constitutes a limitation in the use of CGH-microarray for guiding genetic counselling.

    Journal of medical genetics 2006;43;2;180-6

  • The sequences of the human sex chromosomes.

    Ross MT, Bentley DR and Tyler-Smith C

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    The sequences of both of the human sex chromosomes and of a substantial part of the chimpanzee Y chromosome have now been determined, and most of the protein-coding genes have been identified. The X chromosome codes for more than 800 proteins but the Y chromosome for only approximately 60, illustrating their very different evolutionary histories since their origin from an autosomal pair approximately 300 million years ago and explaining their differential importance in disease. These sequences have provided the basis for understanding normal patterns of variation, such as the distribution of SNPs, and patterns of linkage disequilibrium. In addition, they have been useful for identifying variants associated with simple Mendelian disorders such as microphthalmia or mental retardation, and more complex disorders such as osteoporosis.

    Funded by: Wellcome Trust

    Current opinion in genetics & development 2006;16;3;213-8

  • Evolutionary history of Salmonella typhi.

    Roumagnac P, Weill FX, Dolecek C, Baker S, Brisse S, Chinh NT, Le TA, Acosta CJ, Farrar J, Dougan G and Achtman M

    Max-Planck-Institut für Infektionsbiologie, Department of Molecular Biology, Charitéplatz 1, 10117 Berlin, Germany.

    For microbial pathogens, phylogeographic differentiation seems to be relatively common. However, the neutral population structure of Salmonella enterica serovar Typhi reflects the continued existence of ubiquitous haplotypes over millennia. In contrast, clinical use of fluoroquinolones has yielded at least 15 independent gyrA mutations within a decade and stimulated clonal expansion of haplotype H58 in Asia and Africa. Yet, antibiotic-sensitive strains and haplotypes other than H58 still persist despite selection for antibiotic resistance. Neutral evolution in Typhi appears to reflect the asymptomatic carrier state, and adaptive evolution depends on the rapid transmission of phenotypic changes through acute infections.

    Funded by: Wellcome Trust: 076962

    Science (New York, N.Y.) 2006;314;5803;1301-4

  • Otoferlin, defective in a human deafness form, is essential for exocytosis at the auditory ribbon synapse.

    Roux I, Safieddine S, Nouvian R, Grati M, Simmler MC, Bahloul A, Perfettini I, Le Gall M, Rostaing P, Hamard G, Triller A, Avan P, Moser T and Petit C

    Inserm UMRS587, Unité de Génétique des Déficits Sensoriels, Collège de France, Institut Pasteur, 25 rue du Dr Roux, 75015 Paris, France.

    The auditory inner hair cell (IHC) ribbon synapse operates with an exceptional temporal precision and maintains a high level of neurotransmitter release. However, the molecular mechanisms underlying IHC synaptic exocytosis are largely unknown. We studied otoferlin, a predicted C2-domain transmembrane protein, which is defective in a recessive form of human deafness. We show that otoferlin expression in the hair cells correlates with afferent synaptogenesis and find that otoferlin localizes to ribbon-associated synaptic vesicles. Otoferlin binds Ca(2+) and displays Ca(2+)-dependent interactions with the SNARE proteins syntaxin1 and SNAP25. Otoferlin deficient mice (Otof(-/-)) are profoundly deaf. Exocytosis in Otof(-/-) IHCs is almost completely abolished, despite normal ribbon synapse morphogenesis and Ca(2+) current. Thus, otoferlin is essential for a late step of synaptic vesicle exocytosis and may act as the major Ca(2+) sensor triggering membrane fusion at the IHC ribbon synapse.

    Cell 2006;127;2;277-89

  • A comprehensive study of chromosome 16q in invasive ductal and lobular breast carcinoma using array CGH.

    Roylance R, Gorman P, Papior T, Wan YL, Ives M, Watson JE, Collins C, Wortham N, Langford C, Fiegler H, Carter N, Gillett C, Sasieni P, Pinder S, Hanby A and Tomlinson I

    Molecular and Population Genetics Laboratory, Cancer Research UK, Lincoln's Inn Fields, London, UK.

    We analysed chromosome 16q in 106 breast cancers using tiling-path array-comparative genomic hybridization (aCGH). About 80% of ductal cancers (IDCs) and all lobular cancers (ILCs) lost at least part of 16q. Grade I (GI) IDCs and ILCs often lost the whole chromosome arm. Grade II (GII) and grade III (GIII) IDCs showed less frequent whole-arm loss, but often had complex changes, typically small regions of gain together with larger regions of loss. The boundaries of gains/losses tended to cluster, common sites being 54.5-55.5 Mb and 57.4-58.8 Mb. Overall, the peak frequency of loss (83% cancers) occurred at 61.9-62.9 Mb. We also found several 'minimal' regions of loss/gain. However, no mutations in candidate genes (TRADD, CDH5, CDH8 and CDH11) were detected. Cluster analysis based on copy number changes identified a large group of cancers that had lost most of 16q, and two smaller groups (one with few changes, one with a tendency to show copy number gain). Although all morphological types occurred in each cluster group, IDCs (especially GII/GIII) were relatively overrepresented in the smaller groups. Cluster groups were not independently associated with survival. Use of tiling-path aCGH prompted re-evaluation of the hypothetical pathways of breast carcinogenesis. ILCs have the simplest changes on 16q and probably diverge from the IDC lineage close to the stage of 16q loss. Higher-grade IDCs probably develop from low-grade lesions in most cases, but there remains evidence that some GII/GIII IDCs arise without a GI precursor.

    Oncogene 2006;25;49;6544-53

  • Cell-cell signaling in Xanthomonas campestris involves an HD-GYP domain protein that functions in cyclic di-GMP turnover.

    Ryan RP, Fouhy Y, Lucey JF, Crossman LC, Spiro S, He YW, Zhang LH, Heeb S, Cámara M, Williams P and Dow JM

    BIOMERIT Research Centre, Department of Microbiology, BioSciences Institute, National University of Ireland, Cork, Ireland.

    HD-GYP is a protein domain of unknown biochemical function implicated in bacterial signaling and regulation. In the plant pathogen Xanthomonas campestris pv. campestris, the synthesis of virulence factors and dispersal of biofilms are positively controlled by a two-component signal transduction system comprising the HD-GYP domain regulatory protein RpfG and cognate sensor RpfC and by cell-cell signaling mediated by the diffusible signal molecule DSF (diffusible signal factor). The RpfG/RpfC two-component system has been implicated in DSF perception and signal transduction. Here we show that the role of RpfG is to degrade the unusual nucleotide cyclic di-GMP, an activity associated with the HD-GYP domain. Mutation of the conserved H and D residues of the isolated HD-GYP domain resulted in loss of both the enzymatic activity against cyclic di-GMP and the regulatory activity in virulence factor synthesis. Two other protein domains, GGDEF and EAL, are already implicated in the synthesis and degradation respectively of cyclic di-GMP. As with GGDEF and EAL domains, the HD-GYP domain is widely distributed in free-living bacteria and occurs in plant and animal pathogens, as well as beneficial symbionts and organisms associated with a range of environmental niches. Identification of the role of the HD-GYP domain thus increases our understanding of a signaling network whose importance to the lifestyle of diverse bacteria is now emerging.

    Proceedings of the National Academy of Sciences of the United States of America 2006;103;17;6712-7

  • An explosive-degrading cytochrome P450 activity and its targeted application for the phytoremediation of RDX.

    Rylott EL, Jackson RG, Edwards J, Womack GL, Seth-Smith HM, Rathbone DA, Strand SE and Bruce NC

    CNAP, Department of Biology, University of York, PO Box 373, York, YO10 5YW, UK.

    The widespread presence in the environment of hexahydro-1,3,5-trinitro-1,3,5-triazine (RDX), one of the most widely used military explosives, has raised concern owing to its toxicity and recalcitrance to degradation. To investigate the potential of plants to remove RDX from contaminated soil and water, we engineered Arabidopsis thaliana to express a bacterial gene xplA encoding an RDX-degrading cytochrome P450 (ref. 1). We demonstrate that the P450 domain of XplA is fused to a flavodoxin redox partner and catalyzes the degradation of RDX in the absence of oxygen. Transgenic A. thaliana expressing xplA removed and detoxified RDX from liquid media. As a model system for RDX phytoremediation, A. thaliana expressing xplA was grown in RDX-contaminated soil and found to be resistant to RDX phytotoxicity, producing shoot and root biomasses greater than those of wild-type plants. Our work suggests that expression of xplA in landscape plants may provide a suitable remediation strategy for sites contaminated by this class of explosives.

    Nature biotechnology 2006;24;2;216-9

  • Identification of the ancestral killer immunoglobulin-like receptor gene in primates.

    Sambrook JG, Bashirova A, Andersen H, Piatak M, Vernikos GS, Coggill P, Lifson JD, Carrington M and Beck S

    Immunogenomics Laboratory, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 ISA, UK.

    Background: Killer Immunoglobulin-like Receptors (KIR) are essential immuno-surveillance molecules. They are expressed on natural killer and T cells, and interact with human leukocyte antigens. KIR genes are highly polymorphic and contribute vital variability to our immune system. Numerous KIR genes, belonging to five distinct lineages, have been identified in all primates examined thus far and shown to be rapidly evolving. Since few KIR remain orthologous between species, with only one of them, KIR2DL4, shown to be common to human, apes and monkeys, the evolution of the KIR gene family in primates remains unclear.

    Results: Using comparative analyses, we have identified the ancestral KIR lineage (provisionally named KIR3DL0) in primates. We show KIR3DL0 to be highly conserved with the identification of orthologues in human (Homo sapiens), common chimpanzee (Pan troglodytes), gorilla (Gorilla gorilla), rhesus monkey (Macaca mulatta) and common marmoset (Callithrix jacchus). We predict KIR3DL0 to encode a functional molecule in all primates by demonstrating expression in human, chimpanzee and rhesus monkey. Using the rhesus monkey as a model, we further show the expression profile to be typical of KIR by quantitative measurement of KIR3DL0 from an enriched population of natural killer cells.

    Conclusion: One reason why KIR3DL0 may have escaped discovery for so long is that, in human, it maps in between two related leukocyte immunoglobulin-like receptor clusters outside the known KIR gene cluster on Chromosome 19. Based on genomic, cDNA, expression and phylogenetic data, we report a novel lineage of immunoglobulin receptors belonging to the KIR family, which is highly conserved throughout 50 million years of primate evolution.

    Funded by: NCI NIH HHS: N01-CO-12400; Wellcome Trust

    BMC genomics 2006;7;209

  • WormBase: better software, richer content.

    Schwarz EM, Antoshechkin I, Bastiani C, Bieri T, Blasiar D, Canaran P, Chan J, Chen N, Chen WJ, Davis P, Fiedler TJ, Girard L, Harris TW, Kenny EE, Kishore R, Lawson D, Lee R, Müller HM, Nakamura C, Ozersky P, Petcherski A, Rogers A, Spooner W, Tuli MA, Van Auken K, Wang D, Durbin R, Spieth J, Stein LD and Sternberg PW

    Division of Biology, 156-29 California Institute of Technology, Pasadena, CA, 91125, USA.

    WormBase (, the public database for genomics and biology of Caenorhabditis elegans, has been restructured for stronger performance and expanded for richer biological content. Performance was improved by accelerating the loading of central data pages such as the omnibus Gene page, by rationalizing internal data structures and software for greater portability, and by making the Genome Browser highly customizable in how it views and exports genomic subsequences. Arbitrarily complex, user-specified queries are now possible through Textpresso (for all available literature) and through WormMart (for most genomic data). Biological content was enriched by reconciling all available cDNA and expressed sequence tag data with gene predictions, clarifying single nucleotide polymorphism and RNAi sites, and summarizing known functions for most genes studied in this organism.

    Funded by: NHGRI NIH HHS: P41-HG02223

    Nucleic acids research 2006;34;Database issue;D475-8

  • Truncating mutations in the Fanconi anemia J gene BRIP1 are low-penetrance breast cancer susceptibility alleles.

    Seal S, Thompson D, Renwick A, Elliott A, Kelly P, Barfoot R, Chagtai T, Jayatilake H, Ahmed M, Spanova K, North B, McGuffog L, Evans DG, Eccles D, Breast Cancer Susceptibility Collaboration (UK), Easton DF, Stratton MR and Rahman N

    Section of Cancer Genetics, Institute of Cancer Research, 15 Cotswold Road, Sutton, Surrey, SM2 5NG, UK.

    We identified constitutional truncating mutations of the BRCA1-interacting helicase BRIP1 in 9/1,212 individuals with breast cancer from BRCA1/BRCA2 mutation-negative families but in only 2/2,081 controls (P = 0.0030), and we estimate that BRIP1 mutations confer a relative risk of breast cancer of 2.0 (95% confidence interval = 1.2-3.2, P = 0.012). Biallelic BRIP1 mutations were recently shown to cause Fanconi anemia complementation group J. Thus, inactivating truncating mutations of BRIP1, similar to those in BRCA2, cause Fanconi anemia in biallelic carriers and confer susceptibility to breast cancer in monoallelic carriers.

    Funded by: Medical Research Council: G0000934; Wellcome Trust: 068545/Z/02

    Nature genetics 2006;38;11;1239-41

  • Colonic irritation.

    Sebaihia M and Thomson NR

    Nature reviews. Microbiology 2006;4;12;882-3

  • Comparison of the genome sequence of the poultry pathogen Bordetella avium with those of B. bronchiseptica, B. pertussis, and B. parapertussis reveals extensive diversity in surface structures associated with host interaction.

    Sebaihia M, Preston A, Maskell DJ, Kuzmiak H, Connell TD, King ND, Orndorff PE, Miyamoto DM, Thomson NR, Harris D, Goble A, Lord A, Murphy L, Quail MA, Rutter S, Squares R, Squares S, Woodward J, Parkhill J and Temple LM

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom.

    Bordetella avium is a pathogen of poultry and is phylogenetically distinct from Bordetella bronchiseptica, Bordetella pertussis, and Bordetella parapertussis, which are other species in the Bordetella genus that infect mammals. In order to understand the evolutionary relatedness of Bordetella species and further the understanding of pathogenesis, we obtained the complete genome sequence of B. avium strain 197N, a pathogenic strain that has been extensively studied. With 3,732,255 base pairs of DNA and 3,417 predicted coding sequences, it has the smallest genome and gene complement of the sequenced bordetellae. In this study, the presence or absence of previously reported virulence factors from B. avium was confirmed, and the genetic bases for growth characteristics were elucidated. Over 1,100 genes present in B. avium but not in B. bronchiseptica were identified, and most were predicted to encode surface or secreted proteins that are likely to define an organism adapted to the avian rather than the mammalian respiratory tracts. These include genes coding for the synthesis of a polysaccharide capsule, hemagglutinins, a type I secretion system adjacent to two very large genes for secreted proteins, and unique genes for both lipopolysaccharide and fimbrial biogenesis. Three apparently complete prophages are also present. The BvgAS virulence regulatory system appears to have polymorphisms at a poly(C) tract that is involved in phase variation in other bordetellae. A number of putative iron-regulated outer membrane proteins were predicted from the sequence, and this regulation was confirmed experimentally for five of these.

    Funded by: NIDCR NIH HHS: T32 DE007034

    Journal of bacteriology 2006;188;16;6002-15

  • The multidrug-resistant human pathogen Clostridium difficile has a highly mobile, mosaic genome.

    Sebaihia M, Wren BW, Mullany P, Fairweather NF, Minton N, Stabler R, Thomson NR, Roberts AP, Cerdeño-Tárraga AM, Wang H, Holden MT, Wright A, Churcher C, Quail MA, Baker S, Bason N, Brooks K, Chillingworth T, Cronin A, Davis P, Dowd L, Fraser A, Feltwell T, Hance Z, Holroyd S, Jagels K, Moule S, Mungall K, Price C, Rabbinowitsch E, Sharp S, Simmonds M, Stevens K, Unwin L, Whithead S, Dupuy B, Dougan G, Barrell B and Parkhill J

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    We determined the complete genome sequence of Clostridium difficile strain 630, a virulent and multidrug-resistant strain. Our analysis indicates that a large proportion (11%) of the genome consists of mobile genetic elements, mainly in the form of conjugative transposons. These mobile elements are putatively responsible for the acquisition by C. difficile of an extensive array of genes involved in antimicrobial resistance, virulence, host interaction and the production of surface structures. The metabolic capabilities encoded in the genome show multiple adaptations for survival and growth within the gut environment. The extreme genome variability was confirmed by whole-genome microarray analysis; it may reflect the organism's niche in the gut and should provide information on the evolution of virulence in this organism.

    Funded by: Wellcome Trust

    Nature genetics 2006;38;7;779-86

  • Metazoan Scc4 homologs link sister chromatid cohesion to cell and axon migration guidance.

    Seitan VC, Banks P, Laval S, Majid NA, Dorsett D, Rana A, Smith J, Bateman A, Krpic S, Hostert A, Rollins RA, Erdjument-Bromage H, Tempst P, Benard CY, Hekimi S, Newbury SF and Strachan T

    Institute of Human Genetics, University of Newcastle, Newcastle upon Tyne, United Kingdom.

    Saccharomyces cerevisiae Scc2 binds Scc4 to form an essential complex that loads cohesin onto chromosomes. The prevalence of Scc2 orthologs in eukaryotes emphasizes a conserved role in regulating sister chromatid cohesion, but homologs of Scc4 have not hitherto been identified outside certain fungi. Some metazoan orthologs of Scc2 were initially identified as developmental gene regulators, such as Drosophila Nipped-B, a regulator of cut and Ultrabithorax, and delangin, a protein mutant in Cornelia de Lange syndrome. We show that delangin and Nipped-B bind previously unstudied human and fly orthologs of Caenorhabditis elegans MAU-2, a non-axis-specific guidance factor for migrating cells and axons. PSI-BLAST shows that Scc4 is evolutionarily related to metazoan MAU-2 sequences, with the greatest homology evident in a short N-terminal domain, and protein-protein interaction studies map the site of interaction between delangin and human MAU-2 to the N-terminal regions of both proteins. Short interfering RNA knockdown of human MAU-2 in HeLa cells resulted in precocious sister chromatid separation and in impaired loading of cohesin onto chromatin, indicating that it is functionally related to Scc4, and RNAi analyses show that MAU-2 regulates chromosome segregation in C. elegans embryos. Using antisense morpholino oligonucleotides to knock down Xenopus tropicalis delangin or MAU-2 in early embryos produced similar patterns of retarded growth and developmental defects. Our data show that sister chromatid cohesion in metazoans involves the formation of a complex similar to the Scc2-Scc4 interaction in the budding yeast. The very high degree of sequence conservation between Scc4 homologs in complex metazoans is consistent with increased selection pressure to conserve additional essential functions, such as regulation of cell and axon migration during development.

    Funded by: NIGMS NIH HHS: R01 GM055683, R01 GM063403, R01 GM063403-01, R01 GM063403-02, R01 GM063403-03; Wellcome Trust: 087656

    PLoS biology 2006;4;8;e242

  • Where there's muck there's microbes.

    Seth-Smith H and Bentley S

    Nature reviews. Microbiology 2006;4;9;646-7

  • The fission yeast Rpb4 subunit of RNA polymerase II plays a specialized role in cell separation.

    Sharma N, Marguerat S, Mehta S, Watt S and Bähler J

    University School of Biotechnology, G.G.S. Indraprastha University, Kashmere Gate, Delhi, 110006, India.

    RNA polymerase II is a complex of 12 subunits, Rpb1 to Rpb12, whose specific roles are only partly understood. Rpb4 is essential in mammals and fission yeast, but not in budding yeast. To learn more about the roles of Rpb4, we expressed the rpb4 gene under the control of regulatable promoters of different strength in fission yeast. We demonstrate that below a critical level of transcription, Rpb4 affects cellular growth proportional to its expression levels: cells expressing lower levels of rpb4 grew slower compared to cells expressing higher levels. Lowered rpb4 expression did not affect cell survival under several stress conditions, but it caused specific defects in cell separation similar to sep mutants. Microarray analysis revealed that lowered rpb4 expression causes a global reduction in gene expression, but the transcript levels of a distinct subset of genes were particularly responsive to changes in rpb4 expression. These genes show some overlap with those regulated by the Sep1-Ace2 transcriptional cascade required for cell separation. Most notably, the gene expression signature of cells with lowered rpb4 expression was highly similar to those of mcs6, pmh1, sep10 and sep15 mutants. Mcs6 and Pmh1 encode orthologs of metazoan TFIIH-associated cyclin-dependent kinase (CDK)-activating kinase (Cdk7-cyclin H-Mat1), while Sep10 and Sep15 encode mediator components. Our results suggest that Rpb4, along with some other general transcription factors, plays a specialized role in a transcriptional pathway that controls the cell cycle-regulated transcription of a specific subset of genes involved in cell division.

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Molecular genetics and genomics : MGG 2006;276;6;545-54

  • Microdeletion encompassing MAPT at chromosome 17q21.3 is associated with developmental delay and learning disability.

    Shaw-Smith C, Pittman AM, Willatt L, Martin H, Rickman L, Gribble S, Curley R, Cumming S, Dunn C, Kalaitzopoulos D, Porter K, Prigmore E, Krepischi-Santos AC, Varela MC, Koiffmann CP, Lees AJ, Rosenberg C, Firth HV, de Silva R and Carter NP

    University of Cambridge Department of Medical Genetics, Addenbrooke's Hospital, Cambridge CB2 2QQ, UK.

    Recently, the application of array-based comparative genomic hybridization (array CGH) has improved rates of detection of chromosomal imbalances in individuals with mental retardation and dysmorphic features. Here, we describe three individuals with learning disability and a heterozygous deletion at chromosome 17q21.3, detected in each case by array CGH. FISH analysis demonstrated that the deletions occurred as de novo events in each individual and were between 500 kb and 650 kb in size. A recently described 900-kb inversion that suppresses recombination between ancestral H1 and H2 haplotypes encompasses the deletion. We show that, in each trio, the parent of origin of the deleted chromosome 17 carries at least one H2 chromosome. This region of 17q21.3 shows complex genomic architecture with well-described low-copy repeats (LCRs). The orientation of LCRs flanking the deleted segment in inversion heterozygotes is likely to facilitate the generation of this microdeletion by means of non-allelic homologous recombination.

    Funded by: Medical Research Council: G0501560, G0501560(76517); Wellcome Trust

    Nature genetics 2006;38;9;1032-7

  • A genome wide linkage search for breast cancer susceptibility genes.

    Smith P, McGuffog L, Easton DF, Mann GJ, Pupo GM, Newman B, Chenevix-Trench G, kConFab Investigators, Szabo C, Southey M, Renard H, Odefrey F, Lynch H, Stoppa-Lyonnet D, Couch F, Hopper JL, Giles GG, McCredie MR, Buys S, Andrulis I, Senie R, BCFS, BRCAX Collaborators Group, Goldgar DE, Oldenburg R, Kroeze-Jansema K, Kraan J, Meijers-Heijboer H, Klijn JG, van Asperen C, van Leeuwen I, Vasen HF, Cornelisse CJ, Devilee P, Baskcomb L, Seal S, Barfoot R, Mangion J, Hall A, Edkins S, Rapley E, Wooster R, Chang-Claude J, Eccles D, Evans DG, Futreal PA, Nathanson KL, Weber BL, Breast Cancer Susceptibility Collaboration (UK), Rahman N and Stratton MR

    CR-UK Genetic Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.

    Mutations in known breast cancer susceptibility genes account for a minority of the familial aggregation of the disease. To search for further breast cancer susceptibility genes, we performed a combined analysis of four genome-wide linkage screens, which included a total of 149 multiple case breast cancer families. All families included at least three cases of breast cancer diagnosed below age 60 years, at least one of whom had been tested and found not to carry a BRCA1 or BRCA2 mutation. Evidence for linkage was assessed using parametric linkage analysis, assuming both a dominant and a recessive mode of inheritance, and using nonparametric methods. The highest LOD score obtained in any analysis of the combined data was 1.80 under the dominant model, in a region on chromosome 4 close to marker D4S392. Three further LOD scores over 1 were identified in the parametric analyses and two in the nonparametric analyses. A maximum LOD score of 2.40 was found on chromosome arm 2p in families with four or more cases of breast cancer diagnosed below age 50 years. The number of linkage peaks did not differ from the number expected by chance. These results suggest regions that may harbor novel breast cancer susceptibility genes. They also indicate that no single gene is likely to account for a large fraction of the familial aggregation of breast cancer that is not due to mutations in BRCA1 or BRCA2.

    Funded by: Cancer Research UK: A3353, A5260; NCI NIH HHS: CA-95-003; Wellcome Trust: 077012

    Genes, chromosomes & cancer 2006;45;7;646-55

  • Genomic anatomy of the Tyrp1 (brown) deletion complex.

    Smyth IM, Wilming L, Lee AW, Taylor MS, Gautier P, Barlow K, Wallis J, Martin S, Glithero R, Phillimore B, Pelan S, Andrew R, Holt K, Taylor R, McLaren S, Burton J, Bailey J, Sims S, Squares J, Plumb B, Joy A, Gibson R, Gilbert J, Hart E, Laird G, Loveland J, Mudge J, Steward C, Swarbreck D, Harrow J, North P, Leaves N, Greystrong J, Coppola M, Manjunath S, Campbell M, Smith M, Strachan G, Tofts C, Boal E, Cobley V, Hunter G, Kimberley C, Thomas D, Cave-Berry L, Weston P, Botcherby MR, White S, Edgar R, Cross SH, Irvani M, Hummerich H, Simpson EH, Johnson D, Hunsicker PR, Little PF, Hubbard T, Campbell RD, Rogers J and Jackson IJ

    Medical Research Council Human Genetics Unit, Edinburgh EH4 2XU, United Kingdom.

    Chromosome deletions in the mouse have proven invaluable in the dissection of gene function. The brown deletion complex comprises >28 independent genome rearrangements, which have been used to identify several functional loci on chromosome 4 required for normal embryonic and postnatal development. We have constructed a 172-bacterial artificial chromosome contig that spans this 22-megabase (Mb) interval and have produced a contiguous, finished, and manually annotated sequence from these clones. The deletion complex is strikingly gene-poor, containing only 52 protein-coding genes (of which only 39 are supported by human homologues) and has several further notable genomic features, including several segments of >1 Mb, apparently devoid of a coding sequence. We have used sequence polymorphisms to finely map the deletion breakpoints and identify strong candidate genes for the known phenotypes that map to this region, including three lethal loci (l4Rn1, l4Rn2, and l4Rn3) and the fitness mutant brown-associated fitness (baf). We have also characterized misexpression of the basonuclin homologue, Bnc2, associated with the inversion-mediated coat color mutant white-based brown (B(w)). This study provides a molecular insight into the basis of several characterized mouse mutants, which will allow further dissection of this region by targeted or chemical mutagenesis.

    Funded by: Medical Research Council: MC_U123160651, MC_U127561112; Wellcome Trust

    Proceedings of the National Academy of Sciences of the United States of America 2006;103;10;3704-9

  • The influence of recombination on human genetic diversity.

    Spencer CC, Deloukas P, Hunt S, Mullikin J, Myers S, Silverman B, Donnelly P, Bentley D and McVean G

    Department of Statistics, University of Oxford, Oxford, United Kingdom.

    In humans, the rate of recombination, as measured on the megabase scale, is positively associated with the level of genetic variation, as measured at the genic scale. Despite considerable debate, it is not clear whether these factors are causally linked or, if they are, whether this is driven by the repeated action of adaptive evolution or molecular processes such as double-strand break formation and mismatch repair. We introduce three innovations to the analysis of recombination and diversity: fine-scale genetic maps estimated from genotype experiments that identify recombination hotspots at the kilobase scale, analysis of an entire human chromosome, and the use of wavelet techniques to identify correlations acting at different scales. We show that recombination influences genetic diversity only at the level of recombination hotspots. Hotspots are also associated with local increases in GC content and the relative frequency of GC-increasing mutations but have no effect on substitution rates. Broad-scale association between recombination and diversity is explained through covariance of both factors with base composition. To our knowledge, these results are the first evidence of a direct and local influence of recombination hotspots on genetic variation and the fate of individual mutations. However, that hotspots have no influence on substitution rates suggests that they are too ephemeral on an evolutionary time scale to have a strong influence on broader scale patterns of base composition and long-term molecular evolution.

    Funded by: Wellcome Trust

    PLoS genetics 2006;2;9;e148

  • Stabilization of a plasmid coding for a heterologous antigen in Salmonella enterica serotype typhi vaccine strain CVD908-htrA by using site-specific recombination.

    Stephens JC, Darsley MJ and Turner AK

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.

    A gene cassette incorporating the crs-rsd site-specific recombination system from the Salmonella enterica subsp. enterica serovar Dublin virulence plasmid improved the inheritance in S. enterica serotype Typhi strain CVD908-htrA of a multicopy plasmid expression vector. Use of this recombination cassette may improve expression of heterologous antigens from multicopy plasmid expression vectors in attenuated bacterial vaccine strains.

    Infection and immunity 2006;74;7;4383-6

  • The Genetics of Type 2 Diabetes

    Stevenson,C., Barroso,I. and Wareham,N.;

    Nutritional Genomics: Impact on Health and Disease 2006;Chapter 13;222-65

  • From genome to vaccines for leishmaniasis: screening 100 novel vaccine candidates against murine Leishmania major infection.

    Stober CB, Lange UG, Roberts MT, Gilmartin B, Francis R, Almeida R, Peacock CS, McCann S and Blackwell JM

    Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 2XY, UK.

    The genomic sequence of Leishmania major provides a rich source of vaccine candidates. One hundred randomly selected amastigote-expressed genes were screened as DNA vaccines, and efficacy determined following high-dose L. major footpad challenge in BALB/c mice. Fourteen protective novel vaccine candidates were identified; seven vaccines exacerbated disease. There were no differences in the number of predicted MHC H-2d class I or II epitopes mapping to protective versus exacerbatory antigens. A proportion of both protective (7/14; 50%) and exacerbatory (4/7; 57%) proteins showed short (8- to 18-mer) 100% amino acid sequence identities to human, mouse or gut flora proteins. A high proportion of these (4/7 protective; 3/4 exacerbatory) showed full or partial overlap with RANKPEP-predicted H-2d classes I and II epitopes. Our data suggest, therefore, that there may be little difference between antigens/epitopes that drive regulatory versus effector CD4 T cell populations. The best novel protective antigen was an amastin-like gene that maps to a 17-gene tandem array on Leishmania chromosome 8 and is closely related to 37 other amastin-like genes. Two ribosomal proteins, a V-ATPase subunit, and a dynein light chain orthologue were the only other protective genes with putative functions.

    Funded by: Wellcome Trust

    Vaccine 2006;24;14;2602-16

  • From DNA to RNA to disease and back: the 'central dogma' of regulatory disease variation.

    Stranger BE and Dermitzakis ET

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Much of the focus of human disease genetics is directed towards identifying nucleotide variants that contribute to disease phenotypes. This is a complex problem, often involving contributions from multiple loci and their interactions, as well as effects due to environmental factors. Although some diseases with a genetic basis are caused by nucleotide changes that alter an amino acid sequence, in other cases, disease risk is associated with altered gene regulation. This paper focuses on how studies of gene expression variation might complement disease studies and provide crucial links between genotype and phenotype.

    Human genomics 2006;2;6;383-90

  • Expression of transgenes targeted to the Gt(ROSA)26Sor locus is orientation dependent.

    Strathdee D, Ibbotson H and Grant SG

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom; Centre for Neuroscience Research, University of Edinburgh, Edinburgh, Scotland.

    Background: Targeting transgenes to a chosen location in the genome has a number of advantages. A single copy of the DNA construct can be inserted by targeting into regions of chromatin that allow the desired developmental and tissue-specific expression of the transgene.

    Methodology: In order to develop a reliable system for reproducibly expressing transgenes it was decided to insert constructs at the Gt(ROSA)26Sor locus. A cytomegalovirus (CMV) promoter was used to drive expression of the Tetracycline (tet) transcriptional activator, rtTA2(s)-M2, and test the effectiveness of using the ROSA26 locus to allow transgene expression. The tet operator construct was inserted into one allele of ROSA26 and a tet responder construct controlling expression of EGFP was inserted into the other allele.

    Conclusions: Expression of the targeted transgenes was shown to be affected by both the presence of selectable marker cassettes and by the orientation of the transgenes with respect to the endogenous ROSA26 promoter. These results suggest that transcriptional interference from the endogenous gene promoter or from promoters in the selectable marker cassettes may be affecting transgene expression at the locus. Additionally we have been able to determine the optimal orientation for transgene expression at the ROSA26 locus.

    PloS one 2006;1;e4

  • Staking claims in the biotechnology Klondike.

    Sulston J

    The Human Genetics Commission, London, England.

    Bulletin of the World Health Organization 2006;84;5;412-3

  • Her2-targeted therapies in non-small cell lung cancer.

    Swanton C, Futreal A and Eisen T

    Cancer Research UK London Research Institute, Signal Transduction Laboratory, UK.

    Sensitivity to Her2-directed therapies is complex and involves expression not only of Her2 but also of other epidermal growth factor receptor (EGFR) family members, their ligands, and molecules that influence pathway activity, such as insulin-like growth factor-1 receptor, PTEN, and p27. The EGFR experience has taught us that responses can easily be diluted in an unselected cohort of patients. To date, trials of Her2-targeted therapies, such as trastuzumab, have been insufficiently powered to determine whether patients with non-small cell lung cancer (NSCLC) with Her2 gene amplification (rather than overexpression by immunohistochemistry) may benefit from these agents. It is unclear whether agents targeting Her2 might prove successful in future clinical trials in a highly selected patient cohort, either with Her2 amplification or Her2 gene mutations. The frequency of Her2 mutations in NSCLC may be too low to justify a prospective clinical trial in this patient group. The frequency of Her2 amplification (2-23%) in NSCLC and the widespread availability of Her2 fluorescence in situ hybridization analysis may justify a final study of trastuzumab monotherapy in this patient population. The role played by Her2 as the obligate heterodimerization partner for the other EGFR family members renders Her2 an attractive target irrespective of receptor overexpression. The most promising Her2-targeted strategy will likely prove to be combinatorial approaches using an EGFR tyrosine kinase inhibitor together with Her2 dimerization inhibitors.

    Clinical cancer research : an official journal of the American Association for Cancer Research 2006;12;14 Pt 2;4377s-4383s

  • Mutations in FRMD7, a newly identified member of the FERM family, cause X-linked idiopathic congenital nystagmus.

    Tarpey P, Thomas S, Sarvananthan N, Mallya U, Lisgo S, Talbot CJ, Roberts EO, Awan M, Surendran M, McLean RJ, Reinecke RD, Langmann A, Lindner S, Koch M, Jain S, Woodruff G, Gale RP, Bastawrous A, Degg C, Droutsas K, Asproudis I, Zubcov AA, Pieh C, Veal CD, Machado RD, Backhouse OC, Baumber L, Constantinescu CS, Brodsky MC, Hunter DG, Hertle RW, Read RJ, Edkins S, O'Meara S, Parker A, Stevens C, Teague J, Wooster R, Futreal PA, Trembath RC, Stratton MR, Raymond FL and Gottlob I

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    Idiopathic congenital nystagmus is characterized by involuntary, periodic, predominantly horizontal oscillations of both eyes. We identified 22 mutations in FRMD7 in 26 families with X-linked idiopathic congenital nystagmus. Screening of 42 singleton cases of idiopathic congenital nystagmus (28 male, 14 females) yielded three mutations (7%). We found restricted expression of FRMD7 in human embryonic brain and developing neural retina, suggesting a specific role in the control of eye movement and gaze stability.

    Funded by: Medical Research Council: G9826762; Wellcome Trust: 050211

    Nature genetics 2006;38;11;1242-4

  • Mutations in the gene encoding the Sigma 2 subunit of the adaptor protein 1 complex, AP1S2, cause X-linked mental retardation.

    Tarpey PS, Stevens C, Teague J, Edkins S, O'Meara S, Avis T, Barthorpe S, Buck G, Butler A, Cole J, Dicks E, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jones D, Menzies A, Mironenko T, Perry J, Raine K, Richardson D, Shepherd R, Small A, Tofts C, Varian J, West S, Widaa S, Yates A, Catford R, Butler J, Mallya U, Moon J, Luo Y, Dorkins H, Thompson D, Easton DF, Wooster R, Bobrow M, Carpenter N, Simensen RJ, Schwartz CE, Stevenson RE, Turner G, Partington M, Gecz J, Stratton MR, Futreal PA and Raymond FL

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    In a systematic sequencing screen of the coding exons of the X chromosome in 250 families with X-linked mental retardation (XLMR), we identified two nonsense mutations and one consensus splice-site mutation in the AP1S2 gene on Xp22 in three families. Affected individuals in these families showed mild-to-profound mental retardation. Other features included hypotonia early in life and delay in walking. AP1S2 encodes an adaptin protein that constitutes part of the adaptor protein complex found at the cytoplasmic face of coated vesicles located at the Golgi complex. The complex mediates the recruitment of clathrin to the vesicle membrane. Aberrant endocytic processing through disruption of adaptor protein complexes is likely to result from the AP1S2 mutations identified in the three XLMR-affected families, and such defects may plausibly cause abnormal synaptic development and function. AP1S2 is the first reported XLMR gene that encodes a protein directly involved in the assembly of endocytic vesicles.

    Funded by: NICHD NIH HHS: HD26202; Wellcome Trust

    American journal of human genetics 2006;79;6;1119-24

  • Human chromosome 11 DNA sequence and analysis including novel gene identification.

    Taylor TD, Noguchi H, Totoki Y, Toyoda A, Kuroki Y, Dewar K, Lloyd C, Itoh T, Takeda T, Kim DW, She X, Barlow KF, Bloom T, Bruford E, Chang JL, Cuomo CA, Eichler E, FitzGerald MG, Jaffe DB, LaButti K, Nicol R, Park HS, Seaman C, Sougnez C, Yang X, Zimmer AR, Zody MC, Birren BW, Nusbaum C, Fujiyama A, Hattori M, Rogers J, Lander ES and Sakaki Y

    RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.

    Chromosome 11, although average in size, is one of the most gene- and disease-rich chromosomes in the human genome. Initial gene annotation indicates an average gene density of 11.6 genes per megabase, including 1,524 protein-coding genes, some of which were identified using novel methods, and 765 pseudogenes. One-quarter of the protein-coding genes shows overlap with other genes. Of the 856 olfactory receptor genes in the human genome, more than 40% are located in 28 single- and multi-gene clusters along this chromosome. Out of the 171 disorders currently attributed to the chromosome, 86 remain for which the underlying molecular basis is not yet known, including several mendelian traits, cancer and susceptibility loci. The high-quality data presented here--nearly 134.5 million base pairs representing 99.8% coverage of the euchromatic sequence--provide scientists with a solid foundation for understanding the genetic basis of these disorders and other biological phenomena.

    Funded by: Medical Research Council: G0000107; Wellcome Trust

    Nature 2006;440;7083;497-500

  • Expression of mammalian GPCRs in C. elegans generates novel behavioural responses to human ligands.

    Teng MS, Dekkers MP, Ng BL, Rademakers S, Jansen G, Fraser AG and McCafferty J

    Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire, UK.

    Background: G-protein-coupled receptors (GPCRs) play a crucial role in many biological processes and represent a major class of drug targets. However, purification of GPCRs for biochemical study is difficult and current methods of studying receptor-ligand interactions involve in vitro systems. Caenorhabditis elegans is a soil-dwelling, bacteria-feeding nematode that uses GPCRs expressed in chemosensory neurons to detect bacteria and environmental compounds, making this an ideal system for studying in vivo GPCR-ligand interactions. We sought to test this by functionally expressing two medically important mammalian GPCRs, somatostatin receptor 2 (Sstr2) and chemokine receptor 5 (CCR5) in the gustatory neurons of C. elegans.

    Results: Expression of Sstr2 and CCR5 in gustatory neurons allow C. elegans to specifically detect and respond to somatostatin and MIP-1alpha respectively in a robust avoidance assay. We demonstrate that mammalian heterologous GPCRs can signal via different endogenous Galpha subunits in C. elegans, depending on which cells it is expressed in. Furthermore, pre-exposure of GPCR transgenic animals to its ligand leads to receptor desensitisation and behavioural adaptation to subsequent ligand exposure, providing further evidence of integration of the mammalian GPCRs into the C. elegans sensory signalling machinery. In structure-function studies using a panel of somatostatin-14 analogues, we identified key residues involved in the interaction of somatostatin-14 with Sstr2.

    Conclusion: Our results illustrate a remarkable evolutionary plasticity in interactions between mammalian GPCRs and C. elegans signalling machinery, spanning 800 million years of evolution. This in vivo system, which imparts novel avoidance behaviour on C. elegans, thus provides a simple means of studying and screening interaction of GPCRs with extracellular agonists, antagonists and intracellular binding partners.

    Funded by: Wellcome Trust

    BMC biology 2006;4;22

  • A multicenter study of cancer incidence in CHEK2 1100delC mutation carriers.

    Thompson D, Seal S, Schutte M, McGuffog L, Barfoot R, Renwick A, Eeles R, Sodha N, Houlston R, Shanley S, Klijn J, Wasielewski M, Chang-Claude J, Futreal PA, Weber BL, Nathanson KL, Stratton M, Meijers-Heijboer H, Rahman N and Easton DF

    Genetic Epidemiology Unit, Strangeways Research Laboratories, University of Cambridge, Worts Causeway, Cambridge, CB1 8RN, United Kingdom.

    The CHEK2 1100delC protein-truncating mutation has a carrier frequency of approximately 0.7% in Northern and Western European populations and confers an approximately 2-fold increased risk of breast cancer. It has also been suggested to increase risks of colorectal and prostate cancer, but its involvement with these or other types of cancer has not been confirmed. The incidence of cancer other than breast cancer in 11,116 individuals from 734 non-BRCA1/2 breast cancer families from the United Kingdom, Germany, Netherlands, and the United States was compared with that predicted by population rates. Relative risks (RR) to carriers and noncarriers were estimated by maximum likelihood, via the expectation-maximization algorithm to allow for unknown genotypes. Sixty-seven families contained at least one tested CHEK2 1100delC mutation carrier. There was evidence of underreporting of cancers in male relatives (422 cancers observed, 860 expected) but not in females (322 observed, 335 expected); hence, we focused on cancer risks in female carriers. The risk of cancers other than breast cancer in female carriers was not significantly elevated, although a modest increase in risk could not be excluded (RR, 1.18; 95% confidence interval, 0.64-2.17). The carrier risk was not significantly raised for any individual cancer site, including colorectal cancer (RR, 1.60; 95% confidence interval, 0.54-4.71). However, between ages 20 to 50 years, the risks of colorectal and lung cancer were both higher in female carriers than noncarriers (P = 0.041 and 0.0001, respectively). There was no evidence of a higher prostate cancer risk in carriers than noncarriers (P = 0.26), although underreporting of male cancers limited our power to detect such a difference. Our results suggest that the risk of cancer associated with CHEK2 1100delC mutations is restricted to breast cancer, although we cannot rule out a small increase in overall cancer risk.

    Funded by: Cancer Research UK: A3353, A5260; Wellcome Trust: 077012

    Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology 2006;15;12;2542-5

  • Bacterial home from home.

    Thomson N, Crossman L and Bentley S

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2006;4;3;168-70

  • The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Thomson NR, Howard S, Wren BW, Holden MT, Crossman L, Challis GL, Churcher C, Mungall K, Brooks K, Chillingworth T, Feltwell T, Abdellah Z, Hauser H, Jagels K, Maddison M, Moule S, Sanders M, Whitehead S, Quail MA, Dougan G, Parkhill J and Prentice MB

    The Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.

    The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B) and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common themes in the genome evolution of other human enteropathogens.

    Funded by: Wellcome Trust

    PLoS genetics 2006;2;12;e206

  • Cells expressing murine RAD52 splice variants favor sister chromatid repair.

    Thorpe PH, Marrero VA, Savitzky MH, Sunjevaric I, Freeman TC and Rothstein R

    Department of Genetics and Development, Columbia University Medical Center, HHSC 1608, 701 West 168th St., New York, New York 10032, USA.

    The RAD52 gene is essential for homologous recombination in the yeast Saccharomyces cerevisiae. RAD52 is the archetype in an epistasis group of genes essential for DNA damage repair. By catalyzing the replacement of replication protein A with Rad51 on single-stranded DNA, Rad52 likely promotes strand invasion of a double-stranded DNA molecule by single-stranded DNA. Although the sequence and in vitro functions of mammalian RAD52 are conserved with those of yeast, one difference is the presence of introns and consequent splicing of the mammalian RAD52 pre-mRNA. We identified two novel splice variants from the RAD52 gene that are expressed in adult mouse tissues. Expression of these splice variants in tissue culture cells elevates the frequency of recombination that uses a sister chromatid template. To characterize this dominant phenotype further, the RAD52 gene from the yeast Saccharomyces cerevisiae was truncated to model the mammalian splice variants. The same dominant sister chromatid recombination phenotype seen in mammalian cells was also observed in yeast. Furthermore, repair from a homologous chromatid is reduced in yeast, implying that the choice of alternative repair pathways may be controlled by these variants. In addition, a dominant DNA repair defect induced by one of the variants in yeast is suppressed by overexpression of RAD51, suggesting that the Rad51-Rad52 interaction is impaired.

    Molecular and cellular biology 2006;26;10;3752-63

  • Combinatorial RNA interference in Caenorhabditis elegans reveals that redundancy between gene duplicates can be maintained for more than 80 million years of evolution.

    Tischler J, Lehner B, Chen N and Fraser AG

    The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

    Background: Systematic analyses of loss-of-function phenotypes have been carried out for most genes in Saccharomyces cerevisiae, Caenorhabditis elegans, and Drosophila melanogaster. Although such studies vastly expand our knowledge of single gene function, they do not address redundancy in genetic networks. Developing tools for the systematic mapping of genetic interactions is thus a key step in exploring the relationship between genotype and phenotype.

    Results: We established conditions for RNA interference (RNAi) in C. elegans to target multiple genes simultaneously in a high-throughput setting. Using this approach, we can detect the great majority of previously known synthetic genetic interactions. We used this assay to examine the redundancy of duplicated genes in the genome of C. elegans that correspond to single orthologs in S. cerevisiae or D. melanogaster and identified 16 pairs of duplicated genes that have redundant functions. Remarkably, 14 of these redundant gene pairs were duplicated before the divergence of C. elegans and C. briggsae 80-110 million years ago, suggesting that there has been selective pressure to maintain the overlap in function between some gene duplicates.

    Conclusion: We established a high throughput method for examining genetic interactions using combinatorial RNAi in C. elegans. Using this technique, we demonstrated that many duplicated genes can retain redundant functions for more than 80 million years of evolution. This provides strong support for evolutionary models that predict that genetic redundancy between duplicated genes can be actively maintained by natural selection and is not just a transient side effect of recent gene duplication events.

    Funded by: Wellcome Trust

    Genome biology 2006;7;8;R69

  • Genetic analysis of completely sequenced disease-associated MHC haplotypes identifies shuffling of segments in recent human history.

    Traherne JA, Horton R, Roberts AN, Miretti MM, Hurles ME, Stewart CA, Ashurst JL, Atrazhev AM, Coggill P, Palmer S, Almeida J, Sims S, Wilming LG, Rogers J, de Jong PJ, Carrington M, Elliott JF, Sawcer S, Todd JA, Trowsdale J and Beck S

    Department of Pathology, Immunology Division, University of Cambridge, Cambridge, United Kingdom.

    The major histocompatibility complex (MHC) is recognised as one of the most important genetic regions in relation to common human disease. Advancement in identification of MHC genes that confer susceptibility to disease requires greater knowledge of sequence variation across the complex. Highly duplicated and polymorphic regions of the human genome such as the MHC are, however, somewhat refractory to some whole-genome analysis methods. To address this issue, we are employing a bacterial artificial chromosome (BAC) cloning strategy to sequence entire MHC haplotypes from consanguineous cell lines as part of the MHC Haplotype Project. Here we present 4.25 Mb of the human haplotype QBL (HLA-A26-B18-Cw5-DR3-DQ2) and compare it with the MHC reference haplotype and with a second haplotype, COX (HLA-A1-B8-Cw7-DR3-DQ2), that shares the same HLA-DRB1, -DQA1, and -DQB1 alleles. We have defined the complete gene, splice variant, and sequence variation contents of all three haplotypes, comprising over 259 annotated loci and over 20,000 single nucleotide polymorphisms (SNPs). Certain coding sequences vary significantly between different haplotypes, making them candidates for functional and disease-association studies. Analysis of the two DR3 haplotypes allowed delineation of the shared sequence between two HLA class II-related haplotypes differing in disease associations and the identification of at least one of the sites that mediated the original recombination event. The levels of variation across the MHC were similar to those seen for other HLA-disparate haplotypes, except for a 158-kb segment that contained the HLA-DRB1, -DQA1, and -DQB1 genes and showed very limited polymorphism compatible with identity-by-descent and relatively recent common ancestry (<3,400 generations). These results indicate that the differential disease associations of these two DR3 haplotypes are due to sequence variation outside this central 158-kb segment, and that shuffling of ancestral blocks via recombination is a potential mechanism whereby certain DR-DQ allelic combinations, which presumably have favoured immunological functions, can spread across haplotypes and populations.

    Funded by: NCI NIH HHS: N01-CO-12400; Wellcome Trust: 048880

    PLoS genetics 2006;2;1;e9

  • A systematic analysis of human CHMP protein interactions: additional MIT domain-containing proteins bind to multiple components of the human ESCRT III complex.

    Tsang HT, Connell JW, Brown SE, Thompson A, Reid E and Sanderson CM

    Department of Medical Genetics and Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 2XY, UK.

    In Saccharomyces cerevisiae 6 closely related proteins (Did2p, Vps2p, Vps24p, Vps32p, Vps60p, Vps20p) form part of the extended ESCRT III complex. This complex is required for the formation of multivesicular bodies and the degradation of internalized transmembrane receptor proteins. In contrast the human genome encodes 10 homologous proteins (CHMP1A (approved gene symbol PCOLN3), 1B, 2A, 2B, 3 (approved gene symbol VPS24), 4A, 4B, 4C, 5, and 6). In this study we have performed a series of protein interaction experiments to generate a more comprehensive picture of the human CHMP protein-interaction network. Our results describe novel interactions between known components of the human ESCRT III complex and identify a range of putative binding partners, which may indicate new ways in which the function of human CHMP proteins may be regulated. In particular, we show that two further MIT domain-containing proteins (AMSH/STAMBP and LOC129531) interact with multiple components of the human ESCRT III complex.

    Genomics 2006;88;3;333-46

  • The acquisition of full fluoroquinolone resistance in Salmonella Typhi by accumulation of point mutations in the topoisomerase targets.

    Turner AK, Nair S and Wain J

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SA, UK.

    Objectives: To determine the contribution to fluoroquinolone resistance of point mutations in the gyrA and parC genes of Salmonella Typhi.

    Methods: Point mutations that result in Ser-83-->Phe, Ser-83-->Tyr and Asp-87-->Asn amino acid substitutions in GyrA and Glu-84-->Lys in ParC were introduced into a quinolone-susceptible, attenuated strain of Salmonella Typhi using suicide vector technology. This is the first time that this approach has been used in Salmonella and abrogates the need for selection with quinolone antibacterials in the investigation of resistance mutations.

    Results: A panel of mutants was created using this methodology and tested for quinolone resistance. The ParC substitution alone made no difference to quinolone susceptibility. Any single GyrA substitution resulted in resistance to nalidixic acid (MIC >or= 512 mg/L) and increased by up to 23-fold the MIC of the fluoroquinolones ofloxacin (MIC <or= 2 mg/L) ciprofloxacin (MIC <or= 1 mg/L) and gatifloxacin (MIC <or= 0.38 mg/L). Among the double substitution mutants, those with a substitution in ParC were less prone to killing with ciprofloxacin. The triple substitution mutants (Ser-83-->Phe or Tyr and Asp-87-->Asn in GyrA with Glu-84-->Lys in ParC) showed high levels of resistance to all the fluoroquinolones tested (MICs: gatifloxacin, 3-4 mg/L; ofloxacin, 32 mg/L; ciprofloxacin, 32-64 mg/L).

    Conclusions: In Salmonella Typhi the fluoroquinolones tested act on GyrA and, at higher concentrations, on ParC. The point mutations conferred reduced susceptibility to ofloxacin and ciprofloxacin, and also reduced susceptibility to gatifloxacin. Three mutations conferred resistance to ofloxacin (32 mg/L), ciprofloxacin (32 mg/L) and to the more active fluoroquinolone gatifloxacin (MIC >or= 3 mg/L). These results predict that the use of ofloxacin or ciprofloxacin will select for resistance to gatifloxacin in nature.

    Funded by: Wellcome Trust

    The Journal of antimicrobial chemotherapy 2006;58;4;733-40

  • Harnessing asymmetrical substrate recognition by thermostable EndoV to achieve balanced linear amplification in multiplexed SNP typing.

    Turner DJ, Pingle MR and Barany F

    Department of Microbiology and Immunology, Weill Medical College of Cornell University, New York, NY 10021, USA.

    Multiplexed amplification of specific DNA sequences, by PCR or by strand-displacement amplification, is an intrinsically biased process. The relative abundance of amplified DNA can be altered significantly from the original representation and, in extreme cases, allele dropout can occur. In this paper, we present a method of linear amplification of DNA that relies on the cooperative, sequence-dependent functioning of the DNA mismatch-repair enzyme endonuclease V (EndoV) from Thermotoga maritima (Tma) and Bacillus stearothermophilus (Bst) DNA polymerase. Tma EndoV can nick one strand of unmodified duplex DNA, allowing extension by Bst polymerase. By controlling the bases surrounding a mismatch and the mismatch itself, the efficiency of nicking by EndoV and extension by Bst polymerase can be controlled. The method currently allows 100-fold multiplexed amplification of target molecules to be performed isothermally, with an average change of <1.3-fold in their original representation. Because only a single primer is necessary, primer artefacts and nonspecific amplification products are minimized.

    Biochemistry and cell biology = Biochimie et biologie cellulaire 2006;84;2;232-42

  • Assaying chromosomal inversions by single-molecule haplotyping.

    Turner DJ, Shendure J, Porreca G, Church G, Green P, Tyler-Smith C and Hurles ME

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    Inversions are an important form of structural variation, but they are difficult to characterize, as their breakpoints often fall within inverted repeats. We have developed a method called 'haplotype fusion' in which an inversion breakpoint is genotyped by performing fusion PCR on single molecules of human genomic DNA. Fusing single-copy sequences bracketing an inversion breakpoint generates orientation-specific PCR products, exemplified by a genotyping assay for the int22 hemophilia A inversion on Xq28. Furthermore, we demonstrated that inversion events with breakpoints embedded within long (>100 kb) inverted repeats can be genotyped by haplotype-fusion PCR followed by bead-based single-molecule haplotyping on repeat-specific markers bracketing the inversion breakpoint. We illustrate this method by genotyping a Yp paracentric inversion sponsored by >300-kb-long inverted repeats. The generality of our methods to survey for, and genotype chromosomal inversions should help our understanding of the contribution of inversions to genomic variation, inherited diseases and cancer.

    Funded by: Wellcome Trust

    Nature methods 2006;3;6;439-45

  • The rise and fall of the ape Y chromosome?

    Tyler-Smith C, Howe K and Santos FR

    Nature genetics 2006;38;2;141-3

  • Mouse chromosome engineering for modeling human disease.

    van der Weyden L and Bradley A

    Mouse Genomics Lab, Wellcome Trust Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom.

    Chromosomal rearrangements are frequently in humans and can be disease-associated or phenotypically neutral. Recent technological advances have led to the discovery of copy-number changes previously undetected by cytogenetic techniques. To understand the genetic consequences of such genomic changes, these mutations need to be modeled in experimentally tractable systems. The mouse is an excellent organism for this analysis because of its biological and genetic similarity to humans, and the ease with which its genome can be manipulated. Through chromosome engineering, defined rearrangements can be introduced into the mouse genome. The resulting mouse models are leading to a better understanding of the molecular and cellular basis of dosage alterations in human disease phenotypes, in turn opening new diagnostic and therapeutic opportunities.

    Funded by: Wellcome Trust: 077187

    Annual review of genomics and human genetics 2006;7;247-76

  • Loss of TSLC1 causes male infertility due to a defect at the spermatid stage of spermatogenesis.

    van der Weyden L, Arends MJ, Chausiaux OE, Ellis PJ, Lange UC, Surani MA, Affara N, Murakami Y, Adams DJ and Bradley A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.

    Tumor suppressor of lung cancer 1 (TSLC1), also known as SgIGSF, IGSF4, and SynCAM, is strongly expressed in spermatogenic cells undergoing the early and late phases of spermatogenesis (spermatogonia to zygotene spermatocytes and elongating spermatids to spermiation). Using embryonic stem cell technology to generate a null mutation of Tslc1 in mice, we found that Tslc1 null male mice were infertile. Tslc1 null adult testes showed that spermatogenesis had arrested at the spermatid stage, with degenerating and apoptotic spermatids sloughing off into the lumen. In adult mice, Tslc1 null round spermatids showed evidence of normal differentiation (an acrosomal cap and F-actin polarization indistinguishable from that of wild-type spermatids); however, the surviving spermatozoa were immature, malformed, found at very low levels in the epididymis, and rarely motile. Analysis of the first wave of spermatogenesis in Tslc1 null mice showed a delay in maturation by day 22 and degeneration of round spermatids by day 28. Expression profiling of the testes revealed that Tslc1 null mice showed increases in the expression levels of genes involved in apoptosis, adhesion, and the cytoskeleton. Taken together, these data show that Tslc1 is essential for normal spermatogenesis in mice.

    Molecular and cellular biology 2006;26;9;3595-609

  • Functional knockout of the matrilin-3 gene causes premature chondrocyte maturation to hypertrophy and increases bone mineral density and osteoarthritis.

    van der Weyden L, Wei L, Luo J, Yang X, Birk DE, Adams DJ, Bradley A and Chen Q

    Mouse Genomics Lab, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom.

    Mutations in the gene encoding matrilin-3 (MATN3), a noncollagenous extracellular matrix protein, have been reported in a variety of skeletal diseases, including multiple epiphyseal dysplasia, which is characterized by irregular ossification of the epiphyses and early-onset osteoarthritis, spondylo-epimetaphyseal dysplasia, and idiopathic hand osteoarthritis. To assess the role of matrilin-3 in the pathogenesis of these diseases, we generated Matn3 functional knockout mice using embryonic stem cell technology. In the embryonic growth plate of the developing long bones, Matn3 null chondrocytes prematurely became prehypertrophic and hypertrophic, forming an expanded zone of hypertrophy. This expansion was attenuated during the perinatal period, and Matn3 homozygous null mice were viable and showed no gross skeletal malformations at birth. However, by 18 weeks of age, Matn3 null mice had a significantly higher total body bone mineral density than Matn1 null mice or wild-type littermates. Aged Matn3 null mice were much more predisposed to develop severe osteoarthritis than their wild-type littermates. Here, we show that matrilin-3 plays a role in modulating chondrocyte differentiation during embryonic development, in controlling bone mineral density in adulthood, and in preventing osteoarthritis during aging. The lack of Matn3 does not lead to postnatal chondrodysplasia but accounts for higher incidence of osteoarthritis.

    Funded by: NIAMS NIH HHS: R01 AR044745

    The American journal of pathology 2006;169;2;515-27

  • Defining a genomic radius for long-range enhancer action: duplicated conserved non-coding elements hold the key.

    Vavouri T, McEwen GK, Woolfe A, Gilks WR and Elgar G

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, CB10 1SA.

    Many conserved non-coding elements (CNEs) in vertebrate genomes have been shown to function as tissue-specific enhancers. However, the target genes of most CNEs are unknown. Here we show that the target genes of duplicated CNEs can be predicted by considering their neighbouring paralogous genes. This enables us to provide the first systematic estimate of the genomic range for distal cis-regulatory interactions in the human genome: half of CNEs are >250 kb away from their associated gene.

    Trends in genetics : TIG 2006;22;1;5-10

  • Dietary Cl(-) restriction upregulates pendrin expression within the apical plasma membrane of type B intercalated cells.

    Verlander JW, Kim YH, Shin W, Pham TD, Hassell KA, Beierwaltes WH, Green ED, Everett L, Matthews SW and Wall SM

    Department of Medicine, University of Florida College of Medicine, Gainesville, Florida, USA.

    Pendrin, encoded by Slc26a4, is a Cl(-)/HCO(3)(-) exchanger expressed in the apical region of type B and non-A, non-B intercalated cells, which regulates renal NaCl excretion. Dietary Cl(-) restriction upregulates total pendrin protein expression. Whether the subcellular expression of pendrin and whether the apparent vascular volume contraction observed in Slc26a4 null mice are Cl(-) dependent, but Na(+) independent, is unknown. Thus the subcellular distribution of pendrin and its role in acid-base and fluid balance were explored using immunogold cytochemistry and balance studies of mice ingesting a NaCl-replete or a Na(+)-replete, Cl(-)-restricted diet, achieved through substitution of NaCl with NaHCO(3). Boundary length and apical plasma membrane pendrin label density each increased by approximately 60-70% in type B intercalated cells, but not in non-A, non-B cells, whereas cytoplasmic pendrin immunolabel increased approximately 60% in non-A, non-B intercalated cells, but not in type B cells. Following either NaCl restriction or Cl(-) restriction alone, Slc26a4 null mice excreted more Cl(-) and had a higher arterial pH than pair-fed wild-type mice. In conclusion, 1) following dietary Cl(-) restriction, apical plasma membrane pendrin immunolabel increases in type B intercalated cells, but not in non-A, non-B intercalated cells; and 2) pendrin participates in the regulation of renal Cl(-) excretion and arterial pH during dietary Cl(-) restriction.

    Funded by: NIDDK NIH HHS: DK-52935

    American journal of physiology. Renal physiology 2006;291;4;F833-9

  • Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands.

    Vernikos GS and Parkhill J

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SA, UK.

    Motivation: There is a growing literature on the detection of Horizontal Gene Transfer (HGT) events by means of parametric, non-comparative methods. Such approaches rely only on sequence information and utilize different low and high order indices to capture compositional deviation from the genome backbone; the superiority of the latter over the former has been shown elsewhere. However even high order k-mers may be poor estimators of HGT, when insufficient information is available, e.g. in short sliding windows. Most of the current HGT prediction methods require pre-existing annotation, which may restrict their application on newly sequenced genomes.

    Results: We introduce a novel computational method, Interpolated Variable Order Motifs (IVOMs), which exploits compositional biases using variable order motif distributions and captures more reliably the local composition of a sequence compared with fixed-order methods. For optimal localization of the boundaries of each predicted region, a second order, two-state hidden Markov model (HMM) is implemented in a change-point detection framework. We applied the IVOM approach to the genome of Salmonella enterica serovar Typhi CT18, a well-studied prokaryote in terms of HGT events, and we show that the IVOMs outperform state-of-the-art low and high order motif methods predicting not only the already characterized Salmonella Pathogenicity Islands (SPI-1 to SPI-10) but also three novel SPIs (SPI-15, SPI-16, SPI-17) and other HGT events.

    Availability: The software is available under a GPL license as a standalone application at


    Supplementary data are available at Bioinformatics online.

    Funded by: Wellcome Trust

    Bioinformatics (Oxford, England) 2006;22;18;2196-203

  • Nonrandom distribution of Burkholderia pseudomallei clones in relation to geographical location and virulence.

    Vesaratchavest M, Tumapa S, Day NP, Wuthiekanun V, Chierakul W, Holden MT, White NJ, Currie BJ, Spratt BG, Feil EJ and Peacock SJ

    Faculty of Tropical Medicine, Mahidol University, 420/6 Rajvithi Road, Bangkok 10400, Thailand, and Churchill Hospital, Oxford, UK.

    Burkholderia pseudomallei is a soil-dwelling saprophyte and the causative agent of melioidosis, a life-threatening human infection. Most cases are reported from northeast Thailand and northern Australia. Using multilocus sequence typing (MLST), we have compared (i) soil and invasive isolates from northeast Thailand and (ii) invasive isolates from Thailand and Australia. A total of 266 Thai B. pseudomallei isolates were characterized (83 soil and 183 invasive). These corresponded to 123 sequence types (STs), the most abundant being ST70 (n=21), ST167 (n=15), ST54 (n=12), and ST58 (n=11). Two clusters of related STs (clonal complexes) were identified; the larger clonal complex (CC48) did not conform to a simple pattern of radial expansion from an assumed ancestor, while a second (CC70) corresponded to a simple radial expansion from ST70. Despite the large number of STs, overall nucleotide diversity was low. Of the Thai isolates, those isolated from patients with melioidosis were overrepresented in the 10 largest clones (P<0.0001). There was a significant difference in the classification index between environmental and disease isolates (P<0.001), confirming that genotypes were not distributed randomly between the two samples. MLST profiles for 158 isolates from Australia (mainly disease associated) contained a number of STs (96) similar to that seen with the Thai invasive isolates, but no ST was found in both populations. There were also differences in diversity and allele frequency distribution between the two populations. This analysis reveals strong genetic differentiation on the basis of geographical isolation and a significant differentiation on the basis of virulence potential.

    Funded by: Wellcome Trust

    Journal of clinical microbiology 2006;44;7;2553-7

  • Neurone specific regulation of dendritic spines in vivo by post synaptic density 95 protein (PSD-95).

    Vickers CA, Stephens B, Bowen J, Arbuthnott GW, Grant SG and Ingham CA

    Department of Pre-Clinical Veterinary Sciences, (RDSVS) Summerhall, University of Edinburgh, Edinburgh. EH9 1QH, UK.

    Post synaptic density protein 95 (PSD-95) is a postsynaptic adaptor protein coupling the NMDA receptor to downstream signalling pathways underlying plasticity. Mice carrying a targeted gene mutation of PSD-95 show altered behavioural plasticity including spatial learning, neuropathic pain, orientation preference in visual cortical cells, and cocaine sensitisation. These behavioural effects are accompanied by changes in long-term potentiation of synaptic transmission. In vitro studies of PSD-95 signalling indicate that it may play a role in regulating dendritic spine structure. Here, we show that PSD-95 mutant mice have alterations in dendritic spine density in the striatum (a 15% decrease along the dendritic length) and in the hippocampus (a localised 40% increase) without changes in dendritic branch patterns or gross neuronal architecture. These changes in spine density were accompanied by altered expression of proteins known to interact with PSD-95, including NR2B and SAP102, suggesting that PSD-95 plays a role in regulating the expression and activation of proteins found within the NMDA receptor complex. Thus, PSD-95 is an important regulator of neuronal structure as well as plasticity in vivo.

    Funded by: Wellcome Trust

    Brain research 2006;1090;1;89-98

  • Decoding the fine-scale structure of a breast cancer genome and transcriptome.

    Volik S, Raphael BJ, Huang G, Stratton MR, Bignel G, Murnane J, Brebner JH, Bajsarowicz K, Paris PL, Tao Q, Kowbel D, Lapuk A, Shagin DA, Shagina IA, Gray JW, Cheng JF, de Jong PJ, Pevzner P and Collins C

    Department of Urology, and Cancer Research Institute, University of California San Francisco Comprehensive Cancer Center, San Francisco, California 94115, USA.

    A comprehensive understanding of cancer is predicated upon knowledge of the structure of malignant genomes underlying its many variant forms and the molecular mechanisms giving rise to them. It is well established that solid tumor genomes accumulate a large number of genome rearrangements during tumorigenesis. End Sequence Profiling (ESP) maps and clones genome breakpoints associated with all types of genome rearrangements elucidating the structural organization of tumor genomes. Here we extend the ESP methodology in several directions using the breast cancer cell line MCF-7. First, targeted ESP is applied to multiple amplified loci, revealing a complex process of rearrangement and co-amplification in these regions reminiscent of breakage/fusion/bridge cycles. Second, genome breakpoints identified by ESP are confirmed using a combination of DNA sequencing and PCR. Third, in vitro functional studies assign biological function to a rearranged tumor BAC clone, demonstrating that it encodes anti-apoptotic activity. Finally, ESP is extended to the transcriptome identifying four novel fusion transcripts and providing evidence that expression of fusion genes may be common in tumors. These results demonstrate the distinct advantages of ESP including: (1) the ability to detect all types of rearrangements and copy number changes; (2) straightforward integration of ESP data with the annotated genome sequence; (3) immortalization of the genome; (4) ability to generate tumor-specific reagents for in vitro and in vivo functional studies. Given these properties, ESP could play an important role in a tumor genome project.

    Funded by: NCI NIH HHS: R01 CA69044, R33 CA103068; NHLBI NIH HHS: U01 HL66728; NIEHS NIH HHS: R01 ES008427

    Genome research 2006;16;3;394-404

  • Single Nucleotide Polymorphism Analysis by Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry

    Whittaker,P., Bumpstead,S., Downes,K., Ghori,J. and Deloukas,P.;

    Cell Biology 2006;3;Chapter 48;463–470

  • Single Nucleotide Polymorphism Analysis by Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry

    Whittaker,P., Bumpstead,S., Downes,K., Ghori,J. and Deloukas,P.;

    Cell Biology 2006;3;Chapter 48;463–470

  • Tools and resources for Sz. pombe: a report from the 2006 European Fission Yeast Meeting.

    Wixon J and Wood V

    John Wiley and Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, UK.

    Schizosaccharomyces pombe has always suffered from a relative paucity of tools and resources, particularly when compared to Saccharomyces cerevisiae. The European Fission Yeast Meeting, held in March 2006, brought together a significant proportion of the Sz. pombe research community, so it was an ideal opportunity to hold a discussion session on the future needs of those working on this model organism. While the session generated a consensus on the most essential requirements, it also demonstrated the frustrations and concerns of those working with Sz. pombe. The community was also briefed regarding the future transition of the current database (Sz. pombe GeneDB) to a fully-fledged Model Organism Database (MOD) to support the needs of both fission yeast and the broader scientific community.

    Yeast (Chichester, England) 2006;23;13;901-3

  • Phosphodiesterase genes are associated with susceptibility to major depression and antidepressant treatment response.

    Wong ML, Whelan F, Deloukas P, Whittaker P, Delgado M, Cantor RM, McCann SM and Licinio J

    Center on Pharmacogenomics, Department of Psychiatry and Behavioral Sciences, Miller School of Medicine, University of Miami, Miami, FL 33136, USA.

    Cyclic nucleotide phosphodiesterases (PDEs) constitute a family of enzymes that degrade cAMP and cGMP. Intracellular cyclic nucleotide levels increase in response to extracellular stimulation by hormones, neurotransmitters, or growth factors and are down-regulated through hydrolysis catalyzed by PDEs, which are therefore candidate therapeutic targets. cAMP is a second messenger implicated in learning, memory, and mood, and cGMP modulates nervous system processes that are controlled by the nitric oxide (NO)/cGMP pathway. To investigate an association between genes encoding PDEs and susceptibility to major depressive disorder (MDD), we genotyped SNPs in 21 genes of this superfamily in 284 depressed Mexican Americans who participated in a prospective, double-blind, pharmacogenetic study of antidepressant response, and 331 matched controls. Polymorphisms in PDE9A and PDE11A were found to be associated with the diagnosis of MDD. Our data are also suggestive of the association between SNPs in other PDE genes and MDD. Remission on antidepressants was significantly associated with polymorphisms in PDE1A and PDE11A. Thus, we found significant associations with both the diagnosis of MDD and remission in response to antidepressants with SNPs in the PDE11A gene. We show here that PDE11A haplotype GAACC is significantly associated with MDD. We conclude that PDE11A has a role in the pathophysiology of MDD. This study identifies a potential CNS role for the PDE11 family. The hypothesis that drugs affecting PDE function, particularly cGMP-related PDEs, represent a treatment strategy for major depression should therefore be tested.

    Funded by: NCRR NIH HHS: K12RR17611, RR000865, RR017365, RR16996; NHGRI NIH HHS: HG002500; NHLBI NIH HHS: K30HL04526; NIDDK NIH HHS: DK063240; NIGMS NIH HHS: GM61394; NIMH NIH HHS: MH062777; Wellcome Trust

    Proceedings of the National Academy of Sciences of the United States of America 2006;103;41;15124-9

  • Identification of physicochemical selective pressure on protein encoding nucleotide sequences.

    Wong WS, Sainudiin R and Nielsen R

    Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.

    Background: Statistical methods for identifying positively selected sites in protein coding regions are one of the most commonly used tools in evolutionary bioinformatics. However, they have been limited by not taking the physiochemical properties of amino acids into account.

    Results: We develop a new codon-based likelihood model for detecting site-specific selection pressures acting on specific physicochemical properties. Nonsynonymous substitutions are divided into substitutions that differ with respect to the physicochemical properties of interest, and those that do not. The substitution rates of these two types of changes, relative to the synonymous substitution rate, are then described by two parameters, gamma and omega respectively. The new model allows us to perform likelihood ratio tests for positive selection acting on specific physicochemical properties of interest. The new method is first used to analyze simulated data and is shown to have good power and accuracy in detecting physicochemical selective pressure. We then re-analyze data from the class-I alleles of the human Major Histocompatibility Complex (MHC) and from the abalone sperm lysine.

    Conclusion: Our new method allows a more flexible framework to identify selection pressure on particular physicochemical properties.

    Funded by: PHS HHS: 0201037

    BMC bioinformatics 2006;7;148

  • How to get the most from fission yeast genome data: a report from the 2006 European Fission Yeast Meeting computing workshop.

    Wood V

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1HH, UK.

    A fission yeast computing workshop 'How to get the most from the fission yeast genome data' was run as a satellite to the European Fission Yeast Meeting. The broad aims of the workshop were to provide fission yeast bench biologists with a set of tools and protocols to query the fission yeast genome data in specific ways, in order to extract biologically meaningful information of interest, which can be tailored to the needs of individual research projects. A description of the workshop content is provided and a selection of the tools presented are reviewed.

    Yeast (Chichester, England) 2006;23;13;905-12

  • Sequencing analysis of BRAF mutations in human cancers.

    Wooster R, Futreal AP and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.

    Cancers arise because of the accumulation of mutations in critical genes that alter normal programs of cell proliferation, differentiation, and death. The RAS-RAF-MEK-ERK-MAP kinase pathway mediates cellular responses to growth signals. RAS is mutated to an oncogenic form in approximately 15% of human cancer. The three RAF genes code for cytoplasmic serine/threonine kinases that are regulated by binding RAS. ARAF and c-RAF are infrequently mutated in human cancer. However, BRAF is mutated in a wide range of human cancers. Most mutations are within the kinase domain, with a single amino acid substitution (V600E) accounting for most mutations.

    Methods in enzymology 2006;407;218-24

  • Schistosoma mansoni (Platyhelminthes, Trematoda) nuclear receptors: sixteen new members and a novel subfamily.

    Wu W, Niles EG, El-Sayed N, Berriman M and LoVerde PT

    Department of Microbiology and Immunology, and Center for Microbial Pathogenesis, School of Medicine and Biomedical Science, State University of New York, Buffalo, NY 14214, USA.

    Nuclear receptors (NRs) are important transcriptional modulators in metazoans. Sixteen new NRs were identified in the Platyhelminth trematode, Schistosoma mansoni. Three were found to possess novel tandem DNA-binding domains that identify a new subfamily of NR. Two NRs are homologues of the thyroid hormone receptor that previously were thought to be restricted to chordates. This study brings the total number of identified NR in S. mansoni to 21. Phylogenetic and comparative genomic analyses demonstrate that S. mansoni NRs share an evolutionary lineage with that of arthropods and vertebrates. Phylogenic analysis shows that more than half of the S. mansoni nuclear receptors evolved from a second gene duplication. As the second gene duplication of NRs was thought to be specific to vertebrates, our data challenge the current theory of NR evolution.

    Funded by: NIAID NIH HHS: AI046762, U01 AI48828

    Gene 2006;366;2;303-15

  • Spread of an inactive form of caspase-12 in humans is due to recent positive selection.

    Xue Y, Daly A, Yngvadottir B, Liu M, Coop G, Kim Y, Sabeti P, Chen Y, Stalker J, Huckle E, Burton J, Leonard S, Rogers J and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, United Kingdom.

    The human caspase-12 gene is polymorphic for the presence or absence of a stop codon, which results in the occurrence of both active (ancestral) and inactive (derived) forms of the gene in the population. It has been shown elsewhere that carriers of the inactive gene are more resistant to severe sepsis. We have now investigated whether the inactive form has spread because of neutral drift or positive selection. We determined its distribution in a worldwide sample of 52 populations and resequenced the gene in 77 individuals from the HapMap Yoruba, Han Chinese, and European populations. There is strong evidence of positive selection from low diversity, skewed allele-frequency spectra, and the predominance of a single haplotype. We suggest that the inactive form of the gene arose in Africa approximately 100-500 thousand years ago (KYA) and was initially neutral or almost neutral but that positive selection beginning approximately 60-100 KYA drove it to near fixation. We further propose that its selective advantage was sepsis resistance in populations that experienced more infectious diseases as population sizes and densities increased.

    Funded by: Wellcome Trust

    American journal of human genetics 2006;78;4;659-70

  • Male demography in East Asia: a north-south contrast in human population expansion times.

    Xue Y, Zerjal T, Bao W, Zhu S, Shu Q, Xu J, Du R, Fu S, Li P, Hurles ME, Yang H and Tyler-Smith C

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    The human population has increased greatly in size in the last 100,000 years, but the initial stimuli to growth, the times when expansion started, and their variation between different parts of the world are poorly understood. We have investigated male demography in East Asia, applying a Bayesian full-likelihood analysis to data from 988 men representing 27 populations from China, Mongolia, Korea, and Japan typed with 45 binary and 16 STR markers from the Y chromosome. According to our analysis, the northern populations examined all started to expand in number between 34 (18-68) and 22 (12-39) thousand years ago (KYA), before the last glacial maximum at 21-18 KYA, while the southern populations all started to expand between 18 (6-47) and 12 (1-45) KYA, but then grew faster. We suggest that the northern populations expanded earlier because they could exploit the abundant megafauna of the "Mammoth Steppe," while the southern populations could increase in number only when a warmer and more stable climate led to more plentiful plant resources such as tubers.

    Funded by: Wellcome Trust

    Genetics 2006;172;4;2431-9

  • Comparative genome maps of the pangolin, hedgehog, sloth, anteater and human revealed by cross-species chromosome painting: further insight into the ancestral karyotype and genome evolution of eutherian mammals.

    Yang F, Graphodatsky AS, Li T, Fu B, Dobigny G, Wang J, Perelman PL, Serdukova NA, Su W, O'Brien PC, Wang Y, Ferguson-Smith MA, Volobouev V and Nie W

    Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, 650223, PR China,

    To better understand the evolution of genome organization of eutherian mammals, comparative maps based on chromosome painting have been constructed between human and representative species of three eutherian orders: Xenarthra, Pholidota, and Eulipotyphla, as well as between representative species of the Carnivora and Pholidota. These maps demonstrate the conservation of such syntenic segment associations as HSA3/21, 4/8, 7/16, 12/22, 14/15 and 16/19 in Eulipotyphla, Pholidota and Xenarthra and thus further consolidate the notion that they form part of the ancestral karyotype of the eutherian mammals. Our study has revealed many potential ancestral syntenic associations of human chromosomal segments that serve to link the families as well as orders within the major superordinial eutherian clades defined by molecular markers. The HSA2/8 and 7/10 associations could be the cytogenetic signatures that unite the Xenarthrans, while the HSA1/19p could be a putative signature that links the Afrotheria and Xenarthra. But caution is required in the interpretation of apparently shared syntenic associations as detailed analyses also show examples of apparent convergent evolution that differ in breakpoints and extent of the involved segments.

    Funded by: Wellcome Trust

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2006;14;3;283-96

  • Combined multicolor-FISH and immunostaining.

    Ye CJ, Stevens JB, Liu G, Ye KJ, Yang F, Bremer SW and Heng HH

    SeeDNA Biotech Inc, Windsor, Ontario, Canada.

    The combination of multicolor-FISH and immunostaining produces a powerful visual method to analyze in situ DNA-protein interactions and dynamics. Representing one of the major technical improvements of FISH technology, this method has been used extensively in the field of chromosome and genome research, as well as in clinical studies, and serves as an important tool to bridge molecular analysis and cytological description. In this short review, the development and significance of this method will be briefly summarized using a limited number of examples to illustrate the large body of literature. In addition to descriptions of technical considerations, future applications and perspectives have also been discussed focusing specifically on the areas of genome organization, gene expression and medical research. We anticipate that this versatile method will play an important role in the study of the structure and function of the dynamic genome and for the development of potential applications for medical research.

    Cytogenetic and genome research 2006;114;3-4;227-34

  • Cross-species chromosome painting unveils cytogenetic signatures for the Eulipotyphla and evidence for the polyphyly of Insectivora.

    Ye J, Biltueva L, Huang L, Nie W, Wang J, Jing M, Su W, Vorobieva NV, Jiang X, Graphodatsky AS and Yang F

    Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, The Chinese Academy of Sciences, Kunming, Yunnan, 650223, PR China.

    Insectivore-like animals are traditionally believed among the first eutherian mammals that have appeared on the earth. The modern insectivores are thus crucial for understanding the systematics and phylogeny of eutherian mammals as a whole. Here cross-species chromosome painting, with probes derived from flow-sorted chromosomes of human, was used to delimit the homologous chromosomal segments in two Soricidae species, the common shrew (Sorex araneus, 2n = 20/21), and Asiatic short-tailed shrew (Blarinella griselda, 2n = 44), and one Erinaceidae species, the shrew-hedgehog (Neotetracus sinensis, 2n = 32), and human. We report herewith the first comparative maps for the Asiatic short-tailed shrew and the shrew-hedgehog, in addition to a refined comparative map for the common shrew. In total, the 22 human autosomal paints detected 40, 51 and 58 evolutionarily conserved segments in the genomes of common shrew, Asiatic short-tailed shrew, and shrew-hedgehog, respectively, demonstrating that the common shrew has retained a conserved genome organization while the Asiatic short-tailed shrew and shrew-hedgehog have relatively rearranged genomes. In addition to confirming the existence of such ancestral human segmental combinations as HSA 3/21, 12/22, 14/15 and 7/16 that are shared by most eutherian mammals, our study reveals a shared human segmental combination, HSA 4/20, that could phylogenetically unite the Eulipotyphlan (i.e., the core insectivores) species. Our results provide cytogenetic evidence for the polyphyly of the order Insectivora and additional data for the eventual reconstruction of the ancestral eutherian karyotype.

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2006;14;2;151-9

  • Epigenetic disruption of two proapoptotic genes MAPK10/JNK3 and PTPN13/FAP-1 in multiple lymphomas and carcinomas through hypermethylation of a common bidirectional promoter.

    Ying J, Li H, Cui Y, Wong AH, Langford C and Tao Q

    Leukemia : official journal of the Leukemia Society of America, Leukemia Research Fund, U.K 2006;20;6;1173-5

  • Functional epigenetics identifies a protocadherin PCDH10 as a candidate tumor suppressor for nasopharyngeal, esophageal and multiple other carcinomas with frequent methylation.

    Ying J, Li H, Seng TJ, Langford C, Srivastava G, Tsao SW, Putti T, Murray P, Chan AT and Tao Q

    Cancer Epigenetics Laboratory, Sir YK Pao Cancer Center, Department of Clinical Oncology, The Chinese University of Hong Kong, Hong Kong, and Department of Pathology, National University Hospital, Singapore.

    Protocadherins constitute the largest subgroup in the cadherin superfamily of cell adhesion molecules. Their major functions are poorly understood, although some are implicated in nervous system development. As tumor-specific promoter methylation is a marker for tumor suppressor genes (TSG), we searched for epigenetically inactivated TSGs using methylation-subtraction combined with pharmacologic demethylation, and identified the PCDH10 CpG island as a methylated sequence in nasopharyngeal carcinoma (NPC). PCDH10 is broadly expressed in all normal adult and fetal tissues including the epithelia, though at different levels. It resides at 4q28.3--a region with hemizygous deletion detected by array-CGH in NPC cell lines; however, PCDH10 itself is not located within the deletion. In contrast, its transcriptional silencing and promoter methylation were frequently detected in multiple carcinoma cell lines in a biallelic way, including 12/12 nasopharyngeal, 13/16 esophageal, 3/4 breast, 5/5 colorectal, 3/4 cervical, 2/5 lung and 2/8 hepatocellular carcinoma cell lines, but not in any immortalized normal epithelial cell line. Aberrant methylation was further frequently detected in multiple primary carcinomas (82% in NPC, 42-51% for other carcinomas), but not normal tissues. The transcriptional silencing of PCDH10 could be reversed by pharmacologic demethylation with 5-aza-2'-deoxycytidine or genetic demethylation with double knockout of DNMT1 and DNMT3B, indicating a direct epigenetic mechanism. Ectopic expression of PCDH10 strongly suppressed tumor cell growth, migration, invasion and colony formation. Although the epigenetic and genetic disruptions of several classical cadherins as TSGs have been well documented in tumors, this is the first report that a widely expressed protocadherin can also function as a TSG that is frequently inactivated epigenetically in multiple carcinomas.

    Oncogene 2006;25;7;1070-80

  • The genome of Rhizobium leguminosarum has recognizable core and accessory components.

    Young JP, Crossman LC, Johnston AW, Thomson NR, Ghazoui ZF, Hull KH, Wexler M, Curson AR, Todd JD, Poole PS, Mauchline TH, East AK, Quail MA, Churcher C, Arrowsmith C, Cherevach I, Chillingworth T, Clarke K, Cronin A, Davis P, Fraser A, Hance Z, Hauser H, Jagels K, Moule S, Mungall K, Norbertczak H, Rabbinowitsch E, Sanders M, Simmonds M, Whitehead S and Parkhill J

    Department of Biology, University of York, York, UK.

    Background: Rhizobium leguminosarum is an alpha-proteobacterial N2-fixing symbiont of legumes that has been the subject of more than a thousand publications. Genes for the symbiotic interaction with plants are well studied, but the adaptations that allow survival and growth in the soil environment are poorly understood. We have sequenced the genome of R. leguminosarum biovar viciae strain 3841.

    Results: The 7.75 Mb genome comprises a circular chromosome and six circular plasmids, with 61% G+C overall. All three rRNA operons and 52 tRNA genes are on the chromosome; essential protein-encoding genes are largely chromosomal, but most functional classes occur on plasmids as well. Of the 7,263 protein-encoding genes, 2,056 had orthologs in each of three related genomes (Agrobacterium tumefaciens, Sinorhizobium meliloti, and Mesorhizobium loti), and these genes were over-represented in the chromosome and had above average G+C. Most supported the rRNA-based phylogeny, confirming A. tumefaciens to be the closest among these relatives, but 347 genes were incompatible with this phylogeny; these were scattered throughout the genome but were over-represented on the plasmids. An unexpectedly large number of genes were shared by all three rhizobia but were missing from A. tumefaciens.

    Conclusion: Overall, the genome can be considered to have two main components: a 'core', which is higher in G+C, is mostly chromosomal, is shared with related organisms, and has a consistent phylogeny; and an 'accessory' component, which is sporadic in distribution, lower in G+C, and located on the plasmids and chromosomal islands. The accessory genome has a different nucleotide composition from the core despite a long history of coexistence.

    Funded by: Wellcome Trust

    Genome biology 2006;7;4;R34

  • A deficiency in the region homologous to human 17q21.33-q23.2 causes heart defects in mice.

    Yu YE, Morishima M, Pao A, Wang DY, Wen XY, Baldini A and Bradley A

    Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA.

    Several constitutional chromosomal rearrangements occur on human chromosome 17. Patients who carry constitutional deletions of 17q21.3-q24 exhibit distinct phenotypic features. Within the deletion interval, there is a genomic segment that is bounded by the myeloperoxidase and homeobox B1 genes. This genomic segment is syntenically conserved on mouse chromosome 11 and is bounded by the mouse homologs of the same genes (Mpo and HoxB1). To attain functional information about this syntenic segment in mice, we have generated a 6.9-Mb deletion [Df(11)18], the reciprocal duplication [Dp(11)18] between Mpo and Chad (the chondroadherin gene), and a 1.8-Mb deletion between Chad and HoxB1. Phenotypic analyses of the mutant mouse lines showed that the Dp(11)18/Dp(11)18 genotype was responsible for embryonic or adolescent lethality, whereas the Df(11)18/+ genotype was responsible for heart defects. The cardiovascular phenotype of the Df(11)18/+ fetuses was similar to those of patients who carried the deletions of 17q21.3-q24. Since heart defects were not detectable in Df(11)18/Dp(11)18 mice, the haplo-insufficiency of one or more genes located between Mpo and Chad may be responsible for the abnormal cardiovascular phenotype. Therefore, we have identified a new dosage-sensitive genomic region that may be critical for normal heart development in both mice and humans.

    Funded by: Wellcome Trust

    Genetics 2006;173;1;297-307

  • Comparative genomic analysis links karyotypic evolution with genomic evolution in the Indian muntjac (Muntiacus muntjak vaginalis).

    Zhou Q, Huang L, Zhang J, Zhao X, Zhang Q, Song F, Chi J, Yang F and Wang W

    CAS-Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, 650223, People's Republic of China.

    The karyotype of Indian muntjacs (Muntiacus muntjak vaginalis) has been greatly shaped by chromosomal fusion, which leads to its lowest diploid number among the extant known mammals. We present, here, comparative results based on draft sequences of 37 bacterial artificial clones (BAC) clones selected by chromosome painting for this special muntjac species. Sequence comparison on these BAC clones uncovered sequence syntenic relationships between the muntjac genome and those of other mammals. We found that the muntjac genome has peculiar features with respect to intron size and evolutionary rates of genes. Inspection of more than 80 pairs of orthologous introns from 15 genes reveals a significant reduction in intron size in the Indian muntjac compared to that of human, mouse, and dog. Evolutionary analysis using 19 genes indicates that the muntjac genes have evolved rapidly compared to other mammals. In addition, we identified and characterized sequence composition of the first BAC clone containing a chromosomal fusion site. Our results shed new light on the genome architecture of the Indian muntjac and suggest that chromosomal rearrangements have been accompanied by other salient genomic changes.

    Chromosoma 2006;115;6;427-36

  • DNA sequence of human chromosome 17 and analysis of rearrangement in the human lineage.

    Zody MC, Garber M, Adams DJ, Sharpe T, Harrow J, Lupski JR, Nicholson C, Searle SM, Wilming L, Young SK, Abouelleil A, Allen NR, Bi W, Bloom T, Borowsky ML, Bugalter BE, Butler J, Chang JL, Chen CK, Cook A, Corum B, Cuomo CA, de Jong PJ, DeCaprio D, Dewar K, FitzGerald M, Gilbert J, Gibson R, Gnerre S, Goldstein S, Grafham DV, Grocock R, Hafez N, Hagopian DS, Hart E, Norman CH, Humphray S, Jaffe DB, Jones M, Kamal M, Khodiyar VK, LaButti K, Laird G, Lehoczky J, Liu X, Lokyitsang T, Loveland J, Lui A, Macdonald P, Major JE, Matthews L, Mauceli E, McCarroll SA, Mihalev AH, Mudge J, Nguyen C, Nicol R, O'Leary SB, Osoegawa K, Schwartz DC, Shaw-Smith C, Stankiewicz P, Steward C, Swarbreck D, Venkataraman V, Whittaker CA, Yang X, Zimmer AR, Bradley A, Hubbard T, Birren BW, Rogers J, Lander ES and Nusbaum C

    Broad Institute of MIT and Harvard, 7 Cambridge Center, Massachusetts 02142, USA.

    Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome. It is also enriched in segmental duplications, ranking third in density among the autosomes. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.

    Funded by: Medical Research Council: G0000107; Wellcome Trust: 077187

    Nature 2006;440;7087;1045-9

  • Negative and positive regulation of gene expression by mouse histone deacetylase 1.

    Zupkovitz G, Tischler J, Posch M, Sadzak I, Ramsauer K, Egger G, Grausenburger R, Schweifer N, Chiocca S, Decker T and Seiser C

    Max F. Perutz Laboratories, Department of Medical Biochemistry, Medical University of Vienna, A-1030 Vienna, Austria.

    Histone deacetylases (HDACs) catalyze the removal of acetyl groups from core histones. Because of their capacity to induce local condensation of chromatin, HDACs are generally considered repressors of transcription. In this report, we analyzed the role of the class I histone deacetylase HDAC1 as a transcriptional regulator by comparing the expression profiles of wild-type and HDAC1-deficient embryonic stem cells. A specific subset of mouse genes (7%) was deregulated in the absence of HDAC1. We identified several putative tumor suppressors (JunB, Prss11, and Plagl1) and imprinted genes (Igf2, H19, and p57) as novel HDAC1 targets. The majority of HDAC1 target genes showed reduced expression accompanied by recruitment of HDAC1 and local reduction in histone acetylation at regulatory regions. At some target genes, the related deacetylase HDAC2 partially masks the loss of HDAC1. A second group of genes was found to be downregulated in HDAC1-deficient cells, predominantly by additional recruitment of HDAC2 in the absence of HDAC1. Finally, a small set of genes (Gja1, Irf1, and Gbp2) was found to require HDAC activity and recruitment of HDAC1 for their transcriptional activation. Our study reveals a regulatory cross talk between HDAC1 and HDAC2 and a novel function for HDAC1 as a transcriptional coactivator.

    Molecular and cellular biology 2006;26;21;7913-28

  • Proceedings of the 2006 European Fission Yeast Meeting. March 16-18, 2006. Hinxton, United Kingdom.

    No authors listed

    Yeast (Chichester, England) 2006;23;13;899-1043

