Sanger Institute - Publications 2005

Number of papers published in 2005: 240

  • WebACT--an online companion for the Artemis Comparison Tool.

    Abbott JC, Aanensen DM, Rutherford K, Butcher S and Spratt BG

    Centre for Bioinformatics, Division of Molecular Biosciences, Imperial College London, London SW7 2AZ, UK. admin@webact.org

    WebACT is an online resource which enables the rapid provision of simultaneous BLAST comparisons between up to five genomic sequences in a format amenable for visualization with the well-known Artemis Comparison Tool (ACT). Comparisons can be generated on-the-fly using sequences directly retrieved via EMBL database queries, or by entering or uploading user sequences. Furthermore, pre-computed comparisons are available between all publicly available, completed prokaryotic genomes and plasmids currently contained within the Genome Reviews database (372 sequences, representing 175 different species). The system is designed to minimize the volume of downloaded data and maximize performance. Genome sequences, annotation and pre-computed comparisons are stored in a relational database allowing flexible querying based on user-defined sequence regions, from whole genome to a defined region flanking a specified gene. Comparison and sequence files, whether computed online or retrieved from the database of pre-computed genome comparisons, can be viewed online using ACT and are available for download. AVAILABILITY: Freely accessible at http://www.webact.org. SUPPLEMENTARY INFORMATION: User guide and worked examples are available at http://www.webact.org/WebACT/docs.

    Funded by: Wellcome Trust

    Bioinformatics (Oxford, England) 2005;21;18;3665-6

  • A genome-wide, end-sequenced 129Sv BAC library resource for targeting vector construction.

    Adams DJ, Quail MA, Cox T, van der Weyden L, Gorick BD, Su Q, Chan WI, Davies R, Bonfield JK, Law F, Humphray S, Plumb B, Liu P, Rogers J and Bradley A

    The Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK.

    The majority of gene-targeting experiments in mice are performed in 129Sv-derived embryonic stem (ES) cell lines, which are generally considered to be more reliable at colonizing the germ line than ES cells derived from other strains. Gene targeting is reliant on homologous recombination of a targeting vector with the host ES cell genome. The efficiency of recombination is affected by many factors, including the isogenicity (H. te Riele et al., 1992, Proc. Natl. Acad. Sci. USA 89, 5128-5132) and the length of homologous sequence of the targeting vector and the location of the target locus. Here we describe the double-end sequencing and mapping of 84,507 bacterial artificial chromosomes (BACs) generated from AB2.2 ES cell DNA (129S7/SvEvBrd-Hprtb-m2). We have aligned these BACs against the mouse genome and displayed them on the Ensembl genome browser, DAS: 129S7/AB2.2. This library has an average insert size of 110.68 kb and average depth of genome coverage of 3.63- and 1.24-fold across the autosomes and sex chromosomes, respectively. Over 97% of the mouse genome and 99.1% of Ensembl genes are covered by clones from this library. This publicly available BAC resource can be used for the rapid construction of targeting vectors via recombineering. Furthermore, we show that targeting vectors containing DNA recombineered from this BAC library can be used to target genes efficiently in several 129-derived ES cell lines.

    Funded by: Wellcome Trust

    Genomics 2005;86;6;753-8

  • BRCTx is a novel, highly conserved RAD18-interacting protein.

    Adams DJ, van der Weyden L, Gergely FV, Arends MJ, Ng BL, Tannahill D, Kanaar R, Markus A, Morris BJ and Bradley A

    The Wellcome Trust Sanger Institute Hinxton, Cambs CB10 1SA, United Kingdom.

    The BRCT domain is a highly conserved module found in many proteins that participate in DNA damage checkpoint regulation, DNA repair, and cell cycle control. Here we describe the cloning, characterization, and targeted mutagenesis of Brctx, a novel gene with a BRCT motif. Brctx was found to be expressed ubiquitously in adult tissues and during development, with the highest levels found in testis. Brctx-deficient mice develop normally, show no pathological abnormalities, and are fertile. BRCTx binds to the C terminus of hRAD18 in yeast two-hybrid and immunoprecipitation assays and colocalizes with this protein in the nucleus. Despite this, Brctx-deficient murine embryonic fibroblasts (MEFs) do not show overt sensitivity to DNA-damaging agents. MEFs from Brctx-deficient embryos grow at a similar rate to wild-type MEF CD4/CD8 expressions, and the cell cycle parameters of thymocytes from wild-type and Brctx knockout animals are indistinguishable. Intriguingly, the BRCT domain of BRCTx is responsible for mediating its localization to the nucleus and centrosome in interphase cells. We conclude that, although highly conserved, Brctx is not essential for the above-mentioned processes and may be redundant.

    Molecular and cellular biology 2005;25;2;779-88

  • A case for a Glossina genome project.

    Aksoy S, Berriman M, Hall N, Hattori M, Hide W and Lehane MJ

    Yale University School of Medicine, 60 College Street, 606 LEPH, New Haven, CT 06520, USA. serap.aksoy@yale.edu

    Given the medical and agricultural significance of Glossina, knowledge of the genomic aspects of the vector and vector-pathogen interactions are a high priority. In preparation for a full genome sequence initiative, an extensive set of expressed sequence tags (ESTs) has been generated from tissue-specific normalized libraries. In addition, bacterial artificial chromosome (BAC) libraries are being constructed, and information on the genome structure and size from different species has been obtained. An international consortium is now in place to further efforts to lead to a full genome project.

    Trends in parasitology 2005;21;3;107-11

  • Mutations of the catalytic subunit of RAB3GAP cause Warburg Micro syndrome.

    Aligianis IA, Johnson CA, Gissen P, Chen D, Hampshire D, Hoffmann K, Maina EN, Morgan NV, Tee L, Morton J, Ainsworth JR, Horn D, Rosser E, Cole TR, Stolte-Dijkstra I, Fieggen K, Clayton-Smith J, Mégarbané A, Shield JP, Newbury-Ecob R, Dobyns WB, Graham JM, Kjaer KW, Warburg M, Bond J, Trembath RC, Harris LW, Takai Y, Mundlos S, Tannahill D, Woods CG and Maher ER

    Section of Medical and Molecular Genetics, University of Birmingham, Birmingham, B15 2TT, UK.

    Warburg Micro syndrome (WARBM1) is a severe autosomal recessive disorder characterized by developmental abnormalities of the eye and central nervous system and by microgenitalia. We identified homozygous inactivating mutations in RAB3GAP, encoding RAB3 GTPase activating protein, a key regulator of the Rab3 pathway implicated in exocytic release of neurotransmitters and hormones, in 12 families with Micro syndrome. We hypothesize that the underlying pathogenesis of Micro syndrome is a failure of exocytic release of ocular and neurodevelopmental trophic factors.

    Nature genetics 2005;37;3;221-3

  • Identification of core and variable components of the Salmonella enterica subspecies I genome by microarray.

    Anjum MF, Marooney C, Fookes M, Baker S, Dougan G, Ivens A and Woodward MJ

    Department of Food and Environmental Safety, Veterinary Laboratories Agency-Weybridge, New Haw, Addlestone, Surrey KT15 3NB, United Kingdom. m.anjum@vla.defra.gsi.gov.uk

    We have performed microarray hybridization studies on 40 clinical isolates from 12 common serovars within Salmonella enterica subspecies I to identify the conserved chromosomal gene pool. We were able to separate the core invariant portion of the genome by a novel mathematical approach using a decision tree based on genes ranked by increasing variance. All genes within the core component were confirmed using available sequence and microarray information for S. enterica subspecies I strains. The majority of genes within the core component had conserved homologues in Escherichia coli K-12 strain MG1655. However, many genes present in the conserved set which were absent or highly divergent in K-12 had close homologues in pathogenic bacteria such as Shigella flexneri and Pseudomonas aeruginosa. Genes within previously established virulence determinants such as SPI1 to SPI5 were conserved. In addition several genes within SPI6, all of SPI9, and three fimbrial operons (fim, bcf, and stb) were conserved within all S. enterica strains included in this study. Although many phage and insertion sequence elements were missing from the core component, approximately half the pseudogenes present in S. enterica serovar Typhi were conserved. Furthermore, approximately half the genes conserved in the core set encoded hypothetical proteins. Separation of the core and variant gene sets within S.enterica subspecies I has offered fundamental biological insight into the genetic basis of phenotypic similarity and diversity across S. enterica subspecies I and shown how the core genome of these pathogens differs from the closely related E. coli K-12 laboratory strain.

    Infection and immunity 2005;73;12;7894-905

  • The Vertebrate Genome Annotation (Vega) database.

    Ashurst JL, Chen CK, Gilbert JG, Jekosch K, Keenan S, Meidl P, Searle SM, Stalker J, Storey R, Trevanion S, Wilming L and Hubbard T

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK. jla1@sanger.ac.uk

    The Vertebrate Genome Annotation (Vega) database (http://vega.sanger.ac.uk) has been designed to be a community resource for browsing manual annotation of finished sequences from a variety of vertebrate genomes. Its core database is based on an Ensembl-style schema, extended to incorporate curation-specific metadata. In collaboration with the genome sequencing centres, Vega attempts to present consistent high-quality annotation of the published human chromosome sequences. In addition, it is also possible to view various finished regions from other vertebrates, including mouse and zebrafish. Vega displays only manually annotated gene structures built using transcriptional evidence, which can be examined in the browser. Attempts have been made to standardize the annotation procedure across each vertebrate genome, which should aid comparative analysis of orthologues across the different finished regions.

    Nucleic acids research 2005;33;Database issue;D459-65

  • Integration of tools and resources for display and analysis of genomic data for protozoan parasites.

    Aslett M, Mooney P, Adlem E, Berriman M, Berry A, Hertz-Fowler C, Ivens AC, Kerhornou A, Parkhill J, Peacock CS, Wood V, Rajandream MA, Barrell B and Tivey A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. maa@sanger.ac.uk

    Centralisation of tools for analysis of genomic data is paramount in ensuring that research is always carried out on the latest currently available data. As such, World Wide Web sites providing a range of online analyses and displays of data can play a crucial role in guaranteeing consistency of in silico work. In this respect, the protozoan parasite research community is served by several resources, either focussing on data and tools for one species or taking a broader view and providing tools for analysis of data from many species, thereby facilitating comparative studies. In this paper, we give a broad overview of the online resources available. We then focus on the GeneDB project, detailing the features and tools currently available through it. Finally, we discuss data curation and its importance in keeping genomic data 'relevant' to the research community.

    International journal for parasitology 2005;35;5;481-93

  • A transcriptional pathway for cell separation in fission yeast.

    Bähler J

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK. jurg@sanger.ac.uk

    Numerous genes are transcriptionally activated and repressed in a cell cycle-dependent manner. We have recently reported the global gene expression program during the cell cycle in fission yeast (S. pombe). Among the periodically expressed fission yeast genes, a large proportion shows peak transcript levels during mitosis. Many of these genes are regulated by a transcriptional cascade involving two transcription factors: the forkhead protein Sep1p which activates the zinc finger protein Ace2p. A main function of the Sep1p-Ace2p transcriptional pathway is to trigger the separation of daughter cells after cytokinesis. Absence of Sep1p, Ace2p, or some of their target genes leads to a hyphal-like growth pattern with chains of connected cells. Yeast cells probably evolved from filamentous fungi. It is possible that the Sep1p-Ace2p pathway contributed to the emergence of proliferation through single cells, and that this regulatory pathway can still be modulated to adjust growth modes depending on environmental conditions. Here, various properties of the Sep1p-Ace2p transcriptional pathway and mechanisms for cell separation are discussed in the context of recent findings.

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Cell cycle (Georgetown, Tex.) 2005;4;1;39-41

  • A logical circuit for the regulation of fission yeast growth modes.

    Bähler J and Svetina S

    The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK. jurg@sanger.ac.uk

    Growth of fission yeast at the ends of its cylindrical cells switches from a monopolar to a bipolar mode, before it ceases during mitosis and cell division. Here we assume that these growth modes correspond to three stable states of an underlying regulatory circuit, which is a relatively simple and to a large degree autonomous subsystem of an otherwise complex cellular control system. We develop a switch-like logical circuit based on three elements defined as binary variables. Effects of circuit variables on each other are expressed in terms of logical operations. We analyse this circuit for its behavior ("phenotypes") after removing single or multiple operations ("mutants"). Known fission yeast polarity mutants such as those defective in the switch to bipolar growth can be classified based on these predicted 'phenotypes'. Differences in growth patterns between daughter cells in different bipolar growth mutants are also predicted by the circuit model. The model presented here should provide a useful framework to guide future experiments into mechanisms of cellular polarity. This paper illustrates the usefulness of simple logical circuits to describe and dissect features of complex regulatory processes such as the fission yeast growth patterns in both wild type and mutant cells.

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Journal of theoretical biology 2005;237;2;210-8

  • Detection of Vi-negative Salmonella enterica serovar typhi in the peripheral blood of patients with typhoid fever in the Faisalabad region of Pakistan.

    Baker S, Sarwar Y, Aziz H, Haque A, Ali A, Dougan G, Wain J and Haque A

    The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK. sgb@sanger.ac.uk

    The synthesis and transportation proteins of the Vi capsular polysaccharide of Salmonella enterica serovar Typhi (serovar Typhi) are encoded by the viaB operon, which resides on a 134-kb pathogenicity island known as SPI-7. In recent years, Vi-negative strains of serovar Typhi have been reported in regions where typhoid fever is endemic. However, because Vi negativity can arise during in vitro passage, the clinical significance of Vi-negative serovar Typhi is not clear. To investigate the loss of Vi expression at the genetic level, 60 stored strains of serovar Typhi from the Faisalabad region of Pakistan were analyzed by PCR for the presence of SPI-7 and two genes essential for Vi production: tviA and tviB. Nine of the sixty strains analyzed (15%) tested negative for both tviA and tviB; only two of these strains lacked SPI-7. In order to investigate whether this phenomenon occurred in vivo, blood samples from patients with the clinical symptoms of typhoid fever were also investigated. Of 48 blood samples tested, 42 tested positive by fliC PCR for serovar Typhi; 4 of these were negative for tviA and tviB. Three of these samples tested positive for SPI-7. These results demonstrate that viaB-negative, SPI-7-positive serovar Typhi is naturally occurring and can be detected by PCR in the peripheral blood of typhoid patients in this region. The method described here can be used to monitor the incidence of Vi-negative serovar Typhi in regions where the Vi vaccine is used.

    Journal of clinical microbiology 2005;43;9;4418-25

  • Molecular cytogenetic analyses of breakpoints in apparently balanced reciprocal translocations carried by phenotypically normal individuals.

    Baptista J, Prigmore E, Gribble SM, Jacobs PA, Carter NP and Crolla JA

    Wessex Regional Genetics Laboratory, Salisbury District Hospital, Salisbury, Wiltshire, UK. Julia.baptista@salisbury.nhs.uk

    To test the hypothesis that translocation breakpoints in normal individuals are simple and do not disrupt genes, we characterised the breakpoints in 13 phenotypically normal individuals incidentally ascertained with an apparently balanced reciprocal translocation. Cases were karyotyped, and the breakpoints were refined by fluorescence in situ hybridisation until breakpoint-spanning clones were identified. 1 Mb array-CGH was performed as a whole genome analysis tool to detect any imbalances in chromatin not directly involved in the breakpoints. Breakpoint-associated imbalances were not found in any of the patients analysed in this study. However, breakpoints which disrupted known genes were identified in two patients, with RYR2 disrupted in one patient and COL13A1 in the other. In a further eight patients, Ensembl mapping data suggested that a gene might be disrupted by a breakpoint. In one further patient, the translocation was shown to be nonreciprocal. This study shows that apparently balanced reciprocal translocations in phenotypically normal patients do not have imbalances at the breakpoints, in contrast to phenotypically abnormal patients where the translocation breakpoints are often associated with cryptic imbalances. However, phenotypically normal individuals, and phenotypically abnormal individuals may have genes disrupted and therefore inactivated by one of the breakpoints. The significance of these disruptions remains to be determined.

    Funded by: Wellcome Trust

    European journal of human genetics : EJHG 2005;13;11;1205-12

  • Complex disease: pleiotropic gene effects in obesity and type 2 diabetes.

    Barroso I

    European journal of human genetics : EJHG 2005;13;12;1243-4

  • Genetics of Type 2 diabetes.

    Barroso I

    Metabolic Disease Group, The Wellcome Trust Sanger Institute, Cambridge, UK. ib1@sanger.ac.uk

    Type 2 diabetes (T2D) has become a health-care problem worldwide, with the rise in disease prevalence being all the more worrying as it not only affects the developed world but also developing nations with fewer resources to cope with yet another major disease burden. Furthermore, the problem is no longer restricted to the ageing population, as young adults and children are also being diagnosed with T2D. In recent years, there has been a surge in the number of genetic studies of T2D in attempts to identify some of the underlying risk factors. In this review, I highlight the main genes known to cause uncommon monogenic forms of diabetes (e.g. maturity-onset diabetes of the young--MODY--and insulin resistance syndromes), as well as describe some of the main approaches used to identify genes involved in the more common forms of T2D that result from the interaction between environmental risk factors and predisposing genotypes. Linkage and candidate gene studies have been highly successful in the identification of genes that cause the monogenic variants of diabetes and, although progress in the more common forms of T2D has been slow, a number of genes have now been reproducibly associated with T2D risk in multiple studies. These are discussed, as well as the main implications that the diabetes gene discoveries will have in diabetes treatment and prevention.

    Diabetic medicine : a journal of the British Diabetic Association 2005;22;5;517-35

  • What the genome sequence is revealing about trypanosome antigenic variation.

    Barry JD, Marcello L, Morrison LJ, Read AF, Lythgoe K, Jones N, Carrington M, Blandin G, Böhme U, Caler E, Hertz-Fowler C, Renauld H, El-Sayed N and Berriman M

    University of Glasgow, Glasgow G12 8QQ, Scotland, UK. j.d.barry@bio.gla.ac.uk

    African trypanosomes evade humoral immunity through antigenic variation, whereby they switch expression of the gene encoding their VSG (variant surface glycoprotein) coat. Switching proceeds by duplication of silent VSG genes into a transcriptionally active locus. The genome project has revealed that most of the silent archive consists of hundreds of subtelomeric VSG tandem arrays, and that most of these are not functional genes. Precedent suggests that they can contribute combinatorially to the formation of expressed, functional genes through segmental gene conversion. These findings from the genome project have major implications for evolution of the VSG archive and for transmission of the parasite in the field.

    Biochemical Society transactions 2005;33;Pt 5;986-9

  • The G5 domain: a potential N-acetylglucosamine recognition domain involved in biofilm formation.

    Bateman A, Holden MT and Yeats C

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. agb@sanger.ac.uk

    Summary: Biofilms are complex microbial communities found at surfaces that are often associated with extracellular polysaccharides. Biofilm formation is a complex process that is being understood at the molecular level only recently. We have identified a novel domain that we call the G5 domain (named after its conserved glycine residues), which is found in a variety of enzymes such as Streptococcal IgA peptidases and various glycosyl hydrolases in bacteria. The G5 domain is found in the Accumulation Associated Protein (AAP), which is an important component in biofilm formation in Staphylococcus aureus. A common feature of the proteins containing G5 domains is N-acetylglucosamine binding, and we attribute this function to the G5 domain.

    Contact: agb@sanger.ac.uk.

    Bioinformatics (Oxford, England) 2005;21;8;1301-3

  • Endoplasmic reticulum stress induction of the Grp78/BiP promoter: activating mechanisms mediated by YY1 and its interactive chromatin modifiers.

    Baumeister P, Luo S, Skarnes WC, Sui G, Seto E, Shi Y and Lee AS

    Department of Biochemistry and Molecular Biology, USC/Norris Comprehensive Cancer Center, Keck School of Medicine, 1441 Eastlake Ave., Room 5308, MC-9176, Los Angeles, CA 90089-9176, USA.

    The unfolded protein response is an evolutionarily conserved mechanism whereby cells respond to stress conditions that target the endoplasmic reticulum (ER). The transcriptional activation of the promoter of GRP78/BiP, a prosurvival ER chaperone, has been used extensively as an indicator of the onset of the UPR. YY1, a constitutively expressed multifunctional transcription factor, activates the Grp78 promoter only under ER stress conditions. Previously, in vivo footprinting analysis revealed that the YY1 binding site of the ER stress response element of the Grp78 promoter exhibits ER stress-induced changes in occupancy. Toward understanding the underlying mechanisms of these unique phenomena, we performed chromatin immunoprecipitation analyses, revealing that YY1 only occupies the Grp78 promoter upon ER stress and is mediated in part by the nuclear form of ATF6. We show that YY1 is an essential coactivator of ATF6 and uncover their specific interactive domains. Using small interfering RNA against YY1 and insertional mutation of the gene encoding ATF6alpha, we provide direct evidence that YY1 and ATF6 are required for optimal stress induction of Grp78. We also discovered enhancement of the ER-stressed induction of the Grp78 promoter through the interaction of YY1 with the arginine methyltransferase PRMT1 and evidence of its action through methylation of the arginine 3 residue on histone H4. Furthermore, we detected ER stress-induced binding of the histone acetyltransferase p300 to the Grp78 promoter and histone H4 acetylation. A model for the ER stress-mediated transcription factor binding and chromatin modifications at the Grp78 promoter leading to its activation is proposed.

    Funded by: NCI NIH HHS: CA 27607; NIGMS NIH HHS: GM 53874, GM 58486, GM 64850

    Molecular and cellular biology 2005;25;11;4529-40

  • Acquired mutation of the tyrosine kinase JAK2 in human myeloproliferative disorders.

    Baxter EJ, Scott LM, Campbell PJ, East C, Fourouclas N, Swanton S, Vassiliou GS, Bench AJ, Boyd EM, Curtin N, Scott MA, Erber WN, Green AR and Cancer Genome Project

    Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK.

    Background: Human myeloproliferative disorders form a range of clonal haematological malignant diseases, the main members of which are polycythaemia vera, essential thrombocythaemia, and idiopathic myelofibrosis. The molecular pathogenesis of these disorders is unknown, but tyrosine kinases have been implicated in several related disorders. We investigated the role of the cytoplasmic tyrosine kinase JAK2 in patients with a myeloproliferative disorder.

    Methods: We obtained DNA samples from patients with polycythaemia vera, essential thrombocythaemia, or idiopathic myelofibrosis. The coding exons of JAK2 were bidirectionally sequenced from peripheral-blood granulocytes, T cells, or both. Allele-specific PCR, molecular cytogenetic studies, microsatellite PCR, Affymetrix single nucleotide polymorphism array analyses, and colony assays were undertaken on subgroups of patients.

    Findings: A single point mutation (Val617Phe) was identified in JAK2 in 71 (97%) of 73 patients with polycythaemia vera, 29 (57%) of 51 with essential thrombocythaemia, and eight (50%) of 16 with idiopathic myelofibrosis. The mutation is acquired, is present in a variable proportion of granulocytes, alters a highly conserved valine present in the negative regulatory JH2 domain, and is predicted to dysregulate kinase activity. It was heterozygous in most patients, homozygous in a subset as a result of mitotic recombination, and arose in a multipotent progenitor capable of giving rise to erythroid and myeloid cells. The mutation was present in all erythropoietin-independent erythroid colonies.

    Interpretation: A single acquired mutation of JAK2 was noted in more than half of patients with a myeloproliferative disorder. Its presence in all erythropoietin-independent erythroid colonies demonstrates a link with growth factor hypersensitivity, a key biological feature of these disorders.

    Identification of the Val617Phe JAK2 mutation lays the foundation for new approaches to the diagnosis, classification, and treatment of myeloproliferative disorders.

    Funded by: Wellcome Trust: 088340

    Lancet 2005;365;9464;1054-61

  • The interferon-inducible p47 (IRG) GTPases in vertebrates: loss of the cell autonomous resistance mechanism in the human lineage.

    Bekpen C, Hunn JP, Rohde C, Parvanova I, Guethlein L, Dunn DM, Glowalla E, Leptin M and Howard JC

    Institute for Genetics, University of Cologne, Zülpicher Strasse 47, 50674 Cologne, Germany. c.bekpen@uni-koeln.de

    Background: Members of the p47 (immunity-related GTPases (IRG) family) GTPases are essential, interferon-inducible resistance factors in mice that are active against a broad spectrum of important intracellular pathogens. Surprisingly, there are no reports of p47 function in humans.

    Results: Here we show that the p47 GTPases are represented by 23 genes in the mouse, whereas humans have only a single full-length p47 GTPase and an expressed, truncated presumed pseudo-gene. The human full-length gene is orthologous to an isolated mouse p47 GTPase that carries no interferon-inducible elements in the promoter of either species and is expressed constitutively in the mature testis of both species. Thus, there is no evidence for a p47 GTPase-based resistance system in humans. Dogs have several interferon-inducible p47s, and so the primate lineage that led to humans appears to have lost an ancient function. Multiple p47 GTPases are also present in the zebrafish, but there is only a tandem p47 gene pair in pufferfish.

    Conclusion: Mice and humans must deploy their immune resources against vacuolar pathogens in radically different ways. This carries significant implications for the use of the mouse as a model of human infectious disease. The absence of the p47 resistance system in humans suggests that possession of this resistance system carries significant costs that, in the primate lineage that led to humans, are not outweighed by the benefits. The origin of the vertebrate p47 system is obscure.

    Genome biology 2005;6;11;R92

  • Analyses of murine postsynaptic density-95 identify novel isoforms and potential translational control elements.

    Bence M, Arbuckle MI, Dickson KS and Grant SG

    Division of Neuroscience, University of Edinburgh, Edinburgh EH8 9JZ, UK.

    Postsynaptic density-95 (PSD-95) is an evolutionarily conserved synaptic adaptor protein that is known to bind many proteins including the NMDA receptor. This observation has implicated it in many NMDA receptor-dependent processes including spatial learning and synaptic plasticity. We have cloned and characterised the murine PSD-95 gene. In addition, we have identified two previously uncharacterised splice variants of the major murine PSD-95 transcript (PSD-95alpha): PSD-95alpha-2b results from an extension of exon 2 and PSD-95alpha-Delta18 from the temporal exclusion of exon 18. The presence of PSD-95alpha-2b sequences in other PSD-95 family members implicates this peptide stretch as functionally significant. Another potential transcript (PSD-95gamma) was also identified based on examination of EST databases. Immunoprecipitation assays demonstrate that proteins corresponding in size to PSD-95alpha-Delta18 and PSD-95gamma interact with the NMDA receptor, suggesting an important biological role for these isoforms. Finally, we have performed bioinformatics analyses of the PSD-95 mRNA untranslated regions, identifying multiple translational control elements that suggest protein production could be regulated post-transcriptionally. The variety of mRNA isoforms and regulatory elements identified provides for a high degree of diversity in the structure and function of PSD-95 proteins.

    Brain research. Molecular brain research 2005;133;1;143-52

  • Enemies within.

    Bentley S, Sebaihia M and Crossman L

    Nature reviews. Microbiology 2005;3;1;8-9

  • Binding specificity and mRNA targets of a C. elegans PUF protein, FBF-1.

    Bernstein D, Hook B, Hajarnavis A, Opperman L and Wickens M

    Department of Biochemistry, University of Wisconsin, 433 Babcock Drive, Madison, WI 53706, USA.

    Sequence-specific RNA-protein interactions underlie regulation of many mRNAs. Here we analyze the RNA sequence specificity of Caenorhabditis elegans FBF-1, a founding member of the PUF protein family. Like other PUF proteins, FBF-1 binds to the 3' UTR of target mRNAs and decreases expression of those target genes. Here, we show that FBF-1 and its close relative, FBF-2, bind with similar affinity to multiple RNA sites. We use mutagenesis and in vivo selection experiments to identify nucleotides that are essential for FBF-1 binding. The binding elements comprise a "core" central region and flanking sequences. The core region is similar but distinct from the binding sites of other PUF proteins. We combine the identification of binding elements with informatics to predict new FBF-1 binding sites in a C. elegans 3' UTR database. These data identify a set of new candidate mRNA targets of FBF-1 and FBF-2.

    Funded by: NIGMS NIH HHS: GM31892, GM50942

    RNA (New York, N.Y.) 2005;11;4;447-58

  • The genome of the African trypanosome Trypanosoma brucei.

    Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H, Bartholomeu DC, Lennard NJ, Caler E, Hamlin NE, Haas B, Böhme U, Hannick L, Aslett MA, Shallom J, Marcello L, Hou L, Wickstead B, Alsmark UC, Arrowsmith C, Atkin RJ, Barron AJ, Bringaud F, Brooks K, Carrington M, Cherevach I, Chillingworth TJ, Churcher C, Clark LN, Corton CH, Cronin A, Davies RM, Doggett J, Djikeng A, Feldblyum T, Field MC, Fraser A, Goodhead I, Hance Z, Harper D, Harris BR, Hauser H, Hostetler J, Ivens A, Jagels K, Johnson D, Johnson J, Jones K, Kerhornou AX, Koo H, Larke N, Landfear S, Larkin C, Leech V, Line A, Lord A, Macleod A, Mooney PJ, Moule S, Martin DM, Morgan GW, Mungall K, Norbertczak H, Ormond D, Pai G, Peacock CS, Peterson J, Quail MA, Rabbinowitsch E, Rajandream MA, Reitter C, Salzberg SL, Sanders M, Schobel S, Sharp S, Simmonds M, Simpson AJ, Tallon L, Turner CM, Tait A, Tivey AR, Van Aken S, Walker D, Wanless D, Wang S, White B, White O, Whitehead S, Woodward J, Wortman J, Adams MD, Embley TM, Gull K, Ullu E, Barry JD, Fairlamb AH, Opperdoes F, Barrell BG, Donelson JE, Hall N, Fraser CM, Melville SE and El-Sayed NM

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK. mb4@sanger.ac.uk

    African trypanosomes cause human sleeping sickness and livestock trypanosomiasis in sub-Saharan Africa. We present the sequence and analysis of the 11 megabase-sized chromosomes of Trypanosoma brucei. The 26-megabase genome contains 9068 predicted genes, including approximately 900 pseudogenes and approximately 1700 T. brucei-specific genes. Large subtelomeric arrays contain an archive of 806 variant surface glycoprotein (VSG) genes used by the parasite to evade the mammalian immune system. Most VSG genes are pseudogenes, which may be used to generate expressed mosaic genes by ectopic recombination. Comparisons of the cytoskeleton and endocytic trafficking systems with those of humans and other eukaryotic organisms reveal major differences. A comparison of metabolic pathways encoded by the genomes of T. brucei, T. cruzi, and Leishmania major reveals the least overall metabolic capability in T. brucei and the greatest in L. major. Horizontal transfer of genes of bacterial origin has contributed to some of the metabolic differences in these parasites, and a number of novel potential drug targets have been identified.

    Funded by: NIAID NIH HHS: AI43062; Wellcome Trust

    Science (New York, N.Y.) 2005;309;5733;416-22

  • XPACE4 is a localized pro-protein convertase required for mesoderm induction and the cleavage of specific TGFbeta proteins in Xenopus development.

    Birsoy B, Berg L, Williams PH, Smith JC, Wylie CC, Christian JL and Heasman J

    Division of Developmental Biology, Cincinnati Children's Hospital Research Foundation, 3333 Burnet Avenue, Cincinnati, OH 45229, USA.

    XPACE4 is a member of the subtilisin/kexin family of pro-protein convertases. It cleaves many pro-proteins to release their active proteins, including members of the TGFbeta family of signaling molecules. Studies in mouse suggest it may have important roles in regulating embryonic tissue specification. Here, we examine the role of XPACE4 in Xenopus development and make three novel observations: first, XPACE4 is stored as maternal mRNA localized to the mitochondrial cloud and vegetal hemisphere of the oocyte; second, it is required for the endogenous mesoderm inducing activity of vegetal cells before gastrulation; and third, it has substrate-specific activity, cleaving Xnr1, Xnr2, Xnr3 and Vg1, but not Xnr5, Derriere or ActivinB pro-proteins. We conclude that maternal XPACE4 plays an important role in embryonic patterning by regulating the production of a subset of active mature TGFbeta proteins in specific sites.

    Funded by: NICHD NIH HHS: HD038272, HD042598, HD37976

    Development (Cambridge, England) 2005;132;3;591-602

  • Analysis of the hypervariable region of the Salmonella enterica genome associated with tRNA(leuX).

    Bishop AL, Baker S, Jenks S, Fookes M, Gaora PO, Pickard D, Anjum M, Farrar J, Hien TT, Ivens A and Dougan G

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom. gd1@sanger.ac.uk.

    The divergence of Salmonella enterica and Escherichia coli is estimated to have occurred approximately 140 million years ago. Despite this evolutionary distance, the genomes of these two species still share extensive synteny and homology. However, there are significant differences between the two species in terms of genes putatively acquired via various horizontal transfer events. Here we report on the composition and distribution across the Salmonella genus of a chromosomal region designated SPI-10 in Salmonella enterica serovar Typhi and located adjacent to tRNA(leuX). We find that across the Salmonella genus the tRNA(leuX) region is a hypervariable hot spot for horizontal gene transfer; different isolates from the same S. enterica serovar can exhibit significant variation in this region. Many P4 phage, plasmid, and transposable element-associated genes are found adjacent to tRNA(leuX) in both Salmonella and E. coli, suggesting that these mobile genetic elements have played a major role in driving the variability of this region.

    Journal of bacteriology 2005;187;7;2469-82

  • Evolutionary comparison provides evidence for pathogenicity of RMRP mutations.

    Bonafé L, Dermitzakis ET, Unger S, Greenberg CR, Campos-Xavier BA, Zankl A, Ucla C, Antonarakis SE, Superti-Furga A and Reymond A

    Division of Molecular Pediatrics, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland. Luisa.Bonafe@chuv.ch

    Cartilage-hair hypoplasia (CHH) is a pleiotropic disease caused by recessive mutations in the RMRP gene that result in a wide spectrum of manifestations including short stature, sparse hair, metaphyseal dysplasia, anemia, immune deficiency, and increased incidence of cancer. Molecular diagnosis of CHH has implications for management, prognosis, follow-up, and genetic counseling of affected patients and their families. We report 20 novel mutations in 36 patients with CHH and describe the associated phenotypic spectrum. Given the high mutational heterogeneity (62 mutations reported to date), the high frequency of variations in the region (eight single nucleotide polymorphisms in and around RMRP), and the fact that RMRP is not translated into protein, prediction of mutation pathogenicity is difficult. We addressed this issue by a comparative genomic approach and aligned the genomic sequences of RMRP gene in the entire class of mammals. We found that putative pathogenic mutations are located in highly conserved nucleotides, whereas polymorphisms are located in non-conserved positions. We conclude that the abundance of variations in this small gene is remarkable and at odds with its high conservation through species; it is unclear whether these variations are caused by a high local mutation rate, a failure of repair mechanisms, or a relaxed selective pressure. The marked diversity of mutations in RMRP and the low homozygosity rate in our patient population indicate that CHH is more common than previously estimated, but may go unrecognized because of its variable clinical presentation. Thus, RMRP molecular testing may be indicated in individuals with isolated metaphyseal dysplasia, anemia, or immune dysregulation.

    PLoS genetics 2005;1;4;e47

  • Multiple mutations in mouse Chd7 provide models for CHARGE syndrome.

    Bosman EA, Penn AC, Ambrose JC, Kettleborough R, Stemple DL and Steel KP

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Mouse ENU mutagenesis programmes have yielded a series of independent mutations on proximal chromosome 4 leading to dominant head-bobbing and circling behaviour due to truncations of the lateral semicircular canal of the inner ear. Here, we report the identification of mutations in the Chd7 gene in nine of these mutant alleles including six nonsense and three splice site mutations. The human CHD7 gene is known to be involved in CHARGE syndrome, which also shows inner ear malformations and a variety of other features with varying penetrance and appears to be due to frequent de novo mutation. We found widespread expression of Chd7 in early development of the mouse in organs affected in CHARGE syndrome including eye, olfactory epithelium, inner ear and vascular system. Closer inspection of heterozygous mutant mice revealed a range of defects with reduced penetrance, such as cleft palate, choanal atresia, septal defects of the heart, haemorrhages, prenatal death, vulva and clitoral defects and keratoconjunctivitis sicca. Many of these defects mimic the features of CHARGE syndrome. There were no obvious features of the gene that might make it more mutable than other genes. We conclude that the large number of mouse mutants and human de novo mutations may be due to the combination of the Chd7 gene being a large target and the fact that many heterozygous carriers of the mutations are viable individuals with a readily detectable phenotype.

    Funded by: Wellcome Trust

    Human molecular genetics 2005;14;22;3463-76

  • A human-curated annotation of the Candida albicans genome.

    Braun BR, van Het Hoog M, d'Enfert C, Martchenko M, Dungan J, Kuo A, Inglis DO, Uhl MA, Hogues H, Berriman M, Lorenz M, Levitin A, Oberholzer U, Bachewich C, Harcus D, Marcil A, Dignard D, Iouk T, Zito R, Frangeul L, Tekaia F, Rutherford K, Wang E, Munro CA, Bates S, Gow NA, Hoyer LL, Köhler G, Morschhäuser J, Newport G, Znaidi S, Raymond M, Turcotte B, Sherlock G, Costanzo M, Ihmels J, Berman J, Sanglard D, Agabian N, Mitchell AP, Johnson AD, Whiteway M and Nantel A

    Department of Microbiology and Immunology, University of California, San Francisco, California, USA.

    Recent sequencing and assembly of the genome for the fungal pathogen Candida albicans used simple automated procedures for the identification of putative genes. We have reviewed the entire assembly, both by hand and with additional bioinformatic resources, to accurately map and describe 6,354 genes and to identify 246 genes whose original database entries contained sequencing errors (or possibly mutations) that affect their reading frame. Comparison with other fungal genomes permitted the identification of numerous fungus-specific genes that might be targeted for antifungal therapy. We also observed that, compared to other fungi, the protein-coding sequences in the C. albicans genome are especially rich in short sequence repeats. Finally, our improved annotation permitted a detailed analysis of several multigene families, and comparative genomic studies showed that C. albicans has a far greater catabolic range, encoding respiratory Complex 1, several novel oxidoreductases and ketone body degrading enzymes, malonyl-CoA and enoyl-CoA carriers, several novel amino acid degrading enzymes, a variety of secreted catabolic lipases and proteases, and numerous transporters to assimilate the resulting nutrients. The results of these efforts will ensure that the Candida research community has uniform and comprehensive genomic information for medical research as well as for future diagnostic and therapeutic applications.

    PLoS genetics 2005;1;1;36-57

  • Comprehensive DNA copy number profiling of meningioma using a chromosome 1 tiling path microarray identifies novel candidate tumor suppressor loci.

    Buckley PG, Jarbo C, Menzel U, Mathiesen T, Scott C, Gregory SG, Langford CF and Dumanski JP

    Rudbeck Laboratory, Department of Genetics and Pathology, Uppsala University, Uppsala, Sweden.

    Meningiomas are common neoplasms of the meninges lining of the central nervous system. Deletions of 1p have been established as important for the initiation and/or progression of meningioma. The rationale of this array-CGH study was to characterize copy number imbalances of chromosome 1 in meningioma, using a full-coverage genomic microarray containing 2,118 distinct measurement points. In total, 82 meningiomas were analyzed, making this the most detailed analysis of chromosome 1 in a comprehensive series of tumors. We detected a broad range of aberrations, such as deletions and/or gains of various sizes. Deletions were the predominant finding and ranged from monosomy to a 3.5-Mb terminal 1p homozygous deletion. Although multiple aberrations were observed across chromosome 1, every meningioma in which imbalances were detected harbored 1p deletions. Tumor heterogeneity was also observed in three recurrent meningiomas, which most likely reflects a progressive loss of chromosomal segments at different stages of tumor development. The distribution of aberrations supports the existence of at least four candidate loci on chromosome 1, which are important for meningioma tumorigenesis. In one of these regions, our results already allow the analysis of a number of candidate genes. In a large series of cases, we observed an association between the presence of segmental duplications and deletion breakpoints, which suggests their role in the generation of these tumor-specific aberrations. As 1p is the site of the genome most frequently affected by tumor-specific aberrations, our results indicate loci of general importance for cancer development and progression.

    Cancer research 2005;65;7;2653-61

  • Identification of genetic aberrations on chromosome 22 outside the NF2 locus in schwannomatosis and neurofibromatosis type 2.

    Buckley PG, Mantripragada KK, Díaz de Ståhl T, Piotrowski A, Hansson CM, Kiss H, Vetrie D, Ernberg IT, Nordenskjöld M, Bolund L, Sainio M, Rouleau GA, Niimura M, Wallace AJ, Evans DG, Grigelionis G, Menzel U and Dumanski JP

    Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Uppsala, Sweden.

    Schwannomatosis is characterized by multiple peripheral and cranial nerve schwannomas that occur in the absence of bilateral 8th cranial nerve schwannomas. The latter is the main diagnostic criterion of neurofibromatosis type 2 (NF2), which is a related but distinct disorder. The genetic factors underlying the differences between schwannomatosis and NF2 are poorly understood, although available evidence implicates chromosome 22 as the primary location of the gene(s) of interest. To investigate this, we comprehensively profiled the DNA copy number in samples from sporadic and familial schwannomatosis, NF2, and a large cohort of normal controls. Using a tiling-path chromosome 22 genomic array, we identified two candidate regions of copy number variation, which were further characterized by a PCR-based array with higher resolution. The latter approach allows the detection of minute alterations in total genomic DNA, with as little as 1.5 kb per measurement point of nonredundant sequence on the array. In DNA derived from peripheral blood from a schwannomatosis patient and a sporadic schwannoma sample, we detected rearrangements of the immunoglobulin lambda (IGL) locus, which is unlikely to be due to a B-cell specific somatic recombination of IGL. Analysis of normal controls indicated that these IGL rearrangements were restricted to schwannomatosis/schwannoma samples. In the second candidate region spanning GSTT1 and CABIN1 genes, we observed a frequent copy number polymorphism at the GSTT1 locus. We further describe missense mutations in the CABIN1 gene that are specific to samples from schwannomatosis and NF2 and make this gene a plausible candidate for contributing to the pathogenesis of these disorders.

    Human mutation 2005;26;6;540-9

  • Plasmodium falciparum variant surface antigen expression patterns during malaria.

    Bull PC, Berriman M, Kyes S, Quail MA, Hall N, Kortok MM, Marsh K and Newbold CI

    Nuffield Department of Clinical Medicine, John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom. pbull@kilifi.mimcom.net

    The variant surface antigens expressed on Plasmodium falciparum-infected erythrocytes are potentially important targets of immunity to malaria and are encoded, at least in part, by a family of var genes, about 60 of which are present within every parasite genome. Here we use semi-conserved regions within short var gene sequence "tags" to make direct comparisons of var gene expression in 12 clinical parasite isolates from Kenyan children. A total of 1,746 var clones were sequenced from genomic and cDNA and assigned to one of six sequence groups using specific sequence features. The results show the following. (1) The relative numbers of genomic clones falling in each of the sequence groups was similar between parasite isolates and corresponded well with the numbers of genes found in the genome of a single, fully sequenced parasite isolate. In contrast, the relative numbers of cDNA clones falling in each group varied considerably between isolates. (2) Expression of sequences belonging to a relatively conserved group was negatively associated with the repertoire of variant surface antigen antibodies carried by the infected child at the time of disease, whereas expression of sequences belonging to another group was associated with the parasite "rosetting" phenotype, a well established virulence determinant. Our results suggest that information on the state of the host-parasite relationship in vivo can be provided by measurements of the differential expression of different var groups, and need only be defined by short stretches of sequence data.

    Funded by: Wellcome Trust: 060678, 631342

    PLoS pathogens 2005;1;3;e26

  • Plasmodium falciparum antigenic variation: relationships between in vivo selection, acquired antibody response, and disease severity.

    Bull PC, Pain A, Ndungu FM, Kinyanjui SM, Roberts DJ, Newbold CI and Marsh K

    Nuffield Department of Clinical Medicine, John Radcliffe Hospital, University of Oxford, Oxford, United Kingdom. pbull@kilifi.mimcom.net

    Background: Variant surface antigens (VSA) on Plasmodium falciparum-infected erythrocytes are potentially important targets of immunity to malaria. We previously identified a VSA phenotype--VSA with a high frequency of antibody recognition (VSA(FoRH))--that is associated with young host age and severe malaria. We hypothesized that VSA(FoRH) are positively selected by host molecules such as intercellular adhesion molecule 1 (ICAM1) and CD36 and dominate in the absence of an effective immune response. Here, we assessed, in 115 Kenyan children, the potential role played by in vivo selection pressures in either favoring or selecting against VSA(FoRH) among parasites that cause malaria.

    Methods: We tested for associations between VSA(FoRH) and (1) the repertoire of VSA antibodies carried by children at the time of acute malaria and (2) polymorphisms in ICAM1 (K29M) and CD36 (T188G) that could potentially reduce the positive selection of VSA(FoRH).

    Results: An expected negative association between VSA antibody repertoire and VSA(FoRH) was observed in children with nonsevere malaria. However, this association did not extend to children with severe malaria, many of whom apparently had well-developed VSA antibody responses despite being infected by parasites expressing VSA(FoRH). There was no evidence for involvement of CD36 or ICAM1 in positive selection of VSA(FoRH). On the contrary, a weak positive association between carriage of the CD36 (T188G) allele and VSA(FoRH) was observed in children with severe malaria.

    Conclusion: The association between the VSA(FoRH) parasite phenotype and severe malaria cannot be explained simply in terms of the total repertoire of VSA antibodies carried at the time of acute disease.

    Funded by: Wellcome Trust

    The Journal of infectious diseases 2005;192;6;1119-26

  • Universal DNA primers amplify bacterial DNA from human fetal membranes and link Fusobacterium nucleatum with prolonged preterm membrane rupture.

    Cahill RJ, Tan S, Dougan G, O'Gaora P, Pickard D, Kennea N, Sullivan MH, Feldman RG and Edwards AD

    Department of Biological Sciences, Centre for Molecular Microbiology and Infection, Flowers Building, London, UK.

    A large number of bacterial species have been identified in fetal membranes after preterm labour (PTL) associated with intrauterine infection by microbiological culture. In this study, we have investigated a molecular and bioinformatic approach to organism identification which surmounts the need for specific and diverse microbiological culture conditions required by conventional methods. Samples of fetal membranes were taken from 37 preterm infants, and 6 normal term controls delivered by caesarean section, in which bacteria had been detected by in situ hybridization of 16S ribosomal RNA using a generic probe. Degenerate primers were designed to amplify bacterial 16S ribosomal DNA by PCR and used to amplify bacterial DNA from human fetal membranes. Amplicons were cloned, sequenced and bacteria were identified bioinformatically by comparison of sequences with known bacterial DNA genomes. In situ hybridization using an organism specific probe was then used to confirm the presence of the commonest identified organism in tissue samples. Bacterial DNA amplified from 15/43 samples, all from preterm deliveries, and the bioinformatic approach identified organisms in all cases. Multiple bacteria were identified including Mycoplasma hominis, Pasturella multocida, Pseudomonas PH1, Escherichia coli and Prevotella bivia. The commonest organism Fusobacterium nucleatum was found in 9/15 (60%) of samples. Ten of the 12 samples obtained after prolonged membrane rupture were positive for bacterial DNA, and 7 of these (70%) contained DNA from F. nucleatum. Bacteria from fetal membranes may be identified by molecular and bioinformatic methods. Further work is warranted to investigate the apparent linkage between F. nucleatum, fetal membrane rupture and preterm delivery.

    Molecular human reproduction 2005;11;10;761-6

  • The genome of model malaria parasites, and comparative genomics.

    Carlton J, Silva J and Hall N

    The Institute for Genomic Research, Rockville, MD 20850, USA. carlton@tigr.org

    The field of comparative genomics of malaria parasites has recently come of age with the completion of the whole genome sequences of the human malaria parasite Plasmodium falciparum and a rodent malaria model, Plasmodium yoelii yoelii. With several other genome sequencing projects of different model and human malaria parasite species underway, comparing genomes from multiple species has necessitated the development of improved informatics tools and analyses. Results from initial comparative analyses reveal striking conservation of gene synteny between malaria species within conserved chromosome cores, in contrast to reduced homology within subtelomeric regions, in line with previous findings on a smaller scale. Genes that elicit a host immune response are frequently found to be species-specific, although a large variant multigene family is common to many rodent malaria species and Plasmodium vivax. Sequence alignment of syntenic regions from multiple species has revealed the similarity between species in coding regions to be high relative to non-coding regions, and phylogenetic footprinting studies promise to reveal conserved motifs in the latter. Comparison of non-synonymous substitution rates between orthologous genes is proving a powerful technique for identifying genes under selection pressure, and may be useful for vaccine design. This is a stimulating time for comparative genomics of model and human malaria parasites, which promises to produce useful results for the development of antimalarial drugs and vaccines.

    Current issues in molecular biology 2005;7;1;23-37

  • The transcriptional landscape of the mammalian genome.

    Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SP, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schönbach C, Sekiguchi K, Semple CA, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y, FANTOM Consortium and RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group)

    This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.

    Funded by: Telethon: TGM03P17, TGM06S01

    Science (New York, N.Y.) 2005;309;5740;1559-63

  • Microarray-based comparative genomic analyses of the human malaria parasite Plasmodium falciparum using Affymetrix arrays.

    Carret CK, Horrocks P, Konfortov B, Winzeler E, Qureshi M, Newbold C and Ivens A

    Pathogen Microarrays Group, The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 1SA, UK. ckc@sanger.ac.uk

    Microarray-based comparative genomic hybridization (CGH) provides a powerful tool for whole genome analyses and the rapid detection of genomic variation that underlies virulence and disease. In the field of Plasmodium research, many of the parasite genomes that one might wish to study in a high throughput manner are not laboratory clones, but clinical isolates. One of the key limitations to the use of clinical samples in CGH, however, is the miniscule amounts of genomic DNA available. Here we describe the successful application of multiple displacement amplification (MDA), a non-PCR-based amplification method that exhibits clear advantages over all other currently available methods. Using MDA, CGH was performed on a panel of NF54 and IT/FCR3 clones, identifying previously published deletions on chromosomes 2 and 9 as well as polymorphism in genes associated with disease pathology.

    Funded by: Wellcome Trust

    Molecular and biochemical parasitology 2005;144;2;177-86

  • JAE: Jemboss Alignment Editor.

    Carver TJ and Mullan LJ

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. tjc@sanger.ac.uk

    One of the most basic methods of understanding the biological significance of a sequence is to produce an alignment with related sequences. A vital aspect of correctly aligning sequences is to apply biological intuition through manual editing of an alignment produced by multiple-sequence alignment software. As part of the European Molecular Biology Open Source Software Suite (EMBOSS), a new alignment editor in the Jemboss package is freely available for download. The Jemboss Alignment Editor (JAE) incorporates standard methods of editing, and colouring residues and nucleotides to highlight important regions of interest. JAE also makes use of scoring matrices (PAM and BLOSUM), selected by the user, to display regions of high degrees of similarity and identity. Other tools include the ability to calculate a consensus, a consensus plot (using a selected scoring matrix) and pairwise identities. AVAILABILITY: The JAE can be launched from the webpage (http://emboss.sourceforge.net/Jemboss/).

    Applied bioinformatics 2005;4;2;151-4

  • ACT: the Artemis Comparison Tool.

    Carver TJ, Rutherford KM, Berriman M, Rajandream MA, Barrell BG and Parkhill J

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. artemis@sanger.ac.uk

    The Artemis Comparison Tool (ACT) allows an interactive visualisation of comparisons between complete genome sequences and associated annotations. The comparison data can be generated with several different programs; BLASTN, TBLASTX or Mummer comparisons between genomic DNA sequences, or orthologue tables generated by reciprocal FASTA comparison between protein sets. It is possible to identify regions of similarity, insertions and rearrangements at any level from the whole genome to base-pair differences. ACT uses Artemis components to display the sequences and so inherits powerful searching and analysis tools. ACT is part of the Artemis distribution and is similarly open source, written in Java and can run on any Java enabled platform, including UNIX, Macintosh and Windows.

    Bioinformatics (Oxford, England) 2005;21;16;3422-3

  • GI genomes.

    Cerdeño-Tárraga A, Claesson MJ, Sebaihia M and Thomson NR

    Nature reviews. Microbiology 2005;3;5;368-9

  • Extensive DNA inversions in the B. fragilis genome control variable gene expression.

    Cerdeño-Tárraga AM, Patrick S, Crossman LC, Blakely G, Abratt V, Lennard N, Poxton I, Duerden B, Harris B, Quail MA, Barron A, Clark L, Corton C, Doggett J, Holden MT, Larke N, Line A, Lord A, Norbertczak H, Ormond D, Price C, Rabbinowitsch E, Woodward J, Barrell B and Parkhill J

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    The obligately anaerobic bacterium Bacteroides fragilis, an opportunistic pathogen and inhabitant of the normal human colonic microbiota, exhibits considerable within-strain phase and antigenic variation of surface components. The complete genome sequence has revealed an unusual breadth (in number and in effect) of DNA inversion events that potentially control expression of many different components, including surface and secreted components, regulatory molecules, and restriction-modification proteins. Invertible promoters of two different types (12 group 1 and 11 group 2) were identified. One group has inversion crossover (fix) sites similar to the hix sites of Salmonella typhimurium. There are also four independent intergenic shufflons that potentially alter the expression and function of varied genes. The composition of the 10 different polysaccharide biosynthesis gene clusters identified (7 with associated invertible promoters) suggests a mechanism of synthesis similar to the O-antigen capsules of Escherichia coli.

    Science (New York, N.Y.) 2005;307;5714;1463-5

  • VFDB: a reference database for bacterial virulence factors.

    Chen L, Yang J, Yu J, Yao Z, Sun L, Shen Y and Jin Q

    State Key Laboratory for Molecular Virology and Genetic Engineering, Beijing 100052, China.

    Bacterial pathogens continue to impose a major threat to public health worldwide in the 21st century. Intensified studies on bacterial pathogenesis have greatly expanded our knowledge about the mechanisms of the disease processes at the molecular level over the last decades. To facilitate future research, it becomes necessary to form a database collectively presenting the virulence factors (VFs) of various medical significant bacterial pathogens. The aim of virulence factor database (VFDB) (http://www.mgc.ac.cn/VFs/) is to provide such a source for scientists to rapidly access to current knowledge about VFs from various bacterial pathogens. VFDB is comprehensive and user-friendly. One can search VFDB by browsing each genus or by typing keywords. Furthermore, a BLAST search tool against all known VF-related genes is also available. VFDB provides a unified gateway to store, search, retrieve and update information about VFs from various bacterial pathogens.

    Nucleic acids research 2005;33;Database issue;D325-8

  • Gamma-glutamyl carboxylase (GGCX) microsatellite and warfarin dosing.

    Chen LY, Eriksson N, Gwilliam R, Bentley D, Deloukas P and Wadelius M

    Blood 2005;106;10;3673-4

  • WormBase: a comprehensive data resource for Caenorhabditis biology and genomics.

    Chen N, Harris TW, Antoshechkin I, Bastiani C, Bieri T, Blasiar D, Bradnam K, Canaran P, Chan J, Chen CK, Chen WJ, Cunningham F, Davis P, Kenny E, Kishore R, Lawson D, Lee R, Muller HM, Nakamura C, Pai S, Ozersky P, Petcherski A, Rogers A, Sabo A, Schwarz EM, Van Auken K, Wang Q, Durbin R, Spieth J, Sternberg PW and Stein LD

    Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA. chenn@cshl.org

    WormBase (http://www.wormbase.org), the model organism database for information about Caenorhabditis elegans and related nematodes, continues to expand in breadth and depth. Over the past year, WormBase has added multiple large-scale datasets including SAGE, interactome, 3D protein structure datasets and NCBI KOGs. To accommodate this growth, the International WormBase Consortium has improved the user interface by adding new features to aid in navigation, visualization of large-scale datasets, advanced searching and data mining. Internally, we have restructured the database models to rationalize the representation of genes and to prepare the system to accept the genome sequences of three additional Caenorhabditis species over the coming year.

    Funded by: NHGRI NIH HHS: P41-HG02223

    Nucleic acids research 2005;33;Database issue;D383-9

  • Defining the orientation of the tandem fusions that occurred during the evolution of Indian muntjac chromosomes by BAC mapping.

    Chi JX, Huang L, Nie W, Wang J, Su B and Yang F

    Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, The Chinese Academy of Sciences, Kunming, Yunnan 650223, People's Republic of China.

    The Indian muntjac (Muntiacus muntjak vaginalis) has a karyotype of 2n=6 in the female and 7 in the male, the karyotypic evolution of which through extensive tandem fusions and several centric fusions has been well-documented by recent molecular cytogenetic studies. In an attempt to define the fusion orientations of conserved chromosomal segments and the molecular mechanisms underlying the tandem fusions, we have constructed a highly redundant (more than six times of whole genome coverage) bacterial artificial chromosome (BAC) library of Indian muntjac. The BAC library contains 124,800 clones with no chromosome bias and has an average insert DNA size of 120 kb. A total of 223 clones have been mapped by fluorescent in situ hybridization onto the chromosomes of both Indian muntjac and Chinese muntjac and a high-resolution comparative map has been established. Our mapping results demonstrate that all tandem fusions that occurred during the evolution of Indian muntjac karyotype from the acrocentric 2n=70 hypothetical ancestral karyotype are centromere-telomere (head-tail) fusions.

    Chromosoma 2005;114;3;167-72

  • Nematode genome evolution.

    Coghlan A

    Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin 4, Ireland. alc@sanger.ac.uk

    Nematodes are the most abundant type of animal on earth, and live in hot springs, polar ice, soil, fresh and salt water, and as parasites of plants, vertebrates, insects, and other nematodes. This extraordinary ability to adapt, which hints at an underlying genetic plasticity, has long fascinated biologists. The fully sequenced genomes of Caenorhabditis elegans and Caenorhabditis briggsae, and ongoing sequencing projects for eight other nematodes, provide an exciting opportunity to investigate the genomic changes that have enabled nematodes to invade many different habitats. Analyses of the C. elegans and C. briggsae genomes suggest that these include major changes in gene content; as well as in chromosome number, structure and size. Here I discuss how the data set of ten genomes will be ideal for tackling questions about nematode evolution, as well as questions relevant to all eukaryotes.

    Funded by: Wellcome Trust

    WormBook : the online review of C. elegans biology 2005;1-15

  • Chromosome evolution in eukaryotes: a multi-kingdom perspective.

    Coghlan A, Eichler EE, Oliver SG, Paterson AH and Stein L

    Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland.

    In eukaryotes, chromosomal rearrangements, such as inversions, translocations and duplications, are common and range from part of a gene to hundreds of genes. Lineage-specific patterns are also seen: translocations are rare in dipteran flies, and angiosperm genomes seem prone to polyploidization. In most eukaryotes, there is a strong association between rearrangement breakpoints and repeat sequences. Current data suggest that some repeats promoted rearrangements via non-allelic homologous recombination, for others the association might not be causal but reflects the instability of particular genomic regions. Rearrangement polymorphisms in eukaryotes are correlated with phenotypic differences, so are thought to confer varying fitness in different habitats. Some seem to be under positive selection because they either trap favorable allele combinations together or alter the expression of nearby genes. There is little evidence that chromosomal rearrangements cause speciation, but they probably intensify reproductive isolation between species that have formed by another route.

    Funded by: NHGRI NIH HHS: HG02639; NIGMS NIH HHS: GM58815; Wellcome Trust

    Trends in genetics : TIG 2005;21;12;673-82

  • Proteomic analysis of in vivo phosphorylated synaptic proteins.

    Collins MO, Yu L, Coba MP, Husi H, Campuzano I, Blackstock WP, Choudhary JS and Grant SG

    Division of Neuroscience, University of Edinburgh, Edinburgh EH8 9JZ, UK.

    In the nervous system, protein phosphorylation is an essential feature of synaptic function. Although protein phosphorylation is known to be important for many synaptic processes and in disease, little is known about global phosphorylation of synaptic proteins. Heterogeneity and low abundance make protein phosphorylation analysis difficult, particularly for mammalian tissue samples. Using a new approach, combining both protein and peptide immobilized metal affinity chromatography and mass spectrometry data acquisition strategies, we have produced the first large scale map of the mouse synapse phosphoproteome. We report over 650 phosphorylation events corresponding to 331 sites (289 have been unambiguously assigned), 92% of which are novel. These represent 79 proteins, half of which are novel phosphoproteins, and include several highly phosphorylated proteins such as MAP1B (33 sites) and Bassoon (30 sites). An additional 149 candidate phosphoproteins were identified by profiling the composition of the protein immobilized metal affinity chromatography enrichment. All major synaptic protein classes were observed, including components of important pre- and postsynaptic complexes as well as low abundance signaling proteins. Bioinformatic and in vitro phosphorylation assays of peptide arrays suggest that a small number of kinases phosphorylate many proteins and that each substrate is phosphorylated by many kinases. These data substantially increase existing knowledge of synapse protein phosphorylation and support a model where the synapse phosphoproteome is functionally organized into a highly interconnected signaling network.

    The Journal of biological chemistry 2005;280;7;5972-82

  • Robust enrichment of phosphorylated species in complex mixtures by sequential protein and peptide metal-affinity chromatography and analysis by tandem mass spectrometry.

    Collins MO, Yu L, Husi H, Blackstock WP, Choudhary JS and Grant SG

    The Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK. moc@sanger.ac.uk

    Reversible protein phosphorylation mediated by kinases, phosphatases, and regulatory molecules is an essential mechanism of signal transduction in living cells. Although phosphorylation is the most intensively studied of the several hundred known posttranslational modifications on proteins, until recently the rate of identification of phosphorylation sites has remained low. The use of tandem mass spectrometry has greatly accelerated the identification of phosphorylation sites, although progress was limited by difficulties in phosphoresidue enrichment techniques. We have improved upon existing immobilized metal-affinity chromatography (IMAC) techniques for capturing phosphopeptides, to selectively purify phosphoproteins from complex mixtures. Combinations of phosphoprotein and phosphopeptide enrichment were more effective than current single phosphopeptide purification approaches. We have also implemented iterative mass spectrometry-based scanning techniques to improve detection of phosphorylated peptides in these enriched samples. Here, we provide detailed instructions for implementing and validating these methods together with analysis by tandem mass spectrometry for the study of phosphorylation at the mammalian synapse. This strategy should be widely applicable to the characterization of protein phosphorylation in diverse tissues, organelles, and in cell culture.

    Science's STKE : signal transduction knowledge environment 2005;2005;298;pl6

  • The genome of the heartwater agent Ehrlichia ruminantium contains multiple tandem repeats of actively variable copy number.

    Collins NE, Liebenberg J, de Villiers EP, Brayton KA, Louw E, Pretorius A, Faber FE, van Heerden H, Josemans A, van Kleef M, Steyn HC, van Strijp MF, Zweygarth E, Jongejan F, Maillard JC, Berthier D, Botha M, Joubert F, Corton CH, Thomson NR, Allsopp MT and Allsopp BA

    Department of Veterinary Tropical Diseases, Faculty of Veterinary Science, University of Pretoria, Private Bag X04, Onderstepoort 0110, South Africa.

    Heartwater, a tick-borne disease of domestic and wild ruminants, is caused by the intracellular rickettsia Ehrlichia ruminantium (previously known as Cowdria ruminantium). It is a major constraint to livestock production throughout subSaharan Africa, and it threatens to invade the Americas, yet there is no immediate prospect of an effective vaccine. A shotgun genome sequencing project was undertaken in the expectation that access to the complete protein coding repertoire of the organism will facilitate the search for vaccine candidate genes. We report here the complete 1,516,355-bp sequence of the type strain, the stock derived from the South African Welgevonden isolate. Only 62% of the genome is predicted to be coding sequence, encoding 888 proteins and 41 stable RNA species. The most striking feature is the large number of tandemly repeated and duplicated sequences, some of continuously variable copy number, which contributes to the low proportion of coding sequence. These repeats have mediated numerous translocation and inversion events that have resulted in the duplication and truncation of some genes and have also given rise to new genes. There are 32 predicted pseudogenes, most of which are truncated fragments of genes associated with repeats. Rather then being the result of the reductive evolution seen in other intracellular bacteria, these pseudogenes appear to be the product of ongoing sequence duplication events.

    Funded by: NIAID NIH HHS: R01 AI47885

    Proceedings of the National Academy of Sciences of the United States of America 2005;102;3;838-43

  • Postgraduate training in infectious diseases: investigating the current status in the international community.

    Cooke FJ, Choubina P and Holmes AH

    Department of Infectious Diseases and Microbiology, Hammersmith Hospital, London, UK. fiona@sanger.ac.uk

    International collaboration and understanding is becoming increasingly important as we face a soaring number of emerging and re-emerging infectious diseases. Management of these conditions calls for a cohesive international effort, with contributions from many infectious disease specialists. To optimise collaborative efforts, an international understanding of training, capabilities, and skills would be valuable. An investigation of postgraduate training programmes in the infectious disease specialties around the world was done. 33 countries contributed information. 26 of these countries had established training programmes--one of which was changing its duration and research component; three were in the process of setting up programmes, two provided specialist training that had no official recognition, and two had no specialist training. In addition to promoting international understanding and collaboration, this article should catalyse a global assessment of postgraduate training programmes within the field of infectious diseases.

    The Lancet infectious diseases 2005;5;7;440-9

  • The lrp gene and its role in type I fimbriation in Citrobacter rodentium.

    Cordone A, Mauriello EM, Pickard DJ, Dougan G, De Felice M and Ricca E

    Dipartimento di Biologia Strutturale e Funzionale, Università Federico II, via Cinthia, Complesso Monte S. Angelo, 80126, Naples, Italy.

    Citrobacter rodentium is a murine pathogen that is now widely used as an in vivo model for gastrointestinal infections due to its similarities with human enteropathogens, such as the possession of a locus for enterocyte effacement (the LEE island). We studied the lrp gene of C. rodentium and found that it encodes a product highly similar to members of the Lrp (leucine-responsive regulatory protein) family of transcriptional regulators, able to recognize leucine as an effector and to repress the expression of its own structural gene. In enterobacteria, Lrp is a global regulator of gene expression, as it controls a large variety of genes, including those coding for cell appendages and other potential virulence factors. Based on the well-established role of Lrp on the expression of pilus genes in Escherichia coli, we also studied the role of Lrp in controlling the formation of the type I pilus in C. rodentium. Type I pili, produced by the fim system, are virulence factors of uropathogens, involved in mediating bacterial adhesion to bladder epithelial cells. Yeast agglutination assays showed that Lrp is needed for type I pilus formation and real-time PCR experiments indicated that Lrp has a strong leucine-mediated effect on the expression of the fimAICDFGH operon. Mutant studies indicated that this positive action is exerted mainly through a positive control of Lrp on the phase variation mechanism that regulates fimAICDFGH expression. A quantitative analysis of its expression suggested that this operon may also be negatively regulated at the level of transcription.

    Journal of bacteriology 2005;187;20;7009-17

  • The mSin3A chromatin-modifying complex is essential for embryogenesis and T-cell development.

    Cowley SM, Iritani BM, Mendrysa SM, Xu T, Cheng PF, Yada J, Liggitt HD and Eisenman RN

    Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle WA 98109-1024, USA.

    The corepressor mSin3A is the core component of a chromatin-modifying complex that is recruited by multiple gene-specific transcriptional repressors. In order to understand the role of mSin3A during development, we generated constitutive germ line as well as conditional msin3A deletions. msin3A deletion in the developing mouse embryo results in lethality at the postimplantation stage, demonstrating that it is an essential gene. Blastocysts derived from preimplantation msin3A null embryos and mouse embryo fibroblasts (MEFs) lacking msin3A display a significant reduction in cell division. msin3A null MEFs also show mislocalization of the heterochromatin protein, HP1alpha, without alterations in global histone acetylation. Heterozygous msin3A(+/-) mice with a systemic twofold decrease in mSin3A protein develop splenomegaly as well as kidney disease indicative of a disruption of lymphocyte homeostasis. Conditional deletion of msin3A from developing T cells results in reduced thymic cellularity and a fivefold decrease in the number of cytotoxic (CD8) T cells, while helper (CD4) T cells are unaffected. We show that CD8 development is dependent on mSin3A at a step downstream of T-cell receptor signaling and that loss of mSin3A specifically decreases survival of double-positive and CD8 T cells. Thus, msin3A is a pleiotropic gene which, in addition to its role in cell cycle progression, is required for the development and homeostasis of cells in the lymphoid lineage.

    Funded by: NCI NIH HHS: R01CA57138; NIAID NIH HHS: R01 AI053568-01A1, R01 AI053568-02, R01 AI053568-03, R01 AI053568-04, R01 AI053568-05, R01AI0535468; NICHD NIH HHS: 5R011HD18184; PHS HHS: 5R01JL65898

    Molecular and cellular biology 2005;25;16;6990-7004

  • A survey of homozygous deletions in human cancer genomes.

    Cox C, Bignell G, Greenman C, Stabenau A, Warren W, Stephens P, Davies H, Watt S, Teague J, Edkins S, Birney E, Easton DF, Wooster R, Futreal PA and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.

    Homozygous deletions of recessive cancer genes and fragile sites are known to occur in human cancers. We identified 281 homozygous deletions in 636 cancer cell lines. Of these deletions, 86 were homozygous deletions of known recessive cancer genes, 17 were of sequenced common fragile sites, and 178 were in genomic regions that do not overlap known recessive oncogenes or fragile sites ("unexplained" homozygous deletions). Some cancer cell lines have multiple homozygous deletions whereas others have none, suggesting intrinsic variation in the tendency to develop this type of genetic abnormality (P < 0.001). The 178 unexplained homozygous deletions clustered into 131 genomic regions, 27 of which exhibit homozygous deletions in more than one cancer cell line. This degree of clustering indicates that the genomic positions of the unexplained homozygous deletions are not randomly determined (P < 0.001). Many homozygous deletions, including those that are in multiple clusters, do not overlap known genes and appear to be in intergenic DNA. Therefore, to elucidate further the pathogenesis of homozygous deletions in cancer, we investigated the genome landscape within unexplained homozygous deletions. The gene count within homozygous deletions is low compared with the rest of the genome. There are also fewer short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs), and low-copy-number repeats (LCRs). However, DNA within homozygous deletions has higher flexibility. These features may signal the presence of currently unrecognized zones of susceptibility to DNA rearrangement. They may also reflect a tendency to reduce the adverse effects of homozygous deletions by minimizing the number of genes removed.

    Proceedings of the National Academy of Sciences of the United States of America 2005;102;12;4542-7

  • Treating genetic disease through RNA interference.

    Crombie C and Fraser A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Lancet 2005;365;9467;1288-90

  • Analysis of the genomic region containing the tammar wallaby (Macropus eugenii) orthologues of MHC class III genes.

    Cross JG, Harrison GA, Coggill P, Sims S, Beck S, Deakin JE and Graves JA

    Comparative Genomics Unit, ARC Centre for Kangaroo Genomics, Research School of Biological Sciences, Australian National University, Canberra, Australia. cross@rsbs.anu.edu.au

    Major histocompatibility complex (MHC) molecules are central to development and regulation of the immune system in all jawed vertebrates. MHC class III cytokine genes from the tumor necrosis factor core family, including tumor necrosis factor (TNF) and lymphotoxin alpha and beta (LTA, LTB), are well studied in human and mouse. Orthologues have been identified in several other eutherian species and the cDNA sequences have been reported for a model marsupial, the tammar wallaby. Comparative genomics can help to determine gene function, to understand the evolution of a gene or gene family, and to identify potential regulatory regions. We therefore cloned the genomic region containing the tammar LTB, TNF, and LTA orthologues by "genome walking", using primers designed from known tammar sequences and regions conserved in other species. We isolated two tammar BAC clones containing all three genes. These tammar genes show similar intergenic distances and the same transcriptional orientation as in human and mouse. Gene structures and sequences are also very conserved. By comparing the tammar, human and mouse genomic sequences we were able to identify candidate regulatory regions for these genes in mammals. Full length sequencing of BACs containing the three genes has been partially completed, and reveals the presence of a number of other tammar MHC III orthologues in this region.

    Cytogenetic and genome research 2005;111;2;110-7

  • Diversity at every level.

    Crossman L, Cerdeño-Tárraga A and Parkhill J

    Nature reviews. Microbiology 2005;3;3;196-7

  • Plasmid replicons of Rhizobium.

    Crossman LC

    Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK. lcc@sanger.ac.uk

    Rhizobium spp. are found in soil. They are both free-living and found symbiotically associated with the nodules of leguminous plants. Traditionally, studies have focused on the association of these organisms with plants in nitrogen-fixing nodules, since this is regarded as the most important role of these bacteria in the environment. Rhizobium sp. are known to possess several replicons. Some, like the Rhizobium etli symbiotic plasmid p42d and the plasmid pNGR234b of Rhizobium NGR234, have been sequenced and characterized. The plasmids from these organisms are the focus of this short review.

    Biochemical Society transactions 2005;33;Pt 1;157-8

  • The immunoglobulin heavy-chain locus in zebrafish: identification and expression of a previously unknown isotype, immunoglobulin Z.

    Danilova N, Bussmann J, Jekosch K and Steiner LA

    Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139, USA.

    The only immunoglobulin heavy-chain classes known so far in teleosts have been mu and delta. We identify here a previously unknown class, immunoglobulin zeta, expressed in zebrafish and other teleosts. In the zebrafish heavy-chain locus, variable (V) gene segments lie upstream of two tandem diversity, joining and constant (DJC) clusters, resembling the mouse T cell receptor alpha (Tcra) and delta (Tcrd) locus. V genes rearrange to (DJC)(zeta) or to (DJC)(mu) without evidence of switch rearrangement. The zebrafish immunoglobulin zeta gene (ighz) and mouse Tcrd, which are proximal to the V gene array, are expressed earlier in development. In adults, ighz was expressed only in kidney and thymus, which are primary lymphoid organs in teleosts. This additional class adds complexity to the immunoglobulin repertoire and raises questions concerning the evolution of immunoglobulins and the regulation of the differential expression of ighz and ighm.

    Funded by: NIAID NIH HHS: R01 AI08054

    Nature immunology 2005;6;3;295-302

  • Somatic mutations of the protein kinase gene family in human lung cancer.

    Davies H, Hunter C, Smith R, Stephens P, Greenman C, Bignell G, Teague J, Butler A, Edkins S, Stevens C, Parker A, O'Meara S, Avis T, Barthorpe S, Brackenbury L, Buck G, Clements J, Cole J, Dicks E, Edwards K, Forbes S, Gorton M, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jones D, Kosmidou V, Laman R, Lugg R, Menzies A, Perry J, Petty R, Raine K, Shepherd R, Small A, Solomon H, Stephens Y, Tofts C, Varian J, Webb A, West S, Widaa S, Yates A, Brasseur F, Cooper CS, Flanagan AM, Green A, Knowles M, Leung SY, Looijenga LH, Malkowicz B, Pierotti MA, Teh BT, Yuen ST, Lakhani SR, Easton DF, Weber BL, Goldstraw P, Nicholson AG, Wooster R, Stratton MR and Futreal PA

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, United Kingdom.

    Protein kinases are frequently mutated in human cancer and inhibitors of mutant protein kinases have proven to be effective anticancer drugs. We screened the coding sequences of 518 protein kinases (approximately 1.3 Mb of DNA per sample) for somatic mutations in 26 primary lung neoplasms and seven lung cancer cell lines. One hundred eighty-eight somatic mutations were detected in 141 genes. Of these, 35 were synonymous (silent) changes. This result indicates that most of the 188 mutations were "passenger" mutations that are not causally implicated in oncogenesis. However, an excess of approximately 40 nonsynonymous substitutions compared with that expected by chance (P = 0.07) suggests that some nonsynonymous mutations have been selected and are contributing to oncogenesis. There was considerable variation between individual lung cancers in the number of mutations observed and no mutations were found in lung carcinoids. The mutational spectra of most lung cancers were characterized by a high proportion of C:G > A:T transversions, compatible with the mutagenic effects of tobacco carcinogens. However, one neuroendocrine cancer cell line had a distinctive mutational spectrum reminiscent of UV-induced DNA damage. The results suggest that several mutated protein kinases may be contributing to lung cancer development, but that mutations in each one are infrequent.

    Funded by: Wellcome Trust

    Cancer research 2005;65;17;7591-5

  • Deletion at chromosome band 20p12.1 in colorectal cancer revealed by high resolution array comparative genomic hybridization.

    Davison EJ, Tarpey PS, Fiegler H, Tomlinson IP and Carter NP

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. e.j.davison@leeds.ac.uk

    Array comparative genomic hybridization (Array CGH) with tiling path resolution for a approximately 4.61 Mb region of chromosome band 20p12.1 has been used to investigate copy number loss in 48 colorectal cancer cell lines and 37 primary colorectal cancers. A recurrent deletion was detected in 55% of cell lines and 23% of primary cancers and the consensus minimum region of loss was identified as a approximately 190 kb section from 14.85 Mb to 15.04 Mb of chromosome 20. Two noncoding RNA genes located in the region, BA318C17.1 and DJ974N19.1, were investigated by mutation analysis and real-time PCR in colorectal cancer cell lines. Sequence changes in BA318C17.1 and reduced expression of both genes was detected, suggesting that the abrogation of these genes may play a role in colorectal tumorigenesis.

    Genes, chromosomes & cancer 2005;44;4;384-91

  • Positional and functional mapping of a neuroblastoma differentiation gene on chromosome 11.

    De Preter K, Vandesompele J, Menten B, Carr P, Fiegler H, Edsjö A, Carter NP, Yigit N, Waelput W, Van Roy N, Bader S, Påhlman S and Speleman F

    Center for Medical Genetics, Ghent University Hospital, MRB 2nd floor, De Pintelaan 185, B-9000 Ghent, Belgium. katleen.depreter@ugent.be

    Background: Loss of chromosome 11q defines a subset of high-stage aggressive neuroblastomas. Deletions are typically large and mapping efforts have thus far not lead to a well defined consensus region, which hampers the identification of positional candidate tumour suppressor genes. In a previous study, functional evidence for a neuroblastoma suppressor gene on chromosome 11 was obtained through microcell mediated chromosome transfer, indicated by differentiation of neuroblastoma cells with loss of distal 11q upon introduction of chromosome 11. Interestingly, some of these microcell hybrid clones were shown to harbour deletions in the transferred chromosome 11. We decided to further exploit this model system as a means to identify candidate tumour suppressor or differentiation genes located on chromosome 11.

    Results: In a first step, we performed high-resolution array CGH DNA copy-number analysis in order to evaluate the chromosome 11 status in the hybrids. Several deletions in both parental and transferred chromosomes in the investigated microcell hybrids were observed. Subsequent correlation of these deletion events with the observed morphological changes lead to the delineation of three putative regions on chromosome 11: 11q25, 11p13-->11p15.1 and 11p15.3, that may harbour the responsible differentiation gene.

    Conclusion: Using an available model system, we were able to put forward some candidate regions that may be involved in neuroblastoma. Additional studies will be required to clarify the putative role of the genes located in these chromosomal segments in the observed differentiation phenotype specifically or in neuroblastoma pathogenesis in general.

    BMC genomics 2005;6;97

  • Genomic sequence of the class II region of the canine MHC: comparison with the MHC of other mammalian species.

    Debenham SL, Hart EA, Ashurst JL, Howe KL, Quail MA, Ollier WE and Binns MM

    Genetics Section, Animal Health Trust, Lanwades Park, Kentford, Newmarket, Suffolk CB8 7UU, UK. sally.debenham@aht.org.uk

    The domestic dog, Canis familiaris, is an excellent model species in which to study complex inherited diseases, having over 200 recognized breeds, each of which represents a closed gene pool. Overlapping canine genomic BAC clones were sequenced to obtain 711,521 bp of the canine classical and extended MHC class II regions. Analysis and annotation of this sequence reveals that it contains 45 loci, of which 29 are predicted to be functionally expressed. Comparison of the DLA class II sequence with those of the cat, human, and mouse highlights regions of syntenic conservation and species-specific gene rearrangement and duplication and gives an insight into the evolution of the DR region in the order Carnivora. Elucidation of functionally important dog class II genes and the identification of 23 microsatellite markers spanning this region will contribute significantly to the study of canine diseases that have an immune component.

    Genomics 2005;85;1;48-59

  • Conserved non-genic sequences - an unexpected feature of mammalian genomes.

    Dermitzakis ET, Reymond A and Antonarakis SE

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. md4@sanger.ac.uk

    Mammalian genomes contain highly conserved sequences that are not functionally transcribed. These sequences are single copy and comprise approximately 1-2% of the human genome. Evolutionary analysis strongly supports their functional conservation, although their potentially diverse, functional attributes remain unknown. It is likely that genomic variation in conserved non-genic sequences is associated with phenotypic variability and human disorders. So how might their function and contribution to human disorders be examined?

    Nature reviews. Genetics 2005;6;2;151-7

  • Gene expression variation and expression quantitative trait mapping of human chromosome 21 genes.

    Deutsch S, Lyle R, Dermitzakis ET, Attar H, Subrahmanyan L, Gehrig C, Parand L, Gagnebin M, Rougemont J, Jongeneel CV and Antonarakis SE

    Department of Genetic Medicine and Development, Geneva University Medical School, Geneva, Switzerland.

    Inter-individual differences in gene expression are likely to account for an important fraction of phenotypic differences, including susceptibility to common disorders. Recent studies have shown extensive variation in gene expression levels in humans and other organisms, and that a fraction of this variation is under genetic control. We investigated the patterns of gene expression variation in a 25 Mb region of human chromosome 21, which has been associated with many Down syndrome (DS) phenotypes. Taqman real-time PCR was used to measure expression variation of 41 genes in lymphoblastoid cells of 40 unrelated individuals. For 25 genes found to be differentially expressed, additional analysis was performed in 10 CEPH families to determine heritabilities and map loci harboring regulatory variation. Seventy-six percent of the differentially expressed genes had significant heritabilities, and genomewide linkage analysis led to the identification of significant eQTLs for nine genes. Most eQTLs were in trans, with the best result (P=7.46 x 10(-8)) obtained for TMEM1 on chromosome 12q24.33. A cis-eQTL identified for CCT8 was validated by performing an association study in 60 individuals from the HapMap project. SNP rs965951 located within CCT8 was found to be significantly associated with its expression levels (P=2.5 x 10(-5)) confirming cis-regulatory variation. The results of our study provide a representative view of expression variation of chromosome 21 genes, identify loci involved in their regulation and suggest that genes, for which expression differences are significantly larger than 1.5-fold in control samples, are unlikely to be involved in DS-phenotypes present in all affected individuals.

    Human molecular genetics 2005;14;23;3741-9

  • Exon array CGH: detection of copy-number changes at the resolution of individual exons in the human genome.

    Dhami P, Coffey AJ, Abbs S, Vermeesch JR, Dumanski JP, Woodward KJ, Andrews RM, Langford C and Vetrie D

    Human Genetics, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom.

    The development of high-throughput screening methods such as array-based comparative genome hybridization (array CGH) allows screening of the human genome for copy-number changes. Current array CGH strategies have limits of resolution that make detection of small (less than a few tens of kilobases) gains or losses of genomic DNA difficult to identify. We report here a significant improvement in the resolution of array CGH, with the development of an array platform that utilizes single-stranded DNA array elements to accurately measure copy-number changes of individual exons in the human genome. Using this technology, we screened 31 patient samples across an array containing a total of 162 exons for five disease genes and detected copy-number changes, ranging from whole-gene deletions and duplications to single-exon deletions and duplications, in 100% of the cases. Our data demonstrate that it is possible to screen the human genome for copy-number changes with array CGH at a resolution that is 2 orders of magnitude higher than that previously reported.

    American journal of human genetics 2005;76;5;750-62

  • The SET-domain protein superfamily: protein lysine methyltransferases.

    Dillon SC, Zhang X, Trievel RC and Cheng X

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. scd@sanger.ac.uk

    The SET-domain protein methyltransferase superfamily includes all but one of the proteins known to methylate histones on lysine. Histone methylation is important in the regulation of chromatin and gene expression.

    Funded by: NIGMS NIH HHS: GM49245, GM61355; Wellcome Trust

    Genome biology 2005;6;8;227

  • NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence.

    Down TA and Hubbard TJ

    Wellcome Trust Sanger Institute, Hinxton Cambridge, CB10 1SA, UK. td2@sanger.ac.uk

    NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of a newly developed inference strategy called Nested Sampling means NestedMICA is able to find optimal solutions without the need for a problematic initialization or seeding step. We investigate the performance of NestedMICA in a range scenario, on synthetic data and a well-characterized set of muscle regulatory regions, and compare it with the popular MEME program. We show that the new method is significantly more sensitive than MEME: in one case, it successfully extracted a target motif from background sequence four times longer than could be handled by the existing program. It also performs robustly on synthetic sequences containing multiple significant motifs. When tested on a real set of regulatory sequences, NestedMICA produced motifs which were good predictors for all five abundant classes of annotated binding sites.

    Nucleic acids research 2005;33;5;1445-53

  • The use of edge-betweenness clustering to investigate biological function in protein interaction networks.

    Dunn R, Dudbridge F and Sanderson CM

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. rd3@sanger.ac.uk <rd3@sanger.ac.uk&gt;

    Background: This paper describes an automated method for finding clusters of interconnected proteins in protein interaction networks and retrieving protein annotations associated with these clusters.

    Results: Protein interaction graphs were separated into subgraphs of interconnected proteins, using the JUNG implementation of Girvan and Newman's Edge-Betweenness algorithm. Functions were sought for these subgraphs by detecting significant correlations with the distribution of Gene Ontology terms which had been used to annotate the proteins within each cluster. The method was implemented using freely available software (JUNG and the R statistical package). Protein clusters with significant correlations to functional annotations could be identified and included groups of proteins know to cooperate in cell metabolism. The method appears to be resilient against the presence of false positive interactions.

    Conclusion: This method provides a useful tool for rapid screening of small to medium size protein interaction datasets.

    BMC bioinformatics 2005;6;39

  • Host susceptibility and clinical outcomes in toll-like receptor 5-deficient patients with typhoid fever in Vietnam.

    Dunstan SJ, Hawn TR, Hue NT, Parry CP, Ho VA, Vinh H, Diep TS, House D, Wain J, Aderem A, Hien TT and Farrar JJ

    Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam. sdunstan@hcm.vnn.vn

    Toll-like receptor 5 (TLR5) mediates innate immune responses to bacterial pathogens by binding to flagellin. A polymorphism in the TLR5 gene introduces a premature stop codon (TLR5(392STOP)) that is associated with susceptibility to legionnaires disease. Here we investigated whether TLR5(392STOP) was associated with typhoid fever. The frequency of TLR5(392STOP) was not significantly different in 565 patients with typhoid fever and 281 ethnically matched control subjects. Furthermore, TLR5 deficiency had no measurable effect on a number of clinical parameters associated with typhoid fever, including fever clearance time, pathogen burden, disease severity, or age at acquisition of disease. TLR5 may not play an important role in TLR-stimulated innate immune responses to human infection with Salmonella enterica serovar Typhi. Initiation of these responses may rely on other TLRs that recognize different bacterial ligands.

    The Journal of infectious diseases 2005;191;7;1068-71

  • The genome of the social amoeba Dictyostelium discoideum.

    Eichinger L, Pachebat JA, Glöckner G, Rajandream MA, Sucgang R, Berriman M, Song J, Olsen R, Szafranski K, Xu Q, Tunggal B, Kummerfeld S, Madera M, Konfortov BA, Rivero F, Bankier AT, Lehmann R, Hamlin N, Davies R, Gaudet P, Fey P, Pilcher K, Chen G, Saunders D, Sodergren E, Davis P, Kerhornou A, Nie X, Hall N, Anjard C, Hemphill L, Bason N, Farbrother P, Desany B, Just E, Morio T, Rost R, Churcher C, Cooper J, Haydock S, van Driessche N, Cronin A, Goodhead I, Muzny D, Mourier T, Pain A, Lu M, Harper D, Lindsay R, Hauser H, James K, Quiles M, Madan Babu M, Saito T, Buchrieser C, Wardroper A, Felder M, Thangavelu M, Johnson D, Knights A, Loulseged H, Mungall K, Oliver K, Price C, Quail MA, Urushihara H, Hernandez J, Rabbinowitsch E, Steffen D, Sanders M, Ma J, Kohara Y, Sharp S, Simmonds M, Spiegler S, Tivey A, Sugano S, White B, Walker D, Woodward J, Winckler T, Tanaka Y, Shaulsky G, Schleicher M, Weinstock G, Rosenthal A, Cox EC, Chisholm RL, Gibbs R, Loomis WF, Platzer M, Kay RR, Williams J, Dear PH, Noegel AA, Barrell B and Kuspa A

    Center for Biochemistry and Center for Molecular Medicine Cologne, University of Cologne, Joseph-Stelzmann-Str. 52, 50931 Cologne, Germany.

    The social amoebae are exceptional in their ability to alternate between unicellular and multicellular forms. Here we describe the genome of the best-studied member of this group, Dictyostelium discoideum. The gene-dense chromosomes of this organism encode approximately 12,500 predicted proteins, a high proportion of which have long, repetitive amino acid tracts. There are many genes for polyketide synthases and ABC transporters, suggesting an extensive secondary metabolism for producing and exporting small molecules. The genome is rich in complex repeats, one class of which is clustered and may serve as centromeres. Partial copies of the extrachromosomal ribosomal DNA (rDNA) element are found at the ends of each chromosome, suggesting a novel telomere structure and the use of a common mechanism to maintain both the rDNA and chromosomal termini. A proteome-based phylogeny shows that the amoebozoa diverged from the animal-fungal lineage after the plant-animal split, but Dictyostelium seems to have retained more of the diversity of the ancestral genome than have plants, animals or fungi.

    Funded by: NICHD NIH HHS: R01 HD035925-06

    Nature 2005;435;7038;43-57

  • The Sequence Ontology: a tool for the unification of genome annotations.

    Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L, Durbin R and Ashburner M

    Department of Molecular and Cellular Biology, Life Sciences Addition, University of California, Berkeley, CA 94729-3200, USA. keilbeck@fruitfly.org

    The Sequence Ontology (SO) is a structured controlled vocabulary for the parts of a genomic annotation. SO provides a common set of terms and definitions that will facilitate the exchange, analysis and management of genomic data. Because SO treats part-whole relationships rigorously, data described with it can become substrates for automated reasoning, and instances of sequence features described by the SO can be subjected to a group of logical operations termed extensional mereology operators.

    Genome biology 2005;6;5;R44

  • Comparative genomics of trypanosomatid parasitic protozoa.

    El-Sayed NM, Myler PJ, Blandin G, Berriman M, Crabtree J, Aggarwal G, Caler E, Renauld H, Worthey EA, Hertz-Fowler C, Ghedin E, Peacock C, Bartholomeu DC, Haas BJ, Tran AN, Wortman JR, Alsmark UC, Angiuoli S, Anupama A, Badger J, Bringaud F, Cadag E, Carlton JM, Cerqueira GC, Creasy T, Delcher AL, Djikeng A, Embley TM, Hauser C, Ivens AC, Kummerfeld SK, Pereira-Leal JB, Nilsson D, Peterson J, Salzberg SL, Shallom J, Silva JC, Sundaram J, Westenberger S, White O, Melville SE, Donelson JE, Andersson B, Stuart KD and Hall N

    Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA. nelsayed@tigr.org

    A comparison of gene content and genome architecture of Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major, three related pathogens with different life cycles and disease pathology, revealed a conserved core proteome of about 6200 genes in large syntenic polycistronic gene clusters. Many species-specific genes, especially large surface antigen families, occur at nonsyntenic chromosome-internal and subtelomeric regions. Retroelements, structural RNAs, and gene family expansion are often associated with syntenic discontinuities that-along with gene divergence, acquisition and loss, and rearrangement within the syntenic regions-have shaped the genomes of each parasite. Contrary to recent reports, our analyses reveal no evidence that these species are descended from an ancestor that contained a photosynthetic endosymbiont.

    Funded by: NIAID NIH HHS: AI045039, AI45038, AI45061, U01 AI040599, U01 AI045039; Wellcome Trust

    Science (New York, N.Y.) 2005;309;5733;404-9

  • Mutations of C-RAF are rare in human cancer because C-RAF has a low basal kinase activity compared with B-RAF.

    Emuss V, Garnett M, Mason C and Marais R

    The Institute of Cancer Research, Signal Transduction Team, Cancer Research UK Centre of Cell and Molecular Biology, London, United Kingdom.

    The protein kinase B-RAF is mutated in approximately 8% of human cancers. Here we show that presumptive mutants of the closely related kinase, C-RAF, were detected in only 4 of 545 (0.7%) cancer cell lines. The activity of two of the mutated proteins is not significantly different from that of wild-type C-RAF and these variants may represent rare human polymorphisms. The basal and B-RAF-stimulated kinase activities of a third variant are unaltered but its activation by RAS is significantly reduced, suggesting that it may act in a dominant-negative manner to modulate pathway signaling. The fourth variant has elevated basal kinase activity and is hypersensitive to activation by RAS but does not transform mammalian cells. Furthermore, when we introduce the equivalent of the most common cancer mutation in B-RAF (V600E) into C-RAF, it only has a weak effect on kinase activity and does not convert C-RAF into an oncogene. This lack of activation occurs because C-RAF lacks a constitutive charge within a motif in the kinase domain called the N-region. This fundamental difference in RAF isoform regulation explains why B-RAF is frequently mutated in cancer whereas C-RAF mutations are rare.

    Funded by: Wellcome Trust

    Cancer research 2005;65;21;9719-26

  • Gene finding in the chicken genome.

    Eyras E, Reymond A, Castelo R, Bye JM, Camara F, Flicek P, Huckle EJ, Parra G, Shteynberg DD, Wyss C, Rogers J, Antonarakis SE, Birney E, Guigo R and Brent MR

    Institut Municipal d'Investigacio Medica, Universitat Pompeu Fabra, Centre de Regulacio Genomica, E08003 Barcelona, Catalonia, Spain. eeyras@imim.es

    Background: Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularly true for genomes for which there is not a large collection of known gene sequences, such as the recently published chicken genome. We used the chicken sequence to test comparative and homology-based gene-finding methods followed by experimental validation as an effective genome annotation method.

    Results: We performed experimental evaluation by RT-PCR of three different computational gene finders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram was computed and each component of it was evaluated. The results showed that de novo comparative methods can identify up to about 700 chicken genes with no previous evidence of expression, and can correctly extend about 40% of homology-based predictions at the 5' end.

    Conclusions: De novo comparative gene prediction followed by experimental verification is effective at enhancing the annotation of the newly sequenced genomes provided by standard homology-based methods.

    Funded by: NHGRI NIH HHS: HG003150-01, R01 HG02278; Wellcome Trust

    BMC bioinformatics 2005;6;131

  • Uncover genetic interactions in Caenorhabditis elegans by RNA interference.

    Fortunato A and Fraser AG

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Cambridge, Hinxton, UK. af4@sanger.ac.uk

    RNA-mediated interference (RNAi) has emerged recently as one of the most powerful functional genomics tools. RNAi has been particularly effective in the nematode worm C. elegans where RNAi has been used to analyse the loss-of-function phenotypes of almost all predicted genes. In this review, we illustrate how RNAi has been used to analyse gene function in C. elegans as well as pointing to some future directions for using RNAi to examine genetic interactions in a systematic manner.

    Bioscience reports 2005;25;5-6;299-307

  • Variation in the eNOS gene modifies the association between total energy expenditure and glucose intolerance.

    Franks PW, Luan J, Barroso I, Brage S, Gonzalez Sanchez JL, Ekelund U, Ríos MS, Schafer AJ, O'Rahilly S and Wareham NJ

    National Institute of DiabetesDigestiveKidney Diseases, 1550 E. Indian School Rd., Phoenix, AZ 85014, USA. pfranks@niddk.nih.gov

    Endothelium-derived nitric oxide (NO) facilitates skeletal muscle glucose uptake. Energy expenditure induces the endothelial NO synthase (eNOS) gene, providing a mechanism for insulin-independent glucose disposal. The object was to test 1) the association of genetic variation in eNOS, as assessed by haplotype-tagging single nucleotide polymorphisms (htSNPs) with type 2 diabetes, and 2) the interaction between eNOS haplotypes and total energy expenditure on glucose intolerance. Using multivariate models, we tested associations between eNOS htSNPs and diabetes (n = 461 and 474 case and control subjects, respectively) and glucose intolerance (two cohorts of n = 706 and 738 U.K. and Spanish Caucasians, respectively), and we tested eNOS x total energy expenditure interactions on glucose intolerance. An overall association between eNOS haplotype and diabetes was observed (P = 0.004). Relative to the most common haplotype (111), two haplotypes (121 and 212) tended to increase diabetes risk (OR 1.22, 95% CI 0.96-1.55), and one (122) was associated with decreased risk (0.58, 0.39-0.86). In the cohort studies, no association was observed between haplotypes and 2-h glucose (P > 0.10). However, we observed a significant total energy expenditure-haplotype interaction (P = 0.007). Genetic variation at the eNOS locus is associated with diabetes, which may be attributable to an enhanced effect of total energy expenditure on glucose disposal in individuals with specific eNOS haplotypes. Gene-environment interactions such as this may help explain why replication of genetic association frequently fails.

    Funded by: Wellcome Trust

    Diabetes 2005;54;9;2795-801

  • The pCoo plasmid of enterotoxigenic Escherichia coli is a mosaic cointegrate.

    Froehlich B, Parkhill J, Sanders M, Quail MA and Scott JR

    Department of Microbiology and Immunology, Emory University School of Medicine, Atlanta, Georgia 30322, USA.

    CS1 is the prototype of a class of pili of enterotoxigenic Escherichia coli (ETEC) associated with diarrheal disease in humans. The genes encoding this pilus are carried on a large plasmid, pCoo. We report the sequence of the complete 98,396-bp plasmid. Like many other virulence plasmids, pCoo is a mosaic consisting of regions derived from multiple sources. Complete and fragmented insertion sequences (IS) make up 24% of the total DNA and are scattered throughout the plasmid. The pCoo DNA between these IS elements has a wide range of G+C content (35 to 57%), suggesting that these regions have different ancestries. We find that the pCoo plasmid is a cointegrate of two functional replicons, related to R64 and R100, which are joined at a 1,953-bp direct repeat of IS100. Recombination between these repeats in the cointegrate generates the two smaller replicons which coexist with the cointegrate in the culture. Both of the smaller replicons have plasmid stability genes as well as genes that may be important in pathogenesis. Examination by PCR of 17 other unrelated CS1 ETEC strains with a variety of serotypes demonstrated that all contained at least parts of both replicons of pCoo and that strains of the O6 genotype appear to contain a cointegrate very similar to pCoo. The results suggest that this family of CS1-encoding plasmids is evolving rapidly.

    Funded by: NIAID NIH HHS: AI24870; Wellcome Trust

    Journal of bacteriology 2005;187;18;6509-16

  • The molecular clock mediates leptin-regulated bone formation.

    Fu L, Patel MS, Bradley A, Wagner EF and Karsenty G

    Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030; Bone Disease Program of Texas, Baylor College of Medicine, Houston, Texas 77030, USA.

    The hormone leptin is a regulator of bone remodeling, a homeostatic function maintaining bone mass constant. Mice lacking molecular-clock components (Per and Cry), or lacking Per genes in osteoblasts, display high bone mass, suggesting that bone remodeling may also be subject to circadian regulation. Moreover, Per-deficient mice experience a paradoxical increase in bone mass following leptin intracerebroventricular infusion. Thus, clock genes may mediate the leptin-dependent sympathetic regulation of bone formation. We show that expression of clock genes in osteoblasts is regulated by the sympathetic nervous system and leptin. Clock genes mediate the antiproliferative function of sympathetic signaling by inhibiting G1 cyclin expression. Partially antagonizing this inhibitory loop, leptin also upregulates AP-1 gene expression, which promotes cyclin D1 expression, osteoblast proliferation, and bone formation. Thus, leptin determines the extent of bone formation by modulating, via sympathetic signaling, osteoblast proliferation through two antagonistic pathways, one of which involves the molecular clock.

    Cell 2005;122;5;803-15

  • Somatic mutations in human cancer: insights from resequencing the protein kinase gene family.

    Futreal PA, Wooster R and Stratton MR

    Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, UK.

    All cancers arise due to the accumulation of mutations in critical target genes that, when altered, give rise to selective advantage in the cell and its progeny that harbor them. Knowledge of these mutations is key in understanding the biology of cancer initiation and progression, as well as the development of more targeted therapeutic strategies. We have undertaken a systematic screen of all annotated protein kinases in the human genome for mutations in a series of cancers including breast, non-small-cell lung, and testicular cancer. Our results show a wide diversity in mutation prevalence within and between tumor types. We have identified a mutator phenotype in human breast previously undescribed. The results presented from sequencing the same 1.3 million base pairs through several tumor types suggest that most of the observed mutations are likely to be passenger events rather than causally implicated in oncogenesis. However, this work does provide evidence for the likely existence of multiple, infrequently mutated kinases.

    Funded by: Wellcome Trust

    Cold Spring Harbor symposia on quantitative biology 2005;70;43-9

  • Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae.

    Galagan JE, Calvo SE, Cuomo C, Ma LJ, Wortman JR, Batzoglou S, Lee SI, Baştürkmen M, Spevak CC, Clutterbuck J, Kapitonov V, Jurka J, Scazzocchio C, Farman M, Butler J, Purcell S, Harris S, Braus GH, Draht O, Busch S, D'Enfert C, Bouchier C, Goldman GH, Bell-Pedersen D, Griffiths-Jones S, Doonan JH, Yu J, Vienken K, Pain A, Freitag M, Selker EU, Archer DB, Peñalva MA, Oakley BR, Momany M, Tanaka T, Kumagai T, Asai K, Machida M, Nierman WC, Denning DW, Caddick M, Hynes M, Paoletti M, Fischer R, Miller B, Dyer P, Sachs MS, Osmani SA and Birren BW

    The Broad Institute of MIT and Harvard, 320 Charles Street, Cambridge, Massachusetts 02142, USA.

    The aspergilli comprise a diverse group of filamentous fungi spanning over 200 million years of evolution. Here we report the genome sequence of the model organism Aspergillus nidulans, and a comparative study with Aspergillus fumigatus, a serious human pathogen, and Aspergillus oryzae, used in the production of sake, miso and soy sauce. Our analysis of genome structure provided a quantitative evaluation of forces driving long-term eukaryotic genome evolution. It also led to an experimentally validated model of mating-type locus evolution, suggesting the potential for sexual reproduction in A. fumigatus and A. oryzae. Our analysis of sequence conservation revealed over 5,000 non-coding regions actively conserved across all three species. Within these regions, we identified potential functional elements including a previously uncharacterized TPP riboswitch and motifs suggesting regulation in filamentous fungi by Puf family genes. We further obtained comparative and experimental evidence indicating widespread translational regulation by upstream open reading frames. These results enhance our understanding of these widely studied fungi as well as provide new insight into eukaryotic genome evolution and gene regulation.

    Funded by: NIGMS NIH HHS: R01 GM058529

    Nature 2005;438;7071;1105-15

  • Genome sequence of Theileria parva, a bovine pathogen that transforms lymphocytes.

    Gardner MJ, Bishop R, Shah T, de Villiers EP, Carlton JM, Hall N, Ren Q, Paulsen IT, Pain A, Berriman M, Wilson RJ, Sato S, Ralph SA, Mann DJ, Xiong Z, Shallom SJ, Weidman J, Jiang L, Lynn J, Weaver B, Shoaibi A, Domingo AR, Wasawo D, Crabtree J, Wortman JR, Haas B, Angiuoli SV, Creasy TH, Lu C, Suh B, Silva JC, Utterback TR, Feldblyum TV, Pertea M, Allen J, Nierman WC, Taracha EL, Salzberg SL, White OR, Fitzhugh HA, Morzaria S, Venter JC, Fraser CM and Nene V

    Institute for Genomic Research (TIGR), 9712 Medical Center Drive, Rockville, MD 20850, USA. gardner@tigr.org

    We report the genome sequence of Theileria parva, an apicomplexan pathogen causing economic losses to smallholder farmers in Africa. The parasite chromosomes exhibit limited conservation of gene synteny with Plasmodium falciparum, and its plastid-like genome represents the first example where all apicoplast genes are encoded on one DNA strand. We tentatively identify proteins that facilitate parasite segregation during host cell cytokinesis and contribute to persistent infection of transformed host cells. Several biosynthetic pathways are incomplete or absent, suggesting substantial metabolic dependence on the host cell. One protein family that may generate parasite antigenic diversity is not telomere-associated.

    Science (New York, N.Y.) 2005;309;5731;134-7

  • MicroRNAs regulate brain morphogenesis in zebrafish.

    Giraldez AJ, Cinalli RM, Glasner ME, Enright AJ, Thomson JM, Baskerville S, Hammond SM, Bartel DP and Schier AF

    Developmental Genetics Program, Skirball Institute of Biomolecular Medicine and Department of Cell Biology, New York University School of Medicine, New York, NY 10016, USA. giraldez@saturn.med.nyu.edu

    MicroRNAs (miRNAs) are small RNAs that regulate gene expression posttranscriptionally. To block all miRNA formation in zebrafish, we generated maternal-zygotic dicer (MZdicer) mutants that disrupt the Dicer ribonuclease III and double-stranded RNA-binding domains. Mutant embryos do not process precursor miRNAs into mature miRNAs, but injection of preprocessed miRNAs restores gene silencing, indicating that the disrupted domains are dispensable for later steps in silencing. MZdicer mutants undergo axis formation and differentiate multiple cell types but display abnormal morphogenesis during gastrulation, brain formation, somitogenesis, and heart development. Injection of miR-430 miRNAs rescues the brain defects in MZdicer mutants, revealing essential roles for miRNAs during morphogenesis.

    Science (New York, N.Y.) 2005;308;5723;833-8

  • CoGenT++: an extensive and extensible data environment for computational genomics.

    Goldovsky L, Janssen P, Ahrén D, Audit B, Cases I, Darzentas N, Enright AJ, López-Bigas N, Peregrin-Alvarez JM, Smith M, Tsoka S, Kunin V and Ouzounis CA

    Computational Genomics Group, The European Bioinformatics Institute EMBL, Cambridge Outstation, Cambridge CB10 1SD, UK.

    Motivation: CoGenT++ is a data environment for computational research in comparative and functional genomics, designed to address issues of consistency, reproducibility, scalability and accessibility.

    Description: CoGenT++ facilitates the re-distribution of all fully sequenced and published genomes, storing information about species, gene names and protein sequences. We describe our scalable implementation of ProXSim, a continually updated all-against-all similarity database, which stores pairwise relationships between all genome sequences. Based on these similarities, derived databases are generated for gene fusions--AllFuse, putative orthologs--OFAM, protein families--TRIBES, phylogenetic profiles--ProfUse and phylogenetic trees. Extensions based on the CoGenT++ environment include disease gene prediction, pattern discovery, automated domain detection, genome annotation and ancestral reconstruction.

    Conclusion: CoGenT++ provides a comprehensive environment for computational genomics, accessible primarily for large-scale analyses as well as manual browsing.

    Bioinformatics (Oxford, England) 2005;21;19;3806-10

  • Insights into the P. y. yoelii hepatic stage transcriptome reveal complex transcriptional patterns.

    Grüner AC, Hez-Deroubaix S, Snounou G, Hall N, Bouchier C, Letourneur F, Landau I and Druilhe P

    Unité de Parasitologie Bio-Médicale, Institut Pasteur, 25 Rue du Dr Roux, 75731 Paris Cedex 15, France.

    During their complex life cycle, malaria parasites adopt morphologically, biochemically and immunologically distinct forms. The intra-hepatic form is the least known, yet of established value in the induction of sterile immunity and as a target for chemoprophylaxis. Using Plasmodium yoelii as a model we present here a novel approach to the elucidation of the transcriptome of this poorly studied stage. Sequences from Plasmodium were obtained in 388 of the 3533 inserts (11%) isolated from liver stages cDNA obtained from optimized cultures with high yields. These corresponded to a total of 88 putative P. yoelii genes. The majority of the transcribed genes identified, code for predicted proteins of as yet unknown function. The RT-PCR analysis carried out for 29 of these genes, confirmed expression at the hepatic stage and provided evidence for complex patterns of genes transcription in the distinct stages found in the mosquito and vertebrate host. The results demonstrate the efficacy of the approach that can now be applied to further detailed analysis of the hepatic stage transcriptome of Plasmodium.

    Funded by: Wellcome Trust

    Molecular and biochemical parasitology 2005;142;2;184-92

  • AMPA receptor trafficking and GluR1.

    Grant SG

    Science (New York, N.Y.) 2005;310;5746;234-5; author reply 234-5

  • Synapse proteomics of multiprotein complexes: en route from genes to nervous system diseases.

    Grant SG, Marshall MC, Page KL, Cumiskey MA and Armstrong JD

    Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK sg3@sanger.ac.uk

    Proteomic experiments have produced a draft profile of the overall molecular composition of the mammalian neuronal synapse. It appears that synapses have over 1000 protein components and the mapping of their interactions, organization and functions will lead to a global view of the role of synapses in physiology and disease. A major functional subcomponent of the synaptic machinery is a multiprotein complex of glutamate receptors and adhesion proteins with associated adaptor and signalling enzymes totally 185 proteins known as the N-methyl-d-aspartate receptor complex/MAGUK associated signalling complex (NRC/MASC). Here, we review the proteomic studies and functions of NRC/MASC and specifically report on the role of its component genes in human diseases. Using a systematic literature search protocol, we identified reports of mutations or polymorphisms in 47 genes associated with 183 disorders, of which 54 were nervous system disorders. A similar number of genes are important in mouse synaptic plasticity and behaviour, where the NRC/MASC acts as a signalling complex with multiple functions provided by its individual protein components and their interactions. The individual gene mutations suggest not only an important role for the NRC/MASC in human diseases but that these diseases may be functionally connected by their common link to the NRC/MASC. The NRC/MASC is a rich source of genetic variation and provides a platform for understanding relationships of disease phenotype amenable to systematic studies such as the Genes to Cognition research consortium (www.genes2cognition.org) that links human and mouse genetics with proteomic studies.

    Funded by: Wellcome Trust

    Human molecular genetics 2005;14 Spec No. 2;R225-34

  • The complex nature of constitutional de novo apparently balanced translocations in patients presenting with abnormal phenotypes.

    Gribble SM, Prigmore E, Burford DC, Porter KM, Ng BL, Douglas EJ, Fiegler H, Carr P, Kalaitzopoulos D, Clegg S, Sandstrom R, Temple IK, Youings SA, Thomas NS, Dennis NR, Jacobs PA, Crolla JA and Carter NP

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Objective: To describe the systematic analysis of constitutional de novo apparently balanced translocations in patients presenting with abnormal phenotypes, characterise the structural chromosome rearrangements, map the translocation breakpoints, and report detectable genomic imbalances.

    Methods: DNA microarrays were used with a resolution of 1 Mb for the detailed genome-wide analysis of the patients. Array CGH was used to screen for genomic imbalance and array painting to map chromosome breakpoints rapidly. These two methods facilitate rapid analysis of translocation breakpoints and screening for cryptic chromosome imbalance. Breakpoints of rearrangements were further refined (to the level of spanning clones) using fluorescence in situ hybridisation where appropriate.

    Results: Unexpected additional complexity or genome imbalance was found in six of 10 patients studied. The patients could be grouped according to the general nature of the karyotype rearrangement as follows: (A) three cases with complex multiple rearrangements including deletions, inversions, and insertions at or near one or both breakpoints; (B) three cases in which, while the translocations appeared to be balanced, microarray analysis identified previously unrecognised imbalance on chromosomes unrelated to the translocation; (C) four cases in which the translocation breakpoints appeared simple and balanced at the resolution used.

    Conclusions: This high level of unexpected rearrangement complexity, if generally confirmed in the study of further patients, will have an impact on current diagnostic investigations of this type and provides an argument for the more widespread adoption of microarray analysis or other high resolution genome-wide screens for chromosome imbalance and rearrangement.

    Journal of medical genetics 2005;42;1;8-16

  • Annotating non-coding RNAs with Rfam.

    Griffiths-Jones S

    Wellcome Trust Sanger Institute Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, United Kingdom.

    Non-coding RNA (ncRNA) genes produce a functional RNA product, rather than a translated protein. The range and importance of such genes is only recently apparent, with known ncRNAs participating in a wide range of structural, regulatory, and catalytic roles within the cell. Like protein-coding genes, multiple sequence alignments of families of ncRNAs tell us much about their structure and function, and enable the formulation of statistical models for the detection of related sequences. Rfam is a database of families of ncRNAs, represented by structure-annotated multiple sequence alignments and covariance models.

    Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.] 2005;Chapter 12;Unit 12.5

  • RALEE--RNA ALignment editor in Emacs.

    Griffiths-Jones S

    The Wellcome Trust Sanger Institute Wellcome Trust Genome Campus, Hinxton, CAMBS, CB10 1SA, UK. sgj@sanger.ac.uk

    Production of high quality multiple sequence alignments of structured RNAs relies on an iterative combination of manual editing and structure prediction. An essential feature of an RNA alignment editor is the facility to mark-up the alignment based on how it matches a given secondary structure prediction, but few available alignment editors offer such a feature. The RALEE (RNA ALignment Editor in Emacs) tool provides a simple environment for RNA multiple sequence alignment editing, including structure-specific colour schemes, utilizing helper applications for structure prediction and many more conventional editing functions. This is accomplished by extending the commonly used text editor, Emacs, which is available for Linux, most UNIX systems, Windows and Mac OS. AVAILABILITY: The ELISP source code for RALEE is freely available from http://www.sanger.ac.uk/Users/sgj/ralee/ along with documentation and examples. CONTACT: sgj@sanger.ac.uk

    Bioinformatics (Oxford, England) 2005;21;2;257-9

  • Rfam: annotating non-coding RNAs in complete genomes.

    Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR and Bateman A

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. sgj@sanger.ac.uk

    Rfam is a comprehensive collection of non-coding RNA (ncRNA) families, represented by multiple sequence alignments and profile stochastic context-free grammars. Rfam aims to facilitate the identification and classification of new members of known sequence families, and distributes annotation of ncRNAs in over 200 complete genome sequences. The data provide the first glimpses of conservation of multiple ncRNA families across a wide taxonomic range. A small number of large families are essential in all three kingdoms of life, with large numbers of smaller families specific to certain taxa. Recent improvements in the database are discussed, together with challenges for the future. Rfam is available on the Web at http://www.sanger.ac.uk/Software/Rfam/ and http://rfam.wustl.edu/.

    Nucleic acids research 2005;33;Database issue;D121-4

  • Posterior polymorphous corneal dystrophy in Czech families maps to chromosome 20 and excludes the VSX1 gene.

    Gwilliam R, Liskova P, Filipec M, Kmoch S, Jirsova K, Huckle EJ, Stables CL, Bhattacharya SS, Hardcastle AJ, Deloukas P and Ebenezer ND

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    Purpose: Posterior polymorphous corneal dystrophy (PPCD) is an autosomal dominant disorder, affecting both the corneal endothelium and Descemet's membrane. In the Czech Republic, PPCD is one of the most prevalent corneal dystrophies. The purpose of this study was to determine the chromosomal locus of PPCD in two large Czech families, by using linkage analysis.

    Methods: Linkage analysis was performed on 52 members of two Czech families with PPCD and polymorphic microsatellite markers and lod scores were calculated. The candidate gene VSX1 was also screened for mutations.

    Results: Significant lod scores were obtained with microsatellite markers on chromosome 20. Linkage analysis delineated the Czech PPCD locus to a 2.7-cM locus on chromosome 20, region p11.2, between flanking markers D20S48 and D20S139, which excluded VSX1 as the disease-causing gene in both families. In addition, the exclusion of VSX1 was confirmed by sequence analysis.

    Conclusions: This study reports the localization of PPCD in patients of Czech origin to chromosome 20 at p11.2. Linkage data and sequence analysis exclude VSX1 as causative of PPCD in two Czech families. This refined locus for PPCD overlaps the congenital hereditary endothelial dystrophy (CHED1) disease interval, and it is possible that these corneal dystrophies are allelic.

    Funded by: Wellcome Trust

    Investigative ophthalmology & visual science 2005;46;12;4480-4

  • A comprehensive survey of the Plasmodium life cycle by genomic, transcriptomic, and proteomic analyses.

    Hall N, Karras M, Raine JD, Carlton JM, Kooij TW, Berriman M, Florens L, Janssen CS, Pain A, Christophides GK, James K, Rutherford K, Harris B, Harris D, Churcher C, Quail MA, Ormond D, Doggett J, Trueman HE, Mendoza J, Bidwell SL, Rajandream MA, Carucci DJ, Yates JR, Kafatos FC, Janse CJ, Barrell B, Turner CM, Waters AP and Sinden RE

    Pathogen Sequencing Unit, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 1SA, UK. nhall@tigr.org

    Plasmodium berghei and Plasmodium chabaudi are widely used model malaria species. Comparison of their genomes, integrated with proteomic and microarray data, with the genomes of Plasmodium falciparum and Plasmodium yoelii revealed a conserved core of 4500 Plasmodium genes in the central regions of the 14 chromosomes and highlighted genes evolving rapidly because of stage-specific selective pressures. Four strategies for gene expression are apparent during the parasites' life cycle: (i) housekeeping; (ii) host-related; (iii) strategy-specific related to invasion, asexual replication, and sexual development; and (iv) stage-specific. We observed posttranscriptional gene silencing through translational repression of messenger RNA during sexual development, and a 47-base 3' untranslated region motif is implicated in this process.

    Science (New York, N.Y.) 2005;307;5706;82-6

  • Global effects on gene expression in fission yeast by silencing and RNA interference machineries.

    Hansen KR, Burns G, Mata J, Volpe TA, Martienssen RA, Bähler J and Thon G

    Department of Genetics, Institute of Molecular Biology, University of Copenhagen, Øster Farimagsgade 2A, Copenhagen 1353 K, Denmark.

    Histone modifications influence gene expression in complex ways. The RNA interference (RNAi) machinery can repress transcription by recruiting histone-modifying enzymes to chromatin, although it is not clear whether this is a general mechanism for gene silencing or whether it requires repeated sequences such as long terminal repeats (LTRs). We analyzed the global effects of the Clr3 and Clr6 histone deacetylases, the Clr4 methyltransferase, the zinc finger protein Clr1, and the RNAi proteins Dicer, RdRP, and Argonaute on the transcriptome of Schizosaccharomyces pombe (fission yeast). The clr mutants derepressed similar subsets of genes, many of which also became transcriptionally activated in cells that were exposed to environmental stresses such as nitrogen starvation. Many genes that were repressed by the Clr proteins clustered in extended regions close to the telomeres. Surprisingly few genes were repressed by both the silencing and RNAi machineries, with transcripts from centromeric repeats and Tf2 retrotransposons being notable exceptions. We found no correlation between repression by RNAi and proximity to LTRs, and the wtf family of repeated sequences seems to be repressed by histone deacetylation independent of RNAi. Our data indicate that the RNAi and Clr proteins show only a limited functional overlap and that the Clr proteins play more global roles in gene silencing.

    Funded by: Cancer Research UK: A6517; NIGMS NIH HHS: GM067014; Wellcome Trust: 077118

    Molecular and cellular biology 2005;25;2;590-601

  • SCF(Pof1)-ubiquitin and its target Zip1 transcription factor mediate cadmium response in fission yeast.

    Harrison C, Katayama S, Dhut S, Chen D, Jones N, Bähler J and Toda T

    Laboratory of Cell Regulation, Lincoln's Inn Fields Laboratories, Cancer Research UK, London Research Institute, London, UK.

    Ubiquitin-dependent proteolysis regulates gene expression in many eukaryotic systems. Pof1 is an essential fission yeast F-box protein that is homologous to budding yeast Met30. Temperature-sensitive pof1 mutants display acute growth arrest with small cell size. Extragenic suppressor analysis identified Zip1, a bZIP (basic leucine zipper) transcription factor, as a target for Pof1. We show Zip1 is stabilized in pof1 mutants, Pof1 binds only phosphorylated forms of Zip1, and Zip1 is ubiquitylated in vivo, indicating that Zip1 is a substrate of SCF(Pof1). Genome-wide DNA microarray assay shows that many cadmium-induced genes are under the control of Zip1, suggesting Zip1 plays a role in cadmium response. Consistently, zip1 mutants are hypersensitive to cadmium and unlike wild type, lose cell viability under this stress. Intriguingly, cadmium exposure results in upregulation of Zip1 levels and leads wild-type cells to growth arrest with reduced cell size, reminiscent of pof1 phenotypes. Our results indicate that Zip1 mediates growth arrest in cadmium response, which is essential to maintain viability. Normally growing cells prevent this response through constitutive ubiquitylation and degradation of Zip1 via SCF(Pof1).

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    The EMBO journal 2005;24;3;599-610

  • Typhoid and paratyphoid fever.

    Hasan R, Cooke FJ, Nair S, Harish BN and Wain J

    Lancet 2005;366;9497;1603-4

  • Two new mouse mutants with vestibular defects that map to the highly mutable locus on chromosome 4.

    Hawker K, Fuchs H, Angelis MH and Steel KP

    MRC Institute of Hearing Research, Nottingham, UK.

    The purpose of this study was to characterise two new mouse mutants, carousel, and whirligig. Both were derived from a large-scale mutagenesis programme which screened for dominantly inherited mutations that cause hearing impairments and balance defects. Genetic mapping placed both mutations on the proximal region of chromosome 4. Paint-filling and clearing techniques revealed abnormalities of the lateral semicircular canal. Scanning electron microscopy showed increased numbers of outer and inner hair cells in the apical region of the organ of Corti. The behavioural, genetic, and morphological characteristics lead us to the conclusion that both mutants are probably alleles of seven previously identified mutants which all map to proximal chromosome 4 and share similar defects of the lateral semicircular canal. We suggest that this region may be particularly susceptible to ENU mutagenesis independent of genetic background.

    International journal of audiology 2005;44;3;171-7

  • Beta1,3-N-acetylglucosaminyltransferase 1 glycosylation is required for axon pathfinding by olfactory sensory neurons.

    Henion TR, Raitcheva D, Grosholz R, Biellmann F, Skarnes WC, Hennet T and Schwarting GA

    Shriver Center, Waltham, Massachusetts 02452, USA.

    During embryonic development, axons from sensory neurons in the olfactory epithelium (OE) extend into the olfactory bulb (OB) where they synapse with projection neurons and form glomerular structures. To determine whether glycans play a role in these processes, we analyzed mice deficient for the glycosyltransferase beta1,3-N-acetylglucosaminyltransferase 1 (beta3GnT1), a key enzyme in lactosamine glycan synthesis. Terminal lactosamine expression, as shown by immunoreactivity with the monoclonal antibody 1B2, is dramatically reduced in the neonatal null OE. Postnatal beta3GnT1-/- mice exhibit severely disorganized OB innervation and defective glomerular formation. Beginning in embryonic development, specific subsets of odorant receptor-expressing neurons are progressively lost from the OE of null mice, which exhibit a postnatal smell perception deficit. Axon guidance errors and increased neuronal cell death result in an absence of P2, I7, and M72 glomeruli, indicating a reduction in the repertoire of odorant receptor-specific glomeruli. By approximately 2 weeks of age, lactosamine is unexpectedly reexpressed in sensory neurons of null mice through a secondary pathway, which is accompanied by the regrowth of axons into the OB glomerular layer and the return of smell perception. Thus, both neonatal OE degeneration and the postnatal regeneration are lactosamine dependent. Lactosamine expression in beta3GnT1-/- mice is also reduced in pheromone-receptive vomeronasal neurons and dorsal root ganglion cells, suggesting that beta3GnT1 may perform a conserved function in multiple sensory systems. These results reveal an essential role for lactosamine in sensory axon pathfinding and in the formation of OB synaptic connections.

    Funded by: NIDCD NIH HHS: DC00953, DC06496

    The Journal of neuroscience : the official journal of the Society for Neuroscience 2005;25;8;1894-903

  • A feast of protozoan genomes.

    Hertz-Fowler C, Berriman M and Pain A

    Nature reviews. Microbiology 2005;3;9;670-1

  • Genomics in C. elegans: so many genes, such a little worm.

    Hillier LW, Coulson A, Murray JI, Bao Z, Sulston JE and Waterston RH

    Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri 63108, USA.

    The Caenorhabditis elegans genome sequence is now complete, fully contiguous telomere to telomere and totaling 100,291,840 bp. The sequence has catalyzed the collection of systematic data sets and analyses, including a curated set of 19,735 protein-coding genes--with >90% directly supported by experimental evidence--and >1300 noncoding RNA genes. High-throughput efforts are under way to complete the gene sets, along with studies to characterize gene expression, function, and regulation on a genome-wide scale. The success of the worm project has had a profound effect on genome sequencing and on genomics more broadly. We now have a solid platform on which to build toward the lofty goal of a true molecular understanding of worm biology with all its implications including those for human health.

    Funded by: Wellcome Trust

    Genome research 2005;15;12;1651-60

  • Facilitating genome navigation: survey sequencing and dense radiation-hybrid gene mapping.

    Hitte C, Madeoy J, Kirkness EF, Priat C, Lorentzen TD, Senger F, Thomas D, Derrien T, Ramirez C, Scott C, Evanno G, Pullar B, Cadieu E, Oza V, Lourgant K, Jaffe DB, Tacher S, Dréano S, Berkova N, André C, Deloukas P, Fraser C, Lindblad-Toh K, Ostrander EA and Galibert F

    CNRS, UMR 6061, Génétique et développement, Faculte de Médecine, Rennes, France.

    Accurate and comprehensive sequence coverage for large genomes has been restricted to only a few species of specific interest. Lower sequence coverage (survey sequencing) of related species can yield a wealth of information about gene content and putative regulatory elements. But survey sequences lack long-range continuity and provide only a fragmented view of a genome. Here we show the usefulness of combining survey sequencing with dense radiation-hybrid (RH) maps for extracting maximum comparative genome information from model organisms. Based on results from the canine system, we propose that from now on all low-pass sequencing projects should be accompanied by a dense, gene-based RH map-construction effort to extract maximum information from the genome with a marginal extra cost.

    Nature reviews. Genetics 2005;6;8;643-8

  • Food for thought.

    Holden M, Rajandream MA and Bentley S

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    Nature reviews. Microbiology 2005;3;12;912-3

  • Microbial mariners.

    Holden M, Thomson N and Bentley S

    Nature reviews. Microbiology 2005;3;10;748-9

  • Use of paired serum samples for serodiagnosis of typhoid fever.

    House D, Chinh NT, Diep TS, Parry CM, Wain J, Dougan G, White NJ, Hien TT and Farrar JJ

    Centre for Molecular Microbiology and Infection, Department of Biological Sciences, Imperial College London, South Kensington, UK. d.house@ic.ac.uk

    Using an enzyme-linked immunosorbent assay we demonstrate that, in adult patients with typhoid fever, the sensitivity of a serological test based on the detection of anti-lipopolysaccharide immunoglobulin G is increased when used with paired serum samples taken 1 week apart.

    Funded by: Wellcome Trust

    Journal of clinical microbiology 2005;43;9;4889-90

  • Resin tissue microarrays: a universal format for immunohistochemistry.

    Howat WJ, Warford A, Mitchell JN, Clarke KF, Conquer JS and McCafferty J

    Atlas of Protein Expression Project, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, UK. wjh@sanger.ac.uk

    Tissue microarray (TMA) technology allows the miniaturization and characterization of multiple tissue samples on a single slide and commonly uses formalin-fixed paraffin-embedded (FFPE) tissue or acetone-fixed frozen tissue. The former provides good morphology but can compromise antigenicity, whereas the latter provides compromised morphology with good antigenicity. Here, we report the development of TMAs in glycol methacrylate resin, which combine the advantages of both methods in one embedding format. Freshly collected tissue fixed in -20C acetone or 10% neutral buffered formaldehyde were cored and arrayed into an intermediary medium of 2% agarose before infiltration of the agarose array with glycol methacrylate resin. Acetone-fixed resin TMA demonstrated improved morphology over acetone-fixed frozen TMA, with no loss of antigenicity. Staining for extracellular, cell surface, and nuclear antigens could be realized with monoclonal and polyclonal antibodies as well as with monomeric single-chain Fv preparations. In addition, when compared with FFPE TMA, formalin-fixed tissue in a resin TMA gave enhanced morphology and subcellular detail. Therefore, resin provides a universal format for the construction of TMAs, providing improved tissue morphology while retaining antigenicity, allows thin-section preparation, and could be used to replace preparation of frozen and FFPE TMAs for freshly collected tissue.

    The journal of histochemistry and cytochemistry : official journal of the Histochemistry Society 2005;53;10;1189-97

  • Phylogenomic study of the subfamily Caprinae by cross-species chromosome painting with Chinese muntjac paints.

    Huang L, Nie W, Wang J, Su W and Yang F

    Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, The Chinese Academy of Sciences, Kunming, Yunnan, 650223, PR China.

    Chromosomal homologies have been established between the Chinese muntjac (Muntiacus reevesi, MRE, 2n = 46) and five ovine species: wild goat (Capra aegagrus, CAE, 2n = 60), argali (Ovis ammon, OAM, 2n = 56), snow sheep (Ovis nivicola, ONI, 2n = 52), red goral (Naemorhedus cranbrooki, NCR, 2n = 56) and Sumatra serow (Capricornis sumatraensis, CSU, 2n = 48) by chromosome painting with a set of chromosome-specific probes of the Chinese muntjac. In total, twenty-two Chinese muntjac autosomal painting probes detected thirty-five homologous segments in the genome of each species. The chromosome X probe hybridized to the whole X chromosomes of all ovine species while the chromosome Y probe gave no signal. Our results demonstrate that almost all homologous segments defined by comparative painting show a high degree of conservation in G-banding patterns and that each speciation event is accompanied by specific chromosomal rearrangements. The combined analysis of our results and previous cytogenetic and molecular systematic results enables us to map the chromosomal rearrangements onto a phylogenetic tree, thus providing new insights into the karyotypic evolution of these species.

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2005;13;4;389-99

  • Transcriptome analysis for the chicken based on 19,626 finished cDNA sequences and 485,337 expressed sequence tags.

    Hubbard SJ, Grafham DV, Beattie KJ, Overton IM, McLaren SR, Croning MD, Boardman PE, Bonfield JK, Burnside J, Davies RM, Farrell ER, Francis MD, Griffiths-Jones S, Humphray SJ, Hyland C, Scott CE, Tang H, Taylor RG, Tickle C, Brown WR, Birney E, Rogers J and Wilson SA

    Faculty of Life Sciences, The University of Manchester, Manchester, M60 1QD, United Kingdom.

    We present an analysis of the chicken (Gallus gallus) transcriptome based on the full insert sequences for 19,626 cDNAs, combined with 485,337 EST sequences. The cDNA data set has been functionally annotated and describes a minimum of 11,929 chicken coding genes, including the sequence for 2260 full-length cDNAs together with a collection of noncoding (nc) cDNAs that have been stringently filtered to remove untranslated regions of coding mRNAs. The combined collection of cDNAs and ESTs describe 62,546 clustered transcripts and provide transcriptional evidence for a total of 18,989 chicken genes, including 88% of the annotated Ensembl gene set. Analysis of the ncRNAs reveals a set that is highly conserved in chickens and mammals, including sequences for 14 pri-miRNAs encoding 23 different miRNAs. The data sets described here provide a transcriptome toolkit linked to physical clones for bioinformaticians and experimental biologists who wish to use chicken systems as a low-cost, accessible alternative to mammals for the analysis of vertebrate development, immunology, and cell biology.

    Genome research 2005;15;1;174-83

  • Ensembl 2005.

    Hubbard T, Andrews D, Caccamo M, Cameron G, Chen Y, Clamp M, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Gilbert J, Hammond M, Herrero J, Hotz H, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Kokocinsci F, London D, Longden I, McVicker G, Melsopp C, Meidl P, Potter S, Proctor G, Rae M, Rios D, Schuster M, Searle S, Severin J, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C and Birney E

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

    The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased by 7 to 16, with the addition of the six vertebrate genomes of chimpanzee, dog, cow, chicken, tetraodon and frog and the insect genome of honeybee. The majority have been annotated automatically using the Ensembl gene build system, showing its flexibility to reliably annotate a wide variety of genomes. With the increased number of vertebrate genomes, the comparative analysis provided to users has been greatly improved, with new website interfaces allowing annotation of different genomes to be directly compared. The Ensembl software system is being increasingly widely reused in different projects showing the benefits of a completely open approach to software development and distribution.

    Nucleic acids research 2005;33;Database issue;D447-53

  • How homologous recombination generates a mutable genome.

    Hurles M

    Wellcome Trust Sanger Institute, Genome Campus, Cambridge, CB10 1SA, UK. meh@sanger.ac.uk

    Recombination and mutation have traditionally been regarded as independent evolutionary processes: the latter generates variation, which the former reshuffles. Recent studies, however, have suggested that allelic recombination influences the underlying mutation rate, as high mutation rates are inferred in regions of high recombination. Furthermore, recombination between duplicated sequences introduces structural variation into the human genome and facilitates the formation of clustered gene families. Comparisons of whole-genome sequences reveal the expansion of gene family clusters to be an important mode of genome evolution. The negative aspect of this genomic dynamism is the contribution of these rearrangements to genetic diseases.

    Funded by: Wellcome Trust

    Human genomics 2005;2;3;179-86

  • The dual origin of the Malagasy in Island Southeast Asia and East Africa: evidence from maternal and paternal lineages.

    Hurles ME, Sykes BC, Jobling MA and Forster P

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom. meh@sanger.ac.uk

    Linguistic and archaeological evidence about the origins of the Malagasy, the indigenous peoples of Madagascar, points to mixed African and Indonesian ancestry. By contrast, genetic evidence about the origins of the Malagasy has hitherto remained partial and imprecise. We defined 26 Y-chromosomal lineages by typing 44 Y-chromosomal polymorphisms in 362 males from four different ethnic groups from Madagascar and 10 potential ancestral populations in Island Southeast Asia and the Pacific. We also compared mitochondrial sequence diversity in the Malagasy with a manually curated database of 19,371 hypervariable segment I sequences, incorporating both published and unpublished data. We could attribute every maternal and paternal lineage found in the Malagasy to a likely geographic origin. Here, we demonstrate approximately equal African and Indonesian contributions to both paternal and maternal Malagasy lineages. The most likely origin of the Asia-derived paternal lineages found in the Malagasy is Borneo. This agrees strikingly with the linguistic evidence that the languages spoken around the Barito River in southern Borneo are the closest extant relatives of Malagasy languages. As a result of their equally balanced admixed ancestry, the Malagasy may represent an ideal population in which to identify loci underlying complex traits of both anthropological and medical interest.

    Funded by: Wellcome Trust: 057559

    American journal of human genetics 2005;76;5;894-901

  • Isolation and characterization of murine Cds (CDP-diacylglycerol synthase) 1 and 2.

    Inglis-Broadgate SL, Ocaka L, Banerjee R, Gaasenbeek M, Chapple JP, Cheetham ME, Clark BJ, Hunt DM and Halford S

    Institute of Ophthalmology, University College London, 11-43 Bath Street, London, EC1V 9EL, UK.

    Phototransduction in Drosophila is a phosphoinositide-mediated signalling pathway. Phosphatidylinositol 4,5-bisphosphate (PIP2) plays a central role in this process, and its levels are tightly regulated. A photoreceptor-specific form of the enzyme CDP-diacylglycerol synthase (CDS), which catalyzes the formation of CDP-diacylglycerol from phosphatidic acid, is a key regulator of the amount of PIP2 available for signalling. cds mutants develop light-induced retinal degeneration. We report here the isolation and characterization of two murine genes encoding this enzyme, Cds1 and Cds2. The genes encode proteins that are 73% identical and 92% similar but exhibit very different expression patterns. Cds1 shows a very restricted expression pattern but is expressed in the inner segments of the photoreceptors whilst Cds2 shows a ubiquitous pattern of expression. Using fluorescent in situ hybridization we have mapped Cds1 and Cds2 to chromosomes 5E3 and 2G1 respectively. These are regions of synteny with the corresponding human gene localization (4q21 and 20p13). Transient transfection experiments with epitope tagged proteins have also demonstrated that both are associated with the endoplasmic reticulum.

    Funded by: Wellcome Trust

    Gene 2005;356;19-31

  • A haplotype map of the human genome.

    International HapMap Consortium

    Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.

    Funded by: NHGRI NIH HHS: R01 HG001720, R01 HG001720-06

    Nature 2005;437;7063;1299-320

  • The genome of the kinetoplastid parasite, Leishmania major.

    Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G, Berriman M, Sisk E, Rajandream MA, Adlem E, Aert R, Anupama A, Apostolou Z, Attipoe P, Bason N, Bauser C, Beck A, Beverley SM, Bianchettin G, Borzym K, Bothe G, Bruschi CV, Collins M, Cadag E, Ciarloni L, Clayton C, Coulson RM, Cronin A, Cruz AK, Davies RM, De Gaudenzi J, Dobson DE, Duesterhoeft A, Fazelina G, Fosker N, Frasch AC, Fraser A, Fuchs M, Gabel C, Goble A, Goffeau A, Harris D, Hertz-Fowler C, Hilbert H, Horn D, Huang Y, Klages S, Knights A, Kube M, Larke N, Litvin L, Lord A, Louie T, Marra M, Masuy D, Matthews K, Michaeli S, Mottram JC, Müller-Auer S, Munden H, Nelson S, Norbertczak H, Oliver K, O'neil S, Pentony M, Pohl TM, Price C, Purnelle B, Quail MA, Rabbinowitsch E, Reinhardt R, Rieger M, Rinta J, Robben J, Robertson L, Ruiz JC, Rutter S, Saunders D, Schäfer M, Schein J, Schwartz DC, Seeger K, Seyler A, Sharp S, Shin H, Sivam D, Squares R, Squares S, Tosato V, Vogt C, Volckaert G, Wambutt R, Warren T, Wedler H, Woodward J, Zhou S, Zimmermann W, Smith DF, Blackwell JM, Stuart KD, Barrell B and Myler PJ

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK. alicat@sanger.ac.uk

    Leishmania species cause a spectrum of human diseases in tropical and subtropical regions of the world. We have sequenced the 36 chromosomes of the 32.8-megabase haploid genome of Leishmania major (Friedlin strain) and predict 911 RNA genes, 39 pseudogenes, and 8272 protein-coding genes, of which 36% can be ascribed a putative function. These include genes involved in host-pathogen interactions, such as proteolytic enzymes, and extensive machinery for synthesis of complex surface glycoconjugates. The organization of protein-coding genes into long, strand-specific, polycistronic clusters and lack of general transcription factors in the L. major, Trypanosoma brucei, and Trypanosoma cruzi (Tritryp) genomes suggest that the mechanisms regulating RNA polymerase II-directed transcription are distinct from those operating in other eukaryotes, although the trypanosomatids appear capable of chromatin remodeling. Abundant RNA-binding proteins are encoded in the Tritryp genomes, consistent with active posttranscriptional regulation of gene expression.

    Funded by: NIAID NIH HHS: R01 AI040599, R01 AI053667-04, U01 AI040599-05S1; Wellcome Trust

    Science (New York, N.Y.) 2005;309;5733;436-42

  • Evidence for widespread reticulate evolution within human duplicons.

    Jackson MS, Oliver K, Loveland J, Humphray S, Dunham I, Rocchi M, Viggiano L, Park JP, Hurles ME and Santibanez-Koref M

    Institute of Human Genetics, University of Newcastle upon Tyne, International Centre for Life, Newcastle upon Tyne, United Kingdom. m.s.jackson@ncl.ac.uk

    Approximately 5% of the human genome consists of segmental duplications that can cause genomic mutations and may play a role in gene innovation. Reticulate evolutionary processes, such as unequal crossing-over and gene conversion, are known to occur within specific duplicon families, but the broader contribution of these processes to the evolution of human duplications remains poorly characterized. Here, we use phylogenetic profiling to analyze multiple alignments of 24 human duplicon families that span >8 Mb of DNA. Our results indicate that none of them are evolving independently, with all alignments showing sharp discontinuities in phylogenetic signal consistent with reticulation. To analyze these results in more detail, we have developed a quartet method that estimates the relative contribution of nucleotide substitution and reticulate processes to sequence evolution. Our data indicate that most of the duplications show a highly significant excess of sites consistent with reticulate evolution, compared with the number expected by nucleotide substitution alone, with 15 of 30 alignments showing a >20-fold excess over that expected. Using permutation tests, we also show that at least 5% of the total sequence shares 100% sequence identity because of reticulation, a figure that includes 74 independent tracts of perfect identity >2 kb in length. Furthermore, analysis of a subset of alignments indicates that the density of reticulation events is as high as 1 every 4 kb. These results indicate that phylogenetic relationships within recently duplicated human DNA can be rapidly disrupted by reticulate evolution. This finding has important implications for efforts to finish the human genome sequence, complicates comparative sequence analysis of duplicon families, and could profoundly influence the tempo of gene-family evolution.

    Funded by: Wellcome Trust

    American journal of human genetics 2005;77;5;824-40

  • Activation of AP-1-dependent transcription by a truncated translation initiation factor.

    Jenkins CC, Mata J, Crane RF, Thomas B, Akoulitchev A, Bähler J and Norbury CJ

    Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, United Kingdom.

    Int6/eIF3e is a highly conserved subunit of eukaryotic translation initiation factor 3 (eIF3) that has also been reported to interact with subunits of the proteasome and the COP9 signalosome. Overexpression of full-length Int6 or a 13-kDa C-terminal fragment, Int6CT, in the fission yeast Schizosaccharomyces pombe causes multidrug resistance that requires the otherwise inessential AP-1 transcription factor Pap1. Here we show for the first time that Int6CT acts to increase the transcriptional activity of Pap1. Microarray hybridization data indicate that Int6CT overexpression resulted in the up-regulation of 67 genes; this expression profile closely matched that of cells overexpressing Pap1. Analysis of the upstream regulatory sequences of these genes showed that the majority contained AP-1 consensus binding sites. Partial defects in ubiquitin-dependent proteolysis have been suggested to confer Pap1-dependent multidrug resistance, but no such defect was seen on Int6CT overexpression. Indeed, none of the previously identified interactions of endogenous Int6 was required for the activation of Pap1 transcription described here. Moreover, Int6CT-induced activation of Pap1-responsive gene expression was independent of the ability of Pap1 to undergo a redox-regulated conformational change which mediates its relocalization to the nucleus and expression of oxidative stress response genes. Int6CT therefore activates Pap1-dependent transcription by a novel mechanism.

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Eukaryotic cell 2005;4;11;1840-50

  • Structural relatedness of plant food allergens with specific reference to cross-reactive allergens: an in silico analysis.

    Jenkins JA, Griffiths-Jones S, Shewry PR, Breiteneder H and Mills EN

    Institute of Food Research, Norwich, Cambridge, United Kingdom.

    Background: The body of sequence and structural information on allergens and the sequence analysis of whole plant genomes are facilitating the application of bioinformatic approaches to identifying and defining plant allergens.

    Objective: An in silico approach was used to quantify the distribution of plant food allergen sequences across protein families and to develop and apply a novel means of assessing conserved surface features important for IgE cross-reactivity.

    Methods: Plant food allergen sequences were classified into Pfam families on the basis of sequence homology. Contact surface areas of selected proteins were calculated with MOLMOL by using a 1.4-A probe, corrected by removing contributions from IgE inaccessible main chains and side chains forming the ligand binding sites.

    Results: A set of 129 food allergen sequences were classified into only 20 of 3849 possible Pfam families, with 4 families accounting for more than 65% of food allergens. Structural bioinformatic analysis of conserved exterior main chains and amino acid side chains in cross-reactive homologues of Bet v 1 and nonspecific lipid transfer proteins showed higher levels of similarity than shown by simple sequence comparisons. Thus, 75% of the Mal d 1 surface is likely to bind anti-Bet v 1 antibodies, compared with a sequence identity of approximately 56%.

    Conclusion: Most plant food allergens belong to only 4 structural families, indicating that conserved structures and biological activities may play a role in determining or promoting allergenic properties. Structural bioinformatic analysis shows that conservation of 3-dimensional structure should be included in any assessment of potential IgE cross-reactivity in, for example, novel proteins.

    The Journal of allergy and clinical immunology 2005;115;1;163-70

  • Does structural and chemical divergence play a role in precluding undesirable protein interactions?

    Jiménez JL

    Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London, United Kingdom. jlj@sanger.ac.uk

    To understand the evolutionary forces establishing, maintaining, breaking, or precluding protein-protein interactions, a comprehensive data set of protein complexes has been analyzed to examine the overlap between protein interfaces and the most conserved or divergent protein surface areas. The most divergent areas tend to be found predominantly away from protein interfaces, although when found at interfaces, they are associated with specific lack of cross-reactivity between close homologues, like in antibody-antigen complexes. Moreover, the amino acid composition of highly variable regions is significantly different from any other protein surfaces. The variable regions present higher structural plasticity as a result of insertions and deletions, and favor charged over hydrophobic residues, a known strategy to minimize aggregation. This suggests that (1) a rapid rate of mutations at these regions might be continuously altering their properties, making difficult the coadaptation, in shape and chemical complementarity, to potential interacting partners; and (2) the existence of some form of selective pressure for variable areas away from interfaces to accumulate charged residues, perhaps as an evolutionary mechanism to increase solubility and minimize undesirable interactions within the crowded cellular environment. Finally, these results are placed into the context of the aberrant oligomerization of sickle-cell anemia hemoglobin and prion proteins.

    Proteins 2005;59;4;757-64

  • Antibodies to the iron uptake ABC transporter lipoproteins PiaA and PiuA promote opsonophagocytosis of Streptococcus pneumoniae.

    Jomaa M, Yuste J, Paton JC, Jones C, Dougan G and Brown JS

    Centre for Biological Sciences, Imperial College London, UK.

    PiaA and PiuA are the lipoprotein components of the Pia and Piu Streptococcus pneumoniae iron uptake ABC transporters and are required for full virulence in mouse models of infection. Active or passive vaccination with recombinant PiuA and PiaA protects mice against invasive S. pneumoniae disease. In this study we have analyzed the antibody responses and mechanism of protection induced by PiuA and PiaA in more detail. For both proteins, two booster vaccinations induced stronger antibody responses in mice than a single or no booster vaccinations, and 5 mug of protein induced similar levels of antibody responses as 20 mug. Immunoglobulin G (IgG) subclass-specific enzyme-linked immunosorbent assays demonstrated that the antibody response to PiuA and PiaA was predominantly IgG1, with induction of only low levels of IgG2a. Anti-PiaA and anti-PiuA polyclonal rabbit antibodies bound to the surface of live S. pneumoniae when assessed by flow cytometry but did not inhibit growth of S. pneumoniae in cation-depleted medium or bacterial susceptibility to the iron-dependent antibiotic streptonigrin. However, anti-PiaA and anti-PiuA did increase complement-independent and -dependent opsonophagocytosis of different serotypes of S. pneumoniae by the human neutrophil cell line HL60. Hence, vaccination with PiaA and PiuA protects against S. pneumoniae infection by inducing antibodies that promote bacterial opsonophagocytosis rather than inhibiting iron transport.

    Infection and immunity 2005;73;10;6852-9

  • Array-CGH analysis of microsatellite-stable, near-diploid bowel cancers and comparison with other types of colorectal carcinoma.

    Jones AM, Douglas EJ, Halford SE, Fiegler H, Gorman PA, Roylance RR, Carter NP and Tomlinson IP

    Molecular and Population Genetics Laboratory, Cancer Research UK, 44 Lincoln's Inn Fields, London WC2A 3PX, UK.

    Microsatellite-stable, near-diploid (MSI-CIN-) colorectal carcinomas have been reported, but it is not clear as to whether these tumours form a discrete group or represent one end of the distribution of MSI-CIN+ cancers. In order to address this question, we screened 23 MSI-CIN- colorectal cancers for gains and losses using array-based comparative genomic hybridization (aCGH) based on large-insert clones at about 1 Mb density. We compared our findings with those from a small set of MSI+CIN+ cancers, and with our reported data from MSI-CIN+ and MSI+CIN- cancers. We found no evidence of any form of genomic instability in MSI-CIN- cancers. At the level of the chromosome arm, the MSI-CIN- cancers had significantly fewer gains and losses than MSI-CIN+ tumours, but more than the MSI+CIN- and MSI+CIN+ lesions. The chromosomal-scale changes found in MSI-CIN- cancers generally involved the same sites as those in MSI-CIN+ tumours, and in both cancer groups, the best predictor of a specific change was the total number of such changes in that tumour. A few chromosomal-scale changes did, however, differ between the MSI-CIN- and MSI-CIN+ pathways. MSI-CIN- cancers showed: low frequencies of gain of 9p and 19p; infrequent loss of 5q and a high frequency of 20p gain. Overall, our data suggested that the MSI-CIN- group is heterogeneous, one type of MSI-CIN- cancer having few (< or =6) chromosomal-scale changes and the other with more (> or =10) changes resembling MSI-CIN+ cancers. At the level of individual clones, frequent and/or discrete gains or losses were generally located within regions of chromosomal-scale changes in both MSI-CIN- and MSI-CIN+ cancers, and fewer losses and gains were present in MSI-CIN- than MSI-CIN+ tumours. No changes by clone, which were specific to the MSI-CIN- cancers, were found. In addition to indicating differences among the cancer groups, our results also detected over 50 sites (amplifications, potential homozygous deletion and gains or losses which extended over only a few megabases) which might harbour uncharacterized oncogenes or tumour suppressor loci. In conclusion, our data support the suggestion that some MSI-CIN- carcinomas form a qualitatively different group from the other cancer types, and also suggest that the MSI-CIN- group is itself heterogeneous.

    Oncogene 2005;24;1;118-29

  • Nodulation signaling in legumes requires NSP2, a member of the GRAS family of transcriptional regulators.

    Kaló P, Gleason C, Edwards A, Marsh J, Mitra RM, Hirsch S, Jakab J, Sims S, Long SR, Rogers J, Kiss GB, Downie JA and Oldroyd GE

    Departments of Disease and Stress Biology and Molecular Microbiology, John Innes Centre, Norwich NR4 7UH, UK.

    Rhizobial bacteria enter a symbiotic interaction with legumes, activating diverse responses in roots through the lipochito oligosaccharide signaling molecule Nod factor. Here, we show that NSP2 from Medicago truncatula encodes a GRAS protein essential for Nod-factor signaling. NSP2 functions downstream of Nod-factor-induced calcium spiking and a calcium/calmodulin-dependent protein kinase. We show that NSP2-GFP expressed from a constitutive promoter is localized to the endoplasmic reticulum/nuclear envelope and relocalizes to the nucleus after Nod-factor elicitation. This work provides evidence that a GRAS protein transduces calcium signals in plants and provides a possible regulator of Nod-factor-inducible gene expression.

    Science (New York, N.Y.) 2005;308;5729;1786-9

  • Analysis of Campylobacter jejuni capsular loci reveals multiple mechanisms for the generation of structural diversity and the ability to form complex heptoses.

    Karlyshev AV, Champion OL, Churcher C, Brisson JR, Jarrell HC, Gilbert M, Brochu D, St Michael F, Li J, Wakarchuk WW, Goodhead I, Sanders M, Stevens K, White B, Parkhill J, Wren BW and Szymanski CM

    Department of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London WCIE 7HT, UK.

    We recently demonstrated that Campylobacter jejuni produces a capsular polysaccharide (CPS) that is the major antigenic component of the classical Penner serotyping system distinguishing Campylobacter into >60 groups. Although the wide variety of C. jejuni serotypes are suggestive of structural differences in CPS, the genetic mechanisms of such differences are unknown. In this study we sequenced biosynthetic cps regions, ranging in size from 15 to 34 kb, from selected C. jejuni strains of HS:1, HS:19, HS:23, HS:36, HS:23/36 and HS:41 serotypes. Comparison of the determined cps sequences of the HS:1, HS:19 and HS:41 strains with the sequenced strain, NCTC11168 (HS:2), provides evidence for multiple mechanisms of structural variation including exchange of capsular genes and entire clusters by horizontal transfer, gene duplication, deletion, fusion and contingency gene variation. In contrast, the HS:23, HS:36 and HS:23/36 cps sequences were highly conserved. We report the first detailed structural analysis of 81-176 (HS:23/36) and G1 (HS:1) and refine the previous structural interpretations of the HS:19, HS:23, HS:36 and HS:41 serostrains. For the first time, we demonstrate the commonality and function of a second heptose biosynthetic pathway for Campylobacter CPS independent of the pathway for lipooligosaccharide (LOS) biosynthesis and identify a novel heptosyltransferase utilized by this alternate pathway. Furthermore, we show the retention of two functional heptose isomerases in Campylobacter and the sharing of a phosphatase for both LOS and CPS heptose biosynthesis.

    Molecular microbiology 2005;55;1;90-103

  • A comparison of tagging methods and their tagging space.

    Ke X, Miretti MM, Broxholme J, Hunt S, Beck S, Bentley DR, Deloukas P and Cardon LR

    Wellcome Trust Centre for Human Genetics, University of Oxford, UK. xiayi@well.ox.ac.uk

    Single-nucleotide polymorphism (SNP) tagging is widely used as a way of saving genotyping costs in association studies. A number of different tagging methods have been developed to reduce the number of markers to be genotyped while maintaining power for detecting effects on non-assayed SNPs. How the different methods perform in different settings, the degree to which they overlap and share common tags and how they differ are important questions. We investigated these questions by comparing three widely used tagging methods/algorithms--one haplotype r2-based method, one pair-wise r2-based method and one method which was based on haplotype diversity but focused on major haplotypes. Tagging efficiency was defined as the number of genotyped markers divided by the number of tagging SNPs. Tagging effectiveness was defined as the proportion of un-genotyped or 'hidden' SNPs being detected (having a pair-wise or haplotype r2 with a set of tagging SNPs over a threshold, e.g. haplotype r2> or =0.80). The ENCODE regions genotyped on the HapMap CEPH individuals were examined in this study. Tagging effectiveness was generally poor for rare SNPs than for common SNPs, for all three tagging methods. Inclusion of rare SNPs into initial HapMap scheme could enhance the performance of tags on rare hidden SNPs at the expense of increased genotyping cost. At a moderate tagging efficiency, more than 90% of hidden SNPs detected by tagging SNPs selected by one method were also detected by tagging SNPs selected by another method, and this figure could be increased to 100% if tagging efficiency was allowed to drop. These results indicate that the tagging space is highly concordant between different tagging methods, despite the fact that they often involve different sets of tagging SNPs.

    Funded by: NEI NIH HHS: EY-15652

    Human molecular genetics 2005;14;18;2757-67

  • Gene arrays at Pneumocystis carinii telomeres.

    Keely SP, Renauld H, Wakefield AE, Cushion MT, Smulian AG, Fosker N, Fraser A, Harris D, Murphy L, Price C, Quail MA, Seeger K, Sharp S, Tindal CJ, Warren T, Zuiderwijk E, Barrell BG, Stringer JR and Hall N

    Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati, Cincinnati, Ohio 45267, USA.

    In the fungus Pneumocystis carinii, at least three gene families (PRT1, MSR, and MSG) have the potential to generate high-frequency antigenic variation, which is likely to be a strategy by which this parasitic fungus is able to prolong its survival in the rat lung. Members of these gene families are clustered at chromosome termini, a location that fosters recombination, which has been implicated in selective expression of MSG genes. To gain insight into the architecture, evolution, and regulation of these gene clusters, six telomeric segments of the genome were sequenced. Each of the segments began with one or more unique genes, after which were members of different gene families, arranged in a head-to-tail array. The three-gene repeat PRT1-MSR-MSG was common, suggesting that duplications of these repeats have contributed to expansion of all three families. However, members of a gene family in an array were no more similar to one another than to members in other arrays, indicating rapid divergence after duplication. The intergenic spacers were more conserved than the genes and contained sequence motifs also present in subtelomeres, which in other species have been implicated in gene expression and recombination. Long mononucleotide tracts were present in some MSR genes. These unstable sequences can be expected to suffer frequent frameshift mutations, providing P. carinii with another mechanism to generate antigen variation.

    Funded by: FIC NIH HHS: TW01200-02; NIAID NIH HHS: R01AI36701, R01AI44651; Wellcome Trust

    Genetics 2005;170;4;1589-600

  • Sox2 is required for sensory organ development in the mammalian inner ear.

    Kiernan AE, Pelling AL, Leung KK, Tang AS, Bell DM, Tease C, Lovell-Badge R, Steel KP and Cheah KS

    MRC Institute of Hearing Research, University of Nottingham, Nottingham NG7 2RD, UK.

    Sensory hair cells and their associated non-sensory supporting cells in the inner ear are fundamental for hearing and balance. They arise from a common progenitor, but little is known about the molecular events specifying this cell lineage. We recently identified two allelic mouse mutants, light coat and circling (Lcc) and yellow submarine (Ysb), that show hearing and balance impairment. Lcc/Lcc mice are completely deaf, whereas Ysb/Ysb mice are severely hearing impaired. We report here that inner ears of Lcc/Lcc mice fail to establish a prosensory domain and neither hair cells nor supporting cells differentiate, resulting in a severe inner ear malformation, whereas the sensory epithelium of Ysb/Ysb mice shows abnormal development with disorganized and fewer hair cells. These phenotypes are due to the absence (in Lcc mutants) or reduced expression (in Ysb mutants) of the transcription factor SOX2, specifically within the developing inner ear. SOX2 continues to be expressed in the inner ears of mice lacking Math1 (also known as Atoh1 and HATH1), a gene essential for hair cell differentiation, whereas Math1 expression is absent in Lcc mutants, suggesting that Sox2 acts upstream of Math1.

    Funded by: Medical Research Council: MC_U117562207

    Nature 2005;434;7036;1031-5

  • Genome-wide screening of genomic alterations and their clinicopathologic implications in non-small cell lung cancers.

    Kim TM, Yim SH, Lee JS, Kwon MS, Ryu JW, Kang HM, Fiegler H, Carter NP and Chung YJ

    Department of Microbiology, College of Medicine, Catholic University of Korea, Socho-gu, Seoul, Korea.

    Purpose: Although many genomic alterations have been observed in lung cancer, their clinicopathologic significance has not been thoroughly investigated. This study screened the genomic aberrations across the whole genome of non-small cell lung cancer cells with high-resolution and investigated their clinicopathologic implications.

    One-megabase resolution array comparative genomic hybridization was applied to 29 squamous cell carcinomas and 21 adenocarcinomas of the lung. Tumor and normal tissues were microdissected and the extracted DNA was used directly for hybridization without genomic amplification. The recurrent genomic alterations were analyzed for their association with the clinicopathologic features of lung cancer.

    Results: Overall, 36 amplicons, 3 homozygous deletions, and 17 minimally altered regions common to many lung cancers were identified. Among them, genomic changes on 13q21, 1p32, Xq, and Yp were found to be significantly associated with clinical features such as age, stage, and disease recurrence. Kaplan-Meier survival analysis revealed that genomic changes on 10p, 16q, 9p, 13q, 6p21, and 19q13 were associated with poor survival. Multivariate analysis showed that alterations on 6p21, 7p, 9q, and 9p remained as independent predictors of poor outcome. In addition, significant correlations were observed for three pairs of minimally altered regions (19q13 and 6p21, 19p13 and 19q13, and 8p12 and 8q11), which indicated their possible collaborative roles.

    Conclusions: These results show that our approach is robust for high-resolution mapping of genomic alterations. The novel genomic alterations identified in this study, along with their clinicopathologic implications, would be useful to elucidate the molecular mechanisms of lung cancer and to identify reliable biomarkers for clinical application.

    Clinical cancer research : an official journal of the American Association for Cancer Research 2005;11;23;8235-42

  • Intercalated cell H+/OH- transporter expression is reduced in Slc26a4 null mice.

    Kim YH, Verlander JW, Matthews SW, Kurtz I, Shin W, Weiner ID, Everett LA, Green ED, Nielsen S and Wall SM

    Department of Medicine, Emory University, Atlanta, Georgia, USA.

    Slc26a4 (Pds) encodes pendrin, a Cl(-)/HCO(3)(-) exchanger expressed in the apical region of type B and non-A, non-B cells, which mediates secretion of OH(-) equivalents. Thus genetic disruption of Slc26a4 leads to systemic alkalosis in some treatment models. However, humans and mice with genetic disruption of Slc26a4 have normal acid-base balance under basal conditions. Thus we asked: 1) Is net acid excretion altered in Slc26a4 (-/-) mice under basal conditions? 2) In the absence of pendrin-mediated OH(-) secretion, are increases in intracellular and systemic pH minimized through changes in intercalated cell subtype abundance or intercalated cell H(+)/OH(-) transporter expression? To answer these questions, net acid excretion and H(+)/OH(-) transporter expression were examined in Slc26a4 (-/-) and Slc26a4 (+/+) mice using balance studies, immunolocalization, and immunoblotting. Excretion of ammonium, titratable acid, and citrate were the same in Slc26a4 null and wild-type mice. However, urinary pH and Pco(2) were much lower in Slc26a4 null relative to wild-type mice due to reduced urinary buffering of secreted H(+) by HCO(3)(-). Abundance of non-A, but not type A intercalated cells, was reduced within the cortical collecting ducts of Slc26a4 null mice. Moreover, kidneys from Slc26a4 null mice had reduced H(+)-ATPase, NBC3 and RhBG total protein expression, particularly within type B and non-A, non-B intercalated cells, although RhCG protein expression was unchanged. Reduced intercalated cell H(+)/OH(-) transporter expression is observed in Slc26a4 null mice, which likely attenuates the rise in intracellular and systemic pH expected with genetic disruption of Slc26a4.

    Funded by: NIDDK NIH HHS: DK-52935

    American journal of physiology. Renal physiology 2005;289;6;F1262-72

  • Myosin VI is required for normal retinal function.

    Kitamoto J, Libby RT, Gibbs D, Steel KP and Williams DS

    Department of Pharmacology, UCSD School of Medicine, La Jolla, CA 92093-0912, USA.

    Different unconventional myosins have been shown to play important roles in sensory function, including vision. We investigated the role of myosin VI by examining the retinas of mice carrying a null mutation in the myosin VI gene. Myosin VI was found to be present in the photoreceptor and RPE cells of normal retinas. In the absence of myosin VI, the amplitudes of the a- and b-waves of the electroretinogram were reduced, although there was not photoreceptor cell loss and retinal anatomy appeared normal. Our results indicate that myosin VI is required in photoreceptor cells for normal retinal electrophysiology.

    Funded by: NEI NIH HHS: EY12598, R01 EY007042-25

    Experimental eye research 2005;81;1;116-20

  • Isolation, X location and activity of the marsupial homologue of SLC16A2, an XIST-flanking gene in eutherian mammals.

    Koina E, Wakefield MJ, Walcher C, Disteche CM, Whitehead S, Ross M and Marshall Graves JA

    ARC Centre for Kangaroo Genomics, Research School of Biological Sciences, The Australian National University, Canberra, ACT 0200, Australia. edda.koina@anu.edu.au

    X chromosome inactivation (XCI) achieves dosage compensation between males and females for most X-linked genes in eutherian mammals. It is a whole-chromosome effect under the control of the XIST locus, although some genes escape inactivation. Marsupial XCI differs from the eutherian process, implying fundamental changes in the XCI mechanism during the evolution of the two lineages. There is no direct evidence for the existence of a marsupial XIST homologue. XCI has been studied for only a handful of genes in any marsupial, and none in the model kangaroo Macropus eugenii (the tammar wallaby). We have therefore studied the sequence, location and activity of a gene SLC16A2 (solute carrier, family 16, class A, member 2) that flanks XIST on the human and mouse X chromosomes. A BAC clone containing the marsupial SLC16A2 was mapped to the end of the long arm of the tammar X chromosome and used in RNA FISH experiments to determine whether one or both loci are transcribed in female cells. In male and female cells, only a single signal was found, indicating that the marsupial SLC16A2 gene is silenced on the inactivated X.

    Funded by: NIGMS NIH HHS: GM 61948, R01 GM061948-01

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2005;13;7;687-98

  • FACT--a framework for the functional interpretation of high-throughput experiments.

    Kokocinski F, Delhomme N, Wrobel G, Hummerich L, Toedt G and Lichter P

    Molecular Genetics, Deutsches Krebsforschungszentrum, 69115 Heidelberg, Germany. F.Kokocinski@factweb.de

    Background: Interpreting the results of high-throughput experiments, such as those obtained from DNA-microarrays, is an often time-consuming task due to the high number of data-points that need to be analyzed in parallel. It is usually a matter of extensive testing and unknown beforehand, which of the possible approaches for the functional analysis will be the most informative.

    Results: To address this problem, we have developed the Flexible Annotation and Correlation Tool (FACT). FACT allows for detection of important patterns in large data sets by simplifying the integration of heterogeneous data sources and the subsequent application of different algorithms for statistical evaluation or visualization of the annotated data. The system is constantly extended to include additional annotation data and comparison methods.

    Conclusion: FACT serves as a highly flexible framework for the explorative analysis of large genomic and proteomic result sets. The program can be used online; open source code and supplementary information are available at http://www.factweb.de.

    BMC bioinformatics 2005;6;161

  • Y-chromosomal STR haplotypes and their applications to forensic and population studies in east Asia.

    Kwak KD, Jin HJ, Shin DJ, Kim JM, Roewer L, Krawczak M, Tyler-Smith C and Kim W

    Department of Biological Sciences, Dankook University, Cheonan, 330-714, South Korea.

    We have analyzed 11 Y-STR loci (DYS19, the two DYS385 loci, DYS388, DYS389I/II, DYS390, DYS391, DYS392, DYS393, DXYS156Y) in 700 males from ten ethnic groups in east Asia in order to evaluate their usefulness for forensic and population genetic studies. A total of 644 different haplotypes were identified, among which 603 (86.14%) were individual-specific. The haplotype diversity averaged over all populations was 0.9997; using only the nine Y-STRs comprising the "minimal haplotype" (excluding DYS388 and DXYS156Y) it was 0.9996, a value similar to that found in 1924 samples from other Asian populations (0.9996; Lessig et al. Legal Medicine 5(2003) 160-163), and slightly higher than in European populations (0.9976; n=11,610; Roewer et al. For Sci International (2001) 118:103-111). All of the individual east Asian populations examined here had high haplotype diversity (> or =0.997), except for the Mongolians (0.992) and Manchurians (0.960). The most frequent haplotype identified by the nine markers was present at only 1% (7/700). Population comparisons based on Phi(ST) or rho genetic distance measures revealed clustering according to the traditional northeast-southeast distinction, but with exceptions. For example, the Yunnan population from southern China lay among the northern populations, possibly reflecting recent migration, while the Korean population, traditionally considered northern, lay at the boundary between northern and southern populations. An admixture estimate suggested 55(51-59)% northern, 45(41-49)% southern contribution to the Koreans, illustrating the complexity of the genetic history of this region.

    International journal of legal medicine 2005;119;4;195-201

  • Analysis of ovarian cancer cell lines using array-based comparative genomic hybridization.

    Lambros MB, Fiegler H, Jones A, Gorman P, Roylance RR, Carter NP and Tomlinson IP

    Molecular and Population Genetics Laboratory, Cancer Research UK, London, UK.

    In this study, 23 ovarian cancer cell lines were screened using array-comparative genomic hybridization (aCGH) based on large-insert clones at about 1 Mb density from throughout the genome. The most frequent recurrent changes at the level of the chromosome arm were loss of chromosome 4 or 4q, loss of 18q and gain of 20 or 20q; other recurrent changes included losses of 6q, 8p, 9p, 11p, 15q, 16q, 17p, and 22q, and gain of 7q. Losses of 4q and 18q occurred together more often than expected. Evidence was found for two types of ovarian cancer, one typically near-triploid and characterized by a generally higher frequency of chromosomal changes (especially losses of 4p, 4q, 13q, 15q, 16p, 16q, 18p and 18q), and the other typically near-diploid/tetraploid and with fewer changes overall, but with relatively high frequencies of 9p loss, 9q gain, and 20p gain. Multiple novel changes (amplifications, homozygous deletions, discrete regions of gain or loss, small overlapping regions of change and frequently changed clones) were also detected, each of which might indicate the locations of oncogenes or tumour suppressor loci. For example, at least two regions of amplification on chromosome 11q13, one including cyclin D1 and the other the candidate oncogene PAK1, were found. Amplification on 11q22 near the progesterone receptor gene and a cluster of matrix metalloproteinase loci was also detected. Other potential oncogenes, which mapped to regions found by this study, included cyclin E and PIK3C2G. Candidate tumour suppressor genes in regions of loss included CDKN2C, SMAD4-interacting protein and RASSF2.

    The Journal of pathology 2005;205;1;29-40

  • The complete genome sequence of Francisella tularensis, the causative agent of tularemia.

    Larsson P, Oyston PC, Chain P, Chu MC, Duffield M, Fuxelius HH, Garcia E, Hälltorp G, Johansson D, Isherwood KE, Karp PD, Larsson E, Liu Y, Michell S, Prior J, Prior R, Malfatti S, Sjöstedt A, Svensson K, Thompson N, Vergez L, Wagg JK, Wren BW, Lindler LE, Andersson SG, Forsman M and Titball RW

    Swedish Defence Research Agency, SE-901 82 Umeå, Sweden.

    Francisella tularensis is one of the most infectious human pathogens known. In the past, both the former Soviet Union and the US had programs to develop weapons containing the bacterium. We report the complete genome sequence of a highly virulent isolate of F. tularensis (1,892,819 bp). The sequence uncovers previously uncharacterized genes encoding type IV pili, a surface polysaccharide and iron-acquisition systems. Several virulence-associated genes were located in a putative pathogenicity island, which was duplicated in the genome. More than 10% of the putative coding sequences contained insertion-deletion or substitution mutations and seemed to be deteriorating. The genome is rich in IS elements, including IS630 Tc-1 mariner family transposons, which are not expected in a prokaryote. We used a computational method for predicting metabolic pathways and found an unexpectedly high proportion of disrupted pathways, explaining the fastidious nutritional requirements of the bacterium. The loss of biosynthetic pathways indicates that F. tularensis is an obligate host-dependent bacterium in its natural life cycle. Our results have implications for our understanding of how highly virulent human pathogens evolve and will expedite strategies to combat them.

    Nature genetics 2005;37;2;153-9

  • Genetically indistinguishable SNPs and their influence on inferring the location of disease-associated variants.

    Lawrence R, Evans DM, Morris AP, Ke X, Hunt S, Paolucci M, Ragoussis J, Deloukas P, Bentley D and Cardon LR

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.

    As part of a recent high-density linkage disequilibrium (LD) study of chromosome 20, we obtained genotypes for approximately 30,000 SNPs at a density of 1 SNP/2 kb on four different population samples (47 CEPH founders; 91 UK unrelateds [unrelated white individuals of western European ancestry]; 97 African Americans; 42 East Asians). We observed that approximately 50% of SNPs had at least one genetically indistinguishable partner; i.e., for every individual considered, their genotype at the first locus was identical to their genotype at the second locus, or in LD terms, the SNPs were in "perfect" LD (r2 = 1.0). These "genetically indistinguishable SNPs" (giSNPs) formed into clusters of varying size. The larger the cluster, the greater the tendency to be located within genes and to overlap with giSNP clusters in other population samples. As might be expected for this map density, many giSNPs were located close to one another, thus reflecting local regions of undetected recombination or haplotype blocks. However, approximately 1/3 of giSNP clusters had intermingled, non-indistinguishable SNPs with incomplete LD (D' and r2 <1), sometimes spanning hundreds of kilobases, comprising up to 70 indistinguishable markers and overlapping multiple haplotype blocks. These long-range, nonconsecutive giSNPs have implications for disease gene localization by allelic association as evidence for association at one locus will be indistinguishable from that at another locus, even though both loci may be situated far apart. We describe the distribution of giSNPs on this map of chromosome 20 and illustrate the potential impact they can have on association mapping.

    Funded by: Wellcome Trust

    Genome research 2005;15;11;1503-10

  • CIC, a gene involved in cerebellar development and ErbB signaling, is significantly expressed in medulloblastomas.

    Lee CJ, Chan WI and Scotting PJ

    EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, UK.

    In children, the majority of brain tumors arise in the cerebellum. Medulloblastomas, the most common of these, are believed to originate from the granule cell lineage. We have recently identified a mammalian gene, capicua (Cic), the ortholog of a Drosophila gene implicated in c-erbB (Egfr) signaling, which is predominantly expressed during mouse granule cell development. Its expression in medulloblastoma is therefore of particular interest. In the present study the expression of human CIC in medulloblastoma was analyzed. In silico SAGE analysis demonstrated that medulloblastomas exhibited the highest level of CIC expression and expression was most common in tumors of the CNS in general. RT-PCR and in situ hybridization verified the expression of CIC in tumor cells, although the level of expression varied between different medulloblastoma subtypes. The expression of CIC did not correlate with other markers, such as neurofilament, GFAP and Mib-1. In postnatally developing cerebellum, in silico analysis and in situ hybridization both indicated a strong correlation between Cic expression and the maturation profile of cerebellar granule cell precursors. Expression of CIC is therefore a feature shared between immature granule cells and the tumors derived from them. Cic has been implicated as a mediator of ErbB signaling and this pathway has been associated with a poor prognosis for medulloblastomas. Therefore, further analysis of the role of Cic is likely to provide valuable insight into the biology of these tumors. Additionally, study of genes such as CIC should provide objective criteria by which, in combination with other markers and clinical data, to categorize these tumors into subgroups that might allow better allocation into specific treatment regimes.

    Journal of neuro-oncology 2005;73;2;101-8

  • Impairment of the TFIIH-associated CDK-activating kinase selectively affects cell cycle-regulated gene expression in fission yeast.

    Lee KM, Miklos I, Du H, Watt S, Szilagyi Z, Saiz JE, Madabhushi R, Penkett CJ, Sipiczki M, Bähler J and Fisher RP

    Molecular Biology Program, Memorial Sloan-Kettering Cancer Center, New York, NY 10021, USA.

    The fission yeast Mcs6-Mcs2-Pmh1 complex, homologous to metazoan Cdk7-cyclin H-Mat1, has dual functions in cell division and transcription: as a partially redundant cyclin-dependent kinase (CDK)-activating kinase (CAK) that phosphorylates the major cell cycle CDK, Cdc2, on Thr-167; and as the RNA polymerase (Pol) II carboxyl-terminal domain (CTD) kinase associated with transcription factor (TF) IIH. We analyzed conditional mutants of mcs6 and pmh1, which activate Cdc2 normally but cannot complete cell division at restrictive temperature and arrest with decreased CTD phosphorylation. Transcriptional profiling by microarray hybridization revealed only modest effects on global gene expression: a one-third reduction in a severe mcs6 mutant after prolonged incubation at 36 degrees C. In contrast, a small subset of transcripts ( approximately 5%) decreased by more than twofold after Mcs6 complex function was compromised. The signature of repressed genes overlapped significantly with those of cell separation mutants sep10 and sep15. Sep10, a component of the Pol II Mediator complex, becomes essential in mcs6 or pmh1 mutant backgrounds. Moreover, transcripts dependent on the forkhead transcription factor Sep1, which are expressed coordinately during mitosis, were repressed in Mcs6 complex mutants, and Mcs6 also interacts genetically with Sep1. Thus, the Mcs6 complex, a direct activator of Cdc2, also influences the cell cycle transcriptional program, possibly through its TFIIH-associated kinase function.

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Molecular biology of the cell 2005;16;6;2734-45

  • Systems biology: where it's at in 2005.

    Lehner B, Tischler J and Fraser AG

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    A report on the joint Keystone Symposia on Systems and Biology and Proteomics and Bioinformatics, Keystone, USA, 8-13 April 2005.

    Genome biology 2005;6;8;338

  • Regulation of neural progenitor proliferation and survival by beta1 integrins.

    Leone DP, Relvas JB, Campos LS, Hemmi S, Brakebusch C, Fässler R, Ffrench-Constant C and Suter U

    Institute of Cell Biology, Department of Biology, Swiss Federal Institute of Technology, ETH Hönggerberg, Zürich.

    Neural stem cells give rise to undifferentiated nestin-positive progenitors that undergo extensive cell division before differentiating into neuronal and glial cells. The precise control of this process is likely to be, at least in part, controlled by instructive cues originating from the extracellular environment. Some of these cues are interpreted by the integrin family of extracellular matrix receptors. Using neurosphere cell cultures as a model system, we show that beta1-integrin signalling plays a crucial role in the regulation of progenitor cell proliferation, survival and migration. Following conditional genetic ablation of the beta1-integrin allele, and consequent loss of beta1-integrin cell surface protein, mutant nestin-positive progenitor cells proliferate less and die in higher numbers than their wild-type counterparts. Mutant progenitor cell migration on different ECM substrates is also impaired. These effects can be partially compensated by the addition of exogenous growth factors. Thus, beta1-integrin signalling and growth factor signalling tightly interact to control the number and migratory capacity of nestin-positive progenitor cells.

    Journal of cell science 2005;118;Pt 12;2589-99

  • Identification of DNA markers for a transmissible Pseudomonas aeruginosa cystic fibrosis strain.

    Lewis DA, Jones A, Parkhill J, Speert DP, Govan JR, Lipuma JJ, Lory S, Webb AK and Mahenthiralingam E

    Cardiff School of Biosciences, Cardiff University, UK.

    A number of transmissible Pseudomonas aeruginosa strains have been identified which potentially constitute an emerging threat to patients with cystic fibrosis (CF). We sought to identify DNA markers that were specific to a transmissible P. aeruginosa CF clone and evaluate these probes on a large collection of genotypically distinct P. aeruginosa strains. Using subtractive DNA hybridization, in combination with analysis using the P. aeruginosa PAO1 genome chip, DNA markers specific for or absent from the Manchester transmissible CF strain (MA) were identified. Five subtractive DNA hybridization markers (MA15, MA18, MA21, MA22, and MA30) were found to be specific to strain MA and were located within a novel 13,318-bp genomic island, designated the MA island. The MA island encoded 18 genes and consisted of two bacteriophage-like regions; one region encoded the MA-specific subtractive hybridization markers, while the other bacteriophage-like region contained a Vibrio cholera-like toxin gene. Probes MA15, MA18, MA21, MA22, and MA30 were all found to be specific to strain MA when a collection of 141 P. aeruginosa strains was examined by hybridization with each DNA marker. In contrast, a previously isolated DNA marker for the Liverpool transmissible CF strain, PS21, was not found to be specific, detecting two additional strain types in the collection screened. Both the Manchester and Liverpool strain types were not encountered in CF populations outside the United Kingdom. The MA genomic island and Vibrio cholera-like toxin gene within it constitute novel genetic factors associated with a transmissible P. aeruginosa strain and their role in pathogenesis remains to be determined.

    American journal of respiratory cell and molecular biology 2005;33;1;56-64

  • Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, Chang JL, Kulbokas EJ, Zody MC, Mauceli E, Xie X, Breen M, Wayne RK, Ostrander EA, Ponting CP, Galibert F, Smith DR, DeJong PJ, Kirkness E, Alvarez P, Biagi T, Brockman W, Butler J, Chin CW, Cook A, Cuff J, Daly MJ, DeCaprio D, Gnerre S, Grabherr M, Kellis M, Kleber M, Bardeleben C, Goodstadt L, Heger A, Hitte C, Kim L, Koepfli KP, Parker HG, Pollinger JP, Searle SM, Sutter NB, Thomas R, Webber C, Baldwin J, Abebe A, Abouelleil A, Aftuck L, Ait-Zahra M, Aldredge T, Allen N, An P, Anderson S, Antoine C, Arachchi H, Aslam A, Ayotte L, Bachantsang P, Barry A, Bayul T, Benamara M, Berlin A, Bessette D, Blitshteyn B, Bloom T, Blye J, Boguslavskiy L, Bonnet C, Boukhgalter B, Brown A, Cahill P, Calixte N, Camarata J, Cheshatsang Y, Chu J, Citroen M, Collymore A, Cooke P, Dawoe T, Daza R, Decktor K, DeGray S, Dhargay N, Dooley K, Dooley K, Dorje P, Dorjee K, Dorris L, Duffey N, Dupes A, Egbiremolen O, Elong R, Falk J, Farina A, Faro S, Ferguson D, Ferreira P, Fisher S, FitzGerald M, Foley K, Foley C, Franke A, Friedrich D, Gage D, Garber M, Gearin G, Giannoukos G, Goode T, Goyette A, Graham J, Grandbois E, Gyaltsen K, Hafez N, Hagopian D, Hagos B, Hall J, Healy C, Hegarty R, Honan T, Horn A, Houde N, Hughes L, Hunnicutt L, Husby M, Jester B, Jones C, Kamat A, Kanga B, Kells C, Khazanovich D, Kieu AC, Kisner P, Kumar M, Lance K, Landers T, Lara M, Lee W, Leger JP, Lennon N, Leuper L, LeVine S, Liu J, Liu X, Lokyitsang Y, Lokyitsang T, Lui A, Macdonald J, Major J, Marabella R, Maru K, Matthews C, McDonough S, Mehta T, Meldrim J, Melnikov A, Meneus L, Mihalev A, Mihova T, Miller K, Mittelman R, Mlenga V, Mulrain L, Munson G, Navidi A, Naylor J, Nguyen T, Nguyen N, Nguyen C, Nguyen T, Nicol R, Norbu N, Norbu C, Novod N, Nyima T, Olandt P, O'Neill B, O'Neill K, Osman S, Oyono L, Patti C, Perrin D, Phunkhang P, Pierre F, Priest M, Rachupka A, Raghuraman S, Rameau R, Ray V, Raymond C, Rege F, Rise C, Rogers J, Rogov P, Sahalie J, Settipalli S, Sharpe T, Shea T, Sheehan M, Sherpa N, Shi J, Shih D, Sloan J, Smith C, Sparrow T, Stalker J, Stange-Thomann N, Stavropoulos S, Stone C, Stone S, Sykes S, Tchuinga P, Tenzing P, Tesfaye S, Thoulutsang D, Thoulutsang Y, Topham K, Topping I, Tsamla T, Vassiliev H, Venkataraman V, Vo A, Wangchuk T, Wangdi T, Weiand M, Wilkinson J, Wilson A, Yadav S, Yang S, Yang X, Young G, Yu Q, Zainoun J, Zembek L, Zimmer A and Lander ES

    Broad Institute of Harvard and MIT, 320 Charles Street, Cambridge, Massachusetts 02141, USA. kersli@broad.mit.edu

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.

    Nature 2005;438;7069;803-19

  • Shotgun haplotyping: a novel method for surveying allelic sequence variation.

    Lindsay SJ, Bonfield JK and Hurles ME

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Haplotypic sequences contain significantly more information than genotypes of genetic markers and are critical for studying disease association and genome evolution. Current methods for obtaining haplotypic sequences require the physical separation of alleles before sequencing, are time consuming and are not scaleable for large surveys of genetic variation. We have developed a novel method for acquiring haplotypic sequences from long PCR products using simple, high-throughput techniques. This method applies modified shotgun sequencing protocols to sequence both alleles concurrently, with read-pair information allowing the two alleles to be separated during sequence assembly. Although the haplotypic sequences can be assembled manually from the resultant data using pre-existing sequence assembly software, we have devised a novel heuristic algorithm to automate assembly and remove human error. We validated the approach on two long PCR products amplified from the human genome and confirmed the accuracy of our sequences against full-length clones of the same alleles. This method presents a simple high-throughput means to obtain full haplotypic sequences potentially up to 20 kb in length and is suitable for surveying genetic variation even in poorly-characterized genomes as it requires no prior information on sequence variation.

    Funded by: Wellcome Trust

    Nucleic acids research 2005;33;18;e152

  • The genome of the protist parasite Entamoeba histolytica.

    Loftus B, Anderson I, Davies R, Alsmark UC, Samuelson J, Amedeo P, Roncaglia P, Berriman M, Hirt RP, Mann BJ, Nozaki T, Suh B, Pop M, Duchene M, Ackers J, Tannich E, Leippe M, Hofer M, Bruchhaus I, Willhoeft U, Bhattacharya A, Chillingworth T, Churcher C, Hance Z, Harris B, Harris D, Jagels K, Moule S, Mungall K, Ormond D, Squares R, Whitehead S, Quail MA, Rabbinowitsch E, Norbertczak H, Price C, Wang Z, Guillén N, Gilchrist C, Stroup SE, Bhattacharya S, Lohia A, Foster PG, Sicheritz-Ponten T, Weber C, Singh U, Mukherjee C, El-Sayed NM, Petri WA, Clark CG, Embley TM, Barrell B, Fraser CM and Hall N

    TIGR, 9712 Medical Center Drive, Rockville, Maryland 20850, USA. bjloftus@tigr.org

    Entamoeba histolytica is an intestinal parasite and the causative agent of amoebiasis, which is a significant source of morbidity and mortality in developing countries. Here we present the genome of E. histolytica, which reveals a variety of metabolic adaptations shared with two other amitochondrial protist pathogens: Giardia lamblia and Trichomonas vaginalis. These adaptations include reduction or elimination of most mitochondrial metabolic pathways and the use of oxidative stress enzymes generally associated with anaerobic prokaryotes. Phylogenomic analysis identifies evidence for lateral gene transfer of bacterial genes into the E. histolytica genome, the effects of which centre on expanding aspects of E. histolytica's metabolic repertoire. The presence of these genes and the potential for novel metabolic pathways in E. histolytica may allow for the development of new chemotherapeutic agents. The genome encodes a large number of novel receptor kinases and contains expansions of a variety of gene families, including those associated with virulence. Additional genome features include an abundance of tandemly repeated transfer-RNA-containing arrays, which may have a structural function in the genome. Analysis of the genome provides new insights into the workings and genome evolution of a major human pathogen.

    Nature 2005;433;7028;865-8

  • VEGA, the genome browser with a difference.

    Loveland J

    HAVANA Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK. jel@sanger.ac.uk

    The Vertebrate Genome Annotation (Vega) database is a community resource for browsing manual annotation from a variety of vertebrate genomes of finished sequence (http://vega.sanger.ac.uk). Vega is different from other genome browsers as it has a standardised classification of genes which encompasses pseudogenes and non-coding transcripts. The data is manually curated, which is more accurate at identifying splice variants, pseudogenes poly(A) features, non-coding and complex gene structures and arrangements than current automated methods. The database also contains annotation from regions, not just whole genomes, and displays multiple species annotation (human, mouse, dog and zebrafish) for comparative analysis. Vega encourages community feedback that results in annotation updates and manual annotation of finished vertebrate sequence.

    Briefings in bioinformatics 2005;6;2;189-93

  • Transcriptional adaptation of Shigella flexneri during infection of macrophages and epithelial cells: insights into the strategies of a cytosolic bacterial pathogen.

    Lucchini S, Liu H, Jin Q, Hinton JC and Yu J

    The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom.

    Shigella flexneri, the etiologic agent of bacillary dysentery, invades epithelial cells as well as macrophages and dendritic cells and escapes into the cytosol soon after invasion. Dissection of the global gene expression profile of the bacterium in its intracellular niche is essential to fully understand the biology of Shigella infection. We have determined the complete gene expression profiles for S. flexneri infecting human epithelial HeLa cells and human macrophage-like U937 cells. Approximately one quarter of the S. flexneri genes showed significant transcriptional adaptation during infection; 929 and 1,060 genes were up- or down-regulated within HeLa cells and U937 cells, respectively. The key S. flexneri virulence genes, ipa-mxi-spa and icsA, were drastically down-regulated during intracellular growth. This theme seems to be common in bacterial infection, because the Ipa-Mxi-Spa-like type III secretion systems were also down-regulated during mammalian cell infection by Salmonella enterica serovar Typhimurium and Escherichia coli O157. The bacteria experienced restricted levels of iron, magnesium, and phosphate in both host cell types, as shown by up-regulation of the sitABCD system, the mgtA gene, and genes of the phoBR regulon. Interestingly, ydeO and other acid-induced genes were up-regulated only in U937 cells and not in HeLa cells, suggesting that the cytosol of U937 cells is acidic. Comparison with the gene expression of intracellular Salmonella serovar Typhimurium, which resides within the Salmonella-containing vacuole, indicated that S. flexneri is exposed to oxidative stress in U937 cells. This work will facilitate functional studies of hundreds of novel intracellularly regulated genes that may be important for the survival and growth strategies of Shigella in the human host.

    Infection and immunity 2005;73;1;88-102

  • PPARGC1A genotype (Gly482Ser) predicts exceptional endurance capacity in European men.

    Lucia A, Gómez-Gallego F, Barroso I, Rabadán M, Bandrés F, San Juan AF, Chicharro JL, Ekelund U, Brage S, Earnest CP, Wareham NJ and Franks PW

    European University of Madrid, Spain.

    Animal and human data indicate a role for the peroxisome proliferator-activated receptor-gamma coactivator 1alpha (PPARGC1A) gene product in the development of maximal oxygen uptake (V(O2 max)), a determinant of endurance capacity, diabetes, and early death. We tested the hypothesis that the frequency of the minor Ser482 allele at the PPARGC1A locus is lower in World-class Spanish male endurance athletes (cases) [n = 104; mean (SD) age: 26.8 (3.8) yr] than in unfit United Kingdom (UK) Caucasian male controls [n = 100; mean (SD) age: 49.3 (8.1) yr]. In cases and controls, the Gly482Ser genotype met Hardy-Weinberg expectations (P > 0.05 in both groups tested separately). Cases had significantly higher V(O2 max) [73.4 (5.7) vs. 29.4 ml x kg(-1) x min(-1) (3.8); P < 0.0001] and were leaner [body mass index: 20.6 (1.5) vs. 27.6 kg/m2 (3.9); P < 0.0001] than controls. In unadjusted chi2 analyses, the frequency of the minor Ser482 allele was significantly lower in cases than in controls (29.1 vs. 40.0%; P = 0.01). To assess the possibility that genetic stratification could confound these observations, we also compared Gly482Ser genotype frequencies in Spanish (n = 164) and UK Caucasian men (n = 381) who were unselected for their level of fitness. In these analyses, Ser482 allele frequencies were very similar (36.9% in Spanish vs. 37.5% in UK Caucasians, P = 0.83), suggesting that confounding by genetic stratification is unlikely to explain the association between Gly482Ser genotype and endurance capacity. In summary, our data indicate a role for the Gly482Ser genotype in determining aerobic fitness. This finding has relevance from the perspective of physical performance, but it may also be informative for the targeted prevention of diseases associated with low fitness such as Type 2 diabetes.

    Funded by: Wellcome Trust

    Journal of applied physiology (Bethesda, Md. : 1985) 2005;99;1;344-8

  • The genetic map and comparative analysis with the physical map of Trypanosoma brucei.

    MacLeod A, Tweedie A, McLellan S, Taylor S, Hall N, Berriman M, El-Sayed NM, Hope M, Turner CM and Tait A

    Wellcome Centre for Molecular Parasitology, Anderson College Complex, University of Glasgow, 56 Dumbarton Road, Glasgow G11 6NU, UK. gvwa08@udcf.gla.ac.uk

    Trypanosoma brucei is the causative agent of African sleeping sickness in humans and contributes to the debilitating disease 'Nagana' in cattle. To date we know little about the genes that determine drug resistance, host specificity, pathogenesis and virulence in these parasites. The availability of the complete genome sequence and the ability of the parasite to undergo genetic exchange have allowed genetic investigations into this parasite and here we report the first genetic map of T.brucei for the genome reference stock TREU 927, comprising of 182 markers and 11 major linkage groups, that correspond to the 11 previously identified chromosomes. The genetic map provides 90% probability of a marker being 11 cM from any given locus. Its comparison to the available physical map has revealed the average physical size of a recombination unit to be 15.6 Kb/cM. The genetic map coupled with the genome sequence and the ability to undertake crosses presents a new approach to identifying genes relevant to the disease and its prevention in this important pathogen through forward genetic analysis and positional cloning.

    Funded by: Wellcome Trust

    Nucleic acids research 2005;33;21;6688-93

  • Global expression changes resulting from loss of telomeric DNA in fission yeast.

    Mandell JG, Bähler J, Volpe TA, Martienssen RA and Cech TR

    Department of Chemistry and Biochemistry and Howard Hughes Medical Institute, University of Colorado, Boulder, CO 80309-0215, USA. jmandell@colorado.edu <jmandell@colorado.edu&gt;

    Background: Schizosaccharomyces pombe cells lacking the catalytic subunit of telomerase (encoded by trt1+) lose telomeric DNA and enter crisis, but rare survivors arise with either circular or linear chromosomes. Survivors with linear chromosomes have normal growth rates and morphology, but those with circular chromosomes have growth defects and are enlarged. We report the global gene-expression response of S. pombe to loss of trt1+.

    Results: Survivors with linear chromosomes had expression profiles similar to cells with native telomeres, whereas survivors with circular chromosomes showed continued upregulation of core environmental stress response (CESR) genes. In addition, survivors with circular chromosomes had altered expression of 51 genes compared to survivors with linear chromosomes, providing an expression signature. S. pombe progressing through crisis displayed two waves of altered gene expression. One coincided with crisis and consisted of around 110 genes, 44% of which overlapped with the CESR. The second was synchronized with the emergence of survivors and consisted of a single class of open reading frames (ORFs) with homology both to RecQ helicases and to dh repeats at centromeres targeted for heterochromatin formation via an RNA interference (RNAi) mechanism. Accumulation of transcript from the ORF was found not only in trt1- cells, but also in dcr1- and ago1- RNAi mutants, suggesting that RNAi may control its expression.

    Conclusions: These results demonstrate a correlation between a state of cellular stress, short telomeres and growth defects in cells with circular chromosomes. A putative new RecQ helicase was expressed as survivors emerged and appears to be transcriptionally regulated by RNAi, suggesting that this mechanism operates at telomeres.

    Funded by: Cancer Research UK: A6517; NIGMS NIH HHS: R01GM067014.; Wellcome Trust: 077118

    Genome biology 2005;6;1;R1

  • Expression of a RecQ helicase homolog affects progression through crisis in fission yeast lacking telomerase.

    Mandell JG, Goodrich KJ, Bähler J and Cech TR

    Department of Chemistry and Biochemistry and Howard Hughes Medical Institute, University of Colorado, Boulder, CO 80309-0215, USA.

    RecQ helicases play roles in telomere maintenance in cancerous human cells using the alternative lengthening of telomeres mechanism and in budding yeast lacking telomerase. Fission yeast lacking the catalytic subunit of telomerase (trt1(+)) up-regulate the expression of a previously uncharacterized sub-telomeric open reading frame as survivors emerge from crisis. Here we show that this open reading frame encodes a protein with homology to RecQ helicases such as the human Bloom's and Werner's syndrome proteins and that copies of the helicase gene are present on multiple chromosome ends. Characterization of the helicase transcript revealed a 7.6-kilobase RNA that was associated with polyribosomes, suggesting it is translated. A 3.6-kilobase domain of the helicase gene predicted to encode the region with catalytic activity was cloned, and both native and mutant forms of this domain were overexpressed in trt1(-) cells as they progressed through crisis. Overexpression of the native form caused cells to recover from crisis earlier than cells with a vector-only control, whereas overexpression of the mutant form caused delayed recovery from crisis. Taken together, the sequence homology, functional analysis, and site-directed mutagenesis indicate that the protein is likely a second fission yeast RecQ helicase (in addition to Rqh1) that participates in telomere metabolism during crisis. These results strengthen the notion that in multiple organisms RecQ helicases contribute to survival after telomere damage.

    Funded by: Cancer Research UK: A6517; NIGMS NIH HHS: GM28039; Wellcome Trust: 077118

    The Journal of biological chemistry 2005;280;7;5249-57

  • CGHAnalyzer: a stand-alone software package for cancer genome analysis using array-based DNA copy number data.

    Margolin AA, Greshock J, Naylor TL, Mosse Y, Maris JM, Bignell G, Saeed AI, Quackenbush J and Weber BL

    Abramson Family Cancer Research Institute, University of Pennsylvania, Philadelphia, PA 19104, USA.

    SUMMARY: This synopsis provides an overview of array-based comparative genomic hybridization data display, abstraction and analysis using CGHAnalyzer, a software suite, designed specifically for this purpose. CGHAnalyzer can be used to simultaneously load copy number data from multiple platforms, query and describe large, heterogeneous datasets and export results. Additionally, CGHAnalyzer employs a host of algorithms for microarray analysis that include hierarchical clustering and class differentiation. AVAILABILITY: CGHAnalyzer, the accompanying manual, documentation and sample data are available for download at http://acgh.afcri.upenn.edu. This is a Java-based application built in the framework of the TIGR MeV that can run on Microsoft Windows, Macintosh OSX and a variety of Unix-based platforms. It requires the installation of the free Java Runtime Environment 1.4.1 (or more recent) (http://www.java.sun.com).

    Bioinformatics (Oxford, England) 2005;21;15;3308-11

  • A large-scale screen in S. pombe identifies seven novel genes required for critical meiotic events.

    Martín-Castellanos C, Blanco M, Rozalén AE, Pérez-Hidalgo L, García AI, Conde F, Mata J, Ellermeier C, Davis L, San-Segundo P, Smith GR and Moreno S

    Instituto de Biología Molecular y Celular del Cáncer, CSIC/Universidad de Salamanca, Campus Miguel de Unamuno, 37007 Salamanca, Spain.

    Meiosis is a specialized form of cell division by which sexually reproducing diploid organisms generate haploid gametes. During a long prophase, telomeres cluster into the bouquet configuration to aid chromosome pairing, and DNA replication is followed by high levels of recombination between homologous chromosomes (homologs). This recombination is important for the reductional segregation of homologs at the first meiotic division; without further replication, a second meiotic division yields haploid nuclei. In the fission yeast Schizosaccharomyces pombe, we have deleted 175 meiotically upregulated genes and found seven genes not previously reported to be critical for meiotic events. Three mutants (rec24, rec25, and rec27) had strongly reduced meiosis-specific DNA double-strand breakage and recombination. One mutant (tht2) was deficient in karyogamy, and two (bqt1 and bqt2) were deficient in telomere clustering, explaining their defects in recombination and segregation. The moa1 mutant was delayed in premeiotic S phase progression and nuclear divisions. Further analysis of these mutants will help elucidate the complex machinery governing the special behavior of meiotic chromosomes.

    Funded by: NIGMS NIH HHS: GM32194, R01 GM032194-21, R01 GM032194-22, R01 GM032194-23

    Current biology : CB 2005;15;22;2056-62

  • Post-transcriptional control of gene expression: a genome-wide perspective.

    Mata J, Marguerat S and Bähler J

    Cancer Research UK Fission Yeast Functional Genomics Group, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

    Gene expression is regulated at multiple levels, and cells need to integrate and coordinate different layers of control to implement the information in the genome. Post-transcriptional levels of regulation such as transcript turnover and translational control are an integral part of gene expression and might rival the sophistication and importance of transcriptional control. Microarray-based methods are increasingly used to study not only transcription but also global patterns of transcript decay and translation rates in addition to comprehensively identify targets of RNA-binding proteins. Such large-scale analyses have recently provided supplementary and unique insights into gene expression programs. Integration of several different datasets will ultimately lead to a system-wide understanding of the varied and complex mechanisms for gene expression control.

    Funded by: Cancer Research UK: A6517; Wellcome Trust: 077118

    Trends in biochemical sciences 2005;30;9;506-14

  • Genomic and protein expression profiling identifies CDK6 as novel independent prognostic marker in medulloblastoma.

    Mendrzyk F, Radlwimmer B, Joos S, Kokocinski F, Benner A, Stange DE, Neben K, Fiegler H, Carter NP, Reifenberger G, Korshunov A and Lichter P

    Division of Molecular Genetics (B060), German Cancer Research Center, Im Neuenheimer Feld 580, 69120 Heidelberg, Germany.

    Purpose: Medulloblastoma is the most common malignant brain tumor in children. Despite multimodal aggressive treatment, nearly half of the patients die as a result of this tumor. Identification of molecular markers for prognosis and development of novel pathogenesis-based therapies depends crucially on a better understanding of medulloblastoma pathomechanisms.

    We performed genome-wide analysis of DNA copy number imbalances in 47 medulloblastomas using comparative genomic hybridization to large insert DNA microarrays (matrix-CGH). The expression of selected candidate genes identified by matrix-CGH was analyzed immunohistochemically on tissue microarrays representing medulloblastomas from 189 clinically well-documented patients. To identify novel prognostic markers, genomic findings and protein expression data were correlated to patient survival.

    Results: Matrix-CGH analysis revealed frequent DNA copy number alterations of several novel candidate regions. Among these, gains at 17q23.2-qter (P < .01) and losses at 17p13.1 to 17p13.3 (P = .04) were significantly correlated to poor prognosis. Within 17q23.2-qter and 7q21.2, two of the most frequently gained chromosomal regions, confined amplicons were identified that contained the PPM1D and CDK6 genes, respectively. Immunohistochemistry revealed strong expression of PPM1D in 148 (88%) of 168 and CDK6 in 50 (30%) of 169 medulloblastomas. Overexpression of CDK6 correlated significantly with poor prognosis (P < .01) and represented an independent prognostic marker of overall survival on multivariate analysis (P = .02).

    Conclusion: We identified CDK6 as a novel molecular marker that can be determined by immunohistochemistry on routinely processed tissue specimens and may facilitate the prognostic assessment of medulloblastoma patients. Furthermore, increased protein-levels of PPM1D and CDK6 may link the TP53 and RB1 tumor suppressor pathways to medulloblastoma pathomechanisms.

    Journal of clinical oncology : official journal of the American Society of Clinical Oncology 2005;23;34;8853-62

  • A high-resolution linkage-disequilibrium map of the human major histocompatibility complex and first generation of tag single-nucleotide polymorphisms.

    Miretti MM, Walsh EC, Ke X, Delgado M, Griffiths M, Hunt S, Morrison J, Whittaker P, Lander ES, Cardon LR, Bentley DR, Rioux JD, Beck S and Deloukas P

    Wellcome Trust Sanger Institute, Hinxton, United Kingdom.

    Autoimmune, inflammatory, and infectious diseases present a major burden to human health and are frequently associated with loci in the human major histocompatibility complex (MHC). Here, we report a high-resolution (1.9 kb) linkage-disequilibrium (LD) map of a 4.46-Mb fragment containing the MHC in U.S. pedigrees with northern and western European ancestry collected by the Centre d'Etude du Polymorphisme Humain (CEPH) and the first generation of haplotype tag single-nucleotide polymorphisms (tagSNPs) that provide up to a fivefold increase in genotyping efficiency for all future MHC-linked disease-association studies. The data confirm previously identified recombination hotspots in the class II region and allow the prediction of numerous novel hotspots in the class I and class III regions. The region of longest LD maps outside the classic MHC to the extended class I region spanning the MHC-linked olfactory-receptor gene cluster. The extended haplotype homozygosity analysis for recent positive selection shows that all 14 outlying haplotype variants map to a single extended haplotype, which most commonly bears HLA-DRB1*1501. The SNP data, haplotype blocks, and tagSNPs analysis reported here have been entered into a multidimensional Web-based database (GLOVAR), where they can be accessed and viewed in the context of relevant genome annotation. This LD map allowed us to give coordinates for the extremely variable LD structure underlying the MHC.

    Funded by: NIDDK NIH HHS: DK64869

    American journal of human genetics 2005;76;4;634-46

  • Critical assessment of methods of protein structure prediction (CASP)--round 6.

    Moult J, Fidelis K, Rost B, Hubbard T and Tramontano A

    Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville, Maryland 20850, USA. moult@umbi.umd.edu

    This article is an introduction to the special issue of the journal Proteins, dedicated to the sixth CASP experiment to assess the state of the art in protein structure prediction. The article describes the conduct of the experiment and the categories of prediction included, and outlines the evaluation and assessment procedures. A brief summary of progress over the decade of CASP experiments is also provided.

    Funded by: NIGMS NIH HHS: GM072354; NLM NIH HHS: LM07085

    Proteins 2005;61 Suppl 7;3-7

  • A selenocysteine tRNA and SECIS element in Plasmodium falciparum.

    Mourier T, Pain A, Barrell B and Griffiths-Jones S

    The molecular machinery for incorporating selenocysteine into proteins is present in both prokaryotes and eukaryotes. Although selenocysteine insertion has been reported in animals, plants, and protozoans, known eukaryotic selenocysteine tRNA sequences and selenocysteine insertion sequences are limited to animals and plants. Here we present clear indications of the presence of selenocysteine-tRNA and a selenocysteine insertion sequence in Plasmodium falciparum. To our knowledge, this is the first report of an identification of protozoan selenocysteine insertion machinery at the sequence level.

    RNA (New York, N.Y.) 2005;11;2;119-22

  • InterPro, progress and status in 2005.

    Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bradley P, Bork P, Bucher P, Cerutti L, Copley R, Courcelle E, Das U, Durbin R, Fleischmann W, Gough J, Haft D, Harte N, Hulo N, Kahn D, Kanapin A, Krestyaninova M, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McDowall J, Mitchell A, Nikolskaya AN, Orchard S, Pagni M, Ponting CP, Quevillon E, Selengut J, Sigrist CJ, Silventoinen V, Studholme DJ, Vaughan R and Wu CH

    EMBL Outstation-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. mulder@ebi.ac.uk

    InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).

    Nucleic acids research 2005;33;Database issue;D201-5

  • Citrobacter rodentium of mice and man.

    Mundy R, MacDonald TT, Dougan G, Frankel G and Wiles S

    Centre for Molecular Microbiology and Infection, Division of Cell and Molecular Biology, Imperial College London SW7 2AZ, UK.

    The major classes of enteric bacteria harbour a conserved core genomic structure, common to both commensal and pathogenic strains, that is most likely optimized to a life style involving colonization of the host intestine and transmission via the environment. In pathogenic bacteria this core genome framework is decorated with novel genetic islands that are often associated with adaptive phenotypes such as virulence. This classical genome organization is well illustrated by a group of extracellular enteric pathogens, which includes enteropathogenic Escherichia coli (EPEC), enterohaemorrhagic E. coli (EHEC) and Citrobacter rodentium, all of which use attaching and effacing (A/E) lesion formation as a major mechanism of tissue targeting and infection. Both EHEC and EPEC are poorly pathogenic in mice but infect humans and domestic animals. In contrast, C. rodentium is a natural mouse pathogen that is related to E. coli, hence providing an excellent in vivo model for A/E lesion forming pathogens. C. rodentium also provides a model of infections that are mainly restricted to the lumen of the intestine. The mechanism's by which the immune system deals with such infections has become a topic of great interest in recent years. Here we review the literature of C. rodentium from its emergence in the mid-1960s to the most contemporary reports of colonization, pathogenesis, transmission and immunity.

    Funded by: Wellcome Trust

    Cellular microbiology 2005;7;12;1697-706

  • From genome to epigenome.

    Murrell A, Rakyan VK and Beck S

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK. amm@sanger.ac.uk

    The success of the human genome sequencing project has created wide-spread interest in exploring the human epigenome in order to elucidate how the genome executes the information it holds. Although all (nucleated) human cells effectively contain the same genome, they contain very different epigenomes depending upon cell type, developmental stage, sex, age and various other parameters. This complexity makes it intrinsically difficult to precisely define 'an' epigenome, let alone 'the' epigenome. What is clear, however, is that in order to unravel any epigenome, existing and novel high-throughput approaches on the DNA, RNA and protein levels need to be harnessed and integrated. Here, we review the current thinking and progress on how to get from the genome to the epigenome(s) and discuss some potential applications.

    Human molecular genetics 2005;14 Spec No 1;R3-R10

  • Evolution of noncoding and silent coding sites in the Plasmodium falciparum and Plasmodium reichenowi genomes.

    Neafsey DE, Hartl DL and Berriman M

    Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA. neafsey@broad.mit.edu

    We compared levels of sequence divergence between fourfold synonymous coding sites and noncoding sites from the intergenic and intronic regions of the Plasmodium falciparum and Plasmodium reichenowi genomes. We observed significant differences in the level of divergence between these classes of silent sites. Fourfold synonymous coding sites exhibited the highest level of sequence divergence, followed by introns, and then intergenic sequences. This pattern of relative divergence rates has been observed in primate genomes but was unexpected in Plasmodium due to a paucity of variation at silent sites in P. falciparum and the corollary hypothesis that silent sites in this genome may be subject to atypical selective constraints. Exclusion of hypermutable CpG dinucleotides reduces the divergence level of synonymous coding sites to that of intergenic sites but does not diminish the significantly higher divergence level of introns relative to intergenic sites. A greater than expected incidence of CpG dinucleotides in intergenic regions less than 500 bp from genes may indicate selective maintenance of regulatory motifs containing CpGs. Divergence rates of different classes of silent sites in these Plasmodium genomes are determined by a combination of mutational and selective pressures.

    Funded by: NIGMS NIH HHS: GM61351

    Molecular biology and evolution 2005;22;7;1621-6

  • Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus.

    Nierman WC, Pain A, Anderson MJ, Wortman JR, Kim HS, Arroyo J, Berriman M, Abe K, Archer DB, Bermejo C, Bennett J, Bowyer P, Chen D, Collins M, Coulsen R, Davies R, Dyer PS, Farman M, Fedorova N, Fedorova N, Feldblyum TV, Fischer R, Fosker N, Fraser A, García JL, García MJ, Goble A, Goldman GH, Gomi K, Griffith-Jones S, Gwilliam R, Haas B, Haas H, Harris D, Horiuchi H, Huang J, Humphray S, Jiménez J, Keller N, Khouri H, Kitamoto K, Kobayashi T, Konzack S, Kulkarni R, Kumagai T, Lafon A, Lafton A, Latgé JP, Li W, Lord A, Lu C, Majoros WH, May GS, Miller BL, Mohamoud Y, Molina M, Monod M, Mouyna I, Mulligan S, Murphy L, O'Neil S, Paulsen I, Peñalva MA, Pertea M, Price C, Pritchard BL, Quail MA, Rabbinowitsch E, Rawlins N, Rajandream MA, Reichard U, Renauld H, Robson GD, Rodriguez de Córdoba S, Rodríguez-Peña JM, Ronning CM, Rutter S, Salzberg SL, Sanchez M, Sánchez-Ferrero JC, Saunders D, Seeger K, Squares R, Squares S, Takeuchi M, Tekaia F, Turner G, Vazquez de Aldana CR, Weidman J, White O, Woodward J, Yu JH, Fraser C, Galagan JE, Asai K, Machida M, Hall N, Barrell B and Denning DW

    The Institute for Genomic Research, Rockville, Maryland 20850, USA. wnierman@tigr.org

    Aspergillus fumigatus is exceptional among microorganisms in being both a primary and opportunistic pathogen as well as a major allergen. Its conidia production is prolific, and so human respiratory tract exposure is almost constant. A. fumigatus is isolated from human habitats and vegetable compost heaps. In immunocompromised individuals, the incidence of invasive infection can be as high as 50% and the mortality rate is often about 50% (ref. 2). The interaction of A. fumigatus and other airborne fungi with the immune system is increasingly linked to severe asthma and sinusitis. Although the burden of invasive disease caused by A. fumigatus is substantial, the basic biology of the organism is mostly obscure. Here we show the complete 29.4-megabase genome sequence of the clinical isolate Af293, which consists of eight chromosomes containing 9,926 predicted genes. Microarray analysis revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype. The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus.

    Funded by: Wellcome Trust

    Nature 2005;438;7071;1151-6

  • Genetic factors in type 2 diabetes: the end of the beginning?

    O'Rahilly S, Barroso I and Wareham NJ

    University of Cambridge, Department of Clinical Biochemistry, Addenbrooke's Hospital, Cambridge CB2 2QQ, UK. so104@medschl.cam.ac.uk

    The intensive search for genetic variants that predispose to type 2 diabetes was launched with optimism, but progress has been slower than was hoped. Even so, major advances have been made in the understanding of monogenic forms of the disease which together represent a substantial health burden, and a few common gene variants that influence susceptibility have now been unequivocally identified. Armed with a better understanding of the tools needed to detect such genes, it seems inevitable that the rate of progress will increase and the relevance of genetic information to the diagnosis, treatment, and prevention of diabetes will become increasingly tangible.

    Science (New York, N.Y.) 2005;307;5708;370-3

  • Salmonella paratyphi A rates, Asia.

    Ochiai RL, Wang X, von Seidlein L, Yang J, Bhutta ZA, Bhattacharya SK, Agtini M, Deen JL, Wain J, Kim DR, Ali M, Acosta CJ, Jodar L and Clemens JD

    International Vaccine Institute, Kwanak PO Box 14, Seoul, South Korea 151-600. rlochiai@ivi.int

    Little is known about the causes of enteric fever in Asia. Most cases are believed to be caused by Salmonella enterica serovar Typhi and the remainder by S. Paratyphi A. We compared their incidences by using standardized methods from population-based studies in China, Indonesia, India, and Pakistan.

    Emerging infectious diseases 2005;11;11;1764-6

  • Fungi behaving badly.

    Pain A

    Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK. microbes@sanger.ac.uk

    Nature reviews. Microbiology 2005;3;11;832-3

  • Comparative apicomplexan genomics.

    Pain A, Crossman L and Parkhill J

    Nature reviews. Microbiology 2005;3;6;454-5

  • Genome of the host-cell transforming parasite Theileria annulata compared with T. parva.

    Pain A, Renauld H, Berriman M, Murphy L, Yeats CA, Weir W, Kerhornou A, Aslett M, Bishop R, Bouchier C, Cochet M, Coulson RM, Cronin A, de Villiers EP, Fraser A, Fosker N, Gardner M, Goble A, Griffiths-Jones S, Harris DE, Katzer F, Larke N, Lord A, Maser P, McKellar S, Mooney P, Morton F, Nene V, O'Neil S, Price C, Quail MA, Rabbinowitsch E, Rawlings ND, Rutter S, Saunders D, Seeger K, Shah T, Squares R, Squares S, Tivey A, Walker AR, Woodward J, Dobbelaere DA, Langsley G, Rajandream MA, McKeever D, Shiels B, Tait A, Barrell B and Hall N

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. ap2@sanger.ac.uk

    Theileria annulata and T. parva are closely related protozoan parasites that cause lymphoproliferative diseases of cattle. We sequenced the genome of T. annulata and compared it with that of T. parva to understand the mechanisms underlying transformation and tropism. Despite high conservation of gene sequences and synteny, the analysis reveals unequally expanded gene families and species-specific genes. We also identify divergent families of putative secreted polypeptides that may reduce immune recognition, candidate regulators of host-cell transformation, and a Theileria-specific protein domain [frequently associated in Theileria (FAINT)] present in a large number of secreted proteins.

    Funded by: Wellcome Trust

    Science (New York, N.Y.) 2005;309;5731;131-3

  • HbO-Arab mutation originated in the Pomak population of Greek Thrace.

    Papadopoulos V, Dermitzakis E, Konstantinidou D, Petridis D, Xanthopoulidis G and Loukopoulos D

    HbO-Arab emerged about 2,000 years on a rare haplotype, characteristic of the Greek Pomaks. Its frequency increased as a consequence of high genetic drift within this population, and it was dispersed throughout the Mediterranean basin and Middle East with minor variations of its haplotypic pattern.

    Haematologica 2005;90;2;255-7

  • The nuclear rim protein Amo1 is required for proper microtubule cytoskeleton organisation in fission yeast.

    Pardo M and Nurse P

    Cell Cycle Laboratory, Cancer Research UK, 44 Lincoln's Inn Fields, London, WC2A 3PX, UK. mp3@sanger.ac.uk

    Microtubules have a central role in cell division and cell polarity in eukaryotic cells. The fission yeast is a useful organism for studying microtubule regulation owing to the highly organised nature of its microtubular arrays. To better understand microtubule dynamics and organisation we carried out a screen that identified over 30 genes whose overexpression resulted in microtubule cytoskeleton abnormalities. Here we describe a novel nucleoporin-like protein, Amo1, identified in this screen. Amo1 localises to the nuclear rim in a punctate pattern that does not overlap with nuclear pore complex components. Amo1Delta cells are bent, and they have fewer microtubule bundles that curl around the cell ends. The microtubules in amo1Delta cells have longer dwelling times at the cell tips, and grow in an uncoordinated fashion. Lack of Amo1 also causes a polarity defect. Amo1 is not required for the microtubule loading of several factors affecting microtubule dynamics, and does not seem to be required for nuclear pore function.

    Journal of cell science 2005;118;Pt 8;1705-14

  • JAM-A expression during embryonic development.

    Parris JJ, Cooke VG, Skarnes WC, Duncan MK and Naik UP

    Department of Biological Sciences, University of Delaware, Newark, Delaware 19716, USA.

    Cell adhesion molecules of the immunoglobulin superfamily play an important role in embryonic development. We have shown recently that JAM-A, a member of this family expressed at endothelial and epithelial tight junctions, is involved in platelet activation, leukocyte transmigration, and angiogenesis. Here, we determine the expression pattern of the JAM-A gene during embryogenesis using transgenic mice expressing lacZ under the control of the endogenous JAM-A promoter. Histochemical staining for beta-galactosidase in heterozygous mouse embryos was first seen in the inner cell mass and trophectoderm of the blastocyst. By 8.5 days post coitum (dpc), JAM-A gene activity was detected in the endoderm and part of the surface ectoderm. At 9.5 dpc, JAM-A expression began to localize to certain organ systems, most notably the developing inner ear and early vasculature. Localization of JAM-A to embryonic vasculature was confirmed by double-staining with antibodies against JAM-A and platelet endothelial cell adhesion molecule-1, a known endothelial cell marker. As organogenesis progressed, high levels of JAM-A expression continued in the epithelial component of the inner ear as well as the epithelium of the developing skin, olfactory system, lungs, and kidneys. In addition, JAM-A gene activity was found in the developing liver, choroid plexuses, and gut tubes. Immunofluorescent staining with a JAM-A antibody was performed to confirm that expression of the JAM-A-beta-galactosidase fusion protein accurately represented endogenous JAM-A protein. Thus, JAM-A is prominently expressed in embryonic vasculature and the epithelial components of several organ systems and may have an important role in their development.

    Funded by: NCRR NIH HHS: P20 RR16472; NEI NIH HHS: EY012221, EY015279; NHLBI NIH HHS: HL63960

    Developmental dynamics : an official publication of the American Association of Anatomists 2005;233;4;1517-24

  • Hush puppy: a new mouse mutant with pinna, ossicle, and inner ear defects.

    Pau H, Fuchs H, de Angelis MH and Steel KP

    MRC Institute of Hearing Research, University Park, Nottingham, UK.

    Deafness can be associated with abnormalities of the pinna, ossicles, and cochlea. The authors studied a newly generated mouse mutant with pinna defects and asked whether these defects are associated with peripheral auditory or facial skeletal abnormalities, or both. Furthermore, the authors investigated where the mutation responsible for these defects was located in the mouse genome.

    Methods: The hearing of hush puppy mutants was assessed by Preyer reflex and electrophysiological measurement. The morphological features of their middle and inner ears were investigated by microdissection, paint-filling of the labyrinth, and scanning electron microscopy. Skeletal staining of skulls was performed to assess the craniofacial dimensions. Genome scanning was performed using microsatellite markers to localize the mutation to a chromosomal region.

    Results: Some hush puppy mutants showed early onset of hearing impairment. They had small, bat-like pinnae and normal malleus but abnormal incus and stapes. Some mutants had asymmetrical defects and showed reduced penetrance of the ear abnormalities. Paint-filling of newborns' inner ears revealed no morphological abnormality, although half of the mice studied were expected to carry the mutation. Reduced numbers of outer hair cells were demonstrated in mutants' cochlea on scanning electron microscopy. Skeletal staining showed that the mutants have significantly shorter snouts and mandibles. Genome scan revealed that the mutation lies on chromosome 8 between markers D8Mit58 and D8Mit289.

    Conclusion: The study results indicate developmental problems of the first and second branchial arches and otocyst as a result of a single gene mutation. Similar defects are found in humans, and hush puppy provides a mouse model for investigation of such defects.

    The Laryngoscope 2005;115;1;116-24

  • Evidence for a dispersed Hox gene cluster in the platyhelminth parasite Schistosoma mansoni.

    Pierce RJ, Wu W, Hirai H, Ivens A, Murphy LD, Noël C, Johnston DA, Artiguenave F, Adams M, Cornette J, Viscogliosi E, Capron M and Balavoine G

    Inserm U 547, Institut Pasteur de Lille, France. raymond.pierce@pasteur-lille.fr

    In most bilaterian organisms so far studied, Hox genes are organized in genomic clusters and determine development along the anteroposterior axis. It has been suggested that this clustering, together with spatial and temporal colinearity of gene expression, represents the ancestral condition. However, in organisms with derived modes of embryogenesis and lineage-dependent mechanisms for the determination of cell fate, temporal colinearity of expression can be lost and Hox cluster organization disrupted, as is the case for the ecdysozoans Drosophila melanogaster and Caenorhabditis elegans and the urochordates Ciona intestinalis and Oikopleura dioica. We sought to determine whether a lophotrochozoan, the platyhelminth parasite Schistosoma mansoni, possesses a conserved or disrupted Hox cluster. Using a polymerase chain reaction (PCR)-based strategy, we have cloned and characterized three novel S. mansoni genes encoding orthologues of Drosophila labial (SmHox1), deformed (SmHox4), and abdominal A (SmHox8), as well as the full-length coding sequence of the previously described Smox1, which we identify as an orthologue of fushi tarazu. Quantitative reverse transcriptase-PCR showed that the four genes were expressed at all life-cycle stages but that levels of expression were differentially regulated. Phylogenetic analysis and the conservation of "parapeptide" sequences C-terminal to the homeodomains of SmHox8 and Smox1 support the grouping of platyhelminths within the lophotrochozoan clade. However, Bacterial Artificial Chromosome (BAC) library screening followed by genome walking failed to reconstitute a cluster. The BAC clones containing Hox genes were sequenced, and in no case were other Hox genes found on the same clone. Moreover, the SmHox4 and SmHox8 genes contained single very large introns (>40 kbp) further indicating that the schistosome Hox cluster is highly extended. Localization of the Hox genes to chromosomes using fluorescence in situ hybridization showed that SmHox4 and SmHox8 are on the long arm of chromosome 4, whereas SmHox1 and Smox1 are on chromosome 3. In silico screening of the available genome sequences corroborated results of Southern blotting and BAC library screening that indicate that there are no paralogues of SmHox1, SmHox4, or SmHox8. The schistosome Hox cluster is therefore not duplicated, but is both dispersed and disintegrated in the genome.

    Molecular biology and evolution 2005;22;12;2491-503

  • Differential expression of two NMDA receptor interacting proteins, PSD-95 and SynGAP during mouse development.

    Porter K, Komiyama NH, Vitalis T, Kind PC and Grant SG

    Centre for Neuroscience Research, University of Edinburgh, Edinburgh UK.

    Patterns of neural activity mediated by N-methyl-D-aspartate (NMDA) receptors are known to play important roles in development of the central nervous system. However, the signalling pathways downstream from NMDA receptors that are critical for normal neuronal development are not yet clearly understood. NMDA receptors interact with various signalling proteins via scaffolding proteins, which are important in adult neuronal and behavioural plasticity. For example, the NR2B subunits of the NMDA receptor interact with postsynaptic density 95 (PSD-95), which in turn binds to synaptic ras GTPase-activating protein (SynGAP). Interestingly, the developmental phenotype of mice carrying null mutations in these genes differ. NR2B and SynGAP homozygote mice die within the first week of birth whereas PSD-95 homozygote mice survive to adulthood. We therefore examined the expression patterns of PSD-95 and SynGAP genes from embryonic stages to adult using lacZ (beta-galactosidase) marker gene knock-in mice. Dramatic changes of expression were observed throughout development in brain and other tissues. Although SynGAP binds PSD-95, both genes had distinct, as well as overlapping expression. SynGAP expression peaked at times of synaptogenesis and developmental plasticity in contrast to PSD-95, which was expressed throughout the brain from early embryonic stages. Furthermore, SynGAP showed a more spatially restricted pattern as illustrated by its restriction to forebrain in contrast to PSD-95, which was also found in mid- and hindbrain. These data support the model that synaptic signalling complexes are heterogeneous and individual components show temporal and spatial specificity during development.

    The European journal of neuroscience 2005;21;2;351-62

  • Chromatin regulation and sumoylation in the inhibition of Ras-induced vulval development in Caenorhabditis elegans.

    Poulin G, Dong Y, Fraser AG, Hopper NA and Ahringer J

    Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK.

    In Caenorhabditis elegans, numerous 'synMuv' (synthetic multivulval) genes encode for chromatin-associated proteins involved in transcriptional repression, including an orthologue of Rb and components of the NuRD histone deacetylase complex. These genes antagonize Ras signalling to prevent erroneous adoption of vulval fate. To identify new components of this mechanism, we performed a genome-wide RNA interference (RNAi) screen. After RNAi of 16 757 genes, we found nine new synMuv genes. Based on predicted functions and genetic epistasis experiments, we propose that at least four post-translational modifications converge to inhibit Ras-stimulated vulval development: sumoylation, histone tail deacetylation, methylation, and acetylation. In addition, we demonstrate a novel role for sumoylation in inhibiting LIN-12/Notch signalling in the vulva. We further show that many of the synMuv genes are involved in gene regulation outside the vulva, negatively regulating the expression of the Delta homologue lag-2. As most of the genes identified in this screen are conserved in humans, we suggest that similar interactions may be relevant in mammals for control of Ras and Notch signalling, crosstalk between these pathways, and cell proliferation.

    Funded by: Wellcome Trust: 054523

    The EMBO journal 2005;24;14;2613-23

  • A novel 5q11.2 deletion detected by microarray comparative genomic hybridisation in a child referred as a case of suspected 22q11 deletion syndrome.

    Prescott K, Woodfine K, Stubbs P, Super M, Kerr B, Palmer R, Carter NP and Scambler P

    Molecular Medicine Unit, Institute of Child Health, University College London, 30, Guilford Street, London WC1N 1EH, UK.

    The 22q11 deletion syndrome (22q11DS) is a developmental syndrome comprising of heart, palate, thymus and parathyroid glands defects. Individuals with 22q11DS usually carry a 1.5- to 3-Mb heterozygous deletion on chromosome 22q11.2. However, there are many patients with features of 22q11DS without a known cause from conventional karyotype and FISH analysis. Six patients with features of 22q11DS, a normal chromosomal and FISH 22q11 analysis, were selected for investigation by microarray genomic comparative hybridisation (array CGH). Array-CGH is a powerful technology enabling detection of submicroscopic chromosome duplications and deletions by comparing a differentially labelled test sample to a control. The samples are co-hybridised to a microarray containing genomic clones and the resulting ratio of fluorescence intensities on each array element is proportional to the DNA copy number difference. No chromosomal changes were detected by hybridisation to a high resolution array representing chromosome 22q. However, one patient was found to have a 6-Mb deletion on 5q11.2 detected by a whole genome 1-Mb array. This deletion was confirmed with fluorescence in-situ hybridisation (FISH) and microsatellite marker analysis. It is the first deletion described in this region. The patient had tetralogy of Fallot, a bifid uvula and velopharyngeal insufficiency, short stature, learning and behavioural difficulties. This case shows the increased sensitivity of array CGH over detailed karyotype analysis for detection of chromosomal changes. It is anticipated that array CGH will improve the clinician's capacity to diagnose congenital syndromes with an unknown aetiology.

    Human genetics 2005;116;1-2;83-90

  • Adding some SPICE to DAS.

    Prlić A, Down TA and Hubbard TJ

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus Hinxton, Cambridge, UK. ap3@sanger.ac.uk

    Unlabelled: The distributed annotation system (DAS) defines a communication protocol used to exchange biological annotations. It is motivated by the idea that annotations should not be provided by single centralized databases but instead be spread over multiple sites. Data distribution, performed by DAS servers, is separated from visualization, which is carried out by DAS clients. The original DAS protocol was designed to serve annotation of genomic sequences. We have extended the protocol to be applicable to macromolecular structures. Here we present SPICE, a new DAS client that can be used to visualize protein sequence and structure annotations.

    Availability: http://www.efamily.org.uk/software/dasclients/spice/

    Bioinformatics (Oxford, England) 2005;21 Suppl 2;ii40-1

  • Binding sites for metabolic disease related transcription factors inferred at base pair resolution by chromatin immunoprecipitation and genomic microarrays.

    Rada-Iglesias A, Wallerman O, Koch C, Ameur A, Enroth S, Clelland G, Wester K, Wilcox S, Dovey OM, Ellis PD, Wraight VL, James K, Andrews R, Langford C, Dhami P, Carter N, Vetrie D, Pontén F, Komorowski J, Dunham I and Wadelius C

    Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Sweden.

    We present a detailed in vivo characterization of hepatocyte transcriptional regulation in HepG2 cells, using chromatin immunoprecipitation and detection on PCR fragment-based genomic tiling path arrays covering the encyclopedia of DNA element (ENCODE) regions. Our data suggest that HNF-4alpha and HNF-3beta, which were commonly bound to distal regulatory elements, may cooperate in the regulation of a large fraction of the liver transcriptome and that both HNF-4alpha and USF1 may promote H3 acetylation to many of their targets. Importantly, bioinformatic analysis of the sequences bound by each transcription factor (TF) shows an over-representation of motifs highly similar to the in vitro established consensus sequences. On the basis of these data, we have inferred tentative binding sites at base pair resolution. Some of these sites have been previously found by in vitro analysis and some were verified in vitro in this study. Our data suggests that a similar approach could be used for the in vivo characterization of all predicted/uncharacterized TF and that the analysis could be scaled to the whole genome.

    Funded by: NHGRI NIH HHS: 5 U01 HG003168

    Human molecular genetics 2005;14;22;3435-47

  • Conformational changes of Escherichia coli sigma54-RNA-polymerase upon closed-promoter complex formation.

    Ray P, Hall RJ, Finn RD, Chen S, Patwardhan A, Buck M and van Heel M

    Department of Biological Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, UK.

    RNA polymerase from the mesophile Escherichia coli exists in two forms, the core enzyme and the holoenzyme. Using cryo-electron microscopy and single-particle analysis, we have obtained the structure of the complete RNA polymerase from E.coli containing the sigma54 factor within the closed-promoter complex. Comparisons with earlier reconstructions of the core enzyme and the sigma54 holoenzyme reveal the behaviour of this major variant RNA polymerase in defined functional states. The binding of DNA leads to significant conformational changes in the enzyme's catalytic subunits, apparently a necessity for the initiation of enhancer-dependent promoter-specific transcription.

    Journal of molecular biology 2005;354;2;201-5

  • Tiling path resolution mapping of constitutional 1p36 deletions by array-CGH: contiguous gene deletion or "deletion with positional effect" syndrome?

    Redon R, Rio M, Gregory SG, Cooper RA, Fiegler H, Sanlaville D, Banerjee R, Scott C, Carr P, Langford C, Cormier-Daire V, Munnich A, Carter NP and Colleaux L

    Journal of medical genetics 2005;42;2;166-71

  • SmartCapture and the frontiers of FISH technology: report of the Digital Scientific UK SmartCapture User's Meeting, Peterhouse College Cambridge, UK, 2nd September 2005.

    Reid AG and Gribble SM

    Digital Scientific UK, Sheraton House, Cambridge. alistair.reid@digitalscientific.co.uk

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2005;13;8;835-8

  • The Flag-2 locus, an ancestral gene cluster, is potentially associated with a novel flagellar system from Escherichia coli.

    Ren CP, Beatson SA, Parkhill J and Pallen MJ

    Bacterial Pathogenesis and Genomics Unit, Division of Immunity and Infection, Medical School, University of Birmingham, Birmingham, England B15 2TT.

    Escherichia coli K-12 possesses two adjacent, divergent, promoterless flagellar genes, fhiA-mbhA, that are absent from Salmonella enterica. Through bioinformatics analysis, we found that these genes are remnants of an ancestral 44-gene cluster and are capable of encoding a novel flagellar system, Flag-2. In enteroaggregative E. coli strain 042, there is a frameshift in lfgC that is likely to have inactivated the system in this strain. Tiling path PCR studies showed that the Flag-2 cluster is present in 15 of 72 of the well-characterized ECOR strains. The Flag-2 system resembles the lateral flagellar systems of Aeromonas and Vibrio, particularly in its apparent dependence on RpoN. Unlike the conventional Flag-1 flagellin, the Flag-2 flagellin shows a remarkable lack of sequence polymorphism. The Flag-2 gene cluster encodes a flagellar type III secretion system (including a dedicated flagellar sigma-antisigma combination), thus raising the number of distinct type III secretion systems in Escherichia/Shigella to five. The presence of the Flag-2 cluster at identical sites in E. coli and its close relative Citrobacter rodentium, combined with its absence from S. enterica, suggests that it was acquired by horizontal gene transfer after the former two species diverged from Salmonella. The presence of Flag-2-like gene clusters in Yersinia pestis, Yersinia pseudotuberculosis, and Chromobacterium violaceum suggests that coexistence of two flagellar systems within the same species is more common than previously suspected. The fact that the Flag-2 gene cluster was not discovered in the first 10 Escherichia/Shigella genome sequences studied emphasizes the importance of maintaining an energetic program of genome sequencing for this important taxonomic group.

    Journal of bacteriology 2005;187;4;1430-40

  • The Birc6 (Bruce) gene regulates p53 and the mitochondrial pathway of apoptosis and is essential for mouse embryonic development.

    Ren J, Shi M, Liu R, Yang QH, Johnson T, Skarnes WC and Du C

    Stowers Institute for Medical Research, Kansas City, MO 64110, USA.

    Baculoviral inhibitor of apoptosis repeat-containing (Birc)6 gene/BIRC6 (Bruce/APOLLON) encodes an inhibitor of apoptosis and a chimeric E2/E3 ubiquitin ligase in mammals. The physiological role of Bruce in antiapoptosis is unknown. Here, we show that deletion of the C-terminal half of Bruce, including the UBC domain, causes activation of caspases and apoptosis in the placenta and yolk sac, leading to embryonic lethality. This apoptosis is associated with up-regulation and nuclear localization of the tumor suppressor p53 and activation of mitochondrial apoptosis, which includes up-regulation of Bax, Bak, and Pidd, translocation of Bax and caspase-2 onto mitochondria, release of cytochrome c and apoptosis-inducing factor, and activation of caspase-9 and caspase-3. Mutant mouse embryonic fibroblasts are sensitive to multiple mitochondrial death stimuli but resistant to TNF. In addition, eliminating p53 by RNA interference rescues cell viability induced by Bruce ablation in human cell line H460. This viability preservation results from reduced expression of proapoptotic factors Bax, Bak, and Pidd and from prevention of activation of caspase-2, -9, and -3. The amount of second mitochondrial-derived activator of caspase and Omi does not change. We conclude that p53 is a downstream effector of Bruce, and, in response to loss of Bruce function, p53 activates Pidd/caspase-2 and Bax/Bak, leading to mitochondrial apoptosis.

    Proceedings of the National Academy of Sciences of the United States of America 2005;102;3;565-70

  • Prenatal diagnosis by array-CGH.

    Rickman L, Fiegler H, Carter NP and Bobrow M

    University of Cambridge Department of Medical Genetics, Addenbrooke's Hospital, Hills Road, Cambridge, CB2 2QQ, United Kingdom. lr2@sanger.ac.uk

    Microscopic karyotype analysis of cultured cells has been regarded as the gold standard for prenatal diagnosis for over 30 years. Since the first application of this technique to prenatal testing in the early 1970's, this procedure has proved to be highly reliable for identifying chromosome copy number abnormalities (aneuploidy) and large structural rearrangements in foetal cells obtained invasively by either amniocentesis or chorionic villus sampling (CVS). Recognising the need for more rapid testing methods which do not require cell culture, fluorescence in situ hybridisation (FISH) and quantitative fluorescence PCR (QF-PCR) have been introduced to this field in order to answer specific diagnostic questions. However, both FISH and QF-PCR suffer the disadvantage in that they are difficult to scale to a comprehensive, genome-wide screen. Array-comparative genomic hybridisation (array-CGH) in contrast is a comprehensive, genome-wide screening strategy for detecting DNA copy number imbalances which can be rapid, less labour-intensive than karyotype banding analysis and is highly amenable to automation. Array-CGH has the potential to be used for prenatal diagnosis and may address many of the limitations of both conventional microscopic cytogenetic analyses and the more recently employed rapid-screening strategies.

    European journal of medical genetics 2005;48;3;232-40

  • A member of the cAMP receptor protein family of transcription regulators in Mycobacterium tuberculosis is required for virulence in mice and controls transcription of the rpfA gene coding for a resuscitation promoting factor.

    Rickman L, Scott C, Hunt DM, Hutchinson T, Menéndez MC, Whalan R, Hinds J, Colston MJ, Green J and Buxton RS

    Division of Mycobacterial Research, National Institute for Medical Research, Mill Hill, London NW7 1AA, UK.

    Deletion of gene Rv3676 in Mycobacterium tuberculosis coding for a transcription factor belonging to the cAMP receptor protein (CRP) family caused growth defects in laboratory medium, in bone marrow-derived macrophages and in a mouse model of tuberculosis. Transcript profiling of M. tuberculosis grown in vitro identified 16 genes with significantly altered expression in the mutant compared with the wild type. Analysis of the DNA sequences upstream of the corresponding open reading frames revealed that 12 possessed sequences related to a consensus CRP binding site that could represent the sites of action of Rv3676. These included rpfA, lprQ, whiB1 and ahpC among genes with enhanced expression in the wild type, and Rv3616c-Rv3613c, Rv0188 and lipQ among genes exhibiting enhanced expression in the mutant. The activity of an rpfA::lacZ promoter fusion was lowered in the Rv3676 mutant and by mutation of the predicted Rv3676 binding site. Moreover, the product of Rv3676 (isolated as a TrxA fusion protein) interacted specifically with the rpfA promoter, and binding was inhibited by mutation of the Rv3676 site. Although Rv3676 retains four of the six amino acid residues that bind cAMP in Escherichia coli CRP addition of cAMP did not enhance Rv3676 binding at the rpfA promoter in vitro. In summary, it has been shown that Rv3676 is a direct regulator of rpfA expression, and because rpfA codes for a resuscitation promoting factor this may implicate Rv3676 in reactivation of dormant M. tuberculosis infections.

    Funded by: Medical Research Council: U.1175.02.002.00013 (85867)

    Molecular microbiology 2005;56;5;1274-86

  • Signature of recent historical events in the European Y-chromosomal STR haplotype distribution.

    Roewer L, Croucher PJ, Willuweit S, Lu TT, Kayser M, Lessig R, de Knijff P, Jobling MA, Tyler-Smith C and Krawczak M

    Institute of Legal Medicine, Humboldt-University, Berlin, Germany.

    Previous studies of human Y-chromosomal single-nucleotide polymorphisms (Y-SNPs) established a link between the extant Y-SNP haplogroup distribution and the prehistoric demography of Europe. By contrast, our analysis of seven rapidly evolving Y-chromosomal short tandem repeat loci (Y-STRs) in over 12,700 samples from 91 different locations in Europe reveals a signature of more recent historic events, not previously detected by other genetic markers. Cluster analysis based upon molecular variance yields two clearly identifiable sub-clusters of Western and Eastern European Y-STR haplotypes, and a diverse transition zone in central Europe, where haplotype spectra change more rapidly with longitude than with latitude. This and other observed patterns of Y-STR similarity may plausibly be related to particular historical incidents, including, for example, the expansion of the Franconian and Ottoman Empires. We conclude that Y-STRs may be capable of resolving male genealogies to an unparalleled degree and could therefore provide a useful means to study local population structure and recent demographic history.

    Funded by: Wellcome Trust: 057559

    Human genetics 2005;116;4;279-91

  • Characterization of the chicken C-type lectin-like receptors B-NK and B-lec suggests that the NK complex and the MHC share a common ancestral region.

    Rogers SL, Göbel TW, Viertlboeck BC, Milne S, Beck S and Kaufman J

    Institute for Animal Health, Compton, Berkshire, United Kingdom.

    The sequencing of the chicken MHC led to the identification of two open reading frames, designated B-NK and B-lec, that were predicted to encode C-type lectin domains. C-type lectin domains are not encoded in the MHC of any animal described to date; therefore, this observation was completely unexpected, particularly given that the chicken has a "minimal essential MHC." In this study, we describe the initial characterization of the B-NK and B-lec genes, and show that they share greatest homology with C-type lectin-like receptors encoded in the human NK complex (NKC), in particular NKR-P1 and lectin-like transcript 1 (LLT1), respectively. In common with NKR-P1 and LLT1, B-NK and B-lec are located next to each other and transcribed in opposite orientation. Like human NKR-P1, B-NK has a functional inhibitory signaling motif in the cytoplasmic tail and is expressed in NK cells. In contrast, B-lec contains an endocytosis motif in the cytoplasmic tail, and like LLT1, is an early activation Ag. Further analysis leads us to propose that there are four subgroups of C-type lectin-like receptors in the NKC, which arose as a result of duplication events. Moreover, this analysis suggests that the NKC may be considered a fifth paralogous region, and therefore shares an ancient common origin with the MHC. This provides evidence that C-type lectin-like receptors were present in the preduplication, primordial MHC region, and suggests that an original function of MHC molecules was for recognition by NK cell receptors encoded nearby.

    Journal of immunology (Baltimore, Md. : 1950) 2005;174;6;3475-83

  • More on: polymorphism and hemophilia A causing inversions in distal Xq28: a complex picture.

    Ross MT and Bentley DR

    Journal of thrombosis and haemostasis : JTH 2005;3;11;2600-1

  • The DNA sequence of the human X chromosome.

    Ross MT, Grafham DV, Coffey AJ, Scherer S, McLay K, Muzny D, Platzer M, Howell GR, Burrows C, Bird CP, Frankish A, Lovell FL, Howe KL, Ashurst JL, Fulton RS, Sudbrak R, Wen G, Jones MC, Hurles ME, Andrews TD, Scott CE, Searle S, Ramser J, Whittaker A, Deadman R, Carter NP, Hunt SE, Chen R, Cree A, Gunaratne P, Havlak P, Hodgson A, Metzker ML, Richards S, Scott G, Steffen D, Sodergren E, Wheeler DA, Worley KC, Ainscough R, Ambrose KD, Ansari-Lari MA, Aradhya S, Ashwell RI, Babbage AK, Bagguley CL, Ballabio A, Banerjee R, Barker GE, Barlow KF, Barrett IP, Bates KN, Beare DM, Beasley H, Beasley O, Beck A, Bethel G, Blechschmidt K, Brady N, Bray-Allen S, Bridgeman AM, Brown AJ, Brown MJ, Bonnin D, Bruford EA, Buhay C, Burch P, Burford D, Burgess J, Burrill W, Burton J, Bye JM, Carder C, Carrel L, Chako J, Chapman JC, Chavez D, Chen E, Chen G, Chen Y, Chen Z, Chinault C, Ciccodicola A, Clark SY, Clarke G, Clee CM, Clegg S, Clerc-Blankenburg K, Clifford K, Cobley V, Cole CG, Conquer JS, Corby N, Connor RE, David R, Davies J, Davis C, Davis J, Delgado O, Deshazo D, Dhami P, Ding Y, Dinh H, Dodsworth S, Draper H, Dugan-Rocha S, Dunham A, Dunn M, Durbin KJ, Dutta I, Eades T, Ellwood M, Emery-Cohen A, Errington H, Evans KL, Faulkner L, Francis F, Frankland J, Fraser AE, Galgoczy P, Gilbert J, Gill R, Glöckner G, Gregory SG, Gribble S, Griffiths C, Grocock R, Gu Y, Gwilliam R, Hamilton C, Hart EA, Hawes A, Heath PD, Heitmann K, Hennig S, Hernandez J, Hinzmann B, Ho S, Hoffs M, Howden PJ, Huckle EJ, Hume J, Hunt PJ, Hunt AR, Isherwood J, Jacob L, Johnson D, Jones S, de Jong PJ, Joseph SS, Keenan S, Kelly S, Kershaw JK, Khan Z, Kioschis P, Klages S, Knights AJ, Kosiura A, Kovar-Smith C, Laird GK, Langford C, Lawlor S, Leversha M, Lewis L, Liu W, Lloyd C, Lloyd DM, Loulseged H, Loveland JE, Lovell JD, Lozado R, Lu J, Lyne R, Ma J, Maheshwari M, Matthews LH, McDowall J, McLaren S, McMurray A, Meidl P, Meitinger T, Milne S, Miner G, Mistry SL, Morgan M, Morris S, Müller I, Mullikin JC, Nguyen N, Nordsiek G, Nyakatura G, O'Dell CN, Okwuonu G, Palmer S, Pandian R, Parker D, Parrish J, Pasternak S, Patel D, Pearce AV, Pearson DM, Pelan SE, Perez L, Porter KM, Ramsey Y, Reichwald K, Rhodes S, Ridler KA, Schlessinger D, Schueler MG, Sehra HK, Shaw-Smith C, Shen H, Sheridan EM, Shownkeen R, Skuce CD, Smith ML, Sotheran EC, Steingruber HE, Steward CA, Storey R, Swann RM, Swarbreck D, Tabor PE, Taudien S, Taylor T, Teague B, Thomas K, Thorpe A, Timms K, Tracey A, Trevanion S, Tromans AC, d'Urso M, Verduzco D, Villasana D, Waldron L, Wall M, Wang Q, Warren J, Warry GL, Wei X, West A, Whitehead SL, Whiteley MN, Wilkinson JE, Willey DL, Williams G, Williams L, Williamson A, Williamson H, Wilming L, Woodmansey RL, Wray PW, Yen J, Zhang J, Zhou J, Zoghbi H, Zorilla S, Buck D, Reinhardt R, Poustka A, Rosenthal A, Lehrach H, Meindl A, Minx PJ, Hillier LW, Willard HF, Wilson RK, Waterston RH, Rice CM, Vaudin M, Coulson A, Nelson DL, Weinstock G, Sulston JE, Durbin R, Hubbard T, Gibbs RA, Beck S, Rogers J and Bentley DR

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. mtr@sanger.ac.uk

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.

    Nature 2005;434;7031;325-37

  • Refining molecular analysis in the pathways of colorectal carcinogenesis.

    Rowan A, Halford S, Gaasenbeek M, Kemp Z, Sieber O, Volikos E, Douglas E, Fiegler H, Carter N, Talbot I, Silver A and Tomlinson I

    Molecular and Population Genetics Laboratory, London Research Institute, Cancer Research UK, 44 Lincoln's Inn Fields, London WC2A 3PX, UK.

    In the stepwise model, specific genetic and epigenetic changes accumulate as colorectal adenomas progress to carcinomas (CRCs). CRCs also acquire global phenotypes, particularly microsatellite instability (MSI) and aneuploidy/polyploidy (chromosomal instability, CIN). Few changes specific to MSI-low or CIN+ cancers have been established.

    Methods: We investigated 100 CRCs for: mutations and loss of heterozygosity (LOH) where appropriate, of APC, K-ras, BRAF, SMAD4, and p53; deletion on 5q around APC and 18q around SMAD4; total chromosomal-scale losses and gains; MSI; and CIN.

    Results: As expected, CIN- cancers had fewer chromosomal changes overall than CIN+ lesions, but after correcting for this, 5q deletions alone predicted CIN+ status. 5q deletions were not, however, significantly associated with APC mutations, which were equally frequent in CIN+ and CIN- tumors. We therefore found no evidence to show that mutant APC promotes CIN. p53 mutations/LOH were more common in CIN+ than CIN- lesions, and all chromosomal amplifications were in CIN+ tumors. CIN- cancers could be subdivided according to the total number of chromosomal-scale changes into CIN-low and CIN-stable groups; 18q deletion was the best predictor, being present in nearly all CIN-low lesions and almost no CIN-stable tumors. MSI-low was not associated with CIN, any specific mutation, a mutational signature, or clinicopathologic characteristic.

    Conclusions: Overall, the components of the stepwise model (APC, K-ras, and p53 mutations, plus 18q LOH) tended to co-occur randomly. We propose an updated version of this model comprising 4 pathways of CRC pathogenesis, on the basis of 5q/18q deletions, MSI (high/low), and CIN (high/low/stable).

    Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association 2005;3;11;1115-23

  • Sustained cadherin 23 expression in young and adult cochlea of normal and hearing-impaired mice.

    Rzadzinska AK, Derr A, Kachar B and Noben-Trauth K

    Section on Structural Cell Biology, Laboratory of Cellular Biology, National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Bethesda, MD 20892, USA.

    Cadherin 23 encodes a single-pass transmembrane protein with 27 extracellular cadherin-domains and localizes to stereocilia where it functions as an inter-stereocilia link. Cadherin 23-deficient mice show congenital deafness in combination with circling behavior as a result of organizational defects in the stereocilia hair bundle; common inbred mouse strains carrying the hypomorphic Cdh23(753A) allele are highly susceptible to sensorineural hearing loss. Here, we show that an antibody (N1086) directed against the intracellular carboxyterminus reacts specifically with cadherin 23 and detects with high sensitivity the isoform devoid of the peptide encoded by exon 68 (CDH23Delta68). Cochlea, vestibule, eye, brain and testis produce the CDH23Delta68 isoform in abundance and form moieties with different molecular weight due to variations in glycosylation content. In the cochlea, CDH23Delta68 expression is highest at postnatal day 1 (P1) and P7; expression is down regulated through P14 and P21 and persists at a low steady-state level throughout adulthood (P160). Furthermore, CDH23Delta68 expression levels in young and adult cochlea are similar among normal and hearing deficient strains (C3HeB/FeJ, C57BL/6J and BUB/BnJ). Finally, by immunofluorescence using an antibody (Pb240) specific for ectodomain 14, we show that cadherin 23 localizes to stereocilia during hair bundle development in late gestation and early postnatal days. Cadherin 23-specific labeling becomes weaker as the hair bundle matures but faint labeling concentrated near the top of stereocilia is still detectable at P35. No labeling of cochlea stereocilia was observed with N1086. In conclusion, our data describe a cadherin 23-specific antibody with high affinity to the CDH23Delta68 isoform, reveal a dynamic cochlea expression and localization profile and show sustained cadherin 23 levels in adult cochlea of normal and hearing-impaired mice.

    Hearing research 2005;208;1-2;114-21

  • Full-genome RNAi profiling of early embryogenesis in Caenorhabditis elegans.

    Sönnichsen B, Koski LB, Walsh A, Marschall P, Neumann B, Brehm M, Alleaume AM, Artelt J, Bettencourt P, Cassin E, Hewitson M, Holz C, Khan M, Lazik S, Martin C, Nitzsche B, Ruer M, Stamford J, Winzi M, Heinkel R, Röder M, Finell J, Häntsch H, Jones SJ, Jones M, Piano F, Gunsalus KC, Oegema K, Gönczy P, Coulson A, Hyman AA and Echeverri CJ

    Cenix BioScience GmbH, Tatzberg 47-51, D-01307 Dresden, Germany. soennichsen@cenix-bioscience.com

    A key challenge of functional genomics today is to generate well-annotated data sets that can be interpreted across different platforms and technologies. Large-scale functional genomics data often fail to connect to standard experimental approaches of gene characterization in individual laboratories. Furthermore, a lack of universal annotation standards for phenotypic data sets makes it difficult to compare different screening approaches. Here we address this problem in a screen designed to identify all genes required for the first two rounds of cell division in the Caenorhabditis elegans embryo. We used RNA-mediated interference to target 98% of all genes predicted in the C. elegans genome in combination with differential interference contrast time-lapse microscopy. Through systematic annotation of the resulting movies, we developed a phenotypic profiling system, which shows high correlation with cellular processes and biochemical pathways, thus enabling us to predict new functions for previously uncharacterized genes.

    Nature 2005;434;7032;462-9

  • Single haplotype analysis demonstrates rapid evolution of the killer immunoglobulin-like receptor (KIR) loci in primates.

    Sambrook JG, Bashirova A, Palmer S, Sims S, Trowsdale J, Abi-Rached L, Parham P, Carrington M and Beck S

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kindgom.

    The human killer immunoglobulin-like receptors (KIR) are encoded within the Leukocyte Receptor Complex (LRC) on chromosome 19q13.4. Here we report the comparative genomic analysis of single KIR haplotypes in two other primates. In the common chimpanzee (Pan troglodytes), seven KIR genes (ptKIRnewI, ptKIRnewII, ptKIR2DL5, ptKIRnewIII, ptKIR3DP1, ptKIR2DL4, ptKIR3DL1/2) have been identified, and five KIR genes (mmKIRnewI, mmKIR1D, mmKIR2DL4, mmKIR3DL10, mmKIR3DL1) are present in the haplotype sequenced for the rhesus macaque (Macaca mulatta). Additional cDNA analysis confirms the genes predicted from the genomic sequence and reveals the presence of a fifth novel KIR gene (mmKIRnewII) in the second haplotype of the rhesus macaque. While all known human haplotypes contain both activating and inhibitory KIR genes, only inhibitory KIR genes (characterized by long cytoplasmic tails) were found by in silico and cDNA analyses in the two primate haplotypes studied here. Comparison of the two human and the two non-human primate haplotypes demonstrates rapid diversification of the KIR gene family members, many of which have diverged in a species-specific manner. An analysis of the intronic regions of the two non-human primates reveals the presence of ancient repeat elements, which are indicative of the duplication events that have taken place since the last common ancestor.

    Funded by: NCI NIH HHS: N01-CO-12400

    Genome research 2005;15;1;25-35

  • A genome-wide survey of Major Histocompatibility Complex (MHC) genes and their paralogues in zebrafish.

    Sambrook JG, Figueroa F and Beck S

    Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 ISA, UK. js8@sanger.ac.uk

    Background: The genomic organisation of the Major Histocompatibility Complex (MHC) varies greatly between different vertebrates. In mammals, the classical MHC consists of a large number of linked genes (e.g. greater than 200 in humans) with predominantly immune function. In some birds, it consists of only a small number of linked MHC core genes (e.g. smaller than 20 in chickens) forming a minimal essential MHC and, in fish, the MHC consists of a so far unknown number of genes including non-linked MHC core genes. Here we report a survey of MHC genes and their paralogues in the zebrafish genome.

    Results: Using sequence similarity searches against the zebrafish draft genome assembly (Zv4, September 2004), 149 putative MHC gene loci and their paralogues have been identified. Of these, 41 map to chromosome 19 while the remaining loci are spread across essentially all chromosomes. Despite the fragmentation, a set of MHC core genes involved in peptide transport, loading and presentation are still found in a single linkage group.

    Conclusion: The results extend the linkage information of MHC core genes on zebrafish chromosome 19 and show the distribution of the remaining MHC genes and their paralogues to be genome-wide. Although based on a draft genome assembly, this survey demonstrates an essentially fragmented MHC in zebrafish.

    Funded by: Wellcome Trust

    BMC genomics 2005;6;152

  • GeneMCL in microarray analysis.

    Samuel Lattimore B, van Dongen S and Crabbe MJ

    School of Animal and Microbial Sciences, University of Reading, Whiteknights, Reading RG6 6AJ, UK.

    Accurately and reliably identifying the actual number of clusters present with a dataset of gene expression profiles, when no additional information on cluster structure is available, is a problem addressed by few algorithms. GeneMCL transforms microarray analysis data into a graph consisting of nodes connected by edges, where the nodes represent genes, and the edges represent the similarity in expression of those genes, as given by a proximity measurement. This measurement is taken to be the Pearson correlation coefficient combined with a local non-linear rescaling step. The resulting graph is input to the Markov Cluster (MCL) algorithm, which is an elegant, deterministic, non-specific and scalable method, which models stochastic flow through the graph. The algorithm is inherently affected by any cluster structure present, and rapidly decomposes a graph into cohesive clusters. The potential of the GeneMCL algorithm is demonstrated with a 5,730 gene subset (IGS) of the Van't Veer breast cancer database, for which the clusterings are shown to reflect underlying biological mechanisms.

    Computational biology and chemistry 2005;29;5;354-9

  • Inactivation of Drosophila Apaf-1 related killer suppresses formation of polyglutamine aggregates and blocks polyglutamine pathogenesis.

    Sang TK, Li C, Liu W, Rodriguez A, Abrams JM, Zipursky SL and Jackson GR

    Neurogenetics Program, Department of Neurology, Neuropsychiatric Institute, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA.

    Huntington's disease (HD) is caused by expansion of a polyglutamine tract near the N-terminal of huntingtin. Mutant huntingtin forms aggregates in striatum and cortex, where extensive cell death occurs. We used a Drosophila polyglutamine peptide model to assess the role of specific cell death regulators in polyglutamine-induced cell death. Here, we report that polyglutamine-induced cell death was dramatically suppressed in flies lacking Dark, the fly homolog of human Apaf-1, a key regulator of apoptosis. Dark appeared to play a role in the accumulation of polyglutamine-containing aggregates. Suppression of cell death, caspase activation and aggregate formation were also observed when mutant huntingtin exon 1 was expressed in homozygous dark mutant animals. Expanded polyglutamine induced a marked increase in expression of Dark, and Dark was observed to colocalize with ubiquitinated protein aggregates. Apaf-1 also was found to colocalize with huntingtin-containing aggregates in a murine model and HD brain, suggesting a common role for Dark/Apaf-1 in polyglutamine pathogenesis in invertebrates, mice and man. These findings suggest that limiting Apaf-1 activity may alleviate both pathological protein aggregation and neuronal cell death in HD.

    Funded by: NIA NIH HHS: AG012466; NIEHS NIH HHS: V45 ES012078; NIGMS NIH HHS: R01 GM072124-14A1; NINDS NIH HHS: NS002116

    Human molecular genetics 2005;14;3;357-72

  • Mutation in the transcriptional coactivator EYA4 causes dilated cardiomyopathy and sensorineural hearing loss.

    Schönberger J, Wang L, Shin JT, Kim SD, Depreux FF, Zhu H, Zon L, Pizard A, Kim JB, Macrae CA, Mungall AJ, Seidman JG and Seidman CE

    Harvard Medical School, Department of Genetics, 77 Avenue Louis Pasteur, Boston, Massachusetts 02115, USA.

    We identified a human mutation that causes dilated cardiomyopathy and heart failure preceded by sensorineural hearing loss (SNHL). Unlike previously described mutations causing dilated cardiomyopathy that affect structural proteins, this mutation deletes 4,846 bp of the human transcriptional coactivator gene EYA4. To elucidate the roles of eya4 in heart function, we studied zebrafish embryos injected with antisense morpholino oligonucleotides. Attenuated eya4 transcript levels produced morphologic and hemodynamic features of heart failure. To determine why previously described mutated EYA4 alleles cause SNHL without heart disease, we examined biochemical interactions of mutant Eya4 peptides. Eya4 peptides associated with SNHL, but not the shortened 193-amino acid peptide associated with dilated cardiomyopathy and SNHL, bound wild-type Eya4 and associated with Six proteins. These data define unrecognized and crucial roles for Eya4-Six-mediated transcriptional regulation in normal heart function.

    Nature genetics 2005;37;4;418-22

  • Progressive proximal expansion of the primate X chromosome centromere.

    Schueler MG, Dunn JM, Bird CP, Ross MT, Viggiano L, NISC Comparative Sequencing Program, Rocchi M, Willard HF and Green ED

    Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.

    Previous studies of the pericentromeric region of the human X chromosome short arm (Xp) revealed an age gradient from ancient DNA that contains expressed genes to recent human-specific DNA at the functional centromere. We analyzed the finished sequence of this human genomic region to investigate its evolutionary history. Phylogenetic analysis of >1,500 alpha-satellite monomers from the region revealed the presence of five physical domains, each containing monomers from a distinct phylogenetic clade. The most distal domain contains long interspersed nucleotide element repeats that were active >35 million years ago, whereas the four proximal domains contain more recently active long interspersed nucleotide element repeats. An out-of-register, unequal recombination (i.e., crossover) detected at the edge of the X chromosome-specific alpha-satellite array (DXZ1) may reflect the most recent of a series of punctuating events during evolution that resulted in a proximal physical expansion of the X centromere. The first 18 kb of this array has 97-99% pairwise identity among all 2-kb repeat units. To perform more detailed evolutionary comparisons, we sequenced the junction between the ancient DNA of Xp and the primate-specific alpha satellite in chimpanzee, gorilla, orangutan, vervet, macaque, and baboon. The striking conservation found in all cases supports the ancestral nature of the alpha satellite at this location. These studies demonstrate that the primate X centromere appears to have evolved through repeated expansion events occurring within the central, active region of centromeric DNA, with the newly added sequences then conferring centromere function.

    Proceedings of the National Academy of Sciences of the United States of America 2005;102;30;10563-8

  • Visualizing profile-profile alignment: pairwise HMM logos.

    Schuster-Böckler B and Bateman A

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. bsb@sanger.ac.uk

    Unlabelled: The availability of advanced profile-profile comparison tools, such as PRC or HHsearch demands sophisticated visualization tools not presently available. We introduce an approach built upon the concept of HMM logos. The method illustrates the similarities of pairs of protein family profiles in an intuitive way. Two HMM logos, one for each profile, are drawn one upon the other. The aligned states are then highlighted and connected.

    Availability: A web interface offering online creation of pairwise HMM logos is available at http://www.sanger.ac.uk/Software/analysis/logomat-p. Furthermore, software developers may download a Perl package that includes methods for creation of pairwise HMM logos locally.

    Contact: bsb@sanger.ac.uk.

    Funded by: Wellcome Trust

    Bioinformatics (Oxford, England) 2005;21;12;2912-3

  • Zebrafish notochordal basement membrane: signaling and structure.

    Scott A and Stemple DL

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.

    Current topics in developmental biology 2005;65;229-53

  • The V617F JAK2 mutation is uncommon in cancers and in myeloid malignancies other than the classic myeloproliferative disorders.

    Scott LM, Campbell PJ, Baxter EJ, Todd T, Stephens P, Edkins S, Wooster R, Stratton MR, Futreal PA and Green AR

    Funded by: Wellcome Trust

    Blood 2005;106;8;2920-1

  • Livelihood hazards.

    Sebaihia M, Thomson NR, Crossman L and Parkhill J

    Nature reviews. Microbiology 2005;3;4;278-9

  • Wolbachia variability and host effects on crossing type in Culex mosquitoes.

    Sinkins SP, Walker T, Lynd AR, Steven AR, Makepeace BL, Godfray HC and Parkhill J

    Department of Zoology, University of Oxford, Peter Medawar Building, South Parks Road, Oxford OX1 3SY, UK. steven.sinkins@zoo.ox.ac.uk

    Wolbachia is a common maternally inherited bacterial symbiont able to induce crossing sterilities known as cytoplasmic incompatibility (CI) in insects. Wolbachia-modified sperm are unable to complete fertilization of uninfected ova, but a rescue function allows infected eggs to develop normally. By providing a reproductive advantage to infected females, Wolbachia can rapidly invade uninfected populations, and this could provide a mechanism for driving transgenes through pest populations. CI can also occur between Wolbachia-infected populations and is usually associated with the presence of different Wolbachia strains. In the Culex pipiens mosquito group (including the filariasis vector C. quinquefasciatus) a very unusual degree of complexity of Wolbachia-induced crossing-types has been reported, with partial or complete CI that can be unidirectional or bidirectional, yet no Wolbachia strain variation was found. Here we show variation between incompatible Culex strains in two Wolbachia ankyrin repeat-encoding genes associated with a prophage region, one of which is sex-specifically expressed in some strains, and also a direct effect of the host nuclear genome on CI rescue.

    Funded by: Wellcome Trust

    Nature 2005;436;7048;257-60

  • Two ways to trap a gene in mice.

    Skarnes WC

    Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom. skarnes@sanger.ac.uk

    Proceedings of the National Academy of Sciences of the United States of America 2005;102;37;13001-2

  • MagicMatch--cross-referencing sequence identifiers across databases.

    Smith M, Kunin V, Goldovsky L, Enright AJ and Ouzounis CA

    Computational Genomics Group, The European Bioinformatics Institute, EMBL Cambridge Outstation, Cambridge CB10 1SD, UK.

    MOTIVATION: At present, mapping of sequence identifiers across databases is a daunting, time-consuming and computationally expensive process, usually achieved by sequence similarity searches with strict threshold values. SUMMARY: We present a rapid and efficient method to map sequence identifiers across databases. The method uses the MD5 checksum algorithm for message integrity to generate sequence fingerprints and uses these fingerprints as hash strings to map sequences across databases. The program, called MagicMatch, is able to cross-link any of the major sequence databases within a few seconds on a modest desktop computer.

    Bioinformatics (Oxford, England) 2005;21;16;3429-30

  • The new cytogenetics: blurring the boundaries with molecular biology.

    Speicher MR and Carter NP

    Institut für Humangenetik, Technische Universität München, Germany. speicher@humangenetik.med.tu-muenchen.de

    Exciting advances in fluorescence in situ hybridization and array-based techniques are changing the nature of cytogenetics, in both basic research and molecular diagnostics. Cytogenetic analysis now extends beyond the simple description of the chromosomal status of a genome and allows the study of fundamental biological questions, such as the nature of inherited syndromes, the genomic changes that are involved in tumorigenesis and the three-dimensional organization of the human genome. The high resolution that is achieved by these techniques, particularly by microarray technologies such as array comparative genomic hybridization, is blurring the traditional distinction between cytogenetics and molecular biology.

    Funded by: Wellcome Trust

    Nature reviews. Genetics 2005;6;10;782-92

  • Identification of pathogen-specific genes through microarray analysis of pathogenic and commensal Neisseria species.

    Stabler RA, Marsden GL, Witney AA, Li Y, Bentley SD, Tang CM and Hinds J

    Bacterial Microarray Group, St George's Hospital Medical School, London SW7 0RE, UK. Richard.Stabler@lshtm.ac.uk

    The release of the complete genome sequences of Neisseria meningitidis MC58 and Z2491 along with access to the sequences of N. meningitidis FAM18 and Neisseria gonorrhoeae FA1090 allowed the construction of a pan-Neisseria microarray, with every gene in all four genomes represented. The microarray was used to analyse a selection of strains including all N. meningitidis serogroups and commensal Neisseria species. For each strain, genes were defined as present, divergent or absent using gack analysis software. Comparison of the strains identified genes that were conserved within N. meningitidis serogroup B strains but absent from all commensal strains tested, consisting of mainly virulence-associated genes and transmissible elements. The microarray was able to distinguish between pilin genes, pilC orthologues and serogroup-specific capsule biosynthetic genes, and to identify dam and drg genotypes. Previously described N. meningitidis genes involved in iron response, adherence to epithelial cells, and pathogenicity were compared to the microarray analysis. The microarray data correlated with other genetic typing methods and were able to predict genotypes for uncharacterized strains and thus offer the potential for a rapid typing method. The subset of pathogen-specific genes identified represents potential drug or vaccine targets that would not eliminate commensal neisseriae and the associated naturally acquired immunity.

    Funded by: Wellcome Trust

    Microbiology (Reading, England) 2005;151;Pt 9;2907-22

  • Structure and function of the notochord: an essential organ for chordate development.

    Stemple DL

    Vertebrate Development and Genetics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK. ds4@sanger.ac.uk

    The notochord is the defining structure of the chordates, and has essential roles in vertebrate development. It serves as a source of midline signals that pattern surrounding tissues and as a major skeletal element of the developing embryo. Genetic and embryological studies over the past decade have informed us about the development and function of the notochord. In this review, I discuss the embryonic origin, signalling roles and ultimate fate of the notochord, with an emphasis on structural aspects of notochord biology.

    Funded by: Wellcome Trust

    Development (Cambridge, England) 2005;132;11;2503-12

  • A screen of the complete protein kinase gene family identifies diverse patterns of somatic mutations in human breast cancer.

    Stephens P, Edkins S, Davies H, Greenman C, Cox C, Hunter C, Bignell G, Teague J, Smith R, Stevens C, O'Meara S, Parker A, Tarpey P, Avis T, Barthorpe A, Brackenbury L, Buck G, Butler A, Clements J, Cole J, Dicks E, Edwards K, Forbes S, Gorton M, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jones D, Kosmidou V, Laman R, Lugg R, Menzies A, Perry J, Petty R, Raine K, Shepherd R, Small A, Solomon H, Stephens Y, Tofts C, Varian J, Webb A, West S, Widaa S, Yates A, Brasseur F, Cooper CS, Flanagan AM, Green A, Knowles M, Leung SY, Looijenga LH, Malkowicz B, Pierotti MA, Teh B, Yuen ST, Nicholson AG, Lakhani S, Easton DF, Weber BL, Stratton MR, Futreal PA and Wooster R

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

    We examined the coding sequence of 518 protein kinases, approximately 1.3 Mb of DNA per sample, in 25 breast cancers. In many tumors, we detected no somatic mutations. But a few had numerous somatic mutations with distinctive patterns indicative of either a mutator phenotype or a past exposure.

    Nature genetics 2005;37;6;590-2

  • Effect of deletion or overexpression of the 19-kilodalton lipoprotein Rv3763 on the innate response to Mycobacterium tuberculosis.

    Stewart GR, Wilkinson KA, Newton SM, Sullivan SM, Neyrolles O, Wain JR, Patel J, Pool KL, Young DB and Wilkinson RJ

    Center for Molecular Microbiology and Infection and Wellcome Trust Center for Research in Clinical Tropical Medicine, Imperial College London, UK.

    The 19-kDa lipoprotein of Mycobacterium tuberculosis is an important target of the innate immune response. To investigate the immune biology of this antigen in the context of the whole bacillus, we derived a recombinant M. tuberculosis H37Rv that lacked the 19-kDa-lipoprotein gene (Delta19) and complemented this strain by reintroduction of the 19-kDa-lipoprotein gene on a multicopy vector to produce Delta19::pSMT181. The Delta19 strain multiplied less well than Delta19::pSMT181 in human monocyte-derived macrophages (MDM) (P = 0.039). Surface expression of major histocompatibility complex class II molecules was reduced in phagocytes infected with M. tuberculosis; this effect was not seen in cells infected with Delta19. Delta19 induced lower interleukin 1beta (IL-1beta) secretion from monocytes and MDM. Overexpression of the 19-kDa protein increased IL-1beta, IL-12p40, and tumor necrosis factor alpha secretion irrespective of phagocyte maturity. These data support reports that the 19-kDa lipoprotein has pleiotropic effects on the interaction of M. tuberculosis with phagocytes. However, this analysis indicates that in the context of the whole bacillus, the 19-kDa lipoprotein is only one of a number of molecules that mediate the innate response to M. tuberculosis.

    Funded by: Wellcome Trust: 072070

    Infection and immunity 2005;73;10;6831-7

  • The genetics of regulatory variation in the human genome.

    Stranger BE and Dermitzakis ET

    The Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 1SA, UK. bes@sanger.ac.uk

    The regulation of gene expression plays an important role in complex phenotypes, including disease in humans. For some genes, the genetic mechanisms influencing gene expression are well elucidated; however, it is unclear how applicable these results are to gene expression on a genome-wide level. Studies in model organisms and humans have clearly documented gene expression variation among individuals and shown that a significant proportion of this variation has a genetic basis. Recent studies combine microarray surveys of gene expression for thousands of genes with dense marker maps, and are beginning to identify regions in the human genome that have functional effects on gene expression. This paper reviews recent developments and methodologies in this field, and discusses implications and future directions of this research in the context of understanding the influence of human genomic variation on the regulation of gene expression.

    Human genomics 2005;2;2;126-31

  • Nucleotide variation at the myrosinase-encoding locus, TGG1, and quantitative myrosinase enzyme activity variation in Arabidopsis thaliana.

    Stranger BE and Mitchell-Olds T

    Department of Genetics and Evolution, Max Planck Institute of Chemical Ecology, Jena, Germany. bes@sanger.ac.uk

    The Arabidopsis thaliana TGG1 gene encodes thioglucoside glucohydrolase (myrosinase), an enzyme catalysing the hydrolysis of glucosinolate compounds. The enzyme is involved in plant defence against some insect herbivores, and is present in species of the order Capparales (Brassicales). Nucleotide variation was surveyed by sequencing c. 2.4 kb of the TGG1 locus in a sample of 28 worldwide A. thaliana accessions, and one Arabidopsis lyrata ssp. lyrata individual. Myrosinase activity was quantified for 27 of these same A. thaliana accessions, plus five additional others. Overall, estimated nucleotide diversity in A. thaliana was low compared to other published A. thaliana surveys, and the frequency distribution was skewed toward an excess of low-frequency variants. Furthermore, comparison to the outgroup species A. lyrata demonstrated that A. thaliana exhibited an excess of high-frequency derived variants relative to a neutral equilibrium model, suggesting a selective sweep. A. thaliana accessions differed significantly in total myrosinase activity, but analysis of variance detected no statistical evidence for an association between quantitative enzyme activity and alleles at the TGG1 myrosinase-encoding locus. We thus conclude that other, unsurveyed factors primarily affect the observed myrosinase activity levels in this species. The pattern of nucleotide variation was consistent with a model of positive selection but might also be compatible with a completely neutral model that takes into account the metapopulation behaviour of this highly inbreeding species which experienced a relatively recent worldwide expansion.

    Molecular ecology 2005;14;1;295-309

  • Genome-wide associations of gene expression variation in humans.

    Stranger BE, Forrest MS, Clark AG, Minichiello MJ, Deutsch S, Lyle R, Hunt S, Kahl B, Antonarakis SE, Tavaré S, Deloukas P and Dermitzakis ET

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom.

    The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs) with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis-) to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I) HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.

    Funded by: NHGRI NIH HHS: HG02790, HG03229; NIGMS NIH HHS: GM065509; Wellcome Trust

    PLoS genetics 2005;1;6;e78

  • Role of morphogens in brain growth.

    Tannahill D, Harris LW and Keynes R

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.

    During the last century, experiments on the chick embryo established that the ballooning expansion of the early forebrain and midbrain vesicles is dependent on the underlying axial (notochordal) mesoderm. Transient separation of the early midbrain primordium from the notochord causes subsequent collapse of both midbrain and forebrain (telencephalic) vesicles, accompanied by pronounced folding of the neural epithelium. More recent experiments have shown that vesicle collapse is caused by defective Sonic Hedgehog (Shh) signaling from the notochord and floor plate. Separation of the notochord from the brain causes loss of ventral Shh expression, resulting in reduced cell proliferation and increased cell death in the expanding neural epithelium, and culminating in vesicle collapse. These experiments are reviewed here, and set in the context of other studies illustrating the wide range of molecular and cellular processes that cause abnormal brain morphogenesis when perturbed. We also speculate that variation in the regulation of signaling pathways such as Hedgehog may have played a significant part in generating rapid morphogenetic changes during the evolution of the vertebrate brain.

    Journal of neurobiology 2005;64;4;367-75

  • Construction of a 2-Mb resolution BAC microarray for CGH analysis of canine tumors.

    Thomas R, Scott A, Langford CF, Fosmire SP, Jubala CM, Lorentzen TD, Hitte C, Karlsson EK, Kirkness E, Ostrander EA, Galibert F, Lindblad-Toh K, Modiano JF and Breen M

    Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, North Carolina 27606, USA.

    Recognition of the domestic dog as a model for the comparative study of human genetic traits has led to major advances in canine genomics. The pathophysiological similarities shared between many human and dog diseases extend to a range of cancers. Human tumors frequently display recurrent chromosome aberrations, many of which are hallmarks of particular tumor subtypes. Using a range of molecular cytogenetic techniques we have generated evidence indicating that this is also true of canine tumors. Detailed knowledge of these genomic abnormalities has the potential to aid diagnosis, prognosis, and the selection of appropriate therapy in both species. We recently improved the efficiency and resolution of canine cancer cytogenetics studies by developing a small-scale genomic microarray comprising a panel of canine BAC clones representing subgenomic regions of particular interest. We have now extended these studies to generate a comprehensive canine comparative genomic hybridization (CGH) array that comprises 1158 canine BAC clones ordered throughout the genome with an average interval of 2 Mb. Most of the clones (84.3%) have been assigned to a precise cytogenetic location by fluorescence in situ hybridization (FISH), and 98.5% are also directly anchored within the current canine genome assembly, permitting direct translation from cytogenetic aberration to DNA sequence. We are now using this resource routinely for high-throughput array CGH and single-locus probe analysis of a range of canine cancers. Here we provide examples of the varied applications of this resource to tumor cytogenetics, in combination with other molecular cytogenetic techniques.

    Funded by: Wellcome Trust

    Genome research 2005;15;12;1831-7

  • Massive attack.

    Thomson N, Cerdeño-Tárraga A and Bentley S

    Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Nature reviews. Microbiology 2005;3;8;586-7

  • Brothers in arms.

    Thomson N, Holden M and Parkhill J

    Nature reviews. Microbiology 2005;3;2;100-1

  • The Chlamydophila abortus genome sequence reveals an array of variable proteins that contribute to interspecies variation.

    Thomson NR, Yeats C, Bell K, Holden MT, Bentley SD, Livingstone M, Cerdeño-Tárraga AM, Harris B, Doggett J, Ormond D, Mungall K, Clarke K, Feltwell T, Hance Z, Sanders M, Quail MA, Price C, Barrell BG, Parkhill J and Longbottom D

    The Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom. nrt@sanger.ac.uk

    The obligate intracellular bacterial pathogen Chlamydophila abortus strain S26/3 (formerly the abortion subtype of Chlamydia psittaci) is an important cause of late gestation abortions in ruminants and pigs. Furthermore, although relatively rare, zoonotic infection can result in acute illness and miscarriage in pregnant women. The complete genome sequence was determined and shows a high level of conservation in both sequence and overall gene content in comparison to other Chlamydiaceae. The 1,144,377-bp genome contains 961 predicted coding sequences, 842 of which are conserved with those of Chlamydophila caviae and Chlamydophila pneumoniae. Within this conserved Cp. abortus core genome we have identified the major regions of variation and have focused our analysis on these loci, several of which were found to encode highly variable protein families, such as TMH/Inc and Pmp families, which are strong candidates for the source of diversity in host tropism and disease causation in this group of organisms. Significantly, Cp. abortus lacks any toxin genes, and also lacks genes involved in tryptophan metabolism and nucleotide salvaging (guaB is present as a pseudogene), suggesting that the genetic basis of niche adaptation of this species is distinct from those previously proposed for other chlamydial species.

    Genome research 2005;15;5;629-40

  • Dynamic distribution of endoplasmic reticulum in hippocampal neuron dendritic spines.

    Toresson H and Grant SG

    Division of Neuroscience, College of Medicine and Veterinary Medicine, University of Edinburgh, 1 George Square, Edinburgh, UK. hakan.toresson@med.lu.se

    The role of the endoplasmic reticulum (ER) localized in dendritic spines has become a subject of intense interest because of its potential functions in local protein synthesis and signal transduction. Although it is recognized from electron microscopic studies that not all spines contain ER, little is know of its dynamic regulation or turnover. Here, we report a surprising degree of turnover of ER within spines. Using confocal microscopy imaging we observed continuity of spine-ER with dendritic ER in hippocampal primary neurons. Over 24 h, less than 50% of spine ER was stable. Despite this high degree of turn over, we identified a significant subset of spines that maintained ER for at least 4 days. These results indicate that within a single neuron, the organelle composition of a spine is unexpectedly dynamic and may explain aspects of the spine-to-spine variation in calcium spike magnitude and localized protein synthesis and trafficking.

    Funded by: Wellcome Trust

    The European journal of neuroscience 2005;22;7;1793-8

  • Mechanisms of Hedgehog gradient formation and interpretation.

    Torroja C, Gorfinkiel N and Guerrero I

    Centro de Biología Molecular Severo Ochoa, CSIC, Universidad Autónoma de Madrid, Cantoblanco, E-28049 Madrid, Spain.

    Morphogens are molecules that spread from localized sites of production, specifying distinct cell outcomes at different concentrations. Members of the Hedgehog (Hh) family of signaling molecules act as morphogens in different developmental systems. If we are to understand how Hh elicits multiple responses in a temporally and spatially specific manner, the molecular mechanism of Hh gradient formation needs to be established. Moreover, understanding the mechanisms of Hh signaling is a central issue in biology, not only because of the role of Hh in morphogenesis, but also because of its involvement in a wide range of human diseases. Here, we review the mechanisms affecting the dynamics of Hh gradient formation, mostly in the context of Drosophila wing development, although parallel findings in vertebrate systems are also discussed.

    Journal of neurobiology 2005;64;4;334-56

  • Abridged 5S rDNA units in sea beet (Beta vulgaris subsp. maritima).

    Turner DJ and Brown TA

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Amplification by polymerase chain reaction of the 5S rDNA repeat units of Beta vulgaris subsp. maritima resulted in a 350-bp product corresponding to the full-length 5S unit, but also revealed 4 abridged unit classes, each with a deletion that removed most of the spacer and 12-76 bp of the coding sequence. Each abridged type lacks at least 1 of the conserved elements involved in transcription of the 5S gene, and so appear to be nonfunctional. Network analysis revealed that the abridged units are evolving in the same manner as the full-length versions.

    Genome / National Research Council Canada = Génome / Conseil national de recherches Canada 2005;48;2;352-4

  • Don't mix radiocarbon and calendar years.

    Tyler-Smith C, Hurles ME and Jobling MA

    Funded by: Wellcome Trust: 057559

    Nature 2005;434;7034;697

  • Segmental trisomy of chromosome 17: a mouse model of human aneuploidy syndromes.

    Vacík T, Ort M, Gregorová S, Strnad P, Blatny R, Conte N, Bradley A, Bures J and Forejt J

    Institutes of Molecular Genetics and Physiology, Academy of Sciences of the Czech Republic, 14220 Prague, Czech Republic.

    Triplication of whole autosomes or large autosomal segments is detrimental to the development of a mammalian embryo. The trisomy of human chromosome (Chr) 21, known as Down's syndrome, is regularly associated with mental retardation and a variable set of other developmental anomalies. Several mouse models of Down's syndrome, triplicating 33-104 genes of Chr16, were designed in an attempt to analyze the contribution of specific orthologous genes to particular developmental features. However, a recent study challenged the concept of dosage-sensitive genes as a primary cause of an abnormal phenotype. To distinguish between the specific effects of dosage-sensitive genes and nonspecific effects of a large number of arbitrary genes, we revisited the mouse Ts43H/Ph segmental trisomy. It encompasses >310 known genes triplicated within the proximal 30 megabases (Mb) of Chr17. We refined the distal border of the trisomic segment to the interval bounded by bacterial artificial chromosomes RP23-277B13 (location 29.0 Mb) and Cbs gene (location 30.2 Mb). The Ts43H mice, viable on a mixed genetic background, exhibited spatial learning deficits analogous to those observed in Ts65Dn mice with unrelated trisomy. Quantitative analysis of the brain expression of 20 genes inside the trisomic interval and 12 genes lying outside on Chr17 revealed 1.2-fold average increase of mRNA steady-state levels of triplicated genes and 0.9-fold average down-regulation of genes beyond the border of trisomy. We propose that systemic comparisons of unrelated segmental trisomies, such as Ts65Dn and Ts43H, will elucidate the pathways leading from the triplicated sequences to the complex developmental traits.

    Proceedings of the National Academy of Sciences of the United States of America 2005;102;12;4500-5

  • Null and conditional semaphorin 3B alleles using a flexible puroDeltatk loxP/FRT vector.

    van der Weyden L, Adams DJ, Harris LW, Tannahill D, Arends MJ and Bradley A

    Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom.

    In neural development, Semaphorin 3B (SEMA3B) is thought to play a role in guiding axons by repulsion. In nonneuronal tissue, SEMA3B has been postulated to be a tumor suppressor gene of lung and breast cancer. Much of the understanding of the function of members of the SEMA3 family has come from targeted deletion of these genes in mice (Sema3A, Sema3C, and Sema3F). Thus, targeted deletion of Sema3B in mice would prove invaluable in dissecting out its functions. To allow for maximum gene-targeting flexibility, we developed a generic targeting vector, pFlexible, containing the positive/negative selectable marker puroDeltatk and loxP and FRT recombination sites, and used it to target Sema3B in ES cells. Flpe- and Cre-mediated recombination in vitro generated ES cell lines that contained a conditional or null Sema3B allele, respectively, which were established as homozygous alleles in mice. Analysis of Sema3B null mice showed they were viable, fertile, and displayed no overt pathological abnormalities, suggesting an inherent correction mechanism or level of redundancy between the class 3 semaphorins. This targeting vector system has broad applicability in any knockout experiment and provides a flexible approach for the generation of modified alleles in mice.

    Genesis (New York, N.Y. : 2000) 2005;41;4;171-8

  • The RASSF1A isoform of RASSF1 promotes microtubule stability and suppresses tumorigenesis.

    van der Weyden L, Tachibana KK, Gonzalez MA, Adams DJ, Ng BL, Petty R, Venkitaraman AR, Arends MJ and Bradley A

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.

    The RASSF1A isoform of RASSF1 is frequently inactivated by epigenetic alterations in human cancers, but it remains unclear if and how it acts as a tumor suppressor. RASSF1A overexpression reduces in vitro colony formation and the tumorigenicity of cancer cell lines in vivo. Conversely, RASSF1A knockdown causes multiple mitotic defects that may promote genomic instability. Here, we have used a genetic approach to address the function of RASSF1A as a tumor suppressor in vivo by targeted deletion of Rassf1A in the mouse. Rassf1A null mice were viable and fertile and displayed no pathological abnormalities. Rassf1A null embryonic fibroblasts displayed an increased sensitivity to microtubule depolymerizing agents. No overtly altered cell cycle parameters or aberrations in centrosome number were detected in Rassf1A null fibroblasts. Rassf1A null fibroblasts did not show increased sensitivity to microtubule poisons or DNA-damaging agents and showed no evidence of gross genomic instability, suggesting that cellular responses to genotoxins were unaffected. Rassf1A null mice showed an increased incidence of spontaneous tumorigenesis and decreased survival rate compared with wild-type mice. Irradiated Rassf1A null mice also showed increased tumor susceptibility, particularly to tumors associated with the gastrointestinal tract, compared with wild-type mice. Thus, our results demonstrate that Rassf1A acts as a tumor suppressor gene.

    Funded by: Wellcome Trust

    Molecular and cellular biology 2005;25;18;8356-67

  • Common VKORC1 and GGCX polymorphisms associated with warfarin dose.

    Wadelius M, Chen LY, Downes K, Ghori J, Hunt S, Eriksson N, Wallerman O, Melhus H, Wadelius C, Bentley D and Deloukas P

    Department of Medical Sciences, Clinical Pharmacology, University Hospital, Uppsala, Sweden. mia.wadelius@medsci.uu.se

    We report a novel combination of factors that explains almost 60% of variable response to warfarin. Warfarin is a widely used anticoagulant, which acts through interference with vitamin K epoxide reductase that is encoded by VKORC1. In the next step of the vitamin K cycle, gamma-glutamyl carboxylase encoded by GGCX uses reduced vitamin K to activate clotting factors. We genotyped 201 warfarin-treated patients for common polymorphisms in VKORC1 and GGCX. All the five VKORC1 single-nucleotide polymorphisms covary significantly with warfarin dose, and explain 29-30% of variance in dose. Thus, VKORC1 has a larger impact than cytochrome P450 2C9, which explains 12% of variance in dose. In addition, one GGCX SNP showed a small but significant effect on warfarin dose. Incorrect dosage, especially during the initial phase of treatment, carries a high risk of either severe bleeding or failure to prevent thromboembolism. Genotype-based dose predictions may in future enable personalised drug treatment from the start of warfarin therapy.

    The pharmacogenomics journal 2005;5;4;262-70

  • Vi antigen expression in Salmonella enterica serovar Typhi clinical isolates from Pakistan.

    Wain J, House D, Zafar A, Baker S, Nair S, Kidgell C, Bhutta Z, Dougan G and Hasan R

    Investigative Sciences, Centre for Molecular Microbiology and Infection, Imperial College, London, United Kingdom. jw5@sanger.ac.uk

    The accurate identification of Salmonella enterica subsp. enterica serovar Typhi variants that fail to express the capsular polysaccharide, Vi, is an important and much discussed issue for medical microbiology. We have tested a multiplex PCR method which shows the presence or absence of the genetic locus required for Vi expression. Of 2,222 Salmonella serovar Typhi clinical isolates collected from patients' blood over a 4-year period in a region of Pakistan where typhoid is endemic, 12 tested negative for Vi expression by serological agglutination. However, only 1 of these 12 was Vi negative by the multiplex PCR method. This result was confirmed by immunofluorescence, the most sensitive method for Vi characterization in Salmonella serovar Typhi. The multiplex PCR described therefore represents a simple and accurate method for surveillance for Vi-negative variants of Salmonella serovar Typhi in Pakistan. Testing of clinical isolates of Salmonella serovar Typhi, before subculture, from other regions where Vi-negative Salmonella serovar Typhi has been described should be carried out so that the impact of vaccination with purified Vi antigen on the levels of Vi-negative Salmonella serovar Typhi in bacterial populations can be assessed.

    Journal of clinical microbiology 2005;43;3;1158-65

  • Chicken TAP genes differ from their human orthologues in locus organisation, size, sequence features and polymorphism.

    Walker BA, van Hateren A, Milne S, Beck S and Kaufman J

    Institute For Animal Health, Compton, Nr Newbury, Berkshire, RG20 7NN, UK.

    We have previously shown that in the chicken major histocompatibility complex, the two transporters associated with antigen processing genes (TAP1 and TAP2) are located head to head between two classical class I genes. Here we show that the region between these two TAP genes has transcription factor-binding sites in common with class I gene promoters. The TAP genes are also up-regulated by interferon-gamma in a similar way to mammalian TAP genes and in a way that suggests they are both transcribed from a bi-directional promoter. The gene structures of TAP1 and TAP2 differ from that of human TAPs in that TAP1 has a truncated exon 1 and TAP2 has fused exons, resulting in a much smaller gene size. The truncation of TAP1 results in the loss of approximately 150 amino acids, which are thought to be involved in endoplasmic reticulum retention, heterodimer formation and tapasin binding, compared to human TAP1. Most of the protein sequence features involved in binding ATP are conserved, with two exceptions: chicken TAP1 has a glycine in the switch region where other TAPs have glutamine or histidine, and both chicken TAP genes have serines in the C motif where mammalian TAP2 has an alanine. Lastly, the chicken TAP genes are highly polymorphic, with at least as many TAP alleles as there are class I alleles, as seen by investigating nine inbred lines of chicken. The close proximity of the TAP genes to the class I genes and the high level of polymorphism may allow co-evolution of the genes, allowing TAP molecules to transport peptides specifically for the class I molecules of that haplotype.

    Immunogenetics 2005;57;3-4;232-47

  • Hirudo medicinalis: a platform for investigating genes in neural repair.

    Wang WZ, Emes RD, Christoffers K, Verrall J and Blackshaw SE

    Department of Human Anatomy & Genetics, South Parks Road, University of Oxford OX1 3QX, UK.

    We have used the nervous system of the medicinal leech as a preparation to study the molecular basis of neural repair. The leech central nervous system, unlike mammalian CNS, can regenerate to restore function, and contains identified nerve cells of known function and connectivity. We have constructed subtractive cDNA probes from whole and regenerating ganglia of the ventral nerve cord and have used these to screen a serotonergic Retzius neuron library. This identifies genes that are regulated as a result of axotomy, and are expressed by the Retzius cell. This approach identifies many genes, both novel and known. Many of the known genes identified have homologues in vertebrates, including man. For example, genes encoding thioredoxin (TRX), Rough Endoplasmic Reticulum Protein 1 (RER-1) and ATP synthase are upregulated at 24 h postinjury in leech nerve cord. To investigate the functional role of regulated genes in neuron regrowth we are using microinjection of antisense oligonucleotides in combination with horseradish peroxidase to knock down expression of a chosen gene and to assess regeneration in single neurons in 3-D ganglion culture. As an example of this approach we describe experiments to microinject antisense oligonucleotide to a leech isoform of the structural protein, Protein 4.1. Our approach thus identifies genes regulated at different times after injury that may underpin the intrinsic ability of leech neurons to survive damage, to initiate regrowth programs and to remake functional connections. It enables us to determine the time course of gene expression in the regenerating nerve cord, and to study the effects of gene knockdown in identified neurons regenerating in defined conditions in culture.

    Cellular and molecular neurobiology 2005;25;2;427-40

  • Autosomal location of genes from the conserved mammalian X in the platypus (Ornithorhynchus anatinus): implications for mammalian sex chromosome evolution.

    Waters PD, Delbridge ML, Deakin JE, El-Mogharbel N, Kirby PJ, Carvalho-Silva DR and Graves JA

    Comparative Genomics Group, Research Group of Biological Science, The Australian National University, Canberra, ACT, 2601, Australia. pwaters@sun.ac.za

    Mammalian sex chromosomes evolved from an ancient autosomal pair. Mapping of human X- and Y-borne genes in distantly related mammals and non-mammalian vertebrates has proved valuable to help deduce the evolution of this unique part of the genome. The platypus, a monotreme mammal distantly related to eutherians and marsupials, has an extraordinary sex chromosome system comprising five X and five Y chromosomes that form a translocation chain at male meiosis. The largest X chromosome (X1), which lies at one end of the chain, has considerable homology to the human X. Using comparative mapping and the emerging chicken database, we demonstrate that part of the therian X chromosome, previously thought to be conserved across all mammals, was lost from the platypus X1 to an autosome. This region included genes flanking the XIST locus, and also genes with Y-linked homologues that are important to male reproduction in therians. Since these genes lie on the X in marsupials and eutherians, and also on the homologous region of chicken chromosome 4, this represents a loss from the monotreme X rather than an additional evolutionary stratum of the human X.

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2005;13;4;401-10

  • Differentiating campomelic dysplasia from Cumming syndrome.

    Watiker V, Lachman RS, Wilcox WR, Barroso I, Schafer AJ and Scherer G

    Funded by: NICHD NIH HHS: 5P01 HD 22657

    American journal of medical genetics. Part A 2005;135;1;110-2

  • Validation of mRNA/EST-based gene predictions in human Xp11.4 revealed differences to the organization of the orthologous mouse locus.

    Wen G, Ramser J, Taudien S, Gausmann U, Blechschmidt K, Frankish A, Ashurst J, Meindl A and Platzer M

    Genome Analysis, Institute of Molecular Biotechnology, Beutenbergstr. 11, 07745, Jena, Germany.

    Careful manual annotation of the human reference sequence provides a solid basis for the identification of disease-associated genes. Toward this end, we focused on a medically relevant 2.6-Mb region of the human chromosome Xp11.4 between markers DXS9851 and DXS9751 and identified 16 transcription units according to the Vertebrate Genome Annotation (Vega) rules. In order to validate these annotations, we performed a comprehensive RT-PCR expression analysis and a human-mouse comparison. This revealed, despite the high overall genomic conservation of the region, remarkable differences of the gene content between human and mouse. Whereas 12 of 16 annotations were confirmed by RT-PCR in human tissues, for only seven genes mouse orthologs could be identified and found to be expressed. This indicates that a comprehensive and experimentally supported annotation effort of the human genome simultaneously highlights regions with striking differences in gene organization to other species and may indicate evolutionary events specific to the human lineage demanding further functional analyses.

    Mammalian genome : official journal of the International Mammalian Genome Society 2005;16;12;934-41

  • Slc11a1-mediated resistance to Salmonella enterica serovar Typhimurium and Leishmania donovani infections does not require functional inducible nitric oxide synthase or phagocyte oxidase activity.

    White JK, Mastroeni P, Popoff JF, Evans CA and Blackwell JM

    Wellcome Trust/MRC Building, University of Cambridge School of Clinical Medicine, Addenbrookes Hospital, Hills Road, Cambridge CB2 2XY, UK.

    Solute carrier family 11a member 1 (Slc11a1; formerly natural resistance-associated macrophage protein 1) encodes a late endosomal/lysosomal protein/divalent cation transporter, which regulates iron homeostasis in macrophages. During macrophage activation, Slc11a1 exerts pleiotropic effects on gene regulation and function, including generation of nitric oxide (NO) via inducible NO synthase (iNOS; encoded by Nos2A) and of reactive oxygen intermediates (ROI) via the phagocyte oxidase complex. As NO and ROI have potent antimicrobial activity in macrophages, it was assumed that their activities would contribute to Slc11a1-regulated innate resistance to Salmonella enterica serovar Typhimurium and Leishmania donovani. By intercrossing mice with gene disruptions at Nos2A and Cybb (encoding gp91phox, the heavy chain subunit of cytochrome b-245 and an essential component of phagocyte NADPH oxidase) onto equivalent Slc11a1 wild-type and mutant genetic backgrounds, we demonstrate that neither iNOS nor gp91phox activity is required for Slc11a1-mediated innate resistance to either infection. Functional gp91phox and iNOS are required to control S. enterica serovar Typhimurium in non-Slc11a1-regulated phases of infection. For L. donovani, an organ-specific requirement for iNOS to clear parasites from the spleen was observed at 50 days post-infection, but neither iNOS nor gp91phox influenced late-phase infection in the liver. This contrasted with Leishmania major infection, which caused rapid lesion growth and death in iNOS knockout mice and some exacerbation of disease with gp91phox deficiency. This highlights the adaptive differences in tissue and cellular tropisms between L. donovani and L. major and the different genes and mechanisms that regulate visceral versus cutaneous forms of the disease.

    Funded by: Wellcome Trust: 065642

    Journal of leukocyte biology 2005;77;3;311-20

  • Seasonally hibernating phenotype assessed through transcript screening.

    Williams DR, Epperson LE, Li W, Hughes MA, Taylor R, Rogers J, Martin SL, Cossins AR and Gracey AY

    School of Biological Sciences, University of Liverpool, United Kingdom.

    Hibernation is a seasonally entrained and profound phenotypic transition to conserve energy in winter. It involves significant biochemical reprogramming, although our understanding of the underpinning molecular events is fragmentary and selective. We have conducted a large-scale gene expression screen of the golden-mantled ground squirrel, Spermophilus lateralis, to identify transcriptional responses associated specifically with the summer-winter transition and the torpid-arousal transition in winter. We used 112 cDNA microarrays comprising 12,288 probes that cover at least 5,109 genes. In liver, the profiles of torpid and active states in the winter were almost identical, although we identified 102 cDNAs that were differentially expressed between winter and summer, 90% of which were downregulated in the winter states. By contrast, in cardiac tissue, 59 and 115 cDNAs were elevated in interbout arousal and torpor, respectively, relative to the summer active condition, but only 7 were common to both winter states, and during arousal none was downregulated. In brain, 78 cDNAs were found to change in winter, 44 of which were upregulated. Thus transcriptional changes associated with hibernation are qualitatively modest and, since these changes are generally less than twofold, also quantitatively modest. Unbiased Gene Ontology profiling of the transcripts suggests a winter switch to beta-oxidation of lipids in liver and heart, a reduction in metabolism of toxic compounds and the urea cycle in liver, and downregulated electron transport in the brain. We identified just one strongly winter-induced transcript common to all tissues, namely an RNA-binding protein, RBM3. This analysis clearly differentiates responses of the principal tissues, identifies a large number of new genes undergoing regulation, and broadens our understanding of affected cellular processes that, in part, account for the winter-adaptive hibernating phenotype.

    Physiological genomics 2005;24;1;13-22

  • Complete genome sequence and lytic phase transcription profile of a Coccolithovirus.

    Wilson WH, Schroeder DC, Allen MJ, Holden MT, Parkhill J, Barrell BG, Churcher C, Hamlin N, Mungall K, Norbertczak H, Quail MA, Price C, Rabbinowitsch E, Walker D, Craigon M, Roy D and Ghazal P

    Plymouth Marine Laboratory, Prospect Place, The Hoe, Plymouth, PL1 3DH, UK. whw@pml.ac.uk

    The genus Coccolithovirus is a recently discovered group of viruses that infect the globally important marine calcifying microalga Emiliania huxleyi. Among the 472 predicted genes of the 407,339-base pair genome are a variety of unexpected genes, most notably those involved in biosynthesis of ceramide, a sphingolipid known to induce apoptosis. Uniquely for algal viruses, it also contains six RNA polymerase subunits and a novel promoter, suggesting this virus encodes its own transcription machinery. Microarray transcriptomic analysis reveals that 65% of the predicted virus-encoded genes are expressed during lytic infection of E. huxleyi.

    Science (New York, N.Y.) 2005;309;5737;1090-2

  • Design, validation, and application of a seven-strain Staphylococcus aureus PCR product microarray for comparative genomics.

    Witney AA, Marsden GL, Holden MT, Stabler RA, Husain SE, Vass JK, Butcher PD, Hinds J and Lindsay JA

    Department of Cellular and Molecular Medicine, St. George's Hospital Medical School, London SW17 0RE, United Kingdom.

    Bacterial comparative genomics has been revolutionized by microarrays, but the power of any microarray is dependent on the number and diversity of gene reporters it contains. Staphylococcus aureus is an important human pathogen causing a wide range of invasive and toxin-mediated diseases, and more than 20% of the genome of any isolate consists of variable genes. Seven whole-genome sequences of S. aureus are available, and we exploited this rare opportunity to design, build, and validate a comprehensive, nonredundant PCR product microarray carrying reporters that represent every predicted open reading frame (3,623 probes). Such a comprehensive microarray necessitated a novel design strategy. Validation with the seven sequenced strains showed correct identification of 93.9% of genes present or absent/divergent but was dependent on the method of analysis chosen. Microarray data were highly reproducible, reducing the need for many replicate slides. Interpretation of microarray data was enhanced by focusing on the major areas of variation--the presence or absence of mobile genetic elements (MGEs). We compiled "composite genomes" of every individual MGE and visualized their distribution. This allowed the sensitive discrimination of related isolates, including the first clear description of how isolates of the same clone of epidemic methicillin-resistant S. aureus differ substantially in their carriage of MGEs. These MGEs carry virulence and resistance genes, suggesting differences in pathogenic potential. The novel methods of design and interpretation of data generated from this microarray will enable further studies of S. aureus evolution, epidemiology, and pathogenesis.

    Funded by: Wellcome Trust: 062511

    Applied and environmental microbiology 2005;71;11;7504-14

  • BRAF and NRAS mutations are uncommon in melanomas arising in diverse internal organs.

    Wong CW, Fan YS, Chan TL, Chan AS, Ho LC, Ma TK, Yuen ST, Leung SY and Cancer Genome Project

    Department of Pathology, University of Hong Kong, Queen Mary Hospital, Hong Kong.

    Background: Malignant melanoma arising from different body compartments may be associated with differing aetiological factors and clinical behaviour, and may manifest diverse molecular genetic profiles. Although many studies have focused on cutaneous melanoma, little is known of mucosal and other types of melanoma. In particular, malignant melanoma of soft parts is different from other melanomas in many respects, yet manifests a common melanocytic differentiation. Mutation of BRAF is now known to be common in cutaneous melanomas, and raises possible new therapeutic options of anti-RAF treatment for these patients. Few data are available for non-cutaneous melanomas.

    Aims: To study the incidence of BRAF and NRAS mutations in melanomas arising in diverse internal organs.

    Methods: Fifty one melanomas from various internal organs were investigated for BRAF and NRAS mutation by direct DNA sequencing.

    Results: BRAF and NRAS mutations were found in two and five mucosal melanomas arising from the aerodigestive and female genital tracts (n = 36). Their occurrence is mutually exclusive, giving a combined mutation incidence rate of 19.4% in mucosal melanomas. Both BRAF and NRAS mutations were absent in malignant melanoma of soft parts (n = 7). BRAF mutation was also absent in uveal melanoma (n = 6), but was seen in two of five cutaneous melanomas. The incidence of BRAF or combined BRAF/NRAS mutations in all non-cutaneous groups was significantly lower than published rates for cutaneous melanomas.

    Conclusion: Each melanoma subtype may have a unique oncogenetic pathway of tumour development, and only a small fraction of non-cutaneous melanomas may benefit from anti-RAF treatment.

    Funded by: Wellcome Trust

    Journal of clinical pathology 2005;58;6;640-4

  • Replication timing of human chromosome 6.

    Woodfine K, Beare DM, Ichimura K, Debernardi S, Mungall AJ, Fiegler H, Collins VP, Carter NP and Dunham I

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Genomic microarrays have been used to assess DNA replication timing in a variety of eukaryotic organisms. A replication timing map of the human genome has already been published at a 1Mb resolution. Here we describe how the same method can be used to assess the replication timing of chromosome 6 with a greater resolution using an array of overlapping tile path clones. We report the replication timing map of the whole of chromosome 6 in general, and the MHC region in particular. Positive correlations are observed between replication timing and a number of genomic features including GC content, repeat content and transcriptional activity.

    Cell cycle (Georgetown, Tex.) 2005;4;1;172-6

  • Investigating chromosome organization with genomic microarrays.

    Woodfine K, Carter NP, Dunham I and Fiegler H

    Department of Medical and Molecular Genetics, GKT School of Medicine, King's College London, London, SE1 9RT, UK.

    DNA microarrays are increasingly being used to investigate the functional role of chromatin. These studies are enhanced by the development of high-resolution arrays covering either the whole genome or specific regions of selected chromosomes with large insert clones, PCR products or oligonucleotides of around 100 bp or less. In combination with chromatin immunoprecipitation, this approach allows identification of protein binding for transcription factors, proteins involved in DNA replication and repair as well as sites of chromatin modification. Furthermore, by application of S phase fractions to genomic microarrays, replication timing can be estimated. Thus, microarrays can provide new information about chromosome structure and gene regulation.

    Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 2005;13;3;249-57

  • Heterogeneous duplications in patients with Pelizaeus-Merzbacher disease suggest a mechanism of coupled homologous and nonhomologous recombination.

    Woodward KJ, Cundall M, Sperle K, Sistermans EA, Ross M, Howell G, Gribble SM, Burford DC, Carter NP, Hobson DL, Garbern JY, Kamholz J, Heng H, Hodes ME, Malcolm S and Hobson GM

    Clinical and Molecular Genetics, Institute of Child Health, London.

    We describe genomic structures of 59 X-chromosome segmental duplications that include the proteolipid protein 1 gene (PLP1) in patients with Pelizaeus-Merzbacher disease. We provide the first report of 13 junction sequences, which gives insight into underlying mechanisms. Although proximal breakpoints were highly variable, distal breakpoints tended to cluster around low-copy repeats (LCRs) (50% of distal breakpoints), and each duplication event appeared to be unique (100 kb to 4.6 Mb in size). Sequence analysis of the junctions revealed no large homologous regions between proximal and distal breakpoints. Most junctions had microhomology of 1-6 bases, and one had a 2-base insertion. Boundaries between single-copy and duplicated DNA were identical to the reference genomic sequence in all patients investigated. Taken together, these data suggest that the tandem duplications are formed by a coupled homologous and nonhomologous recombination mechanism. We suggest repair of a double-stranded break (DSB) by one-sided homologous strand invasion of a sister chromatid, followed by DNA synthesis and nonhomologous end joining with the other end of the break. This is in contrast to other genomic disorders that have recurrent rearrangements formed by nonallelic homologous recombination between LCRs. Interspersed repetitive elements (Alu elements, long interspersed nuclear elements, and long terminal repeats) were found at 18 of the 26 breakpoint sequences studied. No specific motif that may predispose to DSBs was revealed, but single or alternating tracts of purines and pyrimidines that may cause secondary structures were common. Analysis of the 2-Mb region susceptible to duplications identified proximal-specific repeats and distal LCRs in addition to the previously reported ones, suggesting that the unique genomic architecture may have a role in nonrecurrent rearrangements by promoting instability.

    Funded by: NCRR NIH HHS: P20 RR-020173-01; NINDS NIH HHS: NS043783; Wellcome Trust

    American journal of human genetics 2005;77;6;966-87

  • Recent spread of a Y-chromosomal lineage in northern China and Mongolia.

    Xue Y, Zerjal T, Bao W, Zhu S, Lim SK, Shu Q, Xu J, Du R, Fu S, Li P, Yang H and Tyler-Smith C

    The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom.

    We have identified a Y-chromosomal lineage that is unusually frequent in northeastern China and Mongolia, in which a haplotype cluster defined by 15 Y short tandem repeats was carried by approximately 3.3% of the males sampled from East Asia. The most recent common ancestor of this lineage lived 590 +/- 340 years ago (mean +/- SD), and it was detected in Mongolians and six Chinese minority populations. We suggest that the lineage was spread by Qing Dynasty (1644-1912) nobility, who were a privileged elite sharing patrilineal descent from Giocangga (died 1582), the grandfather of Manchu leader Nurhaci, and whose documented members formed approximately 0.4% of the minority population by the end of the dynasty.

    Funded by: Wellcome Trust

    American journal of human genetics 2005;77;6;1112-6

  • Genome dynamics and diversity of Shigella species, the etiologic agents of bacillary dysentery.

    Yang F, Yang J, Zhang X, Chen L, Jiang Y, Yan Y, Tang X, Wang J, Xiong Z, Dong J, Xue Y, Zhu Y, Xu X, Sun L, Chen S, Nie H, Peng J, Xu J, Wang Y, Yuan Z, Wen Y, Yao Z, Shen Y, Qiang B, Hou Y, Yu J and Jin Q

    State Key Laboratory for Molecular Virology and Genetic Engineering, Chinese Ministry of Public Health, Beijing 100052, China.

    The Shigella bacteria cause bacillary dysentery, which remains a significant threat to public health. The genus status and species classification appear no longer valid, as compelling evidence indicates that Shigella, as well as enteroinvasive Escherichia coli, are derived from multiple origins of E.coli and form a single pathovar. Nevertheless, Shigella dysenteriae serotype 1 causes deadly epidemics but Shigella boydii is restricted to the Indian subcontinent, while Shigella flexneri and Shigella sonnei are prevalent in developing and developed countries respectively. To begin to explain these distinctive epidemiological and pathological features at the genome level, we have carried out comparative genomics on four representative strains. Each of the Shigella genomes includes a virulence plasmid that encodes conserved primary virulence determinants. The Shigella chromosomes share most of their genes with that of E.coli K12 strain MG1655, but each has over 200 pseudogenes, 300 approximately 700 copies of insertion sequence (IS) elements, and numerous deletions, insertions, translocations and inversions. There is extensive diversity of putative virulence genes, mostly acquired via bacteriophage-mediated lateral gene transfer. Hence, via convergent evolution involving gain and loss of functions, through bacteriophage-mediated gene acquisition, IS-mediated DNA rearrangements and formation of pseudogenes, the Shigella spp. became highly specific human pathogens with variable epidemiological and pathological features.

    Nucleic acids research 2005;33;19;6445-58

  • An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets.

    Zeggini E, Rayner W, Morris AP, Hattersley AT, Walker M, Hitman GA, Deloukas P, Cardon LR and McCarthy MI

    Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK. elez@well.ox.ac.uk

    A substantial investment has been made in the generation of large public resources designed to enable the identification of tag SNP sets, but data establishing the adequacy of the sample sizes used are limited. Using large-scale empirical and simulated data sets, we found that the sample sizes used in the HapMap project are sufficient to capture common variation, but that performance declines substantially for variants with minor allele frequencies of <5%.

    Nature genetics 2005;37;12;1320-2

  • The function and expansion of the Patched- and Hedgehog-related homologs in C. elegans.

    Zugasti O, Rajan J and Kuwabara PE

    University of Bristol, Department of Biochemistry, School of Medical Sciences, Bristol BS8 1TD, United Kingdom.

    The Hedgehog (Hh) signaling pathway promotes pattern formation and cell proliferation in Drosophila and vertebrates. Hh is a ligand that binds and represses the Patched (Ptc) receptor and thereby releases the latent activity of the multipass membrane protein Smoothened (Smo), which is essential for transducing the Hh signal. In Caenorhabditis elegans, the Hh signaling pathway has undergone considerable divergence. Surprisingly, obvious Smo and Hh homologs are absent whereas PTC, PTC-related (PTR), and a large family of nematode Hh-related (Hh-r) proteins are present. We find that the number of PTC-related and Hh-r proteins has expanded in C. elegans, and that this expansion occurred early in Nematoda. Moreover, the function of these proteins appears to be conserved in Caenorhabditis briggsae. Given our present understanding of the Hh signaling pathway, the absence of Hh and Smo raises many questions about the evolution and the function of the PTC, PTR, and Hh-r proteins in C. elegans. To gain insights into their roles, we performed a global survey of the phenotypes produced by RNA-mediated interference (RNAi). Our study reveals that these genes do not require Smo for activity and that they function in multiple aspects of C. elegans development, including molting, cytokinesis, growth, and pattern formation. Moreover, a subset of the PTC, PTR, and Hh-r proteins have the same RNAi phenotypes, indicating that they have the potential to participate in the same processes.

    Genome research 2005;15;10;1402-10

  • Global hypomethylation of the genome in XX embryonic stem cells.

    Zvetkova I, Apedaile A, Ramsahoye B, Mermoud JE, Crompton LA, John R, Feil R and Brockdorff N

    MRC Clinical Sciences Centre, ICFM, Hammersmith Hospital, DuCane Road, London, W12 ONN, UK.

    Embryonic stem (ES) cells are important tools in the study of gene function and may also become important in cell therapy applications. Establishment of stable XX ES cell lines from mouse blastocysts is relatively problematic owing to frequent loss of one of the two X chromosomes. Here we show that DNA methylation is globally reduced in XX ES cell lines and that this is attributable to the presence of two active X chromosomes. Hypomethylation affects both repetitive and unique sequences, the latter including differentially methylated regions that regulate expression of parentally imprinted genes. Methylation of differentially methylated regions can be restored coincident with elimination of an X chromosome in early-passage parthenogenetic ES cells, suggesting that selection against loss of methylation may provide the basis for X-chromosome instability. Finally, we show that hypomethylation is associated with reduced levels of the de novo DNA methyltransferases Dnmt3a and Dnmt3b and that ectopic expression of these factors restores global methylation levels.

    Nature genetics 2005;37;11;1274-9

* quick link - http://q.sanger.ac.uk/bhcwcxb2