Sanger Institute - Publications 1997

Number of papers published in 1997: 47

  • A 12-cistron Escherichia coli operon (hyf) encoding a putative proton-translocating formate hydrogenlyase system.

    Andrews SC, Berks BC, McClay J, Ambler A, Quail MA, Golby P and Guest JR

    Krebs Institute, Department of Molecular Biology & Biotechnology, University of Sheffield, UK. s.andrews@sheffield.ac.uk

    The nucleotide sequence has been determined for a twelve-gene operon of Escherichia coli designated the hyf operon (hyfABCDEFGHIR-focB). The hyf operon is located at 55.8-56.0 min and encodes a putative nine-subunit hydrogenase complex (hydrogenase four or Hyf), a potential formate- and sigma 54-dependent transcriptional activator, HyfR (related to FhlA), and a possible formate transporter, FocB (related to FocA). Five of the nine Hyf-complex subunits are related to subunits of both the E. coli hydrogenase-3 complex (Hyc) and the proton-translocating NADH:quinone oxidoreductases (complex I and Nuo), whereas two Hyf subunits are related solely to NADH:quinone oxidoreductase subunits. The Hyf components include a predicted 523 residue [Ni-Fe] hydrogenase (large subunit) with an N-terminus (residues 1-170) homologous to the 30 kDa or NuoC subunit of complex I. It is proposed that Hyf, in conjunction with formate dehydrogenase H (Fdh-H), forms a hitherto unrecognized respiration-linked proton-translocating formate hydrogenlyase (FHL-2). It is likely that HyfR acts as a formate-dependent regulator of the hyf operon and that FocB provides the Hyf complex with external formate as substrate.

    Funded by: Wellcome Trust

    Microbiology (Reading, England) 1997;143 ( Pt 11);3633-47

  • The chromosome 6 sequencing project at the Sanger Centre.

    Avis T, Clark EK, Flack TL, Mohammadi M, Milne S, Niblett D, Palmer S, Phillips S, Smalley C, Tagney M, Thorpe KL, Tubby B, Westhorp J and Beck S

    The Sanger Centre, Hinxton, England.

    Chromosome 6 is probably best known for encoding the major histocompatibility complex (MHC) which is essential to the human immune response. In addition, it has been shown to be associated with many diseases such Schizophrenia, Diabetes, Arthritis, Haemochromatosis, Narcolepsy, Epilepsy, Retinitis Pigmentosa, Deafness, Ovarian Cancer, and many more. Chromosome 6 is about 180 Mb in size and is estimated to encode around 3500 genes of which only about 10% are currently known. It is our aim to map, sequence and annotate the entire chromosome in close collaboration with the chromosome 6 community.

    DNA sequence : the journal of DNA sequencing and mapping 1997;8;3;131-5

  • Third single chromosome 6 workshop: meeting report.

    Beck S, Cann HM, Campbell RD, Dunham I, Inoko H, Jazwinska EC, Ragoussis J, Trowsdale J and Ziegler A

    The Sanger Centre, Cambridge, UK. beck@sanger.ac.uk

    DNA sequence : the journal of DNA sequencing and mapping 1997;8;3;113-29

  • The BRC repeats are conserved in mammalian BRCA2 proteins.

    Bignell G, Micklem G, Stratton MR, Ashworth A and Wooster R

    Section of Molecular Carcinogenesis, Institute of Cancer Research, Sutton, UK.

    The breast cancer susceptibility gene BRCA2 encodes a protein of 3418 amino acids which does not exhibit substantial sequence similarity to any other protein in the public databases. A dot matrix comparison of BRCA2 with itself revealed an eight times repeated motif in the segment of the protein encoded by exon 11. As a preliminary test of the hypothesis that these motifs are functionally significant, we have sequenced exon 11 of BRCA2 in six mammals. An alignment of the predicted protein sequences shows that, overall, the motifs have been conserved while much of the intervening sequences has diverged. These data support the notion that the BRC motifs are important in BRCA2 function. There is, however, considerable interspecies variation within certain motif units, raising the possibility of redundancy and that not all of the repeats are required for the normal function of BRCA2.

    Funded by: Wellcome Trust

    Human molecular genetics 1997;6;1;53-8

  • Dynamite: a flexible code generating language for dynamic programming methods used in sequence comparison.

    Birney E and Durbin R

    Sanger Centre, Cambridge, UK. birney@sanger.ac.uk

    We have developed a code generating language, called Dynamite, specialised for the production and subsequent manipulation of complex dynamic programming methods for biological sequence comparison. From a relatively simple text definition file Dynamite will produce a variety of implementations of a dynamic programming method, including database searches and linear space alignments. The speed of the generated code is comparable to hand written code, and the additional flexibility has proved invaluable in designing and testing new algorithms. An innovation is a flexible labelling system, which can be used to annotate the original sequences with biological information. We illustrate the Dynamite syntax and flexibility by showing definitions for dynamic programming routines (i) to align two protein sequences under the assumption that they are both poly-topic transmembrane proteins, with the simultaneous assignment of transmembrane helices and (ii) to align protein information to genomic DNA, allowing for introns and sequencing error.

    Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology 1997;5;56-64

  • The nucleotide sequence of Saccharomyces cerevisiae chromosome XIII.

    Bowman S, Churcher C, Badcock K, Brown D, Chillingworth T, Connor R, Dedman K, Devlin K, Gentles S, Hamlin N, Hunt S, Jagels K, Lye G, Moule S, Odell C, Pearson D, Rajandream M, Rice P, Skelton J, Walsh S, Whitehead S and Barrell B

    The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Systematic sequencing of the genome of Saccharomyces cerevisiae has revealed thousands of new predicted genes and allowed analysis of long-range features of chromosomal organization. Generally, genes and predicted genes seem to be distributed evenly throughout the genome, having no overall preference for DNA strand. Apart from the smaller chromosomes, which can have substantially lower gene density in their telomeric regions, there is a consistent average of one open reading frame (ORF) approximately every two kilobases. However, one of the most surprising findings for a eukaryote with approximately 6,000 genes was the amount of apparent redundancy in its genome. This redundancy occurs both between individual ORFs and over more extensive chromosome regions, which have been duplicated preserving gene order and orientation. Here we report the entire nucleotide sequence of chromosome XIII, the sixth-largest S. cerevisiae chromosome, and demonstrate that its features and organization are consistent with those observed for other S. cerevisiae chromosomes. Analysis revealed 459 ORFs, 284 have not been identified previously. Both intra- and interchromosomal duplications of regions of this chromosome have occurred.

    Funded by: Wellcome Trust

    Nature 1997;387;6632 Suppl;90-3

  • Detection of equine X chromosome abnormalities in equids using a horse X whole chromosome paint probe (WCPP).

    Breen M, Langford CF, Carter NP, Fischer PE, Marti E, Gerstenberg C, Allen WR, Lear TL and Binns MM

    Centre for Preventive Medicine, Animal Health Trust, Newmarket, Suffolk, U.K.

    Funded by: Wellcome Trust

    Veterinary journal (London, England : 1997) 1997;153;3;235-8

  • Population statistics of protein structures: lessons from structural classifications.

    Brenner SE, Chothia C and Hubbard TJ

    Structural Biology Centre, National Institute for Bioscience and Human-Technology, Ibaraki, Japan. brenner@hyper.stanford.edu

    Structural classifications aid the interpretation of proteins by describing degrees of structural and evolutionary relatedness. They have also recently revealed strikingly skewed distributions at all levels; for example, a small number of folds are far more common than others, and just a few superfamilies are known to have diverged widely. The classifications also provide an indication of the total number of superfamilies in nature.

    Funded by: Wellcome Trust

    Current opinion in structural biology 1997;7;3;369-76

  • The nucleotide sequence of Saccharomyces cerevisiae chromosome XVI.

    Bussey H, Storms RK, Ahmed A, Albermann K, Allen E, Ansorge W, Araujo R, Aparicio A, Barrell B, Badcock K, Benes V, Botstein D, Bowman S, Brückner M, Carpenter J, Cherry JM, Chung E, Churcher C, Coster F, Davis K, Davis RW, Dietrich FS, Delius H, DiPaolo T, Hani J et al.

    Department of Biology, McGill University, Montreal, Canada. hbussey@monod.biol.mcgill.ca

    The nucleotide sequence of the 948,061 base pairs of chromosome XVI has been determined, completing the sequence of the yeast genome. Chromosome XVI was the last yeast chromosome identified, and some of the genes mapped early to it, such as GAL4, PEP4 and RAD1 (ref. 2) have played important roles in the development of yeast biology. The architecture of this final chromosome seems to be typical of the large yeast chromosomes, and shows large duplications with other yeast chromosomes. Chromosome XVI contains 487 potential protein-encoding genes, 17 tRNA genes and two small nuclear RNA genes; 27% of the genes have significant similarities to human gene products, and 48% are new and of unknown biological function. Systematic efforts to explore gene function have begun.

    Funded by: Wellcome Trust

    Nature 1997;387;6632 Suppl;103-5

  • The nucleotide sequence of Saccharomyces cerevisiae chromosome IX.

    Churcher C, Bowman S, Badcock K, Bankier A, Brown D, Chillingworth T, Connor R, Devlin K, Gentles S, Hamlin N, Harris D, Horsnell T, Hunt S, Jagels K, Jones M, Lye G, Moule S, Odell C, Pearson D, Rajandream M, Rice P, Rowley N, Skelton J, Smith V, Barrell B et al.

    The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    Large-scale systematic sequencing has generally depended on the availability of an ordered library of large-insert bacterial or viral genomic clones for the organism under study. The generation of these large insert libraries, and the location of each clone on a genome map, is a laborious and time-consuming process. In an effort to overcome these problems, several groups have successfully demonstrated the viability of the whole-genome random 'shotgun' method in large-scale sequencing of both viruses and prokaryotes. Here we report the sequence of Saccharomyces cerevisiae chromosome IX, determined in part by a whole-chromosome 'shotgun', and describe the particular difficulties encountered in the random 'shotgun' sequencing of an entire eukaryotic chromosome. Analysis of this sequence shows that chromosome IX contains 221 open reading frames (ORFs), of which approximately 30% have been sequenced previously. This chromosome shows features typical of a small Saccharomyces cerevisiae chromosome.

    Funded by: Wellcome Trust

    Nature 1997;387;6632 Suppl;84-7

  • The organization of the gamma-glutamyl transferase genes and other low copy repeats in human chromosome 22q11.

    Collins JE, Mungall AJ, Badcock KL, Fay JM and Dunham I

    Sanger Centre, Hinxton, Cambs, UK. jec@sanger.ac.uk

    A clone map consisting of YACs, cosmids, and fosmids has been constructed covering low copy repeat regions of human chromosome 22q11. A combination of clone restriction digest analysis, single-copy landmark content analysis, HindIII-Sau3AI fingerprinting, and sequencing of PCR products derived from clones was required to resolve the map in this region. Seven repeat-containing contigs were placed in 22q11, five containing gamma-glutamyl transferase (GGT) sequences described previously. In one case, a single interval at the resolution of the YAC map was shown to contain at least three GGT sequences after higher resolution mapping. The sequence information was used to design a rapid PCR/restriction digest technique that distinguishes the GGT loci placed in the YAC map. This approach has allowed us to resolve the previous cDNA and mapping information relating to GGT and link it to the physical map of 22q11.

    Genome research 1997;7;5;522-31

  • Cloning, chromosomal mapping and expression pattern of the mouse Brca2 gene.

    Connor F, Smith A, Wooster R, Stratton M, Dixon A, Campbell E, Tait TM, Freeman T and Ashworth A

    CRC Centre for Cell and Molecular Biology, Chester Beatty Laboratories, Institute of Cancer Research, London, UK.

    A proportion of human breast cancers result from an inherited predisposition to the disease. Mutations in the BRCA2 gene confer a high risk of breast cancer and are responsible for almost half of these cases. The recent cloning of the human BRCA2 gene has revealed that it encodes a large protein having little significant homology to known proteins. Here we describe the mouse Brca2 gene. The gene maps to mouse chromosome 5, consistent with its location on human chromosome 13q12. We have sequenced cDNA for the entire 3329 amino acid Brca2 protein and this has revealed that, like Brca1, Brca2 is relatively poorly conserved between humans and mice. Brca2 is transcribed in a diverse range of mouse tissues, and the pattern of expression is strikingly similar to that of Brca1. Taken together, our data highlight some intriguing similarities between two genes involved in inherited breast cancer susceptibility.

    Funded by: Wellcome Trust

    Human molecular genetics 1997;6;2;291-300

  • Sequence of the human immunoglobulin diversity (D) segment locus: a systematic analysis provides no evidence for the use of DIR segments, inverted D segments, "minor" D segments or D-D recombination.

    Corbett SJ, Tomlinson IM, Sonnhammer EL, Buck D and Winter G

    MRC Centre For Protein Engineering, Hills Road, Cambridge, CB2 2QH, U.K.

    We have determined the complete nucleotide sequence of the human immunoglobulin D segment locus on chromosome 14q32.3 and identified a total of 27 D segments, of which nine are new. Comparison with a database of rearranged heavy chain sequences indicates that the human antibody repertoire is created by VDJ recombination involving 25 of these 27 D segments, extensive processing at the V-D and D-J junctions and use of multiple reading frames. We could find no evidence for the proposed use of DIR segments, inverted D segments, "minor" D segments or D-D recombination. Conventional VDJ recombination, which obeys the 12/23 rule, is therefore sufficient to explain the wealth of lengths and sequences for the third hypervariable loop of human heavy chains.

    Funded by: Wellcome Trust

    Journal of molecular biology 1997;270;4;587-97

  • Characterization of recombination in the HLA class II region.

    Cullen M, Noble J, Erlich H, Thorpe K, Beck S, Klitz W, Trowsdale J and Carrington M

    Intramural Research Support Program, SAIC Frederick, National Cancer Institute--Frederick Cancer Research and Development Center, MD 21702, USA.

    Studies of linkage disequilibrium across the HLA class II region have been useful in predicting where recombination is most likely to occur. The strong associations between genes within the 85-kb region from DQB1 to DRB1 are consistent with low frequency of recombination in this segment of DNA. Conversely, a lack of association between alleles of TAP1 and TAP2 (approximately 15 kb) has been observed, suggesting that recombination occurs here with relatively high frequency. Much of the HLA class II region has now been sequenced, providing the tools to undertake detailed analysis of recombination. Twenty-seven families containing one or two recombinant chromosomes within the 500-kb interval between the DPB1 and DRB1 genes were used to determine patterns of recombination across this region. SSCP analysis and microsatellite typing yielded identification of 127 novel polymorphic markers distributed throughout the class II region, allowing refinement of the site of crossover in 30 class II recombinant chromosomes. The three regions where recombination was observed most frequently are as follows: the 45-kb interval between HLA-DNA and RING3 (11 cases), the 50-kb interval between DQB3 and DQB1 (6 cases), and an 8.8-kb segment of the TAP2 gene (3 cases). Six of the 10 remaining recombinants await further characterization, pending identification of additional informative markers, while four recombinants were localized to other intervals (outliers). Analysis of association between markers flanking HLA-DNA to RING3 (45 kb), as well as TAP1 to TAP2 (15 kb), by use of independent CEPH haplotypes indicated little or no linkage disequilibrium, supporting the familial recombination data. A notable sequence motif located within a region associated with increased rates of recombination consisted of a (TGGA)12 tandem repeat within the TAP2 gene.

    Funded by: NIGMS NIH HHS: GM35326; Wellcome Trust

    American journal of human genetics 1997;60;2;397-407

  • Identification and characterization of a G protein-coupled receptor homolog encoded by murine cytomegalovirus.

    Davis-Poynter NJ, Lynch DM, Vally H, Shellam GR, Rawlinson WD, Barrell BG and Farrell HE

    Department of Microbiology, University of Western Australia, Queen Elizabeth II Medical Centre, Nedlands, Australia. njdp@uniwa.uwa.edu.au

    This report describes the identification of a murine cytomegalovirus (MCMV) G protein-coupled receptor (GCR) homolog. This open reading frame (M33) is most closely related to, and collinear with, human cytomegalovirus UL33, and homologs are also present in human herpesvirus 6 and 7 (U12 for both viruses). Conserved counterparts in the sequenced alpha- or gammaherpesviruses have not been identified to date, suggesting that these genes encode proteins which are important for the biological characteristics of betaherpesviruses. We have detected transcripts for both UL33 and M33 as early as 3 or 4 h postinfection, and these reappear at late times. In addition, we have identified N-terminal splicing for both the UL33 and M33 RNA transcripts. For both open reading frames, splicing results in the introduction of amino acids which are highly conserved among known GCRs. To characterise the function of the M33 in the natural host, two independent MCMV recombinant viruses were prepared, each of which possesses an M33 open reading frame which has been disrupted with the beta-galactosidase gene. While the recombinant M33 null viruses showed no phenotypic differences in replication from wild-type MCMV in primary mouse embryo fibroblasts in vitro, they showed severely restricted growth in the salivary glands of infected mice. These data suggest that M33 plays an important role in vivo, in particular in the dissemination to or replication in the salivary gland, and provide the first evidence for the function of a viral GCR homolog in vivo.

    Funded by: Wellcome Trust

    Journal of virology 1997;71;2;1521-9

  • Expression of the dystrophin-related protein 2 (Drp2) transcript in the mouse.

    Dixon AK, Tait TM, Campbell EA, Bobrow M, Roberts RG and Freeman TC

    Human Genetics Group, The Sanger Centre, Wellcome Genome Campus, Hinxton, CB10 1SA, U.K.

    We have recently characterised a new member of the dystrophin gene family, DRP2, and its murine counterpart, Drp2, which encode dystrophin-related protein 2 (DRP2). DRP2 is predicted to resemble certain short C-terminal isoforms of dystrophin and dystrophin-related protein 1 (DRP1 or utrophin). We describe here a comprehensive survey of Drp2 expression in the mouse by RT-PCR, and compare the expression profile of Drp2 with that of the related genes Dmd, Drp1 and Dag1 that encode all the known isoforms of dystrophin, DRP1/utrophin and a component of the dystrophin-associated protein complex, dystroglycan, respectively. Drp2 was shown to be expressed throughout the central nervous system (CNS) and in several peripheral tissues including the eye, kidney, teeth, oesophagus, colon, epididymis and ovary. The expression of Drp2 in the CNS was then further defined by in situ hybridization. Overall, the pattern of Drp2 expression corresponds to a subset of the brain regions known to express Dag1, and shows substantial overlap with regions that express various isoforms of dystrophin (particularly in the cerebral cortex, hippocampus and cerebellum). These data define the distribution of Drp2 expression in the mouse, and raise the possibility that in the CNS it may be an important component in neuronal dystrophin-associated complexes.

    Funded by: Wellcome Trust

    Journal of molecular biology 1997;270;4;551-8

  • Sequence comparison of human and yeast telomeres identifies structurally distinct subtelomeric domains.

    Flint J, Bates GP, Clark K, Dorman A, Willingham D, Roe BA, Micklem G, Higgs DR and Louis EJ

    Institute of Molecular Medicine, John Radcliffe Hospital, Headington, Oxford, UK. jf@worf.molbiol.ox.ac.uk

    We have sequenced and compared DNA from the ends of three human chromosomes: 4p, 16p and 22q. In all cases the pro-terminal regions are subdivided by degenerate (TTAGGG)n repeats into distal and proximal sub-domains with entirely different patterns of homology to other chromosome ends. The distal regions contain numerous, short (<2 kb) segments of interrupted homology to many other human telomeric regions. The proximal regions show much longer (approximately 10-40 kb) uninterrupted homology to a few chromosome ends. A comparison of all yeast subtelomeric regions indicates that they too are subdivided by degenerate TTAGGG repeats into distal and proximal sub-domains with similarly different patterns of identity to other non-homologous chromosome ends. Sequence comparisons indicate that the distal and proximal sub-domains do not interact with each other and that they interact quite differently with the corresponding regions on other, non-homologous, chromosomes. These findings suggest that the degenerate TTAGGG repeats identify a previously unrecognized, evolutionarily conserved boundary between remarkably different subtelomeric domains.

    Funded by: Wellcome Trust

    Human molecular genetics 1997;6;8;1305-13

  • The relationship between chromosome structure and function at a human telomeric region.

    Flint J, Thomas K, Micklem G, Raynham H, Clark K, Doggett NA, King A and Higgs DR

    MRC Molecular Haematology Unit, John Radcliffe Hospital, Headington, Oxford, UK.

    We have sequenced a contiguous 284,495-bp segment of DNA extending from the terminal (TTAGGG)n repeats of the short arm of chromosome 16, providing a full description of the transition from telomeric through subtelomeric DNA to sequences that are unique to the chromosome. To complement and extend analysis of the primary sequence, we have characterized mRNA transcripts, patterns of DNA methylation and DNase I sensitivity. Together with previous data these studies describe in detail the structural and functional organization of a human telomeric region.

    Funded by: Wellcome Trust

    Nature genetics 1997;15;3;252-7

  • Immunoglobulin lambda light chain orphons on human chromosome 8q11.2.

    Frippiat JP, Dard P, Marsh S, Winter G and Lefranc MP

    Centre for Protein Engineering, MRC, Cambridge, GB. frippiat@scbiol.u-nancy.fr

    We have identified two V lambda genes outside the major lambda locus on chromosome 22q11.2, and shown that they reside on chromosome 8q11.2. One gene (Orphée1), hybridizing strongly to the V lambda probes, was sequenced and found to belong to the V lambda 8 family; the other gene (Orphée2) only hybridized weakly. Orphée1 was present in all individuals tested (140) from three different populations, and was also found in gorillas. We envisage that these genes were generated by duplication and translocation of the V lambda 8a gene (and a V lambda pseudogene) from the major locus, and that this event occurred before the evolutionary divergence of humans and gorillas. As there is no other evidence for V lambda genes outside the major locus, it appears that the human lambda locus has undergone considerably less evolutionary shuffling than either the human light chain kappa locus or the heavy chain locus.

    Funded by: Wellcome Trust

    European journal of immunology 1997;27;5;1260-5

  • Role of DNA mismatch repair in the cytotoxicity of ionizing radiation.

    Fritzell JA, Narayanan L, Baker SM, Bronner CE, Andrew SE, Prolla TA, Bradley A, Jirik FR, Liskay RM and Glazer PM

    Department of Therapeutic Radiology, Yale University School of Medicine, New Haven, Connecticut 06520-8040, USA.

    The DNA mismatch repair (MMR) system in mammalian cells not only serves to correct base mispairs and other replication errors, but it also influences the cellular response to certain forms of DNA damage. Cells that are deficient in MMR are relatively resistant to alkylation damage because, in wild-type cells, the MMR system is thought to promote toxicity via futile repair of alkylated mispairs. Conversely, MMR-deficient cells are sensitive to UV light, possibly due to the requirement for MMR factors in transcription-coupled repair of active genes. MMR deficiency has been associated with familial and sporadic carcinomas of the colon and other sites, and so, we sought to determine the influence of MMR status on cellular response to ionizing radiation, an agent commonly used for cancer therapy. Fibroblast cell lines were established from transgenic mice carrying targeted disruptions of one of three MMR genes in mammalian cells: Pms2, Mlh1, or Msh2. In comparison to wild-type cell lines from related mice, the Pms2-, Mlh1-, or Msh2-nullizygous cell lines were found to exhibit higher levels of clonogenic survival following exposure to ionizing radiation. Because ionizing radiation generates a variety of lesions in DNA, the differences in survival may reflect a role for MMR in processing a subset of these lesions, such as damaged bases. These results both identify a new class of DNA-damaging agents whose effects are modulated by the MMR system and may help to elucidate pathways of radiation response in cancer cells.

    Funded by: NIEHS NIH HHS: ES05775; NIGMS NIH HHS: GM32741, GM45413

    Cancer research 1997;57;22;5143-7

  • Assignment of the human stress-activated protein kinase-3 gene (SAPK3) to chromosome 22q13.3 by fluorescence in situ hybridization.

    Goedert M, Hasegawa J, Craxton M, Leversha MA and Clegg S

    MRC Laboratory of Molecular Biology, Cambridge, United Kingdom.

    Funded by: Wellcome Trust

    Genomics 1997;41;3;501-2

  • Genome mapping by fluorescent fingerprinting.

    Gregory SG, Howell GR and Bentley DR

    The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK. sgg@sanger.ac.uk

    The construction of sequence-ready maps of overlapping genomic clones is central to large-scale genome sequencing. We have implemented a method for fluorescent fingerprinting of bacterial clones to assemble contig maps. The method utilizes three spectrally distinct fluorescently tagged dideoxy ATPs to specifically label the HindIII termini in HindIII and Sau3AI restriction digests of clones that are multiplexed prior to electrophoresis and data collection. There is excellent reproducibility of raw data, improved resolution of large fragments, and concordance between the results obtained using this and the equivalent radioactive protocol. This method also allows detection of smaller overlaps between clones when compared to the analysis of restriction digests on nondenaturing agarose gels.

    Genome research 1997;7;12;1162-8

  • Characterization of the mouse beta-prime adaptin gene; cDNA sequence, genomic structure, and chromosomal localization.

    Guilbaud C, Peyrard M, Fransson I, Clifton SW, Roe BA, Carter NP and Dumanski JP

    Department of Molecular Medicine, Karolinska Hospital, CMM-building L8:00, S-171 76 Stockholm, Sweden.

    Adaptins are important subunits of heterotetrameric complexes called adaptors, which participate in the clathrin-coated, vesicle-mediated endocytosis and intracellular receptor transport. The gene family of adaptins is divided into three classes, alpha, beta, and gamma, with further subdivision into beta- and beta-prime components. Two beta-prime adaptins, the rat AP105a and the human BAM22, have previously been characterized. The BAM22 gene is located on human Chromosome (Chr) 22q12 and can be considered a candidate meningioma tumor suppressor gene. We report here the characterization of the mouse ortholog of the BAM22 gene, and we suggest the name adtb1 for the mouse gene. Like the BAM22 gene, the adtb1 transcript is highly and ubiquitously expressed. We provide 3885-bp cDNA sequence, which entirely covers the open reading frame of the adtb1, capable of encoding a protein of 943 amino acids. The adtb1 protein is highly conserved (>96% identity) when compared with AP105a and BAM22 proteins. We also report the genomic organization of adtb1, which is similar to the BAM22 gene. The adtb1 gene has been assigned to mouse Chr 11, band 11A2, which confirms the synteny between human Chr 22q12 and mouse Chr 11.

    Mammalian genome : official journal of the International Mammalian Genome Society 1997;8;9;651-6

  • New horizons in sequence analysis.

    Hubbard TJ

    Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK. th@sanger.ac.uk

    An ever increasing number of protein sequences are being compared, partly because of the availability of full sets of protein sequences from several completed genome-sequencing projects. The resulting problem of scale has shifted the emphasis of sequence analysis method development from sensitivity and flexibility, which relies on manual intervention and interpretation, to the automatic generation of results of known reliability.

    Funded by: Wellcome Trust

    Current opinion in structural biology 1997;7;2;190-3

  • The nucleotide sequence of Saccharomyces cerevisiae chromosome IV.

    Jacq C, Alt-Mörbe J, Andre B, Arnold W, Bahr A, Ballesta JP, Bargues M, Baron L, Becker A, Biteau N, Blöcker H, Blugeon C, Boskovic J, Brandt P, Brückner M, Buitrago MJ, Coster F, Delaveau T, del Rey F, Dujon B, Eide LG, Garcia-Cantalejo JM, Goffeau A, Gomez-Peris A, Zaccaria P et al.

    Laboratoire de Génétique Moléculaire, URA 1302 du CNRS, Ecole Normale Supérieure, Paris, France. jacq@biologie.ens.fr

    The complete DNA sequence of the yeast Saccharomyces cerevisiae chromosome IV has been determined. Apart from chromosome XII, which contains the 1-2 Mb rDNA cluster, chromosome IV is the longest S. cerevisiae chromosome. It was split into three parts, which were sequenced by a consortium from the European Community, the Sanger Centre, and groups from St Louis and Stanford in the United States. The sequence of 1,531,974 base pairs contains 796 predicted or known genes, 318 (39.9%) of which have been previously identified. Of the 478 new genes, 225 (28.3%) are homologous to previously identified genes and 253 (32%) have unknown functions or correspond to spurious open reading frames (ORFs). On average there is one gene approximately every two kilobases. Superimposed on alternating regional variations in G+C composition, there is a large central domain with a lower G+C content that contains all the yeast transposon (Ty) elements and most of the tRNA genes. Chromosome IV shares with chromosomes II, V, XII, XIII and XV some long clustered duplications which partly explain its origin.

    Nature 1997;387;6632 Suppl;75-8

  • Two methods for improving performance of an HMM and their application for gene finding.

    Krogh A

    Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark. krogh@cbs.dtu.dk

    A hidden Markov model for gene finding consists of submodels for coding regions, splice sites, introns, intergenic regions and possibly more. It is described how to estimate the model as a whole from labeled sequences instead of estimating the individual parts independently from subsequences. It is argued that the standard maximum likelihood estimation criterion is not optimal for training such a model. Instead of maximizing the probability of the DNA sequence, one should maximize the probability of the correct prediction. Such a criterion, called conditional maximum likelihood, is used for the gene finder 'HMM-gene'. A new (approximative) algorithm is described, which finds the most probable prediction summed over all paths yielding the same prediction. We show that these methods contribute significantly to the high performance of HMMgene.

    Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology 1997;5;179-86

  • Antidiabetic sulphonylureas stimulate acetylcholine release from striatal cholinergic interneurones through inhibition of K(ATP) channel activity.

    Lee K, Brownhill V and Richardson PJ

    Parke Davis Neuroscience Research Centre, Cambridge University Forvie Site, England.

    The sulphonylureas tolbutamide and glibenclamide were shown to stimulate acetylcholine release from rat striatal slices. To determine the mechanism of this effect, whole-cell patch-clamp recordings were made from large neurones within the striatum that displayed morphological, electrophysiological, and pharmacological characteristics typical of cholinergic interneurones. Dialysis of these neurones with a pipette solution containing low concentrations of ATP produced a gradual hyperpolarisation that could be reversed by bath application of the sulphonylureas. In voltage-clamp studies, these compounds were shown to act through the inhibition of a potassium conductance. It is concluded that cholinergic interneurones within the rat striatum express sulphonylurea-sensitive ATP-sensitive potassium channel activity. These channels are probably cytoprotective and may prove to be novel sites of therapeutic modulation.

    Funded by: Wellcome Trust

    Journal of neurochemistry 1997;69;4;1774-6

  • Instability of highly expanded CAG repeats in mice transgenic for the Huntington's disease mutation.

    Mangiarini L, Sathasivam K, Mahal A, Mott R, Seller M and Bates GP

    Division of Medical and Molecular Genetics, Guy's Hospital, London, UK.

    Six inherited neurodegenerative diseases are caused by a CAG/polyglutamine expansion, including spinal and bulbar muscular atrophy (SBMA), Huntington's disease (HD), spinocerebellar ataxia type 1 (SCA1), dentatorubral pallidoluysian atrophy (DRPLA) Machado-Joseph disease (MJD or SCA3) and SCA2. Normal and expanded HD allele sizes of 6-39 and 35-121 repeats have been reported, and the allele distributions for the other diseases are comparable. Intergenerational instability has been described in all cases, and repeats tend to be more unstable on paternal transmission. This may present as larger increases on paternal inheritance as in HD, or as a tendency to increase on male and decrease on female transmission as in SCA1 (ref. 15). Somatic repeat instability is also apparent and appears most pronounced in the CNS. The major exception is the cerebellum, which in HD, DRPLA, SCA1 and MJD has a smaller repeat relative to the other brain regions tested. Of non-CNS tissues, instability was observed in blood, liver, kidney and colon. A mouse model of CAG repeat instability would be helpful in unravelling its molecular basis although an absence of CAG repeat instability in transgenic mice has so far been reported. These studies include (CAG) in the androgen receptor cDNA, (CAG) in the HD cDNA, (CAG) in the SCA1 cDNA, (CAG) in the SCA3 cDNA and as an isolated (CAG) tract.

    Nature genetics 1997;15;2;197-200

  • Gene expression and development databases for C. elegans

    Martinelli SD, Brown CG and Durbin R

    Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK

    The nematode worm C. elegans, with its transparent body, is an excellent vehicle for studying developmental gene expression during embryogenesis and throughout its short life. Expression data from in-situ hybridization, immunolocalization and reporter constructs have been put into the ACeDB database, which is used to store and disseminate most types of C. elegans data, and is also widely used for genome-sequencing projects. In the database, the gene-expression patterns are linked to genes, sequences, cells, organs and the developmental stage in which expression occurs. An accessory program 'Angler' can be used to browse sectional Nomarski images of the worm embryo during early development, and to relate these images to overlaid cell lineage data and 3-D schematic views of cell positions.Copyright 1997 Academic Press Limited Copyright 1997Academic Press Limited

    Seminars in cell & developmental biology 1997;8;5;459-67

  • Cytogenetic analysis of three breast carcinoma cell lines using reverse chromosome painting.

    Morris JS, Carter NP, Ferguson-Smith MA and Edwards PA

    Department of Pathology, University of Cambridge, England.

    Chromosome painting was used to determine the copy number and identity of virtually all the chromosomes in three breast cancer cell lines, T-47D, MDA-MB-361, and ZR-75-1. The karyotypes of all three cell lines were very complex, and were consistent with the monosomic pattern of evolution suggested by Dutrillaux, in which nonreciprocal translocations cause an initial reduction in chromosome number, followed by duplication of the entire genome and further chromosome loss. Twenty distinct abnormal chromosomes were identified in T-47D, seven of which were present as two copies. MDA-MB-361 had 27 abnormal chromosomes each as a single copy. Thirteen abnormal chromosomes in ZR-75-1 occurred singly, two were paired, and one was present as three copies. Most of the aberrant chromosomes were nonreciprocal translocations, although deletions, duplications, isochromosomes, and amplifications (HSR of 1q) were also found. Chromosome arms present in abnormal chromosomes in all three lines were 1q, 6p, 7p, 8p, 8q, 10q, 11p, 11q, 12p, 13q, 14q, 15q, 16p, 16q, 17q, and 20q. The only chromosome arms present in four or more copies in all three lines were 8q and proximal 12p, while 1p, 17p, and bands 11q12--13 were the only chromosome regions consistently reduced to two copies. The most striking feature common to all three lines was a translocation breakpoint on the short arm of chromosome 8 at 8p12.

    Funded by: Wellcome Trust

    Genes, chromosomes & cancer 1997;20;2;120-39

  • EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA.

    Mott R

    Informatics Group, Sanger Centre, Cambridge, UK.

    Funded by: Wellcome Trust

    Computer applications in the biosciences : CABIOS 1997;13;4;477-8

  • Critical assessment of methods of protein structure prediction (CASP): round II.

    Moult J, Hubbard T, Bryant SH, Fidelis K and Pedersen JT

    Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville 20850, USA. jmoult@carb.nist.gov

    Proteins 1997;Suppl 1;2-6

  • From long range mapping to sequence-ready contigs on human chromosome 6.

    Mungall AJ, Humphray SJ, Ranby SA, Edwards CA, Heathcott RW, Clee CM, Holloway E, Peck AI, Harrison P, Green LD, Butler AP, Langford CF, William RG, Huckle EJ, Baron L, Smith A, Leversha MA, Ramsey YH, Clegg SM, Rice CM, Maslen GL, Hunt SE, Scott CE, Soderlund CA, Dunham I et al.

    The Sanger Centre, Hinxton, Cambridge, UK.

    Our aim is to construct physical clone maps covering those regions of chromosome 6 that are not currently extensively mapped, and use these to determine the DNA sequence of the whole chromosome. The strategy we are following involves establishing a high density framework map of the order of 15 markers per Megabase using radiation hybrid (RH) mapping. The markers are then used to identify large-insert genomic bacterial clones covering the chromosome, which are assembled into sequence-ready contigs by restriction enzyme fingerprinting and sequence tagged site (STS) content analysis. Contig gap closure is performed by walking experiments using STSs developed from the end sequences of the clone inserts.

    DNA sequence : the journal of DNA sequencing and mapping 1997;8;3;151-4

  • Trisomy 5q12-->q13.3 in a patient with add(13q): characterization of an interchromosomal insertion by forward and reverse chromosome painting.

    Nordgren A, Arver S, Kvist U, Carter N and Blennow E

    Department of Molecular Medicine, Karolinska Institutet, Stockholm, Sweden.

    We report on a patient with azoospermia, mild mental retardation, and minor physical anomalies. Chromosome analysis demonstrated the presence of additional material on the long arm of one chromosome 13. Forward chromosome painting using chromosome-specific libraries showed an insertion of material from chromosome 5. Further characterization with flow sorting of the aberrant chromosome and amplification by DOP-PCR followed by reverse chromosome painting showed specific trisomy of 5q12-->q13.3.

    Funded by: Wellcome Trust

    American journal of medical genetics 1997;73;3;351-5

  • Intermediate sequences increase the detection of homology between sequences.

    Park J, Teichmann SA, Hubbard T and Chothia C

    Cambridge Centre for Protein Engineering, UK.

    Two homologous sequences, which have diverged beyond the point where their homology can be recognised by a simple direct comparison, can be related through a third sequence that is suitably intermediate between the two. High scores, for a sequence match between the first and third sequences and between the second and the third sequences, imply that the first and second sequences are related even though their own match score is low. We have tested the usefulness of this idea using a database that contains the sequences of 971 protein domains whose structures are known and whose residue identities with each other are some 40% or less (PDB40D). On the basis of sequence and structural information, 2143 pairs of these sequences are known to have an evolutionary relationship. FASTA, in an all-against-all comparison of the sequences in the database, detected 320 (15%) of these relationships as well as three false positive (i.e. 1% error rate). Using intermediate sequences found by FASTA matches of PDB40D sequences to those in the large non-redundant OWL database we could detect 550 evolutionary relationships with an error rate of 1%. This means the intermediate sequence procedure increases the ability to recognise the evolutionary relationships amongst the PDB40D sequences by 70%.

    Funded by: Wellcome Trust

    Journal of molecular biology 1997;273;1;349-54

  • Tc7, a Tc1-hitch hiking transposon in Caenorhabditis elegans.

    Rezsohazy R, van Luenen HG, Durbin RM and Plasterk RH

    Division of Molecular Biology, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands.

    We have found a novel transposon in the genome of Caenorhabditis elegans. Tc7 is a 921 bp element, made up of two 345 bp inverted repeats separated by a unique, internal sequence. Tc7 does not contain an open reading frame. The outer 38 bp of the inverted repeat show 36 matches with the outer 38 bp of Tc1. This region of Tc1 contains the Tc1-transposase binding site. Furthermore, Tc7 is flanked by TA dinucleotides, just like Tc1, which presumably correspond to the target duplication generated upon integration. Since Tc7 does not encode its own transposase but contains the Tc1-transposase binding site at its extremities, we tested the ability of Tc7 to jump upon forced expression of Tc1 transposase in somatic cells. Under these conditions Tc7 jumps at a frequency similar to Tc1. The target site choice of Tc7 is identical to that of Tc1. These data suggest that Tc7 shares with Tc1 all the sequences minimally required to parasitize upon the Tc1 transposition machinery. The genomic distribution of Tc7 shows a striking clustering on the X chromosome where two thirds of the elements (20 out of 33) are located. Related transposons in C. elegans do not show this asymmetric distribution.

    Funded by: NCRR NIH HHS: 5 R01 RR10082-02; Wellcome Trust

    Nucleic acids research 1997;25;20;4048-54

  • Comparative analysis of the polycystic kidney disease 1 (PKD1) gene reveals an integral membrane glycoprotein with multiple evolutionary conserved domains.

    Sandford R, Sgotto B, Aparicio S, Brenner S, Vaudin M, Wilson RK, Chissoe S, Pepin K, Bateman A, Chothia C, Hughes J and Harris P

    Department of Medicine, Addenbrooke's Hospital, Cambridge, UK. rsandfor@med.cam.ac.uk

    PKD1 is the major locus of the common genetic disorder autosomal dominant polycystic kidney disease (ADPKD). Analysis of the predicted protein sequence of the human PKD1 gene, polycystin, shows a large molecule with a unique arrangement of extracellular domains and multiple putative transmembrane regions. The precise function of polycystin remains unclear with a paucity of mutations to define key structural and functional domains. To refine the structure of this protein we have cloned the genomic region encoding the Fugu PKD1 gene. Fugu PKD1 spans 36 kb of genomic DNA and has greater complexity with 54 exons compared with 46 in man. Comparative analysis of the predicted protein sequences shows a lower level of homology than in similar studies with identity of 40 and 59% similarity. However key structural motifs including leucine rich repeats (LRR), a C-type lectin and LDL-A like domains and 16 PKD repeats are maintained. A region of homology with the sea urchin REJ protein was also confirmed in Fugu but found to extend over 1000 amino acids. Several highly conserved intra- and extra-cellular regions, with no known sequence homologies, that are likely to be of functional importance were detected. The likely structure of the membrane associated region has been refined with similarity to the PKD2 protein and voltage gated Ca2+ and Na+ channels highlighted over part of this area. The overall protein structure has therefore been clarified and this comparative analysis derived structure will form the basis for the functional study of polycystin and its individual domains.

    Human molecular genetics 1997;6;9;1483-9

  • Regions of human chromosome 2 (2q32-q35) and mouse chromosome 1 show synteny with the pufferfish genome (Fugu rubripes).

    Schofield JP, Elgar G, Greystrong J, Lye G, Deadman R, Micklem G, King A, Brenner S and Vaudin M

    Department of Medicine, University of Cambridge, Addenbrooke's Hospital, United Kingdom. jps1@mole.bio.cam.ac.uk

    We have isolated and sequenced a cosmid clone from the compact genome of the Japanese pufferfish (Fugu rubripes) containing portions of three genes that have the same order as in human. The gene order is microtubule-associated protein (MAP-2), myosin light chain (MYL-1), and carbamoyl phosphate synthetase (CPS III). The intron-exon organization of Fugu CPS III is identical with that of rat CPS I, although the equivalent genomic fragments of rat and Fugu CPS span 87.9 and 21 kb, respectively. This is the first report of a piscine CPS III genomic structure and predicts a close evolutionary link between CPS III and CPS I. The 8-kb intergenic region between MYL-1 and CPS gave no clear areas of transcription factor-binding sites by pairwise comparison with shark or rat CPS promoter regions. However, there was a match with the rat myosin light chain 2 (MLC-2) gene promoter and a MyoD transcription factor-binding site 874 bp upstream of the MYL-1 gene.

    Funded by: Wellcome Trust

    Genomics 1997;45;1;158-67

  • FPC: a system for building contigs from restriction fingerprinted clones.

    Soderlund C, Longden I and Mott R

    Sanger Centre, Hinxton, Cambridge, UK. cari@sanger.ac.uk

    MOTIVATION: To meet the demands of large-scale sequencing, thousands of clones must be fingerprinted and assembled into contigs. To determine the order of clones, a typical experiment is to digest the clones with one or more restriction enzymes and measure the resulting fragments. The probability of two clones overlapping is based on the similarity of their fragments. A contig contains two or more overlapping clones and a minimal tiling path of clones is selected to be sequenced. Interactive software with algorithmic support is necessary to assemble the clones into contigs quickly. RESULTS: FPC (fingerprinted contigs) is an interactive program for building contigs from restriction fingerprinted clones. FPC uses an algorithm to cluster clones into contigs based on their probability of coincidence score. For each contig, it builds a consensus band (CB) map which is similar to a restriction map; but it does not try to resolve all the errors. The CB map is used to assign coordinates to the clones based on their alignment to the map and to provide a detailed visualization of the clone overlap. FPC has editing facilities for the user to refine the coordinates and to remove poorly fingerprinted clones. Functions are available for updating an FPC database with new clones. Contigs can easily be merged, split or deleted. Markers can be added to clones and are displayed with the appropriate contig. Sequence-ready clones can be selected and their sequencing status displayed. As such, FPC is an integrated program for the assembly of sequence-ready clones for large-scale sequencing projects.

    Funded by: Wellcome Trust

    Computer applications in the biosciences : CABIOS 1997;13;5;523-35

  • Analysis of protein domain families in Caenorhabditis elegans.

    Sonnhammer EL and Durbin R

    Sanger Centre, Cambridge, United Kingdom.

    The Caenorhabditis elegans genome sequencing project has completed over half of this nematode's 100-Mb genome. Proteins predicted in the finished sequence have been compiled and released in the data-base Wormpep. Presented here is a comprehensive analysis of protein domain families in Wormpep 11, which comprises 7299 proteins. The relative abundance of common protein domain families was counted by comparing all Wormpep proteins to the Pfam collection of protein families, which is based on recognition by hidden Markov models. This analysis also identified a number of previously unannotated domains. To investigate new apparently nematode-specific protein families, Wormpep was clustered into domain families on the basis of sequence similarity using the Domainer program. The largest clusters that lacked clear homology to proteins outside Nematoda were analyzed in further detail, after which some could be assigned a putative function. We compared all proteins in Wormpep 11 to proteins in the human, Saccharomyces cerevisiae, and Haemophilus influenzae genomes. Among the results are the estimation that over two-thirds of the currently known human proteins are likely to have a homologue in the whole C. elegans genome and that a significant number of proteins are well conserved between C. elegans and H. influenzae, that are not found in S. cerevisiae.

    Funded by: Wellcome Trust

    Genomics 1997;46;2;200-16

  • Pfam: a comprehensive database of protein domain families based on seed alignments.

    Sonnhammer EL, Eddy SR and Durbin R

    Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.

    Databases of multiple sequence alignments are a valuable aid to protein sequence classification and analysis. One of the main challenges when constructing such a database is to simultaneously satisfy the conflicting demands of completeness on the one hand and quality of alignment and domain definitions on the other. The latter properties are best dealt with by manual approaches, whereas completeness in practice is only amenable to automatic methods. Herein we present a database based on hidden Markov model profiles (HMMs), which combines high quality and completeness. Our database, Pfam, consists of parts A and B. Pfam-A is curated and contains well-characterized protein domain families with high quality alignments, which are maintained by using manually checked seed alignments and HMMs to find and align all members. Pfam-B contains sequence families that were generated automatically by applying the Domainer algorithm to cluster and align the remaining protein sequences after removal of Pfam-A domains. By using Pfam, a large number of previously unannotated proteins from the Caenorhabditis elegans genome project were classified. We have also identified many novel family memberships in known proteins, including new kazal, Fibronectin type III, and response regulator receiver domains. Pfam-A families have permanent accession numbers and form a library of HMMs available for searching and automatic annotation of new protein sequences.

    Funded by: NHGRI NIH HHS: HG01363; Wellcome Trust

    Proteins 1997;28;3;405-20

  • The Chromosome 6 database at the Sanger Centre.

    Theaker AJ, Maslen GL, Scott CE, Rice CM, Hunt SE, King A, Mungall AJ, Dunham I and Beck S

    The Sanger Centre, Hinxton, Cambridge, UK.

    The Sanger Centre Chromosome 6 Database (6ace) has been developed as the primary means of release of annotated sequencing and mapping information for human chromosome 6 from the Sanger Centre. It is also being used to curate global data from published and unpublished external sources. The rationale behind the development of 6ace is described, together with information as to how to access the database.

    DNA sequence : the journal of DNA sequencing and mapping 1997;8;3;167-71

  • Chromosomal localization, gene structure and transcription pattern of the ORFX gene, a homologue of the MHC-linked RING3 gene.

    Thorpe KL, Gorman P, Thomas C, Sheer D, Trowsdale J and Beck S

    The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

    We have mapped the human ORFX gene to chromosome 9q34 and determined its complete gene structure. Comparison with RING3, the human MHC-linked homologue on 6p21.3, shows the two gene structures to be highly conserved but with an approximate threefold expansion in the ORFX introns. RING3 and ORFX are found to be ubiquitously expressed in human adult and foetal tissues. Evidence suggests that the two genes may have arisen from an ancient duplication in a common ancestral chromosome.

    Funded by: Cancer Research UK: A3585

    Gene 1997;200;1-2;177-83

  • Co-ordinate regulation of the cystic fibrosis and multidrug resistance genes in cystic fibrosis knockout mice.

    Trezise AE, Ratcliff R, Hawkins TE, Evans MJ, Freeman TC, Romano PR, Higgins CF and Colledge WH

    Nuffield Department of Clinical Biochemistry, University of Oxford, John Radcliffe Hospital, UK.

    The cystic fibrosis (Cftr and multidrug resistance (Mdr1) genes encode structurally similar proteins which are members of the ABC transporter superfamily. These genes exhibit complementary patterns of expression in vivo, suggesting that the regulation of their expression may be co-ordinated. We have tested this hypothesis in vivo by examining Cftr and Mdr1 expression in cystic fibrosis knockout transgenic mice (Cftr(tm1CAM)). Cftr mRNA expression in Cftr(tm1CAM)/Cftr(tm1CAM) mice was 4-fold reduced in the intestine, as compared with littermate wild-type mice. All other Cftr(tm1CAM)/Cftr(tm1CAM) mouse tissues examined showed similar reductions in Cftr expression. In contrast, we observed a 4-fold increase in Mdr1 mRNA expression in the intestines of neonatal and 3- to 4-week-old Cftr(tm1CAM)/Cftr(tm1CAM) mice, as compared with age-matched +/+ mice, and an intermediate level of Mdr1 mRNA in heterozygous Cftr(tm1CAM) mice. In 10-week-old, Cftr(tm1CAM)/Cftr(tm1CAM) mice and in contrast to the younger mice, Mdr1 mRNA expression was reduced, by 3-fold. The expression of two control genes, Pgk-1 and Mdr2, was similar in all genotypes, suggesting that the changes in Mdr1 mRNA levels observed in the Cftr(tm1CAM)/Cftr(tm1CAM) mice are specific to the loss of Cftr expression and/or function. These data provide further evidence supporting the hypothesis that the regulation Cftr and Mdr1 expression is co-ordinated in vivo, and that this co-ordinate regulation is influenced by temporal factors.

    Funded by: Wellcome Trust

    Human molecular genetics 1997;6;4;527-37

  • Report and abstracts of the third international workshop on human chromosome 1 mapping 1997.

    Vance JM, Matise TC, Wooster R, Schutte BC, Bruns GA, van Roy N, Brodeur GM, Tao YX, Gregory S, Weith A, Vaudin M and White P

    Department of Medicine, Duke University Medical Center, Durham NC 27710, USA. jeff@dnadoc.mc.duke.edu

    Funded by: NHGRI NIH HHS: 1-R13-HGO1350-01; PHS HHS: P01-N5-26630

    Cytogenetics and cell genetics 1997;78;3-4;154-82

  • High-resolution physical map of the X-linked retinoschisis interval in Xp22.

    Walpole SM, Nicolaou A, Howell GR, Whittaker A, Bentley DR, Ross MT, Yates JR and Trump D

    Department of Pathology, University of Cambridge, Addenbrooke's Hospital, United Kingdom.

    X-linked retinoschisis (RS) is the leading cause of macular degeneration in young males and has been mapped to Xp22 between DXS418 and DXS999. To facilitate identification of the RS gene, we have constructed a yeast artificial chromosome (YAC) contig across this region comprising 28 YACs and 32 sequence-tagged sites including seven novel end clone markers. To establish the definitive marker order, a PAC contig containing 50 clones was also constructed, and all clones were fingerprinted. The marker order is: Xpter-DXS1317-(AFM205yd12-DXS7175-DXS7992) -60N8-T7-DXS1195-DXS7993-DXS7174 -60N8-SP6-DXS418-DXS7994-DXS7995-DXS7996-+ ++HYAT2-25HA10R-HYAT1-DXS7997-DXS7998- DXS257-434E8R-3542R-DXS6762-DXS7999-DXS 6763-434E8L-DXS8000-DXS6760-DXS7176- DXS8001-DXS999-3176R-PHKA2-Xcen. A long-range restriction map was constructed, and the RS region is estimated to be 1300 kb, containing three putative CpG islands. An unstable region was identified between DXS6763 and 434E8L. These data will facilitate positional cloning of RS and other disease genes in Xp22.

    Funded by: Wellcome Trust

    Genomics 1997;44;3;300-8

  • Numerical criteria for the evaluation of ab initio predictions of protein structure.

    Zemla A, Venclovas C, Reinhardt A, Fidelis K and Hubbard TJ

    Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, Livermore, CA 94551, USA.

    As part of the CASP2 protein structure prediction experiment, a set of numerical criteria were defined for the evaluation of "ab initio" predictions. The evaluation package comprises a series of electronic submission formats, a submission validator, evaluation software, and a series of scripts to summarize the results for the CASP2 meeting and for presentation via the World Wide Web (WWW). The evaluation package is accessible for use on new predictions via WWW so that results can be compared to those submitted to CASP2. With further input from the community, the evaluation criteria are expected to evolve into a comprehensive set of measures capturing the overall quality of a prediction as well as critical detail essential for further development of prediction methods. We discuss present measures, limitations of the current criteria, and possible improvements.

    Funded by: Wellcome Trust

    Proteins 1997;Suppl 1;140-50

* quick link - http://q.sanger.ac.uk/kjwv66rs