Dr Jose M Gonzalez | Former Senior Bioinformatician at the Sanger Institute

This person is a member of Sanger Institute Alumni.

Gonzalez, Jose M

Jose provides bioinformatics support to the HAVANA genome annotation team and coordinates the GENCODE data releases.

As a bioinformatician in the HAVANA team, I perform bespoke analyses to support the manual annotation of various vertebrate genomes. This work is mainly done in the context of the GENCODE project, which aims to produce a highly accurate reference gene annotation of the human and mouse genomes. One of my tasks involves importing and analysing transcriptomics experiment data and computational predictions, publicly available or generated by GENCODE collaborators, in order to find genomic regions where novel gene annotation can be introduced. GENCODE has been recently generating long read RNAseq data which has proved useful to improve the annotation of existing genes as well as to find novel genes, especially long non-coding RNA genes.

As GENCODE data coordinator I am responsible for maintaining and improving the GENCODE release pipeline, that has been adapted to the increasing complexity of release files over the last five years. I am also involved in updating the gencodegenes.org website and coordinating the GENCODE helpdesk, and contribute to the QC of the GENCODE gene set.

My background is in biology. During my PhD in molecular biology, as a wet lab scientist, I carried out work that led to developing the first infectious cDNA of the largest RNA virus genome at the time. Subsequently I moved on to the bioinformatics field, where I collaborated with experimental groups using protein sequence and structure analysis. Before joining the Sanger I was involved in the functional genomics analysis of the effect of virus infection on host cells.

Publications

  • Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq).

    Lagarde J, Uszczynska-Ratajczak B, Santoyo-Lopez J, Gonzalez JM, Tapanari E et al.

    Nature communications 2016;7;12339

  • Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow.

    Wright JC, Mudge J, Weisser H, Barzine MP, Gonzalez JM et al.

    Nature communications 2016;7;11778

  • Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction.

    Frankish A, Uszczynska B, Ritchie GR, Gonzalez JM, Pervouchine D et al.

    BMC genomics 2015;16 Suppl 8;S2

  • The non-obese diabetic mouse sequence, annotation and variation resource: an aid for investigating type 1 diabetes.

    Steward CA, Gonzalez JM, Trevanion S, Sheppard D, Kerry G et al.

    Database : the journal of biological databases and curation 2013;2013;bat032

  • GENCODE: the reference human genome annotation for The ENCODE Project.

    Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M et al.

    Genome research 2012;22;9;1760-74

  • An integrated encyclopedia of DNA elements in the human genome.

    ENCODE Project Consortium

    Nature 2012;489;7414;57-74

  • Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq).

    Lagarde J, Uszczynska-Ratajczak B, Santoyo-Lopez J, Gonzalez JM, Tapanari E et al.

    Nature communications 2016;7;12339

  • Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow.

    Wright JC, Mudge J, Weisser H, Barzine MP, Gonzalez JM et al.

    Nature communications 2016;7;11778

  • Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction.

    Frankish A, Uszczynska B, Ritchie GR, Gonzalez JM, Pervouchine D et al.

    BMC genomics 2015;16 Suppl 8;S2

  • The Vertebrate Genome Annotation browser 10 years on.

    Harrow JL, Steward CA, Frankish A, Gilbert JG, Gonzalez JM et al.

    Nucleic acids research 2014;42;Database issue;D771-9

  • Current status and new features of the Consensus Coding Sequence database.

    Farrell CM, O'Leary NA, Harte RA, Loveland JE, Wilming LG et al.

    Nucleic acids research 2014;42;Database issue;D865-72

  • The non-obese diabetic mouse sequence, annotation and variation resource: an aid for investigating type 1 diabetes.

    Steward CA, Gonzalez JM, Trevanion S, Sheppard D, Kerry G et al.

    Database : the journal of biological databases and curation 2013;2013;bat032

  • An integrated encyclopedia of DNA elements in the human genome.

    ENCODE Project Consortium

    Nature 2012;489;7414;57-74

  • The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression.

    Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S et al.

    Genome research 2012;22;9;1775-89

  • GENCODE: the reference human genome annotation for The ENCODE Project.

    Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M et al.

    Genome research 2012;22;9;1760-74

  • Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome.

    Howald C, Tanzer A, Chrast J, Kokocinski F, Derrien T et al.

    Genome research 2012;22;9;1698-710

Gonzalez, Jose M
Jose's Timeline
2010

Joined the HAVANA team at the Wellcome Trust Sanger Institute

2005

Poxvirus and Vaccines Lab, CNB-CSIC, Madrid, Spain

2002

Protein Design Group, CNB-CSIC, Madrid, Spain

2000

PhD in Molecular Biology - National Centre for Biotechnology (CNB-CSIC), Madrid, Spain