Swine Genome Sequencing Project
The genome of the pig (Sus scrofa) comprises 18 autosomes, with X and Y sex chromosomes. The genome size is similar to that of human at around 2.7Gb.
The pig is a member of the artiodactyls, (cloven-hoofed mammal), which are an evolutionary clade distinct from the primates and rodents; there is extensive conserved homology between the pig genome and the human genome. The pig is therefore an important model for human health particularly for understanding complex traits such as obesity and cardiovascular disease. The funding for the clone based sequencing project at the Wellcome Sanger Institute ran from January 2006 to December 2009.
Clone mapping and sequencing
A physical map of the swine genome was generated by an international collaboration of four laboratories. Both high-throughput fingerprinting and BAC end sequencing were used to provide the template for an integrated physical map of the whole pig genome.
A bacterial clone physical map of the genome was constructed using restriction enzyme fingerprinting (Marra et al., 1997). Fingerprints were generated by digesting clones with HindIII. Following electrophoresis on agarose gels and data collection using a fluorimager, raw images were entered using the software, IMAGE. This produced an output of normalised band values and gel traces. Analysis of data took place in WebFPC, a version of which can be found here: WebFPC. Clones were contiguated on the basis of shared bands. Coverage across the 2.7Gb genome was generated at 15.3x.
The total number of fingerprints was 267884, which were assembled into 524 contigs. These data represent the output of a stringent automated assembly, resulting in contigs of highly overlapping sets of clones, followed by initial manual editing.
Fingerprinted clones breakdown by library
Library | Fingerprinted clones | Clones average insert (kb) |
---|---|---|
CHORI-242 | 103758 | 173 |
RPCI-44 | 61281 | 165 |
PigE | 73866 | 150 |
INRA | 28467 | 135 |
KNP | 361 | - |
Unknown | 151 | - |
Total | 267884 | - |
The contigs were then merged by relaxing the stringency required for overlap, as well as interrogating all other available data, such as contig localisation to the human genome via BAC end sequences, RH and genetic maps. The map, created by Sean Humphray and his team, provided a template for clone tile path selection. For further information about the fingerprint map, follow this link:
http://genomebiology.com/2007/8/7/R139
End sequencing
Over 600,000 BAC end sequences (BES) were generated from four libraries in three laboratories. The BES are available from the NCBI trace archive.
Using WuBLASTn set for cross-species comparison, the non-repetitive BAC ends were searched against the human reference sequence. This anchoring acted as a framework in the construction of the porcine map and along with reduced stringency fingerprint matches, enabled accelerated merging through the whole genome.
The BAC end sequencing was funded by BBSRC/DEFRA/Roslin Institute, INRA/Genoscope and the United States Department of Agriculture.
Data Availability
- NCBI Trace Archive - raw sequence data in the form of traces
- EMBL - access clone sequences via public nucleotide database
- FTP - download clone sequences via FTP
- Blast - search for clone sequences
Collaborators and Funding
Funding
The mapping and BES projects were funded by:
- USDA Cooperative State Research, Education and Extension Service (CSREES) administered the grant through the National Research Initiative
- National Pork Board
- Iowa Pork Board
- Iowa State University
- North Carolina Pork Council
- North Carolina State University
- Biotechnology and Biological Sciences Research Council (BBSRC)
- Department for Environment, Food and Rural Affairs (DEFRA)
- EU
Swine Genome Sequencing Consortium
The Sanger Institute is a member of the Swine Genome Sequencing Consortium (SGSC), a partnership of institutes involved in sequencing and genomics. The aim of the SGSC was to accelerate, facilitate and coordinate global swine genomic sequencing efforts.
Participating Centres
- Institute for Genomic Biology, University of Illinois, Urbana, IL, USA
- Department of Animal Sciences, University of Illinois, Urbana, IL, USA
- The Wellcome Sanger Institute, Hinxton, UK
- Roslin Institute, Edinburgh, UK
- INRA-CEA, Jouy-en-Josas, France
- INRA-Toulouse, France
- Agricultural Research Service, Clay Center, NE, USA
- The Alliance for Animal Genomics, Bethesda, MD, USA
- University of Nevada, Reno
- Iowa State University, USA
Swine libraries
CHORI-242 BAC library
- Strain
- Duroc
- Source
- Female white blood cells
- Vector
- pTARBAC1.3
- Host strain
- E. coli DH10B
- No. of Clones
- 196457
- Av insert size
- 173 kb
- Redundancy
- 11.4x
- Originators
- P. de Jong
- BACPAC Resources
- Distribution
- P. de Jong
- BACPAC Resources
- Library no: CHORI-242
RPCI-44 BAC Library
- Strain
- 37.5% Yorkshire, 37.5% Landrace, 25% Meishan
- Source
- Male white blood cells
- Vector
- pTARBAC2
- Host strain
- E. coli DH10B
- No. of Clones
- 185389
- Av insert size
- 165 kb
- Redundancy
- 10.2x
- Originators
- P. de Jong
- BACPAC Resources
- Distribution
- P. de Jong
- BACPAC Resources
- Library no: RPCI-44
PigE BAC Library
- Strain
- Large White x Meishan F1
- Source
- Male blood cells
- Vector
- pBeloBAC11
- Host strain
- E. coli DH10B
- No. of Clones
- 97000
- Av insert size
- 150 kb
- Redundancy
- 4.7x
- Originators
- Susan Anderson and Alan Archibald
- Distribution
- ARK-Genomics
- Library no: PigE BAC
- References
-
A large-fragment porcine genomic library resource in a BAC vector.
Mammalian genome : official journal of the International Mammalian Genome Society 2000;11;9;811-4
PUBMED: 10967147; DOI: 10.1007/s003350010155
-
INRA Porcine BAC library
- Strain
- Homozygous for the SLA H01 haplotype
- Source
- Male skin fibroblasts
- Vector
- pBeloBACII
- Host strain
- E. coli DH10B
- No. of Clones
- 107520
- Av insert size
- 135 kb
- Redundancy
- 5x
- Originators
- Laboratoire de Radiobiologie et d'Etude du Génome (LREG)
- Distribution
- Francois Piumi
- INRA BAC-YAC RESOURCE CENTER
- Library no: Porcine BAC
KNP BAC library
No details
All software & data resources
- IMAGE
- WebFPC
- NCBI Trace Archive
- Blast (unfinished clone sequences)
- PreEnsembl (preliminary chromosome assemblies)
- FTP (Unfinished clone sequence)
- FTP (Assembled sequence)
Contact and feedback
Whatever you want to report, comment on or ask, may it be of biological or technical nature, please use the email address below. We will either answer ourselves or forward your email to the relevant person.
If you want to report a problem, please make sure you provide as much information as possible.
Ideally we require:
- what you were looking at (URLs, database names,...)
- names, accession numbers, coordinates,...
- what exactly you were trying to do
- any error message you got
Please send your enquiries/reports/comments to pig-help@sanger.ac.uk.