WEB P.A.G.E

World-Wide Collaboration in 2d Gel Electrophoresis: On-line Databases


Introduction

Electrophoresis, particularly 2D Gel electrophoresis, remains the most accessible and powerful analytical tool for characterisation of complex protein mixtures such as those arising from whole cell extracts. Only mass spectrometry currently offers superior resolution and molecular mass accuracy, but this method requires far less complex mixtures than whole cell extracts for successful measurements to be made. Although all cells may have the genetic information for making any protein, those of multi-cellular organisms are differentiated, expressing subsets of proteins and giving rise to the term PROTEOME, which reflects the protein rather than the genetic content of a cell.

Always an important analytical biochemistry tool, 1D and 2D gel electrophoresis have increased enormously in power and significance as a result of world wide genome sequencing activities. Complete genome sequences provide a definitive picture of the potential protein products of an organism. However, not all genes are expressed in any given cell type and the initial expression products are subject to a bewildering variety of posttranscriptional and posttranslational processing steps (alternate splicing, proteolysis, disulphide linkage and prosthetic group attachment, to name but a few). Gel electrophoresis and 2D PAGE in particular is universally used to determine which proteins are expressed and detect and attempt to identify any post-expression changes.

With the development of matrix-assisted laser desorption mass spectrometry (MALDI) and Electrospray Ionisation (ESI), even the tiny amounts of protein present in spots on 2D gels can be identified. Protein is treated with proteases in situ or after Western blotting. The resulting peptide mixture is mas analysed and the masses compared with those expected theoretically using on-line sequence databases (e.g. SWISS-PROT, PIR and so on). The most exciting development in this respect is nanospray ionisation mass spectrometry. So little sample is consumed in this technique that many different analyses can be carried out on material eluted from a single gel spot. If an extracted protein's amino acid sequence is represented (even in part) in world sequence databases, then it can now be routinely detected and identified. Moreover this technique can detect and frequently identify posttranslational modifications in whole proteins as well as peptide fragments.

Many 2D gel databases are federated as described below and include graphical links to sequence and literature information direct from on-line (Web) 2D-gel image representations. Such databases are proliferating on the Internet as crucial resources both for understanding cellular function and the identification of disease lesions at a molecular level. This short feature describes some of the resources available to the biochemist to exploit this emerging capability.

Techniques

To take advantage of the growing number of tissue and cell-specific databases available on-line, samples need to be of the highest possible quality. The consequences of unplanned proteolytic cleavage are self-evident. 2D-gels have been notoriously difficult to standardise in the past, but new media and immobilised ampholites for the iso-electric focusing step have improved this. For critical work where both identification and quantitiation of resolved 2D spots is important, then both good image capture systems and gel analysis software are also very important. Kendrick Labs and Proteome Inc offer to solve any potential problems by running the gels for you. Kendrick offer a standard 2D gel running service and Proteome are involved in a number of 2D gel project services. However, most sites will be carrying out their own work and there is much support on-line for the preparation of samples and 2D gels as well as image capture and gel image analysis (this site is just one example). Some literature references are given here, but these are general, certainly not comprehensive and approaches which have been developed for a specific tissue or cell or type of analysis are likely to give better results.

Electrophoresis & Analysis Links

Tricine PAGE High resolution separations of small proteins
Phoretix Phoretix 2D - High performance analytical and database software designed specially for 2D Gel applications
SWISS PROT Technical info. (2D gels) Detailed protocols for 2D-gels from University of Geneva
Guide to 2D gels Using the ISO-DALT apparatus design by Large Scale Biology corp. From Hoeffer Pharmacia Biotech. A spiral bound copy is available
Comprehensive biomedical suppliers List of suppliers (reagents apparatus etc)
The World of Electrophoresis Maintained page of links to resources on the Internet
GuessProt Identify protein in SWISS-PROT data base from measured Gel Mr & pI
Yeast 2D gels Theoretical 2d gel plots (Mr, pI) for yeast proteins from the Proteome company

Mass spectrometry links

Mass Spec on the Net Matthias Mann (Protein and Peptide Group, EMBL) has compiled an excellent collection of Internet-based ms resources and links relating to protein microanalysis and 2D-Gel mass fingerprinting
MassSearch Search SWISS PROT or EMBL Protein databases using peptide masses from specific enzyme digestion of proteins separated in PAGE. You can unambiguously identify proteins from 5-7 measured peptide masses alone.
Prot-ID Protein identity from enzymatic fragments (This is an analogous on-line facility at the Rockefeller University)
Ms-fit Another peptide mass fingerprinting tool, at University of California, San Diego.
NYC-MASS Protein mass spectrometry database - for searching for peptide masses (Rockefeller University)

A Federated system of 2D Electrophoresis Databases

Biological data and information added to on-line databases world-wide makes life difficult for providers and consumers alike. The providers of given types of data cannot provide a comprehensive local database without constant replication between all the sites, but unless this happens, the consumers would have to check all databases independently to be sure not to miss new information. However, the construction of a unified global database, free of redundancies and edited for conflicting information makes it difficult to audit the original data (differences in data on the same samples from different sites may itself carry crucial information). Database federation provides a partial solution to this dilemma by allowing the development of locally held databases, but linking them in such a way as to give the operational effect of a singe resource.

Guidelines have been produced for the federation of 2D electrophoresis (2-DE) databases (Appel et al., Electrophoresis 17, 1996, 540-546, 1996):

  1. Individual entries in the database must be remotely accessible by keyword search. Other query methods are possible but not mandatory, such as full text search, for example.
  2. The database must be linked to other databases through active hypertext cross-references, that is through a simple mouse click on a cross-reference, the user automatically gets connected to the corresponding WWW site, and the cross-referenced document is then retrieved and displayed. This simple mechanism links together all related databases and combines them into one large virtual database. Database entries must have such a cross-reference to at least the main index (see Rule 3)
  3. In addition to individually searchable databases, a main index has to be supplied that provides a means of querying all databases through one unique entry point. Bidirectional cross-references must exist between the main index and the other databases. Currently, the main index is the SWISS-PROT database.
  4. Individual protein entries must be accessible through clickable images. That is, 2-DE images must be provided on the WWW server and, as a response to a mouse click on any identified spot on the image, the user must obtain the database entry for the corresponding protein. This method allows a user to easily identify proteins on a 2-DE image.
  5. 2-DE analysis software, that have been designed for use with federated databases, must be able to directly access individual entries in any federated 2-DE database. For example, when displaying a 2-DE reference map with a 2-DE computer program, the user must be able to select a spot and remotely obtain the corresponding entry from the given database. (BUT SEE NOTES BELOW)

A 2-DE computer analysis software may comply with Rule 5 by remote-controlling a WWW browser and requesting the following document for any given protein:


     http://host/cgi-bin/get-2d-entry/database?ID

where host is the name of the server on which the remote database is located, database is the selected database on that server, and ID is the entry's unique identification. For example, in order to retrieve the Alpha-1 antitrypsin entry from the SWISS-2DPAGE database on the Expasy server, the following document has to be requested by the WWW browser:


     http://expasy.hcuge.ch/cgi-bin/get-2d-entry/SWISS-2DPAGE?P01009

The following URL gets the Alpha-1 antitrypsin entry in the SWISS-PROT protein sequence database on the Expasy server:


     http://expasy.hcuge.ch/cgi-bin/get-2d-entry/SWISS-PROT?P01009

Quite a few of the on-line 2D databases described further below and linked at this site are federated or partly-federated databases. For example, HSC-2DPAGE (Heart Science Centre, Harefield Hospital). An updated on-line list of all these also appears on the Swiss ExpaSy server at WORLD-2DPAGE. Form-based searching (2D Hunt) for 2D databases on the Internet is available and new sites can be included.


NOTES on Guideline 5. (Apel)
1. Unless everyone standardises on the Swiss method of running gels it will be difficult and very time consuming to incorporate one of the Swiss gels and its associated links into other analysis experiments.
2. It would also be difficult to create suitable links from a reference gel within an experiment, or one that has been constructed for use across several experiments, and ensure that the links for each spot correspond with the links from the the gel(s) on the web databases.
3. It is actually very easy to print images and graphics of any experimental gels and use these to help identify spots on Web gels and get spot information accordingly - using virtually any modern internet browsing software.
4. Phoretix therefore contests that whilst guideline No. 5 may well be quite helpful for particular projects where there is confidence in the conformity and reproducibility in running the gels, but is not helpful for the majority of scientists running 2D gels.

Round-up of 2D-E web sites & databases

Major 2D-E Databases

SWISS-2DPAGE at Geneva University Hospital, Switzerland contains data on proteins from reference maps. It is possible to locate the proteins on graphical representations of 2D gels or to display areas on the 2D maps where a protein selected from the SWISS-PROT would be expected to run (the software works out Mr and pI). Access to database entries can be made by description line, ID number, accession number, author or by clicking on a spot on one of the maps. Maps include: Human liver and Human plasma and red blood cells

Argonne Protein Mapping Group is at the Center for Mechanistic Biology and Biotechnology at the Argonne National Laboratory, USA. The Protein Mapping Group is using 2-DE to study normal protein expression and changes in protein expression induced by pathological events or exposure to environmental stimuli. This federated WWW database of 2-DE patterns currently includes maps of proteins from Mouse Liver Proteins, Human Breast Cell Proteins, and the hyperthermophilic organism, Pyrococcus furiosus. The 2-DE maps include hyperlinks to textual descriptions of individual protein spots.

Danish Centre for Human Genome Research 2D PAGE Databases (Aarhus) The Danish Centre for Human Genome Research's 2D PAGE Databases at the University of Aarhus contain data on proteins identified on various reference maps. Protein names and information on specific protein spots are obtained by clicking on the image in which you are interested. Also, you can search by protein name, keywords, organelle or cellular component. The databases are being constructed for the study of global cell regulation, skin biology and bladder cancer. Procedures are illustrated with still images and videos. There are links to Human 2-D PAGE Databases, Mouse 2-D PAGE Database, and a 2-D Gel image Gallery.

HEART-2DPAGE at the German Heart Institute, Berlin is a federated 2-DE database of myocardial proteins, identified on 2-D PAGE maps in collaboration with the Wittmann Institute of Technology and Analysis of Biomolecules of the human ventricle and atrium (see HEART-2DPAGE publication: Electrophoresis 1994, 15, 685-707). Silver-stained images of the gels are on-line and proteins can be selected from the separate atrial or ventricular databases by means of clickable gel images. Database federation means links to the Harefield Human Heart database (below) and the proteins sequences at SWISS-2D PAGE. Technical information of gel preparation and running is also available. Details of changes in protein expression are available in Dilated Cardiomyopathy (DCM) using computer assisted analysis of the gels.

HSC-2DPAGE at the Heart Science Centre, Harefield Hospital, UK also provides federated 2-DE PAGE databases along similar lines. This is now a major world site for 2D-gel electrophoresis and is also home to the British Society for Electrophoresis' Web pages

NIMH-NCI Protein Disease Database at the National Institutes for Medical Heaith, USA is part of the NIMH-NCI Protein Disease Database (PDD) project for correlating diseases with proteins observable in serum, CSF, urine and other common human body fluids with respect to disease conditions discussed in the literature. Data is being collected and entered into the PDD relational database by LBG/NIMH and others. It may be searched in a variety of ways including looking for normalised concentration changes of proteins (for disease states with respect to normal states) as well as other more complex queries. The PDD may be accessed to: 1) find proteins by clicking on spots in a reference 2D gel map image, or 2) query the relational database using forms to specify the search query. Query results may then be shown as tables, graphs, 2D gel map images, or hypertext references to other WWW databases.

NCI/FCRDC LMMB Image Processing is the Image Processing Section of the Laboratory of Mathematical Biology, NCI/FCRDC, Frederick, MD, 21702, USA. This web site offers Protein-Disease Database (PDD) - a federated 2-DE database correlating proteins with disease and GELLAB-II software for 2D gel analysis as well as Xconf image conferencing over the Internet.

The Proteome Company. Provides library access to YPD™ -- The Yeast Proteome Handbook provides the researcher with convenient access to the complete proteome of Saccharomyces cerevisiae. Scientists need access to YPD™ in their academic and corporate libraries.

MitoDat is the Mitochondrion database. This partially federated database specialises in those nuclear genes specifying the enzymes, structural proteins, and other proteins, many still not identified, involved in mitochondrial biogenesis and function. MitoDat highlights predominantly human nuclear-encoded mitochondrial proteins, but proteins from other animals are included in addition to those currently known only from yeast and other fungal mitochondria, as well as from plant mitochondria. The database consolidates information from various biological databases, eg., GenBank, SWISSProt, Genome Data Base (GDB) and Online Mendelian Inheritance in Man (OMIM) The mitochondrion is implicated in many human diseases and so the database should be a valuable tool. MitoDat can be searched for key phrases using Boolian logic operators (AND, OR etc). Searches can be restricted to organelle compartments (Outer membrane, intermembrane space, inner membrane and matrix. The database can be downloaded to a local hard disc.

Large Scale Biology Home Page at the Large Scale Biology (LSB) company have 2D gel maps on-line for: standard rat, mouse and human liver (Coomassie Blue-stained 2-D patterns), human plasma experimental and computer reconstructed (from the gels shown), corn and wheat proteins and non-equilibrium gradient electrophoresis (NepHGE) 2-D of rabbit psoas muscle.

UCSF 2DPAGE (A375 cell line) at the University of California, San Francisco presents information on proteins identified following isolation by 2D PAGE. The proteins have been rapidly identified through extensive use of the mass spectrometry based peptide sequencing. Collaboration with Dr. Lois B. Epstein's Tumor Immunology and Interferon Research Laboratory in the Cancer Research Institute and the Department of Pediatrics at UCSF has led to identification of proteins present in malignant human cell lines. Of particular interest are proteins which are either induced or suppressed by Interferon-gamma and/or Tumor Necrosis Factor as the identified proteins may mediate the anti-tumor or immunomodulatory functions of the two cytokines. The A375 Melanoma Database contains details about the A375 cell line, table of melanoma proteins identified (with links to Swiss-Prot, and Genbank entries) and 2D gel images of IFN-gamma/TNF treated cells and controls.

Other sites

Cambridge 2D PAGE offers a variety of links to other sites as well as its own offering of a developing rat neuronal protein database.

ECO2DBASE is an ftp server for a partially federated database relating E.coli genes to expression products.

Embryonal Stem Cells database at the university of Edinburgh immunobiology department is aimed at obtaining molecular insights into cell differentiation, commitment and stem cell self-renewal. One approach is to study the molecular control of hematopoiesis via a high resolution map of protein expression over different time points of differentiation and inducing agent treatment (eg. retinoids) using protein 2D PAGE maps.

Human Colon Carcinoma Protein Database is at the joint Protein Structure Laboratory in the Ludwig Institute at the University of Melbourne. Their server carries updated synthetic images of Human colon carcinoma cell line 2D-Gel protein distribution patterns. Usefully, the site also includes current methods used by the lab for protein extraction from polyacrylamide gel matrices and their structural characterisation techniques.

Special Interest Groups and Societies

These provide another way to get support or contribute to the pool of knowledge on electrophoresis. Major societies are represented by the American Electrophoresis Society (The Electrophoresis Society, PO Bx 1897, Lawrence, KS 66044-8897, USA, E-mail Richard Walker) and the British Electrophoresis Society. These exist to further the science of electrophoresis, offering meetings, newsletters, contact points, and database information.

Bibliography

1. Protein Analysis using high resolution two-dimensional polyacrylamide gel electrophoresis. Dunbar, B.S., Kimura, H. and Timmons, T.M. (1990) in Methods in Enzymology (Deutscher, M.P., ed) 182, 441-459 Academic Press Inc NY. ISBN 0-12-213585-7

2. A Practical Guide to Protein and Peptide Purification for Microsequencing. Matsudaira, P.T., ed., (1989) Academic Press, NY ISBN 0-12-480280-X

3. Very useful specific techniques and applications are described in the series of Protein Society papers "Techniques in Protein Chemistry" published by Academic Press. These cover protein and peptide purification, mass spectral analysis, posttranslational processing, protein folding and NMR, analysis of protein interactions and protein design and engineering.

Michael Geisow, BIODIGM 1996

 


© Phoretix International 1997