Progress on the implementation of Eric Martz's proposal, described at http://molvis.sdsc.edu/fgij/seqspecs.htm
in the prototype at http://biomodel.uah.es/Jmol/sequence_info/.
Specifications for Interactive Sequence Listings
for FirstGlance in Jmol (FG) Created March 26, 2006. Revised March 30, 2006 thanks to feedback from Frieda Reichsman and Jaime Prilusky. This is a pre-implementation draft proposal subject to change. Please share your ideas, suggestions, and additional juicy examples with the FGiJ Development Team |
|||||
A. Functions of Sequence Listing The purpose of the sequence listing in FG is to make it easy to relate sequence to 3D structure within the FG application. Therefore the listing should be kept very simple. Sequence annotations such as secondary structure or sequence motifs can be viewed in other databases, and need not be shown in the FG sequence listing. |
|||||
no problem |
1. The sequence listing will be displayed in the lower left division of FG. | ||||
done |
2. Residues will be listed in one-letter code, with x for non-standard residues.
I've decided to omit water & solvent from the listing, and leave the other hetero groups in the sequence |
||||
done |
3. Touching the one-letter code for an amino acid will display, in a form slot, its chain name, ATOM sequence number (and insertion code when present) and three-letter code.
OK; insertion code is lowercase, it has better visibility in my opinion; aa use the standard Uppercase-lowercase-lowercase format; nucleotides are converted to single-letter even if deoxy (PDBv3 uses DG etc.) |
||||
1. done 2. how? |
4. Seq->3D: Clicking the one-letter code for a residue will highlight it in the 3D view in Jmol. Optionally, the residue could automatically be brought to the front of the molecule (by rotating the molecule) (don't know how to do this), slid smoothly to the center, and zoomed. | ||||
done |
5. 3D->Seq: Clicking a residue in the 3D view will highlight it in the sequence listing. | ||||
The following are optional and need not be in the initial release. | |||||
done |
6. Entering a sequence fragment will highlight the locations of any matches in the sequence listing and also in the model. May fail if gaps or microheterogeneity are involved. |
||||
done |
7. Entering a residue name (e.g. PRO or CYS or A or U) will highlight the locations of that residue in the sequence listing.
and also in the model. But we must use one-letter code (or implement a different slot) |
||||
--how / when to trigger? |
8. Coloring the sequence listing automatically, according to the color scheme in the 3D view. This would be appropriate for all views:
| ||||
B. Contents & Format of the Sequence Listing | |||||
done |
1. There will be a single sequence list taken from the ATOM records (with residues that lack coordinates taken from SEQRES). The SEQRES residues will not be listed separately, but discrepancies between aligned SEQRES and ATOM records will be indicated in the single list as detailed below.
| ||||
done |
2. Non-standard residues will be listed as x (lowercase). Touching the x's will display their 1-3 letter ATOM record abbreviation codes in a form slot.
| ||||
done |
3. Sequence numbers will be those in the ATOM records. Thus, some listings will start with a negative, zero or 2 or higher sequence number.
| ||||
done |
4. Inserted residues will be listed in-line with other residues, but distinguished by having their one-letter codes displayed as superscript letters.
| ||||
done |
5. When there is a numbering gap (due to numbering according to a reference sequence) but no residues are missing in the 3D structure (SEQRES and ATOM records match), the position of the gap will be indicated by two hyphens surrounding a number indicating the size of the gap, e.g. -1-, -2-, -3-, -23-, and so forth.
But using ~ instead.
|
||||
mostly done |
6. When there is a physical gap in the 3D model (residues in SEQRES that are absent in ATOM records, typically due to crystallographic disorder), the residues with no coordinates will be listed in lower case.
Touching such a residue will report an interpolated sequence number. Clicking on such a residue will produce a message* explaining that the residue lacks coordinates.
Still need to implement interpolation (?); right now, it increments the nr. by 1 (*) tooltip (onMouseOver) and alert box (onClick)
|
||||
done |
7. When there is sequence microheterogeneity (residues in ATOM records that are absent in SEQRES), the alternate residues at the same sequence position will be enclosed in square brackets.
| ||||
The following are optional and need not be in the initial release. | |||||
--how to check? |
8. A checkbox to highlight residues with missing atoms. For example, some crystallographic results have the alpha and beta carbons of certain amino acids, but lack the remainder of the sidechains.
| ||||
done (w/ link) --sure a checkbox? |
9. A checkbox to highlight residues with alternate sidechain conformations (rotamers; multiple sets of coordinates for sidechain atoms). Alternate sidechain conformations are quite common in the PDB.
| ||||
C. Listing Examples Experience with Protein Explorer has shown that it is quite easy to find any sequence number by moving the mouse over the listing, and watching the number reports in the form slot. Here is the slot as it appears in Protein Explorer when residue 27, insertion code A, is touched in the sequence listing for the example below, 1QKZ chain L: This reporting slot will be immediately above the sequence listing. Thus, it is not important that the sequence number of a residue be apparent from inspection of the sequence listing table itself. Indeed, maintaining a correspondence between column and row and sequence number is not always feasible because of the anomalous sequence numberings used by some authors (see examples listed above). In the listing below, residues are divided into three groups of ten per line. However, the listing may start at a number <1 or >1, so line 1 need not be numbered 1-30. In the example below, it includes residues numbered 1-27C (1-27 plus 27A, 27B, 27C).
The green color reflects the color assigned to chain L by Jmol. Other color schemes could also be applied (see below). Protein Explorer Seq3D Snapshots Illustrating Uses of Color |