PrankWeb is a web-based application that allows to predict and visualize protein-ligand binding sites. Furthermore, it allows to compare the location of predicted pockets with highly conserved areas as well as actual ligand binding sites. All one needs to use PrankWeb is a device with a web-browser that supports WebGL.
There are three options to obtain a structure:
Besides selecting what protein to analyze, one can also specify whether evolutionary conservation should be included in the prediction model by checking the Use conservation checkbox. Note that calculating conservation score can increase the time of analysis.
The new conservation pipeline operates as follows. First, polypeptide chain sequences are extracted from the input file using P2Rank. The phmmer tool from the HMMER software package is then used to identify and align similar sequences for each respective query; UniRef50 Release 2021 03 is used as the single target sequence database. Up to 1,000 sequences are then randomly selected from each MSA to form the respective sample MSAs; weights are assigned to the individual sequences constituting the sample MSAs using the Gerstein/Sonnhammer/Chothia algorithm implemented in the esl-weigh miniapp included with the HMMER software. Finally, per-column information content (i.e., conservation score) and gap character frequency values are calculated using the esl-alistat miniapp, taking the individual sequence weights into account; positions containing the gap character in more than 50% of sequences are masked to appear as possessing no conservation at all.
The range of the conservation corresponds to the range of the per-residue of information content which is between 0 and ~ 4 ( = log2(20) ) with higher values corresponding to higher conservation.
Once the protein visualization is loaded, three main panels appear: sequence visualization, structural visualization and the pocket panel.
The largest panel contains the three-dimensional visualization of the protein.
The molecule can be rotated by moving mouse while holding left mouse button. On a touch device, just slide your finger. To zoom in or out, move your mouse while holding the right mouse button or use the pinch gesture on a touch display. In order to move the protein, do the same, but this time hold the wheel button. Lastly, for slabbing the protein, scroll the mouse wheel or use the three finger gesture.
Using the buttons in the top-right corner, one can:
By toggling the advanced control panel, one has full control over the content of the visualization using LiteMol. For more help with LiteMol, please visit its wiki page.
The panel above protein 3D visualization displays protein sequence.
As one hovers over the sequence with mouse, the residues are highlighted in the 3D visualization. This feature allows to analyze the protein both from the structural and sequential point of view. By default, the sequence view is zoomed out so that the whole protein is displayed. You can use the trackbar control to zoom in, or select the area with mouse and zoom to the selection.
The right panel contains several control buttons and a list of predicted pockets. Use the control buttons to download PrankWeb report, or switch between the structural views. In the pocket list, pocket name, rank, probability score, size, and average conservation score (if available) is displayed for each pocket. Each pocket also contains several buttons related to visibility of the pockets in the visualizations.
If the visualised structure is predicted, then this panel also contains a warning and a toggle button. Using the button user can toggle between visualisation of a whole structure and confidently predicted regions (confidence score > 70).
The transformer that calculates probability scores from raw scores is calibrated for each individual model (Default, Default+Conservation, Alphafold, Alphafold+Conservation) on the calibration dataset (HOLO4K). The probability score P(x) for a given raw score x is calculated the following way: P(x) = Tx / (Tx + Fx) where Tx is the number of true pockets with a raw score ≤x and Fx is the number for false pockets with the score ≥x (both refer to the predictions of a particular model on the calibration dataset).
Something is not working or are you missing certain functionality/feature? Please let us know by creating a GitHub issue. Alternatively, if you prefer to use email, please feel free to reach us at david.hoksza (at) matfyz.cuni.cz.
PrankWeb, Charles University 2017-