ENORA and multi-state structure calculations
In this tutorial we will provide you with guided examples for calculating eNOEs and multi-state structure calculations.
To this end we will first run the modules of eNORA and then use the obtained eNOEs to calculate a single state and a two-states structure model using automated sorting to group the states. Along the way you will see some additional CYANA skills useful for other purposes as well.
The eNORA module offers in principle two methods to calculate spin diffusion, FRM and TSS. FRM is the recommended way to do these calculations and we will set a main focus on that method, we will however in one section explain the principles at play for TSS and how to set it up.
CYANA setup
Obtaining and installing the CYANA demo version and data
Please follow the following steps carefully (exact Linux commands are given below; you may copy them to a terminal):
- Go to your home directory (or data directory).
- Get the demo data from the server.
- Unpack the demo data for the practical.
- Get the demo version of CYANA.
- Unpack CYANA.
- Setup the CYANA environment variables.
- Change into the newly created directory 'eNORA'.
- Copy the demo_data directory to 'enoe'.
- Change into the subdirectory 'enoe'.
- Test whether CYANA can be started by typing its name, 'cyana'.
- Exit from CYANA by typing 'q' or 'quit'.
To unpack the demo data:
tar zxf demo_data.tar.gz
To unpack CYANA demo version:
tar zxf Cyana-3.98.9_Demo.tgz cd cyana-3.98.9/ ./setup
Change into enoe1pt demo directory:
cd enoe1pt
Try to run CYANA by entering 'cyana' at the command prompt of your terminal (q to quit cyana):
cyana ___________________________________________________________________ CYANA 3.98 (mac-intel) Copyright (c) 2002-17 Peter Guentert. All rights reserved. ___________________________________________________________________ Demo license valid for specific sequences until 2018-12-31 cyana> q
If all worked, you are ready to go in terms of running the CYANA routine!
Execution scripts or "macros" in CYANA
For more complex task within CYANA, rather than to enter the execution commands line by line at the CYANA prompt, the necessary commands are collected in a file named '*.cya'. Collecting the commands in macros has the added advantage, that the macros serve as a record allowing to reconstruct previous calculations.
Hint: For comprehensive information on the CYANA commands etc. consult the CYANA 3.0 Reference Manual.
Preparing input data
Structure input for spin-diffusion calculations
Preparing an xray structure to use within CYANA
Deposited structures many times lack specific features, i.e. Xray structures often lack proton coordinates or contain sequence mutations and ligands. Using the regularize command one can get a structure recalculated within CYANA that has these issues fixed but is still very close to the input structure.
In the data directory you find the 'regulabb' directory and the 'CALC_reg.cya' macro and an 'init.cya' macro.
The initialization macro has the fixed name 'init.cya' and is executed automatically each time CYANA is started. It can also be called any time one wants to reinitialize the program by typing 'init'. It contains normally at least two commands, one to read the library and one to read the sequence. However, for now there is only one command, the one to read the library.
cyanalib
The command 'cyanalib' reads the standard CYANA library.
After reading the library file, one normally reads a sequence file before reading pdb file or a peak list.
Inspection of the 'CALC_reg.cya':
read 1PIN.pdb unknown=warn hetatm new write 1PIN.seq write 1PIN_2.pdb
Where the option 'hetatm' allows for reading of coordinate labeled HETATM, rather than ATOM in the pdb. The parameter 'new' directs CYANA to read the sequence from the pdb.
We require two mutations to the sequence (S18N and W34F), furthermore we do not have a ligand and the expressed protein for NMR is truncated. Therefore we use cyana to truncate the Xray structure and build in the mutations. We read the structure again, but this time provide a sequence, containing the two mutations and use the options unknown=warn or unknown=skip to skip the parts of the Xray structure not specified in the sequence file and later reconstruct those during regularization:
read demo.seq read 1PIN_2.pdb rigid unknown=warn write 1PIN_ed.pdb ./init read pdb 1PIN_ed.pdb regularize steps=20000 link=LL keep
Download the Xray structure used for this exercise:
wget 'https://files.rcsb.org/view/1PIN.pdb'
or for mac os x:
curl https://files.rcsb.org/view/1PIN.pdb -o 1PIN.pdb
Inspect the pdb using chimera: There are several issues with the Xray structure, besides HETATM, that we are handling within CYANA before using the structure.
Execute the 'CALC_reg.cya' macro:
cyana CALC_reg.cya
The resulting structure is 'regula.pdb'.
Calculating a structure from an experimental peak list
In the data directory you find the 'noecalib' directory.
The 'CALC_noecalib.cya' contains the following commands:
peaks:=5.peaks calibration peaks=$peaks peaks calibrate simple write upl noesimple.upl
The following block of commands, reads the restraints 'noesimple.upl', an angle file called 'demo.aco', calculates a structure:
read upl noesimple.upl read aco talos.aco calc_all 100 steps=50000 overview demo.ovw structures=20 pdb
The statistics are in demo.ovw file and the 20 conformers with the lowest target function in the 'demo.pdb'.
Estimating spin-diffusion
Before recording NOESY spectra, it makes sense to estimate the ideal mixing time (or for a NOESY series, the mixing times), where the buildup is still in the linear regime and spin diffusion not prohibitively strong. CYANA has features built in that accomplish this task conveniently.
In the directory 'spinDiff' you find the macro 'CALC_spinDiff.cya':
echo:= on mixingtimes = '0.02,0.03,0.04,0.05,0.06' b0field = 700 mode = 3 # ---------------------------------- structure input ---------------------------------- # specify the conformers for calculations read pdb demo rigid structure select 1 # ---------------------------------- spin-diffusion ---------------------------------- enoe spindiff b0field=b0field time=$mixingtimes mode=mode enoe twospin b0field=b0field time=$mixingtimes mode=mode enoe spdcorr opt=1 time=$mixingtimes
It does the following:
- reading the structure and use the first conformer for spin diffusion calculations.
- 'enoe spindiff' the full-buildup (including spin-diffusion) is calculated for the five mixing times supplied.
- 'enoe twospin' direct transfer (two spin, excluding spin-diffusion)) is calculated for the five mixing times supplied.
- 'enoe spdcorr' calculates the spin diffusion correction for each mixing time specified from the ratio of two-spin versus full-buildup.
Recording NOESY experiments with 13C/15N simultaneous evolution
Bi-directional eNOEs are the most accurate eNOEs with in general no tolerance applied. To obtain these and fulfill the normalization requirements, it is prudent to record combined 13C/14N spectra where the diagonals of 13C bound and 15N bound protons are available in the same spectra.
Vögeli B, Günter P, Riek R (2013) Multiple-state Ensemble Structure Determination from eNOE Spectroscopy. Mol Phys 111:437-454
The pulse sequence are found here: http://n.ethz.ch/~bvoegeli/PulseSequences/CNnoesyhsqc
Preparing experimental peak lists
Peak lists in XEASY format are prepared by automatic peak picking with a visualization program such as CcpNmr Analysis, NMRdraw or NMRview and saved as XXX.peaks, where XXX denotes the name of the xeasy peak list file. Since NMRdraw peak lists are of different file type, cyana provides the command read tab to convert the files to XEASY format.
# Number of dimensions 3 #FORMAT xeasy3D #INAME 1 H #INAME 2 HN #INAME 3 N #SPECTRUM N15NOESY H HN N ??? change to 13C, processed as 13C!!!! 17086 4.098 4.099 57.441 1 U 6.990943E+08 0.000000E+00 e 0 HA.5 HA.5 CA.5 89532 4.355 1.829 33.507 1 U 1.720779E+06 0.000000E+00 e 0 HA.6 HB2.6 CB.6 89544 4.353 1.757 33.513 1 U 2.939628E+06 0.000000E+00 e 0 HA.6 HB3.6 CB.6
The first line specifies the number of dimensions (3 in this case). The '#SPECTRUM' (no space between characters) lines gives the experiment type (N15NOESY, which refers to the corresponding experiment definition in the CYANA library), followed by an identifier for each dimension of the peak list (H HN N) that specifies which chemical shift is stored in the corresponding dimension of the peak list. The experiment type and identifiers must correspond to an experiment definition in the general CYANA library (see below) in most uses of the definition, here however we cheat slightly because for eNOE calculations we record our NOESY spectra with simultanous evolution of 13C and 15N dimensions, since we require 15N and 13C bound spins within the same spectrum for purposes of normalization (see...).
After the '#SPECTRUM' line follows one line for every peak. For example, the first peak in the 'HNCA.peaks' list has
- Peak number 17086
- H chemical shift 4.098 ppm
- ("HN") chemical shift 4.099 ppm (in this case 13C bound)
- Heavy atom chemical shift 57.441 ppm (in this case 13C labeled)
The other data are relevant entry for the eNOE mudules is the peak volume or intensity (6.990943E+08).
SPECTRUM definitions in the CYANA library
When you start CYANA, the program reads the library and displays the full path name of the library file. You can open the standard library file to inspect, for example, the NMR experiment definitions . For instance, the definition for the N15NOESY spectrum (search for 'N15NOESY' in the library file 'cyana.lib') is
SPECTRUM N15NOESY H HN N 0.900 N:N_AM* HN:H_AMI ~4.0 H:H_* 0.800 N:N_AM* HN:H_AMI ~4.5 H:H_* 0.700 N:N_AM* HN:H_AMI ~5.0 H:H_* 0.600 N:N_AM* HN:H_AMI ~5.5 H:H_* 0.500 N:N_AM* HN:H_AMI ~6.0 H:H_*
The first line corresponds to the '#SPECTRUM' line in the peak list. It specifies the experiment name and identifies the atoms that are detected in each dimension of the spectrum. The number of identifiers defines the dimensionality of the experiment (3 in case of N15NOESY).
Each line below defines a (formal) magnetization transfer pathway that gives rise to an expected peak. in the case of N15NOESY there are five lines, corresponding to the through space magnetization transfer by dipol-dipol mechanism. The peak definition starts with the probability to observe the peak (0.900), followed by a series of atom types, e.g. H_AMI for amide proton etc. The atoms whose chemical shifts appear in the spectrum are identified by their labels followed by ':', e.g. for N15NOESY 'H:', 'HN:', and 'N:'. If you were to use the CYANA functions to simulate peaks, expected peaks are generated for each molecular fragment in which these atom types occur.
You may have realized that our peak list contains peaks that are 13C bound, therefore the spectrum definition is wrong, since we are only reading the peak lists and not generating any, this is not a problem.
From nmrDraw to XEASY
In the 'nmrDrawX' directory you find the 'nmrDrawX.cya' macro:
# read nmrDraw tab file read tab demo.tab # sort the peaks peaks sort # write out peak lists do i 1 npkl peaks select "** list=${i}" write peaks $i.peaks names end do
It does the following:
- Conversion of the nmrDraw peak file to XEASY format with the atom assignments in the file (see * read tab).
- Sorting of the peaks in each peaks list per mixing time.
- Writing out the peaks in XEASY format with atom names contained in the file, one peak list for each mixing time.
Using Talos to generate torsion angle restraints
Torsion angle restraints from the backbone chemical shifts help restrict angular conformation space. We wish to use only "strong assignments" to generate these restraints.
If you do not have TALOS installed get it from here. It is part of the nmrpipe software package.
In the 'acoPREP' directory, inspect the 'CALC_talos.cya' file with the commands to calculate the talos angle restraints:
read 5.peaks shifts initialize shifts adapt atoms set "* shift=990.0.." shift=none write demo.prot read prot demo.prot unknown=skip talos talos=talos+ talosaco pred.tab write aco talos.aco
This will call the program TALOS+ and store the resulting torsion angle restraints in the file 'talos.aco'.
Since this is not a calculation suited for the MPI scheduler, start CYANA first, then call the 'CALC_talos.cya' macro from the prompt.
Hint: change to a cshell before running cyana (since talos needs a cshell to run):
csh
eNOE calculations
All eNOE related calculations within cyana are carried out using the eNORA modules.
For best results, NOESY experiments are measured at different mixing times (keeping the mixing times as much as possible within the linear regime of NOE buildup). However one can obtain very good results from a single mixing time. The advantage of a buildup series lies manly in the ability to see NOEs that do not behave as expected, in order to exclude them from use in structure calculation.
eNORA (single mixing time)
In the 'enoe1pt' directory you find the relevant 'init.cya' and 'CALC_enoe1pt.cya' macro's.
The init macro
The initialization macro file has the fixed name 'init.cya' and is executed automatically each time CYANA is started. It can also be called any time one wants to reinitialize the program by typing 'init'. It contains normally at least two commands that read the CYANA library and the protein sequence:
rmsdrange:=8-33 cyanalib read seq demo.seq
The first line sets the appropriate rmsdrange, and the command 'cyanalib' reads the standard CYANA library. The next command reads the protein sequence.
The protein sequence is stored in three-letter code in the file 'demo.seq'.
Stereo-specificity of dia-stereocenters
atoms stereo "HA? 10" atoms stereo "HB? 7 8 11 13 14 21 23 24 25 26 27 33 34 35 37 38" atoms stereo "HG? 8 12 14 17 36 37" atoms stereo "HD2? 18 26 30" atoms stereo "QG? 22" atoms stereo "QD? 7" atoms stereo "HE2? 33" atoms stereo "HD? 8 14 21 37" atoms stereo "HG1? 28"
However, one may do the following to supply all atoms as stereo specific:
atoms select atoms stereo
or to supply all atoms as non stereo specific, use:
atoms select atoms stereo delete
To get a feedback of the supplied stereo specific assignment add to your 'init.cya' the command:
atoms stereo list
D20 exchange
With 3% D2O in the nmr buffer for exchange of backbone amide atoms:
atoms set "H" protlev=0.97
TauC setting
Individual correlation times may be set per atom, the overall tumbling time is set as:
atoms set "*" tauc=4.25
The eNORA CALC macro
Below you find the 'CALC.cya' script. You will find comments for the commands options, where we found it appropriate.
The 'CALC_enoe1pt.cya':
# ---------------------------------- eNORA routine ---------------------------------- # ---------------------------------- basic parameter definitions ---------------------------------- echo:= on mixingtimes = 0.06 b0field = 700 maxdistance = 6.5 normed = 0 normspin = 2 mode = 3
- The parameter definitions, their function is explained later with respect to the functions that use them.
# ---------------------------------- structure input ---------------------------------- # specify the conformers for calculations read pdb demo rigid structure select 1
- The structure input and which conformers to use for spin-diffusion calculations. If multiple conformers are used to average the spin-diffusion values, the command 'structure select' may be used to select multiple conformers of the pdb.
# ---------------------------------- peak file reading ---------------------------------- # read in the peak lists read peaks 5.peaks # ---------------------------------- initializing the routine ---------------------------------- # initialize the routine, fit experimental decays and buildups enoe init normalize=normspin normed=normed time=$mixingtimes
- 'enoe init' orders the cross peaks to the specified diagonal and either returns all cross peaks or only the normalizable ones.
# fit experimental decays enoe diag opt=1 # fit experimental buildups enoe cross enoe sigma opt=1 # ---------------------------------- spin-diffusion ---------------------------------- # calculate the spin-diffusion correction enoe spindiff b0field=b0field time=$mixingtimes mode=$mode enoe twospin b0field=b0field time=$mixingtimes mode=$mode enoe spdcorr opt=0 time=$mixingtimes
- 'enoe spindiff' calculates the spin diffusion correction factors per mixing time with the FRM approach (mode=3).
# ---------------------------------- apply spin-diffusion ---------------------------------- # apply spin-diffusion correction to experimental buildups (if calculated) and calculate sigma enoe sigma opt=2 # calculate reff enoe reff b0field=b0field # prepare the cyana restraints (scaling, error margins) enoe restraint errStereoFlag=1 errStereo=-1 chiN=-1 errBi=0
- 'enoe restraint' applies various tolerances to the effective distance (bi- versus uni-directional, known or unknown-stereo specificity, degenerate methelyne and methyl groups, etc.).
# ---------------------------------- write restraints ---------------------------------- # delete fixed distances from upl/lol output distance delete fixed # write the distance restraints to file write upl enoe.upl write lol enoe.lol
- finally write the upl and lol files after deleting distances that do not contribute structural information and are redundant ('distance delete fixed').
# ---------------------------------- overview ---------------------------------- enoe overview
- 'enoe overview' writes the overview file 'enoe.ovw'.
Run the eNOE calculation such as:
cyana CALC_enoe1pt.cya
eNORA output files
The eNORA algorithm will produce the following output files:
- enoe.upl and enoe.lol: Upper limit and lower limit restraint files with tolerances applied.
- enoe.ovw: Collated results file.
The enoe.upl/lol distance restrains
Suggested (default) correction factors for lower and upper distance bounds:
Condition Factor for lower bound Factor for upper bound Methyl group 3–1/6 x 0.915 = 0.762 3–1/6 x 1.085 = 0.903 Degenerate isopropyl group (Val, Leu) 6–1/6 x 0.915 x 0.95 = 0.645 6–1/6 x 1.085 x 1.05 = 0.713 Degenerate methylene group 2–1/6 x 0.95 = 0.846 2–1/6 x 1.05 = 0.935 2-fold degenerate aromatic protons 2–1/6 = 0.891 2–1/6 = 0.891 4-fold degenerate aromatic protons 4–1/6 = 0.794 4–1/6 = 0.794 bi-directional NOE 1 1 uni-directional NOE 0.8 (default) 1.2 (default) Aromatic to non-aromatic NOE normalized to non-aromatic diagonal peak 0.89 (default) 1.11 (default) no stereospecific assignment (default calculated) (default calculated)
The enoe.ovw file
Collated file that may be generated at any time during the routine and will be populated with the values available at the momentary progress of calculations.
- ASSIGNMENT(i->j): Assignment arranged with the flow of magnetization.
- REFFixEXP: The experimental sigma of spin x, where x=i,j
- REFFxCORR: The reff of spin x, where x=i,j
- SIGxEXP: The experimental sigma of spin x, where x=i,j
- SIGxCORR: The spin-diffusion corrected experimental sigma of spin x, where x=i,j
- SDCx: The spin-diffusion correction of spin x, where x=i,j. This value is obtained from the ratio of the spin-diffusion corrected sigma over raw experimental sigma. The spin-diffusion corrected sigma, for which the spin diffusion corrections are applied to the intensity at each mixing time and then fitted, is calculated with the 'enoe reff' command.
- SDPx: The number of other partner spins involved in spin-diffusion for spin x, where x=i,j
- IZEROx: The back calculated I(0) value of spin x, where x=i,j
- exp_chiNx: The goodness of fit of the experimental data, where x=i,j
- corr_chiN x: The goodness of fit of the spin-diffusion corrected experimental data, where x=i,j
eNORA (using buildup data)
All eNOE related calculations within cyana are carried out using the eNORA modules.
NOESY experiment measured at different mixing times (keeping the mixing times as much as possible within the linear regime of NOE buildup) supply very precise distance restraints used for a structure calculation.
You will find the relevant macro's in the directory 'enoebup'.
The init macro
The initialization macro is the same as for a single mixing time, except for the setting of the setting of a general rho value:
# -------------- avg rho -------------- atoms set "*" rho=5.3
The eNORA CALC macro
The 'CALC_enoe.cya' starts with the following:
# ---------------------------------- eNORA routine ---------------------------------- # ---------------------------------- basic parameter definitions ---------------------------------- echo:= on mixingtimes = '0.02,0.03,0.04,0.05,0.06' b0field = 700 rhofile = 'rhoInApo.rho' normed = 0 normspin = 2 mode = 3 bname ='bupplots' dname ='decplots' izname ='izPlot' # ---------------------------------- structure input ---------------------------------- # specify the conformers for calculations read pdb demo rigid structure select 1 # ---------------------------------- peak file reading ---------------------------------- # read in the peak lists do i 1 5 read peaks $i.peaks $if(i.eq.1,' ','append') end do
- reading the XEASY peak lists in the order of increasing mixing time.
# ---------------------------------- initializing the routine ---------------------------------- # initialize the routine, fit experimental decays and buildups enoe init normalize=normspin normed=normed time=$mixingtimes # print average experimental auto-relaxation and I(0) values to screen enoe diag opt=2 plot=$izname graf $izname.pdf # supply averaged rho or izero values if (existfile('$rhofile')) then read rho $rhofile end if
- Autorelaxation values and/or Izero values are calculated depending on the option set by the user ('enoe diag') and saved in memory or can be overwritten by reading a file ('read rho').
# plot the diagonal decay's enoe plotdec plot=$dname graf $dname.pdf
Additional commands only usefull with buildup data:
- 'enoe plotdec plot=$dname' and 'graf $dname.pdf' generate a pdf file for the diagonal decays of the experimental data. This is visually inspected to remove diagonal decays that may behave not as expected (experimental artifacts etc.)
# write the auto-relaxation values to file write rhoOut.rho # enoe cross peaks select "NONORM list=1" write peaks 1_NONORM.peaks names # fit experimental buildups enoe sigma opt=1 # ---------------------------------- spin-diffusion ---------------------------------- enoe spindiff b0field=b0field time=$mixingtimes mode=$mode enoe twospin b0field=b0field time=$mixingtimes mode=$mode enoe spdcorr opt=0 time=$mixingtimes # apply spin-diffusion correction to experimental buildups (if calculated) and calculate sigma enoe sigma opt=2 # calculate reff enoe reff b0field=b0field # prepare the cyana restraints (scaling, error margins) enoe restraint errStereoFlag=1 errStereo=-1 chiN=-1 errBi=0 # ---------------------------------- write restraints ---------------------------------- # delete fixed distances from upl/lol output distance delete fixed # write the distance restraints to file write upl enoe.upl write lol enoe.lol # --------------------------------- plotting buildups -------------------------------- enoe plotbup plot=$bname opt=2 graf $bname.pdf
Additional commands only usefull with buildup data:
- 'enoe plotbup plot=$bname opt=2' and 'graf $bname.pdf' print the buildups to file for inspection.
# ---------------------------------- overview ---------------------------------- enoe overview
Start CYANA and execute the 'CALC_enoe.cya' macro from the CYANA prompt as such:
cyana CALC_enoe.cya
The program will run and then stop as soon as it finished plotting the diagonal decay's and has written the 'rhoOut.rho' file. Before commenting out (#) the exit command and running the routine to its completion, do the following exercise.
Exercise: Compiling the autorelaxation file
Before using the calculated averages for auto-relaxation and Izero, check the diagonal decays visually for their quality. Edit out any diagonal peaks from the peak files that give bad decays, then run the routine again.
If your are compiling the rhoIn.rho file manually see * auto-relaxation and I(0) values.
eNORA (using generic normalized eNOEs)
For larger proteins with lots of overlap on the diagonal, it may be necessary to resort to a more pragmatic way to undertake normalization. If on separates the diagonal into spin types in accordance to their relaxation properties, it is possible to calculate an upper limit intensity for the diagonal per spin type. Excluding out-layers, the upper distance limit will be longer in comparison to the true distance. The resulting distance therefore helps to define the structure but will not lead to erroneous results.
In comparison to the regular eNOE calculation there are a few things to change in order to accomplish this feat.
normalized = 0
Change the normalized variable used in the enoe init command from 1 to 0. Doing this will ensure that cross peaks without a corresponding diagonal peaks remain in the pool for calculations.
#fit experimental decays enoe diag opt=3 plot=izplot graf izplot.pdf
Change the opt variable used in the enoe diag command from 2 to 3. This will then calculate the statistics of diagonal intensities per spin type and assign an artificial value to missing spins.
distance select UNIDIR distance select "-GENNORM" distance sort write upl enoe_UNIDIR.upl write lol enoe_UNIDIR.lol distance select BIDIR distance select "-GENNORM" distance sort write upl enoe_BIDIR.upl write lol enoe_BIDIR.lol distance select GENNORM distance sort write upl enoe_GENNORM.upl
Using the distance select command allows to user to separate distances resulting from eNOEs into UNIDIR (uni directional) and BIDIR (bi directional) and separating generic normalized eNOEs (GENNORM) from the other distances. For UNIDIR and BIDIR distances one writes the upper and lower limits, for GENNORM only the upper distances are kept, since the nature of using upper limits for Izero values results in distance that are too long in comparison to the true distances.
The TSS approach to spin-diffusion calculations
The TSS approach to spin-diffusion calculations is especially useful for calculations involving deuterated samples or samples with special labeling (i.e. methyl labeling, see below):
The TSS mode is accessed by setting the parameter mode=2 for the 'enoe spindiff' command (see * enoe spindiff).
Labeling Schemes
Deuterated labeling schemes often involve methyl labeling with 3% D2O in the nmr buffer, i.e.
atoms set "H" protlev=0.97 # labeling scheme: VAL_G1 0% LEU_D1 0% ILE_D1 0% atoms set "QG1 @VAL + QD1 @LEU + QD1 @ILE" protlev=0.0
Considerations regarding the obtained eNOE restraints
Mapping calculated eNOE restraints onto a known structure
One can map the calculated restraints, such as distance restraints (upl/lol) onto a known structure (in the example here the modified xray structure). This is an approach to analyze restraints and their influence on the results.
Below you find the commands to accomplish this. You see by studying the commands, which files are needed to execute the macro. Therefore, create a new directory ('mkdir') or copy a directory containing the respective files. Delete what you do not need. Use the regularized xray structure from the exercise above.
You need an init file:
rmsdrange:=8-33 cyanalib read seq demo.seq
And the main macro (name it 'CALC_xraymap.cya'):
read upl enoe.upl read lol enoe.lol read regula.pdb unknown=warn weight_vdw=0 overview enoe_xray.ovw
- If the restraints do not match with the xray structure, does it mean they are wrong?
- If you tried the two options, what is (are) the difference(s)?
Multi-state structure calculation with ensemble-averaged restraints
To facilitate the discussion of multi-state structure calculations and ensembles we use the following definitions:
- The CYANA target function is a measure for the quality of the computed structural ensembles given in terms of the squared violation of the experimental restraints.
- A structure is defined by a bundle (or an ensemble) of conformers fulfilling the experimental data.
- A conformer is the result of one individual structure calculation that fulfills the experimental data and may be composed of one or more states.
- A state is one set of coordinates for all atoms of a molecule. If there are multiple states they fulfill the experimental data on average and not individually.
- Sub-bundles are formed by sorting the states according to structural similarity in the region of interest. There are as many sub-bundles as there are states in a conformer, and each sub-bundle comprises as many conformers as the original structure bundle. This requires for each state to belong to exactly one sub-bundle. The sub-bundle for each structural state is a measure of the precision of the individual structural states similar to the conventional bundle representation.
Calculating a single state structure
We will perform calculations based on eNOEs by using torsion angle dynamics in order to compute the three-dimensional structure of the protein.
The 'enoe.upl' and 'enoe.lol' files will be used together with the aco based on chemical shifts of the backbone and scalar couplings from backbone, Ha-HB and aromatic residues determined by experiment.
The single-state structure calculation is in principle a regular structure calculation, using your upl/lol, aco and cco files as input.
This would look something like this (we will do it differently):
syntax inputseed=@i=3771 # ------ Structure calculation ------ read upl XXX.upl read lol XXX.lol read aco XXX.aco read cco XXX.cco anneal_weight_aco := 1.0, 1.0, 0.0, 0.0 anneal_weight_cco := 0.0, 0.5 seed=inputseed calc_all 100 steps=50000 if (master) then cut_cco=1.0 cut_rdc=3.0 weight_aco = 0.0 rmsdrange:=8-33,108-133 overview sstate structures=20 pdb end if
However, since we end up calculating also multi-state structures later on, it makes sense to setup the single-state calculation exactly the same way as the multi-state calculations, and only edit as few parameters as possible. As soon as you understand the 'PREP.cya' macro below, you will realize why this makes sense.
Single state calculation
In the 'sstate' you will find the 'init.cya', 'PREP.cya' and the 'CALC_sstate.cya' macro's.
The init macro
In addition to what was described above, the 'init.cya' macro contains additional lines to read the multi-state sequence in order to prepare the restraints and run the structure calculation:
cyanalib if (existfile('bundle.seq')) then read seq bundle.seq molecules define * atoms set * vdwgroup=bundle else read seq demo.seq end if rmsdrange:=8-33 swap=0 expand=1
- the distances derived from NOEs involving magnetically or chemically equivalent spins are interpreted as effective distances corresponding to the sum of all cross-relaxation rates between the individual spin pairs (expand=1).
atoms stereo "HA? 10" atoms stereo "HB? 7 8 11 13 14 21 23 24 25 26 27 33 34 35 37 38" atoms stereo "HG? 8 12 14 17 36 37" atoms stereo "HD2? 18 26 30" atoms stereo "QG? 22" atoms stereo "QD? 7" atoms stereo "HE2? 33" atoms stereo "HD? 8 14 21 37" atoms stereo "HG1? 28" atoms set "H" protlev=0.97
The PREP macro
The 'PREP.cya' macro prepares the calculated eNOEs for use in multi-state structure calculation. For a single state structure calculation this would really not be necessary, we only run the script to keep the calculations consistent and not accidentally introduce changes other than the multi-state changes. We nevertheless explain the detailed workings of the script here.
syntax nbundle=@i=1 togetherweight=@r=0.1 multitensor
- 'nbundle=@1' sets the number of states to 1, for a two state structure calculation this need be set to two.
#multitensor=.true. together=.true. moloffset=100
- 'moloffset' sets the offset in residue numbering between the sets of coordinates that compose one conformer.
# ------ Sequence file ------ read seq demo.seq print "\# Bundle of $nbundle conformers" >bundle.seq do j 1 nbundle do i 1 nr if (j.lt.nbundle .and. rnam(i).eq.'PL') break print "$rnam(i) ${$rnum(i)+moloffset*(j-1)}" >> end do if (j.lt.nbundle) print "PL LL2 LL2 LL2 LL2 LL2 LL2 LL2 LL2 LL2 LL2 LL2 LL2 LL2 LL2 LP" >> end do print >>.
- The offset is incorporated in the sequence by a linker between the sets of coordinates that compose one conformer. For a two-state conformer there is one linker, for a three-state conformer there are two linkers. This is necessary because we operate in angular space.
# ------ Make bundle angle restraints ------ read aco demo.aco write aco bundle.aco do j 2 nbundle atom set * residue=residue+moloffset write aco bundle.aco append end do
- this prepares the talos angels with the offset in residue numbering.
# ------ Make bundle coupling constant restraints ------ read demo.seq read cco demo_backbone.cco read cco demo_aro.cco append read cco demo_JHaHb.cco append print "\# Coupling constant restraint file" >bundle.cco do i 1 ncco i1=iccoa(1,i); i2=iccoa(2,i) do j 1 nbundle m=moloffset*(j-1) print "${$rnum(iar(i1))+m} $rnam(iar(i1)) $anam(i1) ${$rnum(iar(i2))+m} $rnam(iar(i2)) $anam(i2) $cco(i) $tolcco(i) 1.0 $karplus(1,i) $karplus(2,i) $karplus(3,i) $if(j.ne.nbundle,'&',' ')" >>bundle.cco end do end do print >>. read seq bundle.seq read cco bundle.cco write cco bundle.cco karplus
- this prepares the experimental scalar coupling restraints with the offset in residue numbering.
# ------ Make bundle RDC restraints ------ #read bundle.seq #read rdc demo.rdc #print "\# RDC restraint file" >bundle.rdc #do i 1 orientations # if (multitensor) then # do j 1 nbundle # print "${i+orientations*(j-1)} $magnitude(i) $rhombicity(i) ${$rnum(iar(irtena(4,i)))+moloffset*(j-1)}" >> # end do # else # print "$i $magnitude(i) $rhombicity(i) $rnum(iar(irtena(4,i)))" >> # end if #end do #do i 1 nrdc # i1=irdca(1,i); i2=irdca(2,i) # do j 1 nbundle # m=moloffset*(j-1) # iten=irdct(i); if (multitensor) iten=iten+orientations*(j-1) # print "${$rnum(iar(i1))+m} $rnam(iar(i1)) $anam(i1) ${$rnum(iar(i2))+m} $rnam(iar(i2)) $anam(i2) $rdc(i) $tolrdc(i) $weirdc(i) $iten $rdcsca(i) $if(j.lt.nbundle,'&',' ')" >>bundle.rdc # end do #end do #print >>. #read seq bundle.seq #read rdc bundle.rdc #write rdc bundle.rdc
- if experimental RDC's are available include this section of the code.
# ------ Make ambiguous bundle distance restraints ------ subroutine PURGE #distance delete "HA 9, HB2 9" end
- if there are distance restraints you decide to delete the assignment can be included in the PURGE command, to remove them below from the generated upl and lol bundle restraints.
init read upl enoe.upl PURGE distance modify info=full molecules symmetrize if (nbundle.gt.1) distances set "$moloffset.., $moloffset.." bound=0.0 distances set "*, *" bound=bound*(1.0*nbundle)**(-1.0/6.0) write upl bundle.upl
- this prepares the experimental upper limit distance restraints with the offset in residue numbering, makes them ambiguous and imposes the multi-state averaging condition.
init read lol enoe.lol PURGE distance modify info=full molecules symmetrize if (nbundle.gt.1) distances set "$moloffset.., $moloffset.." bound=0.0001 distances set "*, *" bound=bound*(1.0*nbundle)**(-1.0/6.0) write lol bundle.lol
- this prepares the experimental lower limit distance restraints with the offset in residue numbering, makes them ambiguous and imposes the multi-state averaging condition.
# ------ Make restraints to keep corresponding atoms together ------ if (together .and. nbundle.gt.1 .and. togetherweight.gt.0.0) then read seq bundle.seq molecules define * atom set * vdwgroup=bundle atom select "N C*" do i 1 na if (iamol(i).ne.1) break if (asel(i)) then distance make "$atom(i)" "$anam(i) ${$rnum(iar(i))+moloffset}" upl=1.2 weight=$togetherweight info=none end if end do distances set "* - N CA C CB, * - N CA C CB" weight=weight*0.1 molecules symmetrize
- this prepares artificial distance restraints to keep the keep the multi-state coordinates close together for the averaging condition. Otherwise molecules very far away would also fulfill the condition, since they have no contribution to the eNOE, they also do not violate it (not a desired outcome).
- 'molecules symmetrize' disables van der Waals forces between the copies of the same molecule within the same calculation
distances unique write upl together.upl end if
The single-state CALC macro
The 'CALC_sstate.cya' file for structure calculation is outlined below:
syntax inputseed=@i=3771 if (master) then PREP end if # ------ Structure calculation ------ read upl bundle.upl read lol bundle.lol read aco bundle.aco read cco bundle.cco #read rdc bundle.rdc if (existfile('together.upl')) read upl together.upl append anneal_weight_rdc := 0.0, 0.5 anneal_weight_aco := 1.0, 1.0, 0.0, 0.0 anneal_weight_cco := 0.0, 0.5 seed=inputseed calc_all 100 steps=50000
- these commands call to start the structure calculation with 100 conformers and to analyze the best 20 of them to keep. 50000 torsion angle dynamics steps are applied per conformer.
if (master) then cut_cco=1.0 cut_rdc=3.0 weight_aco = 0.0 rmsdrange:=8-33,108-133 overview bundle structures=20 pdb #read pdb bundle.pdb #rmsdrange:=23-26, 31-34,123-126, 131-134 #overview bundleSec structures=20 pdb # reference=xxx.pdb #molecules sort "BACKBONE 23-26, 31-34" base=1 #write sortStates.pdb all #SPLIT end if
The structure calculation is executed by running the 'CALC_sState.cya' macro:
cyana -n 33 CALC_sState.cya
Doing this, basically means each processor will calculate 100/33=3 conformers. If you changed the setup to calculate 50 structures, you would start the calculation with 'cyana -n 25 CALC_sState.cya'.
Carefully analyze the WARNING and ERROR messages if any.
Statistics on the the structure calculation will be displayed to screen. The final structure is named 'bundle.pdb'.
Calculating Multi-state structures
The multi-state structure calculation is analogous to what was shown for the single-state calculation, therefore we only explain additional commands and changes.
Grouping the coordinates of multi-state calculations
In the 'CALC_multistate.cya' script, there are the following additional commands:
read pdb bundle.pdb rmsdrange:=23-26,31-34,123-126,131-134 overview bundleSec structures=20 pdb # reference=xxx.pdb molecules sort "BACKBONE 23-26,31-34" base=1 write sortStates.pdb all
It is very important here that the 'rmsdrange' is set the same as for the 'molecules sort' command. Otherwise the 'pdb write' command inside the 'SPLIT' macro, has the potential to create confusing results.
The SPLIT macro
We introduced a molecular offset to create a multi-state coordinate set for each conformer, now we want to set this offset back and delete the linker between the states.
moloffset=100 read pdb sortStates.pdb n=nstruct write_all split nbundle=$rnum(nr)/moloffset+1 show nbundle do i 1 nstruct read seq bundle.seq read pdb split$i(I3.3).pdb do j 1 nbundle atoms select 1-80 write pdb split$i(I3.3)-$j.pdb selected atoms set * residue=residue-moloffset end do read seq demo.seq read_all split$i(I3.3)-*.pdb unknown=skip write split$i(I3.3).pdb all do j 1 nbundle remove split$i(I3.3)-$j.pdb end do end do read seq demo.seq do i 1 n read pdb split$i(I3.3).pdb append end do write pdb splitall.pdb all rmsd
- The final output is the 'splitall.pdb' file.
Exercise: Setting up a two-state calculation
Copy the 'sstate' directory and give it the name 'twostate', then delete all the previous, unnecessary output files to reduce clutter and have better oversight. Copy the 'CALC_sstate.cya' and rename it 'CALC_multistate.cya'.
cp -r sstate twostate cd twostate mv CALC_sstate.cya CALC_multistate.cya rm *.out *.job final* rama*
With a text editor, edit the 'CALC_multistate.cya' macro to activate the inactive commands (by deleting the preceeding hashtag #) necessary to perform the grouping of states and splitting of the conformers.
With a text editor, change the number of states (nbundle=@i=) from one to two in the 'PREP.cya' macro:
syntax nbundle=@i=1 togetherweight=@r=0.1 multitensor
When you are done preparing the macros as outlined perform the calculation by running the 'CALC_multistate.cya' macro:
cyana -n 33 CALC_multistate.cya.cya
Carefully analyze the WARNING and ERROR messages if any.
Results: analysis
Download and install the molecular viewer Chimera
- Download Chimera (to your personal laptop) from: Chimera
Exercise: Single state structure analysis
The final structure will be 'final.pdb'. You can visualize it, with chimera:
chimera bundle.pdb
Analyze the result, the bundle seems unnaturally tight for an NMR structure bundle. Why?
Exercise: Two-state structure analysis
The final structure will be 'splitall.pdb'. You can visualize it, with the chimera
chimera splitall.pdb
By then loading 'chimera.com' script in the directory, you can individually color the states to cyan and blue.
On improving the final structure
General questions to answer regarding this task:
- How can you get more eNOEs out of the existing data? Hint: think about normalization.
- Name additional experimental restraints (or inputs) you could use for structure calculation.
- Name additional NMR experiments you could measure, to acquire experimental data that are not supplied with the demo_data.
eNORA extensions and options
There are a variety of commands to modify eNORA runs.
Averaging of spin-diffusion over multiple conformers
After reading the pdb set the 'structure select' command to:
structure select 1-20
Generating XEASY peak list with expected FRM or two-spin intensities
Remember to set up a init file:
cyanalib read seq demo.seq # -------------- stereo specific assignment -------------- # to supply all atoms as stereo specific, use: atoms select atoms stereo # see the supplied stereo specific assignment atoms stereo list # -------------- tauC -------------- atoms set "*" tauc=4.25
The main CALC.cya file:
# -------------------------- get the shifts from a XEASY peaks list --------------------------- ./init # convert read 5.peaks shifts adapt contribution=0.0 shifts renumber atoms set "* shift=900.0.." shift=none write demo.prot # -------------------------------- basic parameter definitions -------------------------------- ./init echo:= on mixingtimes:= 0.02,0.03,0.04,0.05,0.06 b0field = 700 # ---------------------------------- structure input ----------------------------------- # specify the conformers for calculations read pdb demo rigid structure select 1 # ---------------------------------- peak input ---------------------------------- loadspectra structure=demo.pdb peaks=N15NOESY,C13NOESY prot=demo.prot simulate # ---------------------------------- run eNORA elements and write peaks ---------------------------------- do n 1 length('mixingtimes') enoe spindiff b0field=b0field time=$mixingtimes(n) mode=3 labilatom='NONE' enoe twospin b0field=b0field time=$mixingtimes(n) mode=3 labilatom='NONE' read peaks N15NOESY_exp.peaks # FM atoms values mode=1 write peaks N15NOESY_FM_$n.peaks names read peaks C13NOESY_exp.peaks enoe values mode=1 write peaks C13NOESY_FM_$n.peaks names # 2 spin enoe values mode=2 write peaks N15NOESY_2spin_$n.peaks names read peaks C13NOESY_exp.peaks enoe values mode=2 write peaks C13NOESY_2spin_$n.peaks names end do read peaks C13NOESY_FM_1.peaks peaks2dplot dimensions=12 read peaks C13NOESY_FM_1.peaks read peaks N15NOESY_FM_1.peaks append shifts initialize shifts adapt atoms set "* shift=990.0.." shift=none write prot NOESY_1.prot write peaks NOESY_1.peaks
Depositing multi-states structures to a PDB data base
PDB data bases require a specific format to deposit structures for publication. Below you find a CYANA script that will allow you to transform a multi-state structure into a publishable format. The format distinguishes the states by using a chain letter, such as A and B for a two-states structure. Populations are specified in this format as occupancy (corresponding to the Xray structure format).
read seq demo.seq read pdb demoState1.pdb read pdb demoState2.pdb append atoms select 11-16,21-26,31-34 write pdb append.pdb all read pdb append.pdb rigid structure select 1-20 atoms set * chain=A write_all splitA structure select 21-40 atoms set * chain=B write_all splitB remove splitAB.pdb do i 1 nstruct j=i+20 system "cat splitA$i(I3.3).pdb splitB$j(I3.3).pdb >> splitAB.pdb; rm -f split?$i(I3.3).pdb ; rm -f split?$j(I3.3).pdb" end do read seq demoAB.seq read pdb splitAB.pdb write pdb splitAB.pdb all ter read pdb splitAB.pdb deposit pdb=demoAB.pdb read bundle.seq read bundle.lol read bundle.upl read bundle.aco read bundle.cco atoms set "* 101-199" chain=B #residue=residue-100 atoms set "* 1-99" chain=A atoms set "* :B101-B199" residue=residue-100 write bundleAB.lol write bundleAB.upl write bundleAB.aco write bundleAB.cco karplus read seq demoAB.seq molecules define A6-A39 B6-B39 atoms set * vdwgroup=bundle rmsdrange:=A11-A16,A21-A26,A31-A34,B11-B16,B21-B26,B31-B34 read pdb demoAB.pdb read bundleAB.lol read bundleAB.upl read bundleAB.aco read bundleAB.cco overview read seq demoAB.seq read pdb demoAB.pdb molecules define A5-A39 B5-B39 atoms set "* :A*" occupancy=0.5 atoms set "* :B*" occupancy=0.5 write demoOcc.pdb multistate all details bfactor=0.00