ACMI v1.5.1 manual (c) 2007 Frank DiMaio (c) 2008 Ameet Soni 1 INTRODUCTION -------------- ACMI is a suite of programs designed to automate construction of a macromolecular model from an electron density map. It is specifically developed to handle low-resolution (3-4 Angstrom resolution) and poorly phased electron density maps. Details of ACMI's algorithms are available from the publications listed at the end of this manual. The manual is meant as an introduction to using ACMI to construct protein models from electron density, and a reference to each program's command-line parameters. Generally, a user wanting to construct an all-atom protein model from an electron density map will want to use the following program pipeline: a) SHFEAR -- given a protein amino-acid sequence and a density map -- searches the map for a set of templates corresponding to each amino-acid residue in the protein. It computes the prior probability distribution that of each amino-acid's presence at every location in the map. b) BP uses structural knowledge of the protein to refine the probabilities computed by SHFEAR. Its probabilistic inference algorithm computes the posterior distribution of each amino acid's presence at every location in the map; that is, the probability taking into account the location of every other amino acid. c) SIDECHAIN_PF uses a Monte-Carlo sampling algorithm over BP's posterior distributions to generate a _set_ of physically feasible protein structures, that best explains the computed posterior distribution. These three programs are connected using the idea of a working directory. Each program writes its output to the working directory; the next program in the pipeline reads in from this working directory, processes this input, then writes its output to this same directory. 2 TUTORIAL ---------- A small, well-phased 2.5-Angstrom-resolution density map -- with corresponding amino-acid sequence and crystal parameters -- is provided in the folder 'test/data/'. The density map is contained in the file 'test.fs4.map', the amino-acid sequence in 'test.seq', and the crystal parameters in 'test.cryst'. The script 'do_test.sh' shows a simple pipeline for the provided data. Below, we show each command in the pipeline, annotated with the output produced. ./shfear -w test/ -m test/data/test.fs4.map \ -c test/data/test.cryst \ -s test/data/test.seq \ -r 2.5 -o 0.9 -N 20 -p 1 1 NOTE: Before running this command, make sure the the environment variable "ACMI_NMERDB" is set to specify the folder containing the nmer database. This command runs SHFEAR's template searching over the density map. For each amino-acid position in test.seq, SHFEAR considers the 5-amino-acid sequence _centered_ at this position, and searches ACMI's fragment database for structures with a 5-AA sequence with high local similarity. It performs an exhaustive 6D search over the map for each matching structure, computing a correlation coefficient at every location and rotation. The density map, crystal parameters (identical to the 'CRYST1' line in a PDB file) and sequence information (one-letter codes, each chain on a line, with multiple asymmetric copies indicated with multiple lines) are specified with the '-m', '-c', and '-s' flags, respectively. The '-r' flag optionally provides the density map resolution. The '-o' flag says to ignore 90% of the map in this search -- that is, only look for fragments in the best 10% of the map. We set this value higher than the default 0.8 (ignoring only 80% of the map) because this is a well-phased map; setting it to 0.9 should find as good a model in about half the time. The '-N' flag specifies the minimum number of templates for which to search per residue. The default value, 50, typically works well; we set it to 20 here for performance reasons. Finally, the -p flag is used for parallelizing the algorithm. The default, '-p 1 1' (read 1-of-1) runs the entire job on a single processor. If there are two machines available, running two jobs, one specifying '-p 1 2', the other '-p 2 2' will finish about twice as fast. The working directory is specified with the '-w' flag (here it is the folder 'test/'). SHFEAR creates a subdirectory in the working directory named 'priors'. In this folder, for each 5-amino-acid sequence XXXXX, SHFEAR creates 5 files: match_XXXXX.cc.mat - a map of correlation coefficients between the best matching templates (C-alpha-centered) and the density map. match_XXXXX.betaF.mat, match_XXXXX.gammaF.mat - a map of the "preferred" location of the adjacent N-terminal C-alpha, specified in spherical coordinates. match_XXXXX.betaB.mat, match_XXXXX.gammaB.mat - a map of the "preferred" location of the adjacent C-terminal C-alpha, specified in spherical coordinates. Each of these maps is in MATLAB v6 format, and can be loaded and visualized in MATLAB. Data is extended to cover the entire unit cell (not just the asymmetric unit). ./BP -w test/ -c test/data/test.cryst \ -s test/data/test.seq This command runs ACMI's inference algorithm on the previously computed priors. The arguments -c and -s are identical The working directory -- again indicated with '-w' -- must be the same working directory in the previous step. BP reads in the contents of the 'priors' folder, and creates an output directory, 'posteriors'. In this folder, for each residue X of chain Y, BP creates the output file 'test_chnY_resX_marginal.mat'. This file contains the posterior distribution (that is, the distribution taking all other amino acid locations into account) of residue X's C-alpha location. Each of these maps are also in MATLAB v6 format, and can be loaded and visualized in MATLAB. BP also creates a subfolder to the working directory, 'backbone_models'. At each iteration of the algorithm, BP computes a backbone (C-alpha-only) model of the protein, and places it in this folder. ./sidechain_pf -w test/ -m test/data/test.fs4.map \ -c test/data/test.cryst \ -s test/data/test.seq \ -r 2.5 -b 1-1 This command uses the posterior distributions to guide a Monte Carlo sampling algorithm, producing an ensemble of all-atom protein models. The arguments '-m', '-s', and '-c' indicate the map, crystal, and sequence files, as in the other core programs. Resolution is indicated with -r, while an additional argument, '-b' indicates a range of structures to generate. In this case, we generate just 1 structure. Allowing a range of indices makes parallelizing straightforward, e.g. one can compute structures 1-5 on one machine, and 6-10 on another (repeated runs with a the same index yield the same structure. The working directory, specified with '-w', is the same as before. SIDECHAIN_PF reads the contents of the 'posteriors' directory from this folder, and creates a new output folder, 'Nparticle_models' (where N is the number of particles per run, by default 100). It outputs a set of all-atom models, named prot_allatom_pf.K.pdb, where K is the structure's index (corresponding to the range specified with '-b'). SIDECHAIN_PF also creates a directory 'fragment_cache' This set of structures comprises the final model produced by ACMI. 3 CORE PROGRAMS --------------- This section describes individual command line arguments to the core programs SHFEAR, BP, and SIDECHAIN_PF. 3.1 SHFEAR Given the amino acid sequence and a map, this program looks up each 5-mer in the database, finds a number of instances, and searches the map for these instances. For each residue in the protein, it outputs a map of correlation coefficients between the best-matching instance and the electron density map. Usage: ./shfear [options [args]] --help, -h Displays this information --working, -w [REQUIRED] Working Directory This is the top-level output directory, shared by all programs in ACMI SHFEAR writes its output maps to a subdirectory, 'priors'. --map, -m [REQUIRED] Input fs4 or CCP4 map file Full path to the fs4-format or CCP4-format density map. --cryst, -c [REQUIRED] Input crystal file A one-line text file that has the exact same syntax as the 'CRYST1' line in a PDB file. --seq, -s [REQUIRED] Input sequence file The protein's sequence as a one-letter code ('_'s indicate gaps). Each chain is specified on a separate line. NOTE! If there are multiple copies in the asymmetric unit, you will need to enter a line for each. --reso, -r Density map resolution (default 2.0) --bandwidth, -B Spherical harmonic bandwidth (default 12) --radius, -R Fragment radius (default 5) --nfrags, -N Number of fragments considered per amino acid (default 50) Search parameters. These are optional, and defaults give reasonable results in most cases. Higher values for 'bandwidth' and 'nfrags' may yield higher-quality results at the cost of additional runtime. --grid, -x Approximate map grid spacing (default 1A) Grid spacing is now computed automatically, given an approximate grid spacing (works the same as 'fft' in CCP4). The default is reasonable in most cases. --omit, -o Using density values, omit fraction of map (default 0.8) A first-pass filter, based on density values throws away this fraction of points. If the map is especially well-phased, you may consider increasing this to 0.9 (reducing the runtime by half). --parallel, -p Parallelize of This flag allows parallelism of shfear jobs. Individual amino-acids are run on separate machines. Parallelization of jobs is done as 'k of N'; e.g., -p 1 12 indicates this job is the first of 12 parallel runs. --analogues, -a List of structural analogues for fragment generation Allows specification of structural analogues. The search procedure will use templates from this structure, and only go to the database if there are no instances in the analogue. Specifically a list of PDB files, delimited by spaces. --force, -f Force priors to be overwritten Normally, if an ouput file exists in 'priors' from a previous run of shfear, it will not be overwritten. This flag forces overwriting in such a case. --partial, -P Use partial solution PDB structure If a user has a partial solution already determined (e.g. from a molecular replacement solution), shfear can skip the template search for any solved residue by using the coordinates in the partial solution. A normal search is done for all non-specified residues in the PDB file. *NOTE see below for partial solution format requirements --seleniums, -e Substitute sulfurs with seleniums in Methionine residues When generating templates densities from PDB pentapeptides, sulfurs in methionines are replaced with the density of seleniums when methionine is the central residue. --augment, -A Repeat shfear, but expand previous results based on given directory" Expand the search space (i.e., reduce the -o omit paramater value) of a previous run. Input the location of the previous match calculation results 3.2 BP Given the output of shfear, BP runs the inference algorithm to compute marginal distributions of each residue's position. Usage: ./BP [options [args]] --help, -h Displays this information --working, -w [REQUIRED] Working Directory These are the same as in shfear. BP reads from the priors directory, and creates two directories for output, 'posteriors', and 'backbone_models'. --cryst, -c [REQUIRED] Input crystal file --seq, -s [REQUIRED] Input sequence file These are the same as in shfear. --sigma, -o Average squared rotation error (default 1.0) --m-est, -m Probability of incorrect rotational alignment (default 0.1) --iters, -i Iterations of BP (default 25) Search parameters. These are optional, and defaults give reasonable results in most cases. --prioratoms, -e PDB file containing locations of selenium and/or CA atoms --se-prior, -p Weight for provided prior atom list (E.g. -e SeWgt CaWgt) default 0.9 0.1 These two parameters allow you to specify the location of all the selenium atoms (from phasing experiments) and/or CA atoms from a heuristic search, as well as a weight to apply to these locations. The default weight should work fine in most cases. --memory, -x Maximum memory usage before paging (in MB) --scratch, -d Scratch directory for paging (will use a unique subdir) With especially large maps, BP often need significant amounts of memory, which can lead to swapping and significant performance penalty. These flags allows specification of a maximum amount of memory to allocate, and a scratch directory, which will be used for temporary storage, only if required memory is >= the memory limit. Default is 4000 MB and '/scratch', respectively. --partial, -P Use partial solution PDB structure If a partial solution was used for shfear, the same file needs to supplied to BP, otherwise it will not be able to interpret the shfear results properly *NOTE see below for partial solution format requirements --fixed, -F Fix residues in partial solution By default, the partial solution will be treated as a normal probability distribution, meaning BP may modify the solution. This flag will cause BP to lock the solution for the residues in the partial solution. --secondary, -s Input secondary structure sequence file Input secondary structure prediction sequence. Should be in same format as sequence file with same lengths. Currently supports three classes: helix (H), loops/other (L), beta (E) --guide, -g Input priority function for guiding message passing Input file containing priority value of each character in the amino acid sequence. The format is similar to a sequence file, and each amino acid should have a value from 0 to 9 with 9 being the highest priority. An example is to use protein disorder prediction values as the priority function. --start, -S Central residue for beginning BP --continue, -C Start from previous BP results (i.e., use previous posteriors as starting point --force, -f Ignore previous checkpoint --backwards -B Process equally weighted nodes from right to left --checkpoint -t Toggle checkpointing off 3.3 SIDECHAIN_PF Given the posteriors from BP, use particle filtering to obtain a set of all-atom models. Usage: ./sidechain_pf [options [args]] --help, -h Displays this information --working, -w [REQUIRED] Working Directory Same as in shfear. sidechain_pf reads from 'posteriors', and writes to a folder Nparticle_models, where 'N' is the # of particles. --map, -m [REQUIRED] Input fs4 or CCP4 map file --cryst, -c [REQUIRED] Input crystal file --seq, -s [REQUIRED] Input sequence file The same as shfear/BP. --reso, -r Density map resolution (default 2.0) --bandwidth, -B Spherical harmonic bandwidth (default 16) --rad-align, -g Fragment radius for alignment (default 5) --rad-mask, -k Fragment radius for computing CC (default 8) --mask-dist, -d Mask dist for computing CC if rad-mask > rad-align (default 3.0) --nfrags, -N Number of fragments considered per amino acid (default 50) Search parameters. These are optional, and defaults give reasonable results in most cases. Many are the same as in shfear. --npart, -p Number of particles (default 100) The number of point estimates used to model the posterior. The default is probably reasonable. --buckets, -b Bucket number (or range) of this particle set (default 1-10) Range of buckets, or "populations" of particles over which to run (e.g -b 3-6 runs buckets 3,4,5, and 6). No resampling occurs between different buckets, so each bucket produces an independent interpretation. Using '-b' it is possible to parallelize PF runs: for example, running buckets 1-5 on 1 machine, and 6-10 simultaneously on another. --b-min, -y Min belief to continue a trace (default 0.001) --M-max, -z Max entropy to start a trace (default 2.8) Parameters controlling when to abort chain growth and to terminate interpretation early. Defaults give reasonable results. --analogues, -a A list of structural analogues to use for fragment generation --partial, -P Use partial solution PDB structure If a partial solution is available, sidechain_pf can skip the sampling procedure for any pre-determined residues. This will seed all particles with the partial solution first. *NOTE see below for partial solution format requirements --guide, -G Input priority function for guiding message passing 4 EXTRA UTILITIES ----------------- Several additional programs are provided with the ACMI distribution. The program GET_RMSD is used to evaluate an ACMI model against a crystallographer-built model. The program SIDECHAIN_NAIVE takes the backbone model produced by BP and stitches sidechains on this predicted backbone. Finally, the program BUILD_NMER_DB is provided as a reference to users wishing to construct their own 5-mer fragment library. 4.1 GET_RMSD A short program to analyze computed results. Unless the 'verbose' option is specified, arguments are output in a single comma-separated line, in the following order: 1) Number of chains in the model 2) Percentage of "true" C-alphas found 3) Percentage of "true" sidechains correctly identified 4) Percentage of predicted structure corresponding to "true" structure 5) Percentage of predicted structure corresponding to correctly-identified "true" structure 6-8) C-alpha, all-atom, and sidechain-only RMS error of WHAT WAS CORRECTLY PREDICTED Values 2 and 3 correspond to the "recall" of the model, while values 4 and 5 correspond to the "precision". If 'nobreak' was not specified, 8 additional values provide the same metrics for the model in which the predicted chain was optimally broken. Usage: ./get_RMSd [options [args]] --help, -h Displays this information --true, -t [REQUIRED] True PDB --pred, -p [REQUIRED] Predicted PDB Paths to the true and the predicted PDB files. --correct, -c Consider AA correct if CA is within this distance of the true CA location (default=2A) Self-explanitory. --outpdb, -o Write PDB in correct asymm unit to STDOUT Outputs to STDOUT the predicted PDB, but rotated to lie in the same asymmetric unit as the true PDB. Useful for displaying the results. --stats, -s Write matching statistics to STDOUT Prints a comma-seperated list comparing prediction confidence (specified in the B-factor column of the predicted PDB), and the distance from the true CA location. --nobreak, -n Don't try to break chains By default, get_RMSd will try to break false connections in the model. Use this flag to disable this behavior. --verbose, -v Use verbose output By default, get_RMSd will output on a single comma-separated line. This argument forces human-readable output. 4.2 SIDECHAIN_NAIVE Given a backbone model produced by BP, this program naively creates an all-atom model by stitching sidechains on this backbone-only model. The arguments are similar to SIDECHAIN_PF. SIDECHAIN_NAIVE writes to STDOUT. Usage: ./sidechain_pf [options [args]] --help, -h Displays this information --working, -w [REQUIRED] Working Directory Same as in shfear. sidechain_pf reads from 'posteriors', and writes to a folder Nparticle_models, where 'N' is the # of particles. --map, -m [REQUIRED] Input fs4 or CCP4 map file The same as shfear/sidechain_pf. --cryst, -p [REQUIRED] Input backbone model A backbone model (C-alpha-only) of the protein of interest. --reso, -r Density map resolution (default 2.0) --bandwidth, -B Spherical harmonic bandwidth (default 16) --rad-align, -g Fragment radius for alignment (default 5) --rad-mask, -k Fragment radius for computing CC (default 8) --mask-dist, -d Mask dist for computing CC if rad-mask > rad-align (default 3.0) --nfrags, -N Number of fragments considered per amino acid (default 50) Search parameters. These are optional, and defaults give reasonable results in most cases. --analogues, -a A list of structural analogues to use for fragment generation 4.3 BUILD_NMER_DB This program allows one to construct a custom fragment library. The program takes two command line arguments: a) A text file where the first four charaters of each line give a PDB accession code, and the fifth gives a chain identifier (with a single header line). b) The location of the fragment database. Before building the database, the PDB files should be downloaded and placed in 'nmerdb/fraglib', and named 'pdbXXXX.pdb', where XXXX is the accession code. A sample such text file is given in 'nmerdb/cullpdb_pc30_res3.0_R1.0_d050604_chains4257.txt' The intended use of the program is to use the Dunbrack lab's culled PDB lists at 'http://dunbrack.fccc.edu/Guoli/pisces_download.php'. The "live" database is always at 'nmerdb/nmers.db' (and ACMI always looks for PDB files in 'nmerdb/fraglib'). 5 PARALLELIZATION ----------------- The utility SHFEAR is the component of ACMI that requires the most significant computation time. However, it is trivially parallelized using the '-p' flag. Inputs are specified in a k-of-n format, e.g. '-p 12 35' indicates that this run of SHFEAR is to be job #12 after breaking the task into 35 jobs. Jobs are split at the amino-acid level; specifying more parallel jobs than amino acids in the protein will not provide additional parallelism (extra jobs will immediately halt). SHFEAR's parallelism works extremely well with the CONDOR distributed computing environment. Details of CONDOR are available on the project's home page, at http://www.cs.wisc.edu/condor/. Included with this distribution is a sample SHFEAR condor-submission file, 'run.condor'. In this file, the testset protein in the folder '_02b_70653' is split into 323 jobs (one for each amino-acid). These jobs are submitted to the job queuing system with the command 'condor_submit run.condor'. 6 ACMI PUBLICATIONS ------------------- SHFEAR is described in: F. DiMaio, A. Soni, G. Phillips and J. Shavlik (2007). Improved methods for template-matching in electron density maps using spherical harmonics. Proc. BIBM. F. DiMaio, A. Soni, G. Phillips & J. Shavlik (2009). Spherical-Harmonic Decomposition for Molecular Recognition in Electron-Density Maps. International Journal of Data Mining and Bioinformatics, 3, pp. 205-227. BP is described in: F. DiMaio, J. Shavlik, and G. Phillips (2006). A probabilistic approach to protein backbone tracing in electron density maps. Bioinformatics 22. e81-e89. A. Soni, C. Bingman & J. Shavlik (2010). Guiding Belief Propagation using Domain Knowledge for Protein-Structure Determination. Proceedings of the ACM International Conference on Bioinformatics and Computational Biology 2010 (ACM-BCB 2010), Niagara Falls, New York. SIDECHAIN_PF is described in: F. DiMaio, D. Kondrashov, E. Bitto, C. Bingman, A. Soni, G. Phillips, and J. Shavlik (under review). Creating All-Atom Protein Models from Electron-Density Maps using Particle-Filtering Methods. 7 INPUTTING PARTIAL SOLUTION ---------------------------- The take home summary is: add -P flag to shfear, BP, and/or sidechain_pf to input a partial structure. Add -f to BP to treat this partial solution as known truth, no need to modify the partial solution. The pdb file needs to meet strict requirements in terms of how residues are numbered as well as format: *The file needs to be in PDB format. *The only atom that is read in is the CA of a residue. Each residue in the partial solution must have one and only one CA specified. *You do not need to remove other atoms - they will be ignored. *Residues are numbered based on their position in the sequence file, where the first position in the sequence is 1 (not 0). If the first residue defined in the PDB is sequence position 12, it must have a residue ID of 12 (NOT 1!). shfear will place a smoothed peak of probability around the position specified in the PDB as well as all symmetric copies, with the -w flag defining the width of the Gaussian. BP uses the sequence file for identification issues. If the --fixed flag is not used, BP treats all shfear files(i.e. from 5mer search and partial solution) the same, meaning the partial solution may be modified. If --fixed is used, those residues pre-defined in the partial solution are fixed in location, providing a speed up in performance since calculations are not done for these residue. Even if a partial solution is used for shfear and BP, sidechain_pf can optionally take in the structure. If no structure is given, it uses the normal sampling procedure using the information from BP (which processed the partial solution information). If the structure is given, sidechain_pf skips sampling of residues in the structure and uses the partial solution as a seed for the start of the sampling of the remaining residues. 8 GUIDED BELIEF PROPAGATION --------------------------- As detailed in our ACM-BCB '10 paper, BP obtains improved results when a priority function based on disorder prediction is provided as input. To do so, choose a disorder prediction software of your choice and follow the instructions below. I have chosen DisEMBL as an example so the details of conversion may be different. 1. Go to http://dis.embl.de/ 2. Copy and paste the *entire* amino acid sequence into the form 3. Click DisEMBL protein (change parameters if you see fit) 4. At the bottom, right click the scores link and save the file to your disk *We recommend the smoothed scores, but the numbers don't seem very different with raw 5. The next part is formatting and is not simple. Use the script in the scripts/ folder perl disEmbl.pl $INPUTFILE > $OUTPUTFILE Briefly, this converts the first column of scores (i.e., probability of coil) to its complement (1-coil) and than to an integer value from 0 to 9 and formats the file as a sequence file in ACMI (i.e., each aminoacid gets an integer). For example, if your sequence is: SAILM and the disorder prediction is, respectively: 0.1, 0.2, 0.5, 0.3, 0.8 then the priority function file should read: 98572 8 ENSEMBLES IN ACMI --------------------------- ACMI can now use multiple runs of BP to improve accuracy. To do so, run BP several times with variations in two parameters: -forward inference or backward inference. The default is forward, -B will run BP backwards -The start location of inference. The default is at the first amino acid in the sequence (0). Use -S where num is some integer between 0 and the length of the sequence. These two variations can be combined (e.g., use 3 different seeds in both forward and backward directions to yield 6 runs of BP) Let's say we want to consider a forward and backward run starting at amino acid 100: FORWARD: -S 100 .... BACKWARD: -S 100 -B .... Each run needs to be saved in a different working directory. Let's say you ran SHFEAR in a directory test. Then you need to create the new directories and copy the priors into them (or link): mkdir test/bp100forward mkdir test/bp100backward ln -s test/priors/ test/bp100forward/priors ln -s test/priors/ test/bp100backward/priors Then run BP: BP -S 100 -w test/bp100forward -c -s BP -S 100 -B -w test/bp100backward -c -s To run sidechain_pf on this ensemble, you need to create a text file "ensemble.txt" in the test directory. The first line should be an integer from 0 to 4 to indicate the combination method: 0: use maximum score across posteriors 1: use average (recommended)*** 2: sample from one posterior at random 3: use weighted average 4: sample from one posterior using weights After this line, list all the location of all posteriors directories. Example file: 1 test/bp100forward/posteriors test/bp100backward/posteriors Run sidechain_pf as normal (working directory is test) 9 CHANGES LOG --------------------------- From 1.5.0 to 1.4.1 -Added versioning output to know what version of ACMI you are running. Use the -V flag on any ACMI utility e.g. BP -V. -Added ensemble capabilities to ACMI. -Added instructions and scripts for creating NMERDB, including removing homologous structures (see READMN.nmerdb) -BP flag to toggle off checkpointing (-t) -BP flag for processing amino acids backwards (-B) -Added utility mem_usage to report estimate of memory requirements fo BP ***************** From 1.4.1 to 1.4.0a Major: -BP checkpoints manually (in posterior/checkpoint subdirectory) at the end of each iteration. A crashed run automatically starts from this checkpoint unless -f is specified -BP reads in CA guess-locations (e.g., using phenix's find strand and helices algorithm) in same file as seleniums -BP can start from results of a previous run of BP using -C flag -sidechain_pf can use a guided function to choose next residue to sample, similar usage as BP Minor: -Fixed bug with reading CCP4 maps -ACMI warns user of potentially corrupt map if more than 20% of values are 0 -Error checks shfear's parallel feature (skipByRes > 0) -Removed unneccesary verbose output in shfear -***************** From 1.4.0a to 1.3.1 Major: -Shfear can augment already calculated match files to expand search region (i.e. decrease the omit paramater). Thus, files calculated with a high omit value (e.g., 0.9) can be rerun with a lower omit-value (e.g. 0.8) without redoing the original work. -Added ability to guide BP message passing using user-provided priority function. -Added ability to start BP message passing from user-defined location in the sequence. -Added new feature generating functions in soft_wrapper.h, SOFT_Kazhdan_rep, SOFT_Frome_rep -Added sample condor scripts for running test set in DiMaio, et al. 2007, Bioinformatics. Minor: -Fixed bug in randomLocation function -Added mean, variance, and entropy functions to map_3d class -Added getRandomFrag function to nmer_lookup to grab random fragments from fragment library -Fixed bug in getMer function in Nmer class -Added ability to read orientation match files from disk (readMapsFromDisk in spharm.h) -Added cross product function for X,Y,Z vectors (cross in vec3.h), as well as unit function to return unit vector and set function to modify contents of vec3 object -BP caches adjacency functions ***************** From 1.3.1 to 1.3.0 Major: -ACMI can handle CCP4 density maps (in addition to fsfour maps). -BP can incorporate provided secondary structure (in development phase) -Ability to substitute seleniums into methionines for shfear Minor: -Seleniums only boost probabilities for methionines now -Fixed bug causing BP to crash when shfear files are not complete -Fixed bug to handle improper space group names. ***************** From 1.2.1 to 1.3.0 Major: -sidechain_pf is able to handle partial solutions. The given partial solution is used as the seed for each particle -shfear is able to define confidence in the partial solution via a Gaussian width parameter Minor: shfear: -place Gaussian convolution on given partial solution atom locations -added parameters to define kernel width for Gaussian -removed mixture parameter sidechain_pf: -output solution contains chainID and resID based on original sequence file -output raw file before moving molecule to one assymetric unit (some chains get broken up by shift) ***************** From 1.2 to 1.2.1: Major changes: src/shfear.cpp - added capability to handle partial solutions to the structure Minor fixes: BP - removed bounding box option, fixed paging bug, pre-compute forward messages ****************** From 1.1 to 1.2 src/BP.cpp, src/BP.h added ability to include partial structure (untested)