Tuesday, May 10, 2011

PICS-Ord Tricks

Those who have ever used the R statistical package know that it can be a bit tricky, especially if you're first starting out.

The R-based PICS-Ord program (developed to recode ambiguously-aligned regions for phylogenetic analyses; see here and here and here for more information) was written to be as simple as possible while remaining flexible and adjustable. A set of pretty comprehensive instructions was produced to facilitate analyses and provide recommendations for basic use (see 'manual.pdf' available here). The manual is where everyone should look first for help with PICS-Ord. However, those who are not experts in R and/or the command line may benefit from some additional information on how to implement PICS-Ord without really having to know much background information on the R statistical package. Therefore, the goal of this post is to detail one relatively simple way of implementing the PICS-Ord program on a PC.

Here are instructions for one way of running PICS-Ord on Microsoft Windows.  Note that these instructions will only work with installations of the Windows version of R (try 2.12.0; the most recent version of R did not work at the time that this was last updated) [ http://cran.r-project.org/bin/windows/base/old/2.12.0/ ] and the Ngila Windows executable [ http://scit.us/projects/ngila/ ] (for the latter, choose the option of putting Ngila in your PATH upon installation).
1) Place picsord.R (found in the picsord.zip archive) in the same folder as ngila.exe (probably C:\\"Program Files"\Ngila\bin\).
[Note: If Ngila is not in your path, you can go into the picsord.R file and change "ngila" (the one in quotes) to "ngila.exe" in the first line that is not preceded by a hash mark (#), or you can type out the full path (but this will not be necessary if you are following the rest of this procedure). Some users have manually edited picsord.R by deleting the first line (the line specifying the location of Rscript); this seems unnecessary but may be worthwhile on certain machines.]
2) Save/copy the input fasta file to a directory (e.g., your home directory, Desktop, or C:\).
3) In the Command Prompt window, use the 'cd' or 'chdir' command to navigate to the directory in which the input fasta file is stored (or simply put the input fasta file in the home directory so that navigation is not necessary), then type the following (this line can be modified based on the version of R being used or the specific location on the drive where Rscript and picsord.R are located):
C:\\"Program Files"\R\R-2.12.0\bin\Rscript.exe C:\\"Program Files"\Ngila\bin\picsord.R input.fas > output.phy
After this, the output phylip file should appear in the working directory.

Multiple regions can be processed by running the command in step 3 separately for each region, or one can use the .bat file that comes as part of the PICS-Ord package (use of the .bat file is outlined in the manual).  After this, the phylip-formatted PICS-Ord alignment portions can be pasted at the end of the original nucleotide alignment alongside the unambiguously-aligned sites.  Please see the manual for further recommendations regarding implementation.

-Brendan



References:

Lücking, R., B. P. Hodkinson, A. Stamatakis, and R. A. Cartwright. 2011. PICS-Ord: Unlimited Coding of Ambiguous Regions by Pairwise Identity and Cost Scores Ordination. BMC Bioinformatics 12: 10.
Download publication (PDF file)
Download R-based PICS-Ord program (zipped program package)
View program wiki (website)

No comments:

Post a Comment