ClustalParser

Libary for parsing Clustal tools output

LTS Haskell 23.1:1.3.0
Stackage Nightly 2024-12-21:1.3.0
Latest on Hackage:1.3.0

See all snapshots ClustalParser appears in

GPL-3.0-only licensed by Florian Eggenhofer
Maintained by [email protected]
This version can be pinned in stack with:ClustalParser-1.3.0@sha256:b7ae3df1328650257d4e882fc03bae2d924dc94d2067be0561e5e8b823d390ce,2614

Module documentation for 1.3.0

ClustalParser Hackage Build Status

Currently contains parsers and datatypes for: clustalw2, clustalo, mlocarna, cmalign

Clustal tools are multiple sequence alignment tools for biological sequences like DNA, RNA and Protein. For more information on clustal Tools refer to http://www.clustal.org/.

Mlocarna is a multiple sequence alignment tool for RNA sequences with secondary structure output. For more information on mlocarna refer to http://www.bioinf.uni-freiburg.de/Software/LocARNA/.

cmalign is a multiple sequence alignment program based on RNA family models and produces ,among others, clustal output. It is part of infernal http://infernal.janelia.org/.

4 types of output are parsed

  • Alignment file (.aln):
  • Parsing with readClustalAlignment from filepath (Bio.ClustalParser)
  • Parsing with parseClustalAlignment from String (Bio.ClustalParser)
  • Alignment file with secondary structure (.aln):
  • Parsing with readStructuralClustalAlignment from filepath (Bio.ClustalParser)
  • Parsing with parsStructuralClustalAlignment from String (Bio.ClustalParser)
  • Summary (printed to STDOUT):
  • Parsing with readClustalSummary from filepath (Bio.ClustalParser)
  • Parsing with parseClustalSummary from String (Bio.ClustalParser)
  • Phylogenetic Tree (.dnd):
  • Parsing with readGraphNewick from filepath (Bio.Phylogeny)
  • Parsing with readGraphNewick from String (Bio.Phylogeny)

Changes

--change-log--

1.3.0 Florian Eggenhofer 14. November 2019

  • Fixed requested tick number for compilation with GHC 8.6.*
  • Changed to Biobase style

1.2.3 Florian Eggenhofer 12. March 2018

  • Fixed parsing of additional newline in Biopythons AlignIO output without conservation track

1.2.2 Florian Eggenhofer 07. March 2018

  • Clustal parser can now parse alignments with missing consensus annotation

1.2.1 Florian Eggenhofer 06. February 2017

  • Structural alignment parser now works with multiline consensus structures

1.2.0 Florian Eggenhofer 07. January 2017

  • Changed datastructures for sequence identifers and sequences to Data.Text

1.1.4 Florian Eggenhofer 30. May 2016

  • Fixed a bug in output of clustal alignments with sequence length of 60

1.1.3 Florian Eggenhofer 4. July 2015

  • Nucleotide sequences are now parsed by a unified function in line with IUPAC nucleotide code

1.1.2 Florian Eggenhofer 3. July 2015

  • Included parsing of optional field in mlocarna clustal output

1.1.1 Florian Eggenhofer 2. July 2015

  • Added support for cmalign clustal output .

1.1.0 Florian Eggenhofer 1. July 2015

  • Added Hspec test-suite for parsing functions
  • Added Show instances for ClustalAlignment and StructuralClustalAlignment

1.0.3 Florian Eggenhofer 19. April 2015

  • Added Y (pyrimidine) and R (purine) to sequence characters

1.0.2 [Florian Eggenhofer](mailto:[email protected]> 19. March 2015

* Linebreaks are now filtered from structural alignment sequence identifiers

1.0.1 [Florian Eggenhofer](mailto:[email protected]> 27. October 2014

* Fixed compiler warnings and updated documentation to mention structural clustal format
* Added -Wall and -O2 compiler options
* Added support for clustal alignments with secondary structure annotation