clustertools

Tools for manipulating sequence clusters

http://malde.org/~ketil/

Latest on Hackage:0.1.5

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

LicenseRef-GPL licensed by Ketil Malde
Maintained by Ketil Malde

This is a bunch of stuff I needed at some for manipulating sequence clusters. See the README for details. The tools included are:

  • filter - remove unwanted sequences from a clustering

  • hist - produce a histogram of cluster sizes from a "label"-formatted clustering.

  • clusc - compare clusterings, calculating numerous pair-based and entropy based indices.

  • add_single - add singletons to a clustering.

  • ace2contigs - parse an ACE assembly file, and output the contigs in a FASTA file.

  • ace2fasta - parse an ACE assembly, and output each assembly in a separate FASTA file

  • ace2clusters - parse an ACE assembly, and output clusters in TGICL format

  • clusterlibs - given a table of regular expressions and library names, along with a clustering (TGICL-format), output a table of cluster sizes per library.

  • xcerpt - extract sequences from a list of sequence labels.

The Darcs repository is at: http://malde.org/~ketil/biohaskell/cluster_tools.