Bio::Tools::Alignment::Trim - A kludge to do specialized trimming of sequence based on quality.


use Bio::Tools::Alignment::Trim;
$o_trim = Bio::Tools::Alignment::Trim->new();


This is a specialized module designed by Chad for Chad to trim sequences based on a highly specialized list of requirements. In other words, write something that will trim sequences 'just like the people in the lab would do manually'.

I settled on a sliding-window-average style of search which is ugly and slow but does _exactly_ what I want it to do.

Mental note: rewrite this.

It is very important to keep in mind the context in which this module was written: strictly to support the projects for which was designed.


AUTHOR - Chad Matsalla



The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _


 Title   : new()
 Usage   : $o_trim = Bio::Tools::Alignment::Trim->new();
 Function: Construct the Bio::Tools::Alignment::Trim object. No parameters
	   are required to create this object. It is strictly a bundle of
	   functions, as far as I am concerned.
 Returns : A reference to a Bio::Tools::Alignment::Trim object.
 Args    : (optional)
           -windowsize (default 10)
           -phreds (default 20)


 Title   : set_designators(<forward>,<reverse>)
 Usage   : $o_trim->set_designators("F","R")
 Function: Set the string by which the system determines whether a given
	sequence represents a forward or a reverse read.
 Returns : Nothing.
 Args    : two scalars: one representing the forward designator and one
	representing the reverse designator


 Title   : set_forward_designator($designator)
 Usage   : $o_trim->set_forward_designator("F")
 Function: Set the string by which the system determines if a given
	sequence is a forward read.
 Returns : Nothing.
 Args    : A string representing the forward designator of this project.


 Title   : set_reverse_designator($reverse_designator)
 Function: Set the string by which the system determines if a given
	sequence is a reverse read.
 Usage   : $o_trim->set_reverse_designator("R")
 Returns : Nothing.
 Args    : A string representing the forward designator of this project.


Title   : get_designators()
Usage   : $o_trim->get_designators()
Returns : A string describing the current designators.
Args    : None
Notes   : Really for informational purposes only. Duh.


 Title   : trim_leading_polys()
 Usage   : $o_trim->trim_leading_polys()
 Function: Not implemented. Does nothing.
 Returns : Nothing.
 Args    : None.
 Notes   : This function is not implemented. Part of something I wanted to
	do but never got around to doing.


Title   : dump_hash()
Usage   : $o_trim->dump_hash()
Function: Unimplemented.
Returns : Nothing.
Args    : None.
Notes   : Does nothing.


 Title   : trim_singlet($sequence,$quality,$name,$class)
 Usage   : ($r_trim_points,$trimmed_sequence) =
 Function: Trim a singlet based on its quality.
 Returns : a reference to an array containing the forward and reverse
	trim points and the trimmed sequence.
 Args    : $sequence : A sequence (SCALAR, please)
	   $quality : A _scalar_ of space-delimited quality values.
	   $name : the name of the sequence
	   $class : The class of the sequence. One of qw(singlet
		singleton doublet pair multiplet)
 Notes   : At the time this was written the bioperl objects SeqWithQuality
	and PrimaryQual did not exist. This is what is with the clumsy
	passing of references and so on. I will rewrite this next time I
	have to work with it. I also wasn't sure whether this function
	should return just the trim points or the points and the sequence.
	I decided that I always wanted both so that's how I implemented
     - Note that the size of the sliding windows is set during construction of
       the Bio::Tools::Alignment::Trim object.


 Title   : trim_doublet($sequence,$quality,$name,$class) 
 Usage   : ($r_trim_points,$trimmed_sequence) =
 Function: Trim a singlet based on its quality.
 Returns : a reference to an array containing the forward and reverse
 Args    : $sequence : A sequence
	   $quality : A _scalar_ of space-delimited quality values.
	   $name : the name of the sequence
	   $class : The class of the sequence. One of qw(singlet
		singleton doublet pair multiplet)
 Notes   : At the time this was written the bioperl objects SeqWithQuality
	and PrimaryQual did not exist. This is what is with the clumsy
	passing of references and so on. I will rewrite this next time I
	have to work with it. I also wasn't sure whether this function
	should return just the trim points or the points and the sequence.
	I decided that I always wanted both so that's how I implemented


 Title   : chop_sequence($name,$class,$sequence,@points)
 Usage   : ($start_point,$end_point,$chopped_sequence) = 
 Function: Chop a sequence based on its name, class, and sequence.
 Returns : an array containing three scalars:
	1- the start trim point
	2- the end trim point
	3- the chopped sequence
 Args    :
	   $name : the name of the sequence
	   $class : The class of the sequence. One of qw(singlet
		singleton doublet pair multiplet)
	   $sequence : A sequence
	   @points : An array containing two elements- the first contains
		the start trim point and the second conatines the end trim


 Title   : _get_start($r_quals,$windowsize,$phreds,$offset)
 Usage   : $start_base = $self->_get_start($r_windows,5,20);
 Function: Provide the start trim point for this sequence.
 Returns : a scalar representing the start of the sequence
 Args    : 
	$r_quals : A reference to an array containing quality values. In
		context, this array of values has been smoothed by then
		sliding window-look ahead algorithm.
	$windowsize : The size of the window used when the sliding window
		look-ahead average was calculated.
	$phreds : <fill in what this does here>
	$offset : <fill in what this does here>


 Title   : _get_end($r_qual,$windowsize,$phreds,$count)
 Usage   : my $end_base = &_get_end($r_windows,20,20,$start_base);
 Function: Get the end trim point for this sequence.
 Returns : A scalar representing the end trim point for this sequence.
 Args    : 
	$r_qual : A reference to an array containing quality values. In
		context, this array of values has been smoothed by then
		sliding window-look ahead algorithm.
	$windowsize : The size of the window used when the sliding window
		look-ahead average was calculated.
	$phreds : <fill in what this does here>
	$count : Start looking for the end of the sequence here.


 Title   : count_doublet_trailing_zeros($r_qual)
 Usage   : my $start_of_trailing_zeros = &count_doublet_trailing_zeros(\@qual);
 Function: Find out when the trailing zero qualities start.
 Returns : A scalar representing where the zeros start.
 Args    : A reference to an array of quality values.
 Notes   : Again, this should be rewritten to use PrimaryQual objects.
	A more detailed explanation of why phrap puts these zeros here should
	be written and placed here. Please email and hassle the author.


 Title   : _sliding_window($r_quals,$windowsize)
 Usage   : my $r_windows = &_sliding_window(\@qual,$windowsize);
 Function: Create a sliding window, look-forward-average on an array
	of quality values. Used to smooth out differences in qualities.
 Returns : A reference to an array containing the smoothed values.
 Args    : $r_quals: A reference to an array containing quality values.
	   $windowsize : The size of the sliding window.
 Notes   : This was written before PrimaryQual objects existed. They
	   should use that object but I haven't rewritten this yet.


Title   : _print_formatted_qualities(\@quals)
Usage   : &_print_formatted_qualities(\@quals);
Returns : Nothing. Prints.
Args    : A reference to an array containing quality values.
Notes   : An internal procedure used in debugging. Prints out an array nicely.


Title   : _get_end_old($r_qual,$windowsize,$phreds,$count)
Usage   : Deprecated. Don't use this!
Returns : Deprecated. Don't use this!
Args    : Deprecated. Don't use this!