Tools¶
pyGeno provides a set of tools that can be used independentely. Here you’ll find goodies for translation, indexation, and more.
Progress Bar¶
pyGeno’s awesome progress bar, with logging capabilities and remaining time estimation.
- class tools.ProgressBar.ProgressBar(nbEpochs=- 1, width=25, label='progress', minRefeshTime=1)[source]¶
A very simple unthreaded progress bar. This progress bar also logs stats in .logs. Usage example:
p = ProgressBar(nbEpochs = -1) for i in range(200000) : p.update(label = 'value of i %d' % i) p.close()
If you don’t know the maximum number of epochs you can enter nbEpochs < 1
Useful functions¶
This module is a bunch of handy bioinfo functions.
- tools.UsefulFunctions.complement(seq)[source]¶
returns the complementary sequence without inversing it
- tools.UsefulFunctions.complementTab(seq=[])[source]¶
returns a list of complementary sequence without inversing it
- tools.UsefulFunctions.decodePolymorphicNucleotide(nuc)[source]¶
the opposite of encodePolymorphicNucleotide, from ‘R’ to [‘A’, ‘G’]
- tools.UsefulFunctions.decodePolymorphicNucleotide_str(nuc)[source]¶
same as decodePolymorphicNucleotide but returns a string instead of a list, from ‘R’ to ‘A/G
- tools.UsefulFunctions.encodePolymorphicNucleotide(polySeq)[source]¶
returns a single character encoding all nucletides of polySeq in a single character. PolySeq must have one of the following forms: [‘A’, ‘T’, ‘G’], ‘ATG’, ‘A/T/G’
- tools.UsefulFunctions.findAll(haystack, needle)[source]¶
returns a list of all occurances of needle in haystack
- tools.UsefulFunctions.getNucleotideCodon(sequence, x1)[source]¶
Returns the entire codon of the nucleotide at pos x1 in sequence, and the position of that nocleotide in the codon in a tuple
- tools.UsefulFunctions.getSequenceCombinaisons(polymorphipolymorphicDnaSeqSeq, pos=0)[source]¶
Takes a dna sequence with polymorphismes and returns all the possible sequences that it can yield
- tools.UsefulFunctions.highlightSubsequence(sequence, x1, x2, start=' [', stop='] ')[source]¶
returns a sequence where the subsequence in [x1, x2[ is placed in bewteen ‘start’ and ‘stop’
- tools.UsefulFunctions.polymorphicCodonCombinaisons(codon)[source]¶
Returns all the possible amino acids encoded by codon
- tools.UsefulFunctions.reverseComplement(seq)[source]¶
Complements a DNA sequence, returning the reverse complement.
- tools.UsefulFunctions.reverseComplementTab(seq)[source]¶
Complements a DNA sequence, returning the reverse complement in a list to manage INDEL.
- tools.UsefulFunctions.showDifferences(seq1, seq2)[source]¶
Returns a string highligthing differences between seq1 and seq2:
Matches by ‘-‘
Differences : ‘A|T’
Exceeded length : ‘#’
Binary sequences¶
To encode sequence in binary formats
- class tools.BinarySequence.BinarySequence(sequence, arrayForma, charToBinDict)[source]¶
A class for representing sequences in a binary format
- encode(sequence)[source]¶
Returns a tuple (binary reprensentation, default sequence, polymorphisms list)
- find(strSeq)[source]¶
returns the first occurence of strSeq in self. Takes polymorphisms into account
- findByBiSearch(strSeq)[source]¶
returns the first occurence of strSeq in self. Takes polymorphisms into account
- findPolymorphisms(strSeq, strict=False)[source]¶
Compares strSeq with self.sequence. If not ‘strict’, this function ignores the cases of matching heterozygocity (ex: for a given position i, strSeq[i] = A and self.sequence[i] = ‘A/G’). If ‘strict’ it returns all positions where strSeq differs self,sequence
Segment tree¶
Segment trees are an optimised way to index a genome.
- class tools.SegmentTree.SegmentTree(x1=None, x2=None, name='', referedObject=[], father=None, level=0)[source]¶
Optimised genome annotations. A segment tree is an arborescence of segments. First position is inclusive, second exlusive, respectively refered to as x1 and x2. A segment tree has the following properties :
The root has no x1 or x2 (both set to None).
Segment are arrangend in an ascending order
For two segment S1 and S2 : [S2.x1, S2.x2[ C [S1.x1, S1.x2[ <=> S2 is a child of S1
Here’s an example of a tree :
Root : 0-15
—->Segment : 0-12
——->Segment : 1-6
———->Segment : 2-3
———->Segment : 4-5
——->Segment : 7-8
——->Segment : 9-10
—->Segment : 11-14
——->Segment : 12-14
—->Segment : 13-15
Each segment can have a ‘name’ and a ‘referedObject’. ReferedObject are objects are stored within the graph for future usage. These objects are always stored in lists. If referedObject is already a list it will be stored as is.
- flatten()[source]¶
Flattens the tree. The tree become a tree of depth 1 where overlapping regions have been merged together
- insert(x1, x2, name='', referedObject=[])[source]¶
Insert the segment in it’s right place and returns it. If there’s already a segment S as S.x1 == x1 and S.x2 == x2. S.name will be changed to ‘S.name U name’ and the referedObject will be appended to the already existing list