Objects¶
With pyGeno you can manipulate familiar object in intuituive way. All the following classes except SNP inherit from pyGenoObjectWrapper and have therefor access to functions sur as get(), count(), ensureIndex()…
Base classes¶
Base classes are abstract and are not meant to be instanciated, they nonetheless implement most of the functions that classes such as Genome possess.
- class pyGeno.pyGenoObjectBases.RLWrapper(rabaObj, listObjectType, rl)[source]¶
A wrapper for rabalists that replaces raba objects by pyGeno Object
- class pyGeno.pyGenoObjectBases.pyGenoRabaObject(*args, **fieldsDct)[source]¶
pyGeno uses rabaDB to persistenly store data. Most persistent objects have classes that inherit from this one (Genome_Raba, Chromosome_Raba, Gene_Raba, Protein_Raba, Exon_Raba). Theses classes are not mean to be accessed directly. Users manipulate wrappers such as : Genome, Chromosome etc… pyGenoRabaObject extends the Raba class by adding a function _curate that is called just before saving. This class is to be considered abstract, and is not meant to be instanciated
- class pyGeno.pyGenoObjectBases.pyGenoRabaObjectWrapper(wrapped_object_and_bag=(), *args, **kwargs)[source]¶
All the wrapper classes such as Genome and Chromosome inherit from this class. It has most that make pyGeno useful, such as get(), count(), ensureIndex(). This class is to be considered abstract, and is not meant to be instanciated
- classmethod ensureGlobalIndex(fields)[source]¶
Add a GLOBAL index to the db to speedup lookouts. Fields can be a list of fields for Multi-Column Indices or simply the name of a single field. A global index is an index on the entire type. A global index on ‘Transcript’ on field ‘name’, will index the names for all the transcripts in the database
- classmethod flushIndexes()[source]¶
Drops all the indexes attached to the object’s class. Ex Transcript.flushIndexes()
- get(objectType, *args, gen=False, **coolArgs)[source]¶
Raba Magic inside. This is th function that you use for querying pyGeno’s DB.
Usage examples:
myGenome.get(“Gene”, name = ‘TPST2’)
myGene.get(Protein, id = ‘ENSID…’)
myGenome.get(Transcript, {‘start >’ : x, ‘end <’ : y})
- classmethod getIndexes()[source]¶
Returns a list of indexes attached to the object’s class. Ex Transcript.getIndexes()
Genome¶
- class pyGeno.Genome.Genome(SNPs=None, SNPFilter=None, *args, **kwargs)[source]¶
This is the entry point to pyGeno:
myGeno = Genome(name = 'GRCh37.75', SNPs = ['RNA_S1', 'DNA_S1'], SNPFilter = MyFilter) for prot in myGeno.get(Protein) : print prot.sequence
Chromosome¶
- class pyGeno.Chromosome.Chromosome(*args, **kwargs)[source]¶
The wrapper for playing with Chromosomes
Gene¶
Transcript¶
- class pyGeno.Transcript.Transcript(*args, **kwargs)[source]¶
The wrapper for playing with Transcripts
- findAllInUTR3(sequence)[source]¶
Returns a lits of all positions where sequence was found in the 3’UTR
- findAllInUTR5(sequence)[source]¶
Returns a list of all positions where sequence was found in the 5’UTR
- findAllIncDNA(sequence)[source]¶
Returns a list of all positions where sequence was found in the cDNA
Exon¶
- class pyGeno.Exon.Exon(*args, **kwargs)[source]¶
The wrapper for playing with Exons
Protein¶
- class pyGeno.Protein.Protein(*args, **kwargs)[source]¶
The wrapper for playing with Proteins
- find(sequence)[source]¶
Returns the position of the first occurence of sequence taking polymorphisms into account
- findAll(sequence)[source]¶
Returns all the position of the occurences of sequence taking polymorphisms into accoun
- findString(sequence)[source]¶
Returns the first occurence of sequence using simple string search in sequence that doesn’t care about polymorphisms
- findStringAll(sequence)[source]¶
Returns all first occurences of sequence using simple string search in sequence that doesn’t care about polymorphisms
SNP¶
- class pyGeno.SNP.AgnosticSNP(*args, **fieldsDct)[source]¶
This is a generic SNPs/Indels format that you can easily make from the result of any SNP caller. AgnosticSNP files are tab delimited files such as:
chromosomeNumber uniqueId start end ref alleles quality caller Y 1 2655643 2655644 T AG 30 TopHat Y 2 2655645 2655647 - AG 28 TopHat Y 3 2655648 2655650 TT - 10 TopHat
All positions must be 0 based The ‘-‘ indicates a deletion or an insertion. Collumn order has no importance.
- class pyGeno.SNP.SNPMaster(*args, **fieldsDct)[source]¶
This object keeps track of SNP sets and their types
- class pyGeno.SNP.SNP_INDEL(*args, **fieldsDct)[source]¶
All SNPs should inherit from me. The name of the class must end with SNP