EncodedSeq object is similar to the BioPython
Seq object, in that it primarily contains the
sequence letters and an associated Alphabet with the prime difference being the Alphabet is a GCGC
alphabet and not from BioPython.
Creating an EncodedSeq
To create an
EncodedSeq pass a sequence and an alphabet.
from gcgc.encoded_seq import EncodedSeq from gcgc.alphabet import IUPACUnambiguousDNAEncoding es = EncodedSeq("ATCG", IUPACUnambiguousDNAEncoding()) # EncodedSeq('ATCG', IUPACUnambiguousDNAEncoding())
Modifying an EncodedSeq
Once the object has been created, there are various ways to modify the underlying sequence.
es.pad(pad_to=10) # EncodedSeq('ATCG||||||', IUPACUnambiguousDNAEncoding()) es.encapsulate() # EncodedSeq('>ATCG<', IUPACUnambiguousDNAEncoding())
EncodedSeq object also supports chaining.
es.encapsulate().conform(7) # EncodedSeq('>ATCG<|', IUPACUnambiguousDNAEncoding())
After the sequence has been modified, integer encodings are available as properties.
[1, 2, 3, 0]