Encoding Seq
The EncodedSeq object is similar to the BioPython Seq object, in that it primarily contains the
sequence letters and an associated Alphabet with the prime difference being the Alphabet is a GCGC
alphabet and not from BioPython.
Creating an EncodedSeq
To create an EncodedSeq pass a sequence and an alphabet.
from gcgc.encoded_seq import EncodedSeq from gcgc.alphabet import IUPACUnambiguousDNAEncoding es = EncodedSeq("ATCG", IUPACUnambiguousDNAEncoding()) # EncodedSeq('ATCG', IUPACUnambiguousDNAEncoding())
Modifying an EncodedSeq
Once the object has been created, there are various ways to modify the underlying sequence.
es.pad(pad_to=10) # EncodedSeq('ATCG||||||', IUPACUnambiguousDNAEncoding()) es.encapsulate() # EncodedSeq('>ATCG<', IUPACUnambiguousDNAEncoding())
The EncodedSeq object also supports chaining.
es.encapsulate().conform(7) # EncodedSeq('>ATCG<|', IUPACUnambiguousDNAEncoding())
Integer Encodings
After the sequence has been modified, integer encodings are available as properties.
es.integer_encoded
[1, 2, 3, 0]
```