Encoding Seq
The EncodedSeq
object is similar to the BioPython Seq
object, in that it primarily contains the
sequence letters and an associated Alphabet with the prime difference being the Alphabet is a GCGC
alphabet and not from BioPython.
Creating an EncodedSeq
To create an EncodedSeq
pass a sequence and an alphabet.
from gcgc.encoded_seq import EncodedSeq from gcgc.alphabet import IUPACUnambiguousDNAEncoding es = EncodedSeq("ATCG", IUPACUnambiguousDNAEncoding()) # EncodedSeq('ATCG', IUPACUnambiguousDNAEncoding())
Modifying an EncodedSeq
Once the object has been created, there are various ways to modify the underlying sequence.
es.pad(pad_to=10) # EncodedSeq('ATCG||||||', IUPACUnambiguousDNAEncoding()) es.encapsulate() # EncodedSeq('>ATCG<', IUPACUnambiguousDNAEncoding())
The EncodedSeq
object also supports chaining.
es.encapsulate().conform(7) # EncodedSeq('>ATCG<|', IUPACUnambiguousDNAEncoding())
Integer Encodings
After the sequence has been modified, integer encodings are available as properties.
es.integer_encoded
[1, 2, 3, 0]
```