MEME - Alphabet

MEME accepts DNA or protein sequences. The sequences must all be of the same type, either protein or DNA. DNA sequences are not translated into protein.

Protein sequences should use the standard IUPAC alphabet: ACDEFGHIKLMNPQRSTVWY.
In addition, the following conversions from ambiguous to unambiguous letter codes are made automatically by MEME for protein sequences:
- B --> D (Asp, Asn to Asp)
- U --> C (selenocysteine to cysteine)
- X --> C (unknown to cysteine)
- Z --> E (Glu, Gln to Glu)
DNA sequences should use the standard DNA alphabet: ACGT.
In addition, the following conversions from ambiguous to unambiguous letter codes are made automatically by MEME for DNA sequences:
- B --> C (GTC to C)
- D --> G (GAT to G)
- H --> A (ACT to A)
- K --> T (GT to T)
- M --> A (AC to A)
- N --> C (any to C)
- R --> A (GA to A)
- S --> G (GC to G)
- U --> T (uridine to T)
- V --> G (GCA to G)
- W --> T (AT to T)
- Y --> C (TC to C)