MAST - Input
MAST -- Motif Alignment and Search Tool
Motif search tool
The input to MAST contains the following fields.
- e-mail address
Enter the email address where you want MAST to send
the confirmation message and your search results.
Be sure that this is a valid e-mail address.
- description
This will be included in the subject lines of the messages MAST
will send you. This field is optional so you may leave it blank if
you wish.
- motifs
This is the name of a file on your computer that contains
one or more of motifs that characterize a group of related sequences.
This can be the name of a file that contains the results of a
MEME analysis, one or
more profiles in GCG format, or a motifs from any other source
as long as they are in the correct
format.
You can create a motif file by saving the e-mail message MEME
sends you to a file.
How you save an e-mail message to a file depends on
the e-mail program you use--consult your system manager if you need
help. A "browse" button is provided by MAST to help you locate
the motif file you wish to use. If you use profiles, the gap
opening and extension penalties will be ignored.
- sequence database
This is the
name
of the sequence database
you wish to search for sequences containing your motifs.
You select the database from a pull-down menu of available databases.
Make sure that you select the right kind of database for your
motifs. MAST can only search protein databases with protein motifs
and DNA databases with DNA motifs. MAST will search the database
of the same type as the motifs in the motif file.
- treatment of reverse complement strands
MAST can automatically generate the reverse complement strand for
each nucleotide sequence in the database and treat it in
three different ways. ("Given strand" refers to the sequence as it
appears in the database MAST is searching.):
- combine with given strand
MAST searches for motif occurrences on either the
given strand or its reverse complement together,
not allowing occurrences on the two strands to overlap
each other, and displays them together as a single sequence.
This allows motifs to occur on either strand and still count
toward the overall E-value of the match.
(The given strand is the sequence as it appears in the
database MAST is searching.)
- treat as separate sequence
MAST to search for motifs in both
the given strand and its reverse complement, treating
them as two, independent sequences. The results are
displayed separately for the two strands, as though both
had occurred in the database.
- none
MAST searches only the given strand of each sequence in
the database.
Note: this field has no effect when the database contains protein
sequences.
- use individual sequence compostion in
E-and p-value calculation
This option can improve search selectivity
when erroneous matches are due to biased sequence composition.
MAST normally computes E-values and p-values using a
random sequence model based on the overall letter composition
of the database being searched. Selecting this option will
cause MAST to use a different random model for
each target sequence.
The random model for each target sequence will be based on its
letter composition, not that of the entire database.
Using this option will tend to give more accurate E-values and
increase the E-values of compositionally biased sequences.
This option may increase search times substantially if used in
conjunction with E-value display thresholds
over 10, since MAST must compute a new set of motif score
distributions for each high-scoring sequence.
- ignore motifs with high E-values
MAST can ignore motifs in the query with E-values above a
threshold you select.
This is desirable because motifs with high E-values are unlikely
to be biologically significant. The default threshold
will cause MAST to use all motifs in the query,
regardless of their E-values.
Note: This option is only available for motifs generated by
MEME 3.0 and above.
- search nucleotide database with protein motifs
Choosing this option will cause
MAST to search the nucleotide version of the selected sequence database,
converting the nucleotide sequences to protein sequences in all six
reading frames. By default, MAST searches the protein version of the
selected database when you give it a file of protein motifs.
- scale motif display threshold by sequence length
MAST displays motifs that score above a threshold for all high-scoring
sequences. By default, this threshold is based on the probability
of the motifs without regard to the length of the sequence. The
threshold was chosen with protein sequences of average length in
mind. Consequently, many positions in very long sequences may match
motifs with scores above this threshold by chance, making the results
difficult to interpret. Selecting this option causes the motif
display threshold to take sequence length into account.
This will reduce the number of weak motifs displayed in long sequences
and minimize the size of the output file.
- E-value display threshold
MAST only displays sequences matching your query
with E-values below the given threshold you specify here.
By default, sequences in the database with matches with E-values
less than 10 are displayed. If your motifs are very short or have
low information content (are not very specific), it may be impossible
for any sequence to achieve a low E-value. If your MAST search
returns no hits, you may wish to increase the
E-value display threshold and repeat the search.
- rank of first match returned
In order to prevent excessively large results files that cannot
be emailed, a maximum of 500 matching sequences is returned.
By default, results for the 500 best-matching sequences are returned.
If you wish to see further results, you can resubmit your query
to MAST and specify a rank larger than 1. For example,
specifying 501 will cause the 500 best-matching sequences to be
omitted and the results will start with the 501st best-matching
sequence.
- text output format
Choosing this option will cause MAST to produce plain text (ASCII)
output. By default, MAST output is in hypertext (HTML) format.
Clicking on the Start search button
causes your motifs to be sent to
San Diego Supercomputer Center (SDSC)
and used to search the database you selected.
The results of the MAST search are sent to you by e-mail.
No copies of your motifs or search results are saved at SDSC after
the results have been sent to you.
Search using MAST
MAST introduction
MEME SYSTEM introduction