|
Conserved Amino Acids
Purpose of this program
This program was designed to help a lab mate to find out if an amino acid was conserved in a
number of different species that do not have any protein sequence data but do have genomic
DNA sequences on the UCSC genome browser site.
This program can be downloaded
here.
To use the program first collect the data following the instuction in section A then use the program
as shown in section B:
Section A
- The program works with coding sequences, while the genome browser is geared to genomic DNA, therefore it's
easier to search the web page with the DNA sequence of the exon that contains the AA you are interested in.
(See bottom of page for an example.)
- Go the
UCSC Genome browser 'Blat' page and paste your exon sequence in to the text box and press 'Submit'.
- Go throught the pages untill you are at the browser page, scroll down the web page and in the
'Comparative Genomics' section, select the Chain option for the species you want to check and
press the 'refresh' button.
- Find the track that represents the data and select it, then on the next page select the 'View details
of parts of chain within browser window.' link
- Finally, scroll down to the section where the alignment between your sequence and species sequence is
and copy the whole alignment. (See bottom of page for an example.)
Section B
- Press the 'Sequence' button and import the file containing the cDNA sequence. The
sequence will then appear in the panel
- Select the start of the open reading frame from the list box at the bottom left
of the form or enter it into the text box next to the 'Start codon' label. The ORFs amino
acid sequence should now be visible. (Use the scroll bar to navigate along the sequence.)
- Press the 'Alignment' button and paste the alignment from 'Section A' in to
the text box.
- Press the 'format' button and check formated alignment, changes to this alignment
will be ignored. If it is OK press the 'Done' button.
- Scroll along the sequence to find the AA you are interested in and see if it has been conserved.
- To reset the mutant sequence press the 'Reset' button.
- To identify a position, click on the sequence and the information will appear below the scroll bar.
Example data
Sequence exon 3 of ABCC8 to use in Blat search of the UCSC browser
ggtgaccgaatcccaccatctgcacctgtacatgccagccgggatggcgttcatggctgctgtcacctccgtggtctactatcacaacatcgagactt
ccaacttccccaagctgctaattg
Alignment of the ABCC8 sequence with Medaka (Japanese killifish)
21648630 ggtgatgaacaccgaccaccttcatctcttcatgccagcattcatgggtttcatagcagc 21648689
<<<<<<<< ||||| | || |||| || || || | ||||||||| ||||
||||| || || <<<<<<<<
17448345 ggtgaccgaatcccaccatctgcacctgtacatgccagccgggatggcgttcatggctgc 17448286
21648690 aacaacgtctgtggtttattaccacaacatagagacatccaacttccctaaactgctgct 21648749
<<<<<<<< || || ||||| || || |||||||| ||||| ||||||||||| || |||||
| <<<<<<<<
17448285 tgtcacctccgtggtctactatcacaacatcgagacttccaacttccccaagctgctaat 17448226
21648750 tg 21648751
<<<<<<<< || <<<<<<<<
17448225 tg 17448224
Full sequence of ABCC8 cDNA
cggggcccggggggcgggggcctgacggccgggccgggcggcggagctgcaagggacaga
ggcgcggcaggcgcgcggagccagcggagccagctgagcccgagcccagcccgcgcccgc
gccgccatgcccctggccttctgcggcagcgagaaccactcggccgcctaccgggtggac
cagggggtcctcaacaacggctgctttgtggacgcgctcaacgtggtgccgcacgtcttc
ctactcttcatcaccttccccatcctcttcattggatggggaagtcagagctccaaggtg
cacatccaccacagcacatggcttcatttccctgggcacaacctgcggtggatcctgacc
ttcatgctgctcttcgtcctggtgtgtgagattgcagagggcatcctgtctgatggggtg
accgaatcccaccatctgcacctgtacatgccagccgggatggcgttcatggctgctgtc
acctccgtggtctactatcacaacatcgagacttccaacttccccaagctgctaattgcc
ctgctggtgtattggaccctggccttcatcaccaagaccatcaagtttgtcaagttcttg
gaccacgccatcggcttctcgcagctacgcttctgcctcacagggctgctggtgatcctc
tatgggatgctgctcctcgtggaggtcaatgtcatcagggtgaggagatacatcttcttc
aagacaccgagggaggtgaagcctcccgaggacctgcaagacctgggggtacgcttcctg
cagcccttcgtgaatctgctgtccaaaggcacctactggtggatgaacgccttcatcaag
actgcccacaagaagcccatcgacttgcgagccatcgggaagctgcccatcgccatgagg
gccctcaccaactaccaacggctctgcgaggcctttgacgcccaggtgcggaaggacatt
cagggcactcaaggtgcccgggccatctggcaggcactcagccatgccttcgggaggcgc
ctggtcctcagcagcactttccgcatcttggccgacctgctgggcttcgccgggccactg
tgcatctttgggatcgtggaccaccttgggaaggagaacgacgtcttccagcccaagaca
caatttctcggggtttactttgtctcatcccaagagttccttgccaatgcctacgtctta
gctgtgcttctgttccttgccctcctactgcaaaggacatttctgcaagcatcctactat
gtggccattgaaactggaattaacttgagaggagcaatacagaccaagatttacaataaa
attatgcacctgtccacctccaacctgtccatgggagaaatgactgctggacagatctgt
aatctggttgccatcgacaccaatcagctcatgtggtttttcttcttgtgcccaaacctc
tgggctatgccagtacagatcattgtgggtgtgattctcctctactacatactcggagtc
agtgccttaattggagcagctgtcatcattctactggctcctgtccagtacttcgtggcc
accaagctgtctcaggcccagcggagcacactggagtattccaatgagcggctgaagcag
accaacgagatgctccgcggcatcaagctgctgaagctgtacgcctgggagaacatcttc
cgcacgcgggtggagacgacccgcaggaaggagatgaccagcctcagggcctttgccatc
tatacctccatctccattttcatgaacacggccatccccattgcagctgtcctcataact
ttcgtgggccacgtcagcttcttcaaagaggccgacttctcgccctccgtggcctttgcc
tccctctccctcttccatatcttggtcacaccgctgttcctgctgtccagtgtggtccga
tctaccgtcaaagctctagtgagcgtgcaaaagctaagcgagttcctgtccagtgcagag
atccgtgaggagcagtgtgccccccatgagcccacacctcagggcccagccagcaagtac
caggcggtgcccctcagggttgtgaaccgcaagcgtccagcccgggaggattgtcggggc
ctcaccggcccactgcagagcctggtccccagtgcagatggcgatgctgacaactgctgt
gtccagatcatgggaggctacttcacgtggaccccagatggaatccccacactgtccaac
atcaccattcgtatcccccgaggccagctgactatgatcgtggggcaggtgggctgcggc
aagtcctcgctccttctagccgcactgggggagatgcagaaggtctcaggggctgtcttc
tggagcagccttcctgacagcgagataggagaggaccccagcccagagcgggagacagcg
accgacttggatatcaggaagagaggccccgtggcctatgcttcgcagaaaccatggctg
ctaaatgccactgtggaggagaacatcatctttgagagtcccttcaacaaacaacggtac
aagatggtcattgaagcctgctctctgcagccagacatcgacatcctgccccatggagac
cagacccagattggggaacggggcatcaacctgtctggtggtcaacgccagcgaatcagt
gtggcccgagccctctaccagcacgccaacgttgtcttcttggatgaccccttctcagct
ctggatatccatctgagtgaccacttaatgcaggccggcatccttgagctgctccgggac
gacaagaggacagtggtcttagtgacccacaagctacagtacctgccccatgcagactgg
atcattgccatgaaggatggcaccatccagagggagggtaccctcaaggacttccagagg
tctgaatgccagctctttgagcactggaagaccctcatgaaccgacaggaccaagagctg
gagaaggagactgtcacagagagaaaagccacagagccaccccagggcctatctcgtgcc
atgtcctcgagggatggccttctgcaggatgaggaagaggaggaagaggaggcagctgag
agcgaggaggatgacaacctgtcgtccatgctgcaccagcgtgctgagatcccatggcga
gcctgcgccaagtacctgtcctccgccggcatcctgctcctgtcgttgctggtcttctca
cagctgctcaagcacatggtcctggtggccatcgactactggctggccaagtggaccgac
agcgccctgaccctgacccctgcagccaggaactgctccctcagccaggagtgcaccctc
gaccagactgtctatgccatggtgttcacggtgctctgcagcctgggcattgtgctgtgc
ctcgtcacgtctgtcactgtggagtggacagggctgaaggtggccaagagactgcaccgc
agcctgctaaaccggatcatcctagcccccatgaggttttttgagaccacgccccttggg
agcatcctgaacagattttcatctgactgtaacaccatcgaccagcacatcccatccacg
ctggagtgcctgagccgctccaccctgctctgtgtctcagccctggccgtcatctcctat
gtcacacctgtgttcctcgtggccctcttgcccctggccatcgtgtgctacttcatccag
aagtacttccgggtggcgtccagggacctgcagcagctggatgacaccacccagcttcca
cttctctcacactttgccgaaaccgtagaaggactcaccaccatccgggccttcaggtat
gaggcccggttccagcagaagcttctcgaatacacagactccaacaacattgcttccctc
ttcctcacagctgccaacagatggctggaagtccgaatggagtacatcggtgcatgtgtg
gtgctcatcgcagcggtgacctccatctccaactccctgcacagggagctctctgctggc
ctggtgggcctgggccttacctacgccctaatggtctccaactacctcaactggatggtg
aggaacctggcagacatggagctccagctgggggctgtgaagcgcatccatgggctcctg
aaaaccgaggcagagagctacgaggggctcctggcaccatcgctgatcccaaagaactgg
ccagaccaagggaagatccagatccagaacctgagcgtgcgctacgacagctccctgaag
ccggtgctgaagcacgtcaatgccctcatcgcccctggacagaagatcgggatctgcggc
cgcaccggcagtgggaagtcctccttctctcttgccttcttccgcatggtggacacgttc
gaagggcacatcatcattgatggcattgacatcgccaaactgccgctgcacaccctgcgc
tcacgcctctccatcatcctgcaggaccccgtcctcttcagcggcaccatccgatttaac
ctggaccctgagaggaagtgctcagatagcacactgtgggaggccctggaaatcgcccag
ctgaagctggtggtgaaggcactgccaggaggcctcgatgccatcatcacagaaggcggg
gagaatttcagccagggacagaggcagctgttctgcctggcccgggccttcgtgaggaag
accagcatcttcatcatggacgaggccacggcttccattgacatggccacggaaaacatc
ctccaaaaggtggtgatgacagccttcgcagaccgcactgtggtcaccatcgcgcatcga
gtgcacaccatcctgagtgcagacctggtgatcgtcctgaagcggggtgccatccttgag
ttcgataagccagagaagctgctcagccggaaggacagcgtcttcgcctccttcgtccgt
gcagacaagtgacctgccagagcccaagtgccatcccacattcggaccctgcccataccc
ctgcctgggttttctaactgtaaatcacttgtaaataaatagatttgattatttcctaaa
|