10-08-2011, 02:46 PM
[attachment=15165]
Objective:
To predict the genes in the given nucleotide sequence (NM_000230.2) of human using the GenScan tool.
Theory:
• GenScan allows prediction of complete gene structures in genomic sequences, including exons, introns, promoters and poly-adenylation signals.
• GenScan differs from the majority of gene finding algorithms as it can identify complete, partial and multiple genes on both DNA strands. The program is based on a probabilistic model of gene structure/compositional properties and does not make use of protein sequence homology information. The program is suitable for vertebrate, maize and Arabidopsis sequences. The vertebrate version also works fairly well for Drosophila sequences.
• GenScan was developed by Chris Burge in the research group of Samuel Karlin, Department of Mathematics at Stanford University.
• The score of a predicted feature (e.g., exon or splice site) is a log-odds measure of the quality of the feature based on local sequence properties. Thus, for example, a predicted donor splice site with score > 100 is excellent, 50-100 is acceptable, 0-50 is weak and below zero is poor (probably not a real donor site).
• The probability of a predicted exon is the estimated probability under GenScan's model of genomic sequence structure that the exon is correct. This probability depends in general on global as well as local sequence properties. This information can be used to assess the reliability of the predicted exon, e.g., it would be better to design PCR primers based on a predicted exon with probability > 0.95 than one with lower probability.
Procedure:
• Retrieve the query DNA sequence in Fasta format from NCBI
(ncbi.nlm.nih.gov).
• Log in to http://genome.dkfz-heidelberg.de/cgi-bin...enscan.cgi
• Paste the respective sequence in the given input box.
• Click run.
• As soon as the run button is clicked the process continues for predicting the coding regions.
• The genes/exons were obtained after complete execution of the program.