GENE MARK

Introduction:

GeneMark was the first gene finding method recognized as an efficient and accurate tool for genome projects. GeneMark was used for annotation of the first completely sequenced bacteria, Haemophilus influenzae, and the first completely sequenced archaea, Methanococcus jannaschii. The GeneMark algorithm uses species specific inhomogeneous Markov chain models of protein-coding DNA sequence as well as homogeneous Markov chain models of non-coding DNA. Parameters of the models are estimated from training sets of sequences of known type. The major step of the algorithm computes a posteriory probability of a sequence fragment to carry on a genetic code in one of six possible frames (including three frames in complementary DNA strand) or to be “non-coding”. http://en.wikipedia.org/wiki/GeneMark.

How to use Gene Mark?

  • Point your browser to opal.biology.gatech.edu/GeneMark.This is the main Gene Mark home page. It offers different specialized version of the program (each corresponds to a different gene model) for working on prokaryotic (microbes), eukaryotic (animals), cDNA (the DNA version of mRNA), or virus sequences. Now I want to analyze a sequence from a microbe that is little known.

  • Now point your browser to www.uniprot.org.
  • Write accession number AEOO8569 in a query and click “search”.
  • Here is the information about protein.
  • Now select Gene Bank, click on the accession number and retrieve nucleotide sequence from here.
  • Here is the nucleotide sequence, save that sequence on a notepad file.
  • Now go back to opal.biology.gatech.edu/GeneMark, here in the prokaryotic section, click the Heuristic models link.This selects the program version corresponding to the analysis of the sequences from a new organism. It brings you to a simple form with a usual sequence input box.
  • Copy our sequence from a notepad file.
  • Paste the sequence in the input box.
  • Click the start Gene Mark.hmm button to run the search. After a while the program outputs a list of predicted genes, with their locations in a very simple format as shown below.
  • Go back to main page and then select the option to predict translated protein. The output is shown below.

Contributed by:Omama Saud