Iowa State University

Iowa State University

 

Department of Computer Science

Artificial Intelligence Laboratory

 

 

 

Frequency Based Learning (Frebal) for Naïve Bayes, NB k-gram, and NB(k)

GM066387

Frequency Based Learning (Frebal): Frebal is a stand alone algorithmic framework for learning on sequence data. The general concept is to use the probabilities of small local k-gram sequences (in the case of proteins, k consecutive amino acids) given a class to build classifiers to predict the given class. These probabilities can be estimated by the counts of the k-grams given a dataset. This demo version integrates two algorithms into this frame work. The first algorithm is NB k-gram. NB k-gram builds a Naïve Bayes classifier based on the k-grams. It assumes that these k-grams are independent based on position and ignores the dependencies caused by overlapping sequences. The second algorithm is NBk. NB k also builds a Naïve Bayes classifier based on the k-grams, but it takes into account the dependencies caused by overlapping sequences. Please note that using the value k=1 with either algorithm is equivalent to running a Naïve Bayes classifier. This framework can also be extended to other learning algorithms such as Support Vector Machines, Nearest Neighbor, Decision Trees, Artificial Neural Networks, etc. The downloadable version comes built in with five datasets (three based on Gene Ontology labels, and two based on subcellular localization data).

Files:
Download Demo Software (Zipped)
Download user manual (WORD)
Download user manual (PDF)

Data:
Dataset1 (kinase)
Dataset2 (kinase/ligase)
Dataset3 (kinase/ligase/helicase/isomerase)

References:
Andorf, C., Silvescu, A., Dobbs, D. and Honavar, V. Learning Classifiers for Assigning Protein Sequences to Gene Ontology Functional Families. In: Proceedings of the Fifth International Conference on Knowledge Based Computer Systems (KBCS 2004), India.

Please direct questions to

Carson Andorf
Artificial Intelligence Research Group
215 Atanasoff Hall
Iowa State University
Ames, IA
(andorfc@cs.iastate.edu)

 

 

 

 

 

 

 

 

Artificial Intelligence Laboratory
214 Atanasoff Hall
Ames, IA 50011-1041

Phone: (515)294-9074
Fax:    (515)294-0258