# CSC SVMs and other classifiers differ significantly especially

CSC 4810-Artificial Intelligence

ASSG# 4

Support Vector Machine

SVM is an implementation of Support Vector Machine (SVM). Support

Vector Machine was developed by Vapnik. The main futures of the program

are the following: for the problem of pattern recognition, for the problem

of regression, for the problem of learning a ranking function. Underlying

the success of SVM are mathematical foundations of statistical learning

theory. Rather than minimizing the training error, SVMs minimize

structural risk which express and upper bound on generalization error.

SVM are popular because they usually achieve good error rates and can

handle unusual types of data like text, graphs, and images.

SVM’s leading idea is to classify the input data separating them

within a decision threshold lying far from the two classes and scoring a

low number of errors. SVM’s are used for pattern recognition. Basically,

a data set is used to “train” a particular machine. This machine can learn

more by retraining it with the old data plus the new data. The trained

machine is as unique as the data that was used to train it and the

algorithm that was used to process the data. Once a machine is trained, it

can be used to predict how closely a new data set matches the trained

machine. In other words, Support Vector Machines are used for pattern

recognition. SVM uses the following equation to trained the Vector

Machine: H(x) = sign {wx + b}

Where

w = weight vector

b = threshold

The generalization abilities of SVMs and other classifiers differ

significantly especially when the number of training data is small. This

means that if some mechanism to maximize margins of decision boundaries is

introduced to non-SVM type classifiers, their performance degradation will

be prevented when the class overlap is scarce or non-existent. In the

original SVM, the n-class classification problem is converted into n two-

class problems, and in the ith two-class problem we determine the optimal

decision function that separates class i from the remaining classes. In

classification, if one of the n decision functions classifies an unknown

datum into a definite class, it is classified into that class. In this

formulation, if more than one decision function classifies a datum into

definite classes, or no decision functions classify the datum into a

definite class, the datum is unclassifiable.

To resolve unclassifiable regions for SVMswe discuss four types of

SVMs: one against all SVMs; pairwise SVMs; ECOC (Error Correction Output

Code) SVMs; all at once SVMs; and their variants. Another problem of SVM

is slow training. Since SVM are trained by a solving quadratic programming

problem with number of variables equals to the number of training data,

training is slow for a large number of training data. We discuss training

of Sims by decomposition techniques combined with a steepest ascent method.

Support Vector Machine algorithm also plays big role in internet

industry. For example, the Internet is huge, made of billions of documents

that are growing exponentially every year. However, a problem exists in

trying to find a piece of information amongst the billions of growing

documents. Current search engines scan for key words in the document

provided by the user in a search query. Some search engines such as Google

even go as far as to offer page rankings by users who have previously

visited the page. This relies on other people ranking the page according

to their needs. Even though these techniques help millions of users a day

retrieve their information, it is not even close to being an exact science.

The problem lies in finding web pages based on your search query that

actually contain the information you are looking for.

Here is the figure of SVM algorithm:

It is important to understand the mechanism behind the SVM. The SVM

implement the Bayes rule in interesting way. Instead of estimating P(x) it

estimates sign P(x)-1/2. This is advantage when our goal is binary

classification with minimal excepted misclassification rate. However, this

also means that in some other situation the SVM needs to be modified and

should not be used as is.

In conclusion, Support Vector Machine support lots of real world

applications such as text categorization, hand-written character

recognition, image classification, bioinformatics, etc. Their first

introduction in early 1990s lead to a recent explosion of applications and

deepening theoretical analysis that was now established Support Vector

Machines along with neural networks as one of standard tools for machine

learning and data mining. There is a big use of Support Vector Machine in

Medical Field.

Reference:

Boser, B., Guyon, I and Vapnik, V.N.(1992). A training algorithm for

optimal margin classifiers.

http://www.csie.ntu.edu.tw/~cjlin/papers/tanh.pdf