# A training algorithm for optimal margin classifiers

@inproceedings{Boser1992ATA, title={A training algorithm for optimal margin classifiers}, author={Bernhard E. Boser and Isabelle Guyon and Vladimir Naumovich Vapnik}, booktitle={COLT '92}, year={1992} }

A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to… Expand

#### Topics from this paper

#### 10,288 Citations

Adaptive training methods for optimal margin classification

- Computer Science
- IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339)
- 1999

This study considers adaptive training schemes for optimal margin classification with neural networks and describes some novel schemes and compares them with the conventional schemes. Expand

Automatic Capacity Tuning of Very Large VC-Dimension Classifiers

- Mathematics, Computer Science
- NIPS
- 1992

It is shown that even high-order polynomial classifiers in high dimensional spaces can be trained with a small amount of training data and yet generalize better than classifiers with a smaller VC-dimension. Expand

Pattern Selection for Support Vector Classifiers

- Computer Science
- IDEAL
- 2002

A k-nearest neighbors (k-NN) based pattern selection method that tries to select the patterns that are near the decision boundary and that are correctly labeled to reduce training time of redundant SVs. Expand

Pattern recognition with novel support vector machine learning method

- Computer Science, Mathematics
- 2000 10th European Signal Processing Conference
- 2000

This study investigates the basic SVM method and points out some problems that may arise especially in large scale problems with abundant data, and proposes a novel SVM type method that aims to avoid the problems found in the basic method. Expand

New support vector algorithms with parametric insensitive/margin model

- Mathematics, Medicine
- Neural Networks
- 2010

In this paper, a modification of v-support vector machines (v-SVM) for regression and classification is described, and the use of a parametric insensitive/margin model with an arbitrary shape is… Expand

Training Data Selection for Support Vector Machines

- Computer Science
- ICNC
- 2005

This paper proposes two new methods that select a subset of data for SVM training and shows that a significant amount of training data can be removed by the proposed methods without degrading the performance of the resulting SVM classifiers. Expand

Fast Pattern Selection for Support Vector Classifiers

- Computer Science
- PAKDD
- 2003

A k-nearest neighbors (k-NN) based pattern selection method that tries to select the patterns that are near the decision boundary and that are correctly labeled to reduce training time of redundant SVs. Expand

Perceptron-like large margin classifiers

- Computer Science, Mathematics
- 2005

As the data are embedded in the augmented space at a larger distance from the origin the maximum margin in that space approaches the maximum geometric one in the original space, and the algorithmic procedure could be regarded as an approximate maximal margin classifier. Expand

Selecting Data for Fast Support Vector Machines Training

- Computer Science
- Trends in Neural Computation
- 2007

This paper proposes two new methods that select a subset of data for SVM training and shows that a significant amount of training data can be removed by the proposed methods without degrading the performance of the resulting SVM classifiers. Expand

On the proliferation of support vectors in high dimensions

- Computer Science, Mathematics
- AISTATS
- 2021

This paper identifies new deterministic equivalences for this phenomenon of support vector proliferation, and uses them to substantially broaden the conditions under which the phenomenon occurs in high-dimensional settings, and proves a nearly matching converse result. Expand

#### References

SHOWING 1-10 OF 41 REFERENCES

Structural Risk Minimization for Character Recognition

- Mathematics, Computer Science
- NIPS
- 1991

The method of Structural Risk Minimization is used to control the capacity of linear classifiers and improve generalization on the problem of handwritten digit recognition. Expand

Computer aided cleaning of large databases for character recognition

- Computer Science
- Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems
- 1992

By using the method of pattern cleaning, combined with an emphasizing scheme applied on the patterns that are hard to learn, the error rate on the test set has been reduced by half, in the case of the database of handwritten lowercase characters entered on a touch terminal. Expand

Comparing different neural network architectures for classifying handwritten digits

- Computer Science
- International 1989 Joint Conference on Neural Networks
- 1989

The authors propose a novel way of organizing the network architectures by training several small networks so as to deal separately with subsets of the problem, and then combining the results. Expand

Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks

- Computer Science, Medicine
- Science
- 1990

A theory is reported that shows the equivalence between regularization and a class of three-layer networks called regularization networks or hyper basis functions. Expand

Consistent inference of probabilities in layered networks: predictions and generalizations

- Mathematics
- International 1989 Joint Conference on Neural Networks
- 1989

The problem of learning a general input-output relation using a layered neural network is discussed in a statistical framework. By imposing the consistency condition that the error minimization be… Expand

Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network

- Mathematics, Computer Science
- NIPS
- 1991

A scheme is implemented that allows a network to learn the derivative of its outputs with respect to distortion operators of their choosing, which not only reduces the learning time and the amount of training data, but also provides a powerful language for specifying what generalizations the authors wish the network to perform. Expand

What Size Net Gives Valid Generalization?

- Mathematics, Computer Science
- Neural Computation
- 1989

It is shown that if m O(W/ ∊ log N/∊) random examples can be loaded on a feedforward network of linear threshold functions with N nodes and W weights, so that at least a fraction 1 ∊/2 of the examples are correctly classified, then one has confidence approaching certainty that the network will correctly classify a fraction 2 ∊ of future test examples drawn from the same distribution. Expand

Predicting {0,1}-functions on randomly drawn points

- Computer Science, Mathematics
- COLT '88
- 1988

This model is related to Valiant′s PAC learning model, but does not require the hypotheses used for prediction to be represented in any specified form and shows how to construct prediction strategies that are optimal to within a constant factor for any reasonable class F of target functions. Expand

Neural Networks and the Bias/Variance Dilemma

- Computer Science
- Neural Computation
- 1992

It is suggested that current-generation feedforward neural networks are largely inadequate for difficult problems in machine perception and machine learning, regardless of parallel-versus-serial hardware or other implementation issues. Expand

Fast Learning in Networks of Locally-Tuned Processing Units

- Computer Science
- Neural Computation
- 1989

We propose a network architecture which uses a single internal layer of locally-tuned processing units to learn both classification tasks and real-valued function approximations (Moody and Darken… Expand