Skip to main content

Makale: ImageNet Classification with Deep Convolutional Neural Networks

We trained a large, deep convolutional neural network to classify the 1.3 million high-resolution images in the LSVRC-2010 ImageNet training set into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 39.7\% and 18.9\% which is considerably better than the previous state-of-the-art results. The neural network, which has 60 million parameters and 500,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and two globally connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of convolutional nets. To reduce overfitting in the globally connected layers we employed a new regularization method that proved to be very effective.

4824-imagenet-classification-with-deep-convolutional-neural-networks-supplemental.zip

Makale: Learning Semantic Relationships for Better Action Retrieval in Images

Human  actions  capture  a  wide  variety  of  interactions between people and objects.  As a result, the set of possible  actions  is  extremely  large  and  it  is  difficult  to  obtain sufficient  training  examples  for  all  actions.   However,  we could compensate for this sparsity in supervision by leveraging the rich semantic relationship between different actions.   A single action is often composed of other smaller actions and is exclusive of certain others. We need a method which can reason about such relationships and extrapolate unobserved  actions  from  known  actions.   Hence,  we  propose a novel neural network framework which jointly extracts the relationship between actions and uses them for training better action retrieval models. Our model incorporates linguistic, visual and logical consistency based cues to effectively identify these relationships.  We train and test
our model on a largescale image dataset of human actions. We show a significant improvement in mean AP compared to different baseline methods including the HEX-graph approach from Deng et al.

Makale: What is the Best Multi-Stage Architecture for Object Recognition?

In many recent object recognition systems, feature extraction stages are generally composed of a filter bank, a non-linear transformation, and some sort of feature pooling layer. Most systems use only one stage of feature extraction in which the filters are hard-wired, or two stages where the filters in one or both stages are learned in supervised or unsupervised mode. This paper addresses three questions: 1. How does the non-linearities that follow the filter banks influence the recognition accuracy? 2. does learning the filter banks in an unsupervised or supervised manner improve the performance over random filters or hardwired filters? 3. Is there any advantage to using an architecture with two stages of feature extraction, rather than one? We show that using non-linearities that include rectification and local contrast normalization is the single most important ingredient for good accuracy on object recognition benchmarks. We show that two stages of feature extraction yield better accuracy than one. Most surprisingly, we show that a two-stage system with random filters can yield almost 63% recognition rate on Caltech-101, provided that the proper non-linearities and pooling layers are used.
Finally, we show that with supervised refinement, the system achieves state-of-the-art performance on NORB dataset (5.6%) and unsupervised pre-training followed by supervised refinement produces good accuracy on Caltech-101 (>65%), and the lowest known error rate on the undistorted, unprocessed MNIST dataset (0.53%).

CVPR2015 : Büyük Ölçekli Görsel Tanıma Yarışması Eğitimi

ImageNet Büyük Ölçekli Görsel Tanıma Yarışması (ImageNet Large Scale Visual Recognition Challenge – ILSVRC) yüzlerce obje kategorisi ve milyonlarca resimden obje sınıflandırmaya yönelik yapılan bir faaliyettir. Yarışma 2010 yılından günümüze kadar yıllık olarak yapılmakta ve 50’nin üzerinde kuruluş tarafından katılım sağlanmaktadır.

Yarışmaya katılmak isteyenleri eğitmeyi amaçlayan bir çalışma 7 Haziran 2015 tarihinde icra edilmiştir. Eğitim kapsamında icra edilen sunumlar aşağıda yer almaktadır.

Devamını Oku