Skip to main content

Makale: Pylearn2: a machine learning research library

Pylearn2 is a machine learning research library. This does not just mean that it is a collection of machine learning algorithms that share a common API; it means that it has been designed for flexibility and extensibility in order to facilitate research projects that involve new or unusual use cases. In this paper we give a brief history of the library, an overview of its basic philosophy, a summary of the library’s architecture, and a description of how the Pylearn2 community functions socially.

Makale: Object Detectors Emerge in Deep Scene CNNs

With the success of new computational architectures for visual processing, such as convolutional neural networks (CNN) and access to image databases with millions of labeled examples (e.g., ImageNet, Places), the state of the art in computer vision is advancing rapidly. One important factor for continued progress is to understand the representations that are learned by the inner layers of these deep architectures. Here we show that object detectors emerge from training CNNs to perform scene classification. As scenes are composed of objects, the CNN for scene classification automatically discovers meaningful objects detectors, representative of the learned scene categories. With object detectors emerging as a result of learning to recognize scenes, our work demonstrates that the same network can perform both scene recognition and object localization in a single forward-pass, without ever having been explicitly taught the notion of objects.

NVIDIA DIGITS ve Derin Öğrenme ile İlgili Sorular ve Cevaplar

NVIDIA’nın DIGITS hakkında düzenlemiş olduğu online derste (12.08.2015) katılımcıların yazılı sorularına verilen cevaplar aşağıda yer almaktadır. Dersle ilgili daha fazla bilgi için tıklayınız.

Q: I own a Titan X. I read somewhere that its single-precision performance (FP32) is 7 TFLOPS and double-precision performance (FP64) is only 1.3 TFLOPS. Do the frameworks discussed here all use single-precision by default? If not, how can they be configured for best performance?
A: By default, all the frameworks use single precision floating point.

Q: How is the number of GPUS set in DIGITS?
A: The number of GPUs to use is set on the train model page

Q: Will the model we make on digits work on Nvidia’s fork of Caffe or will it work with vanilla caffe too?
A: It will work in the main branch of Caffe.  Nvidia’s fork uses the same formats and layer types.

Q: Can digits work on a cluster? I have two GPUs on different machines. If I create a cluster out of them, can digits utilise the two GPU’s?
A: Yes, DIGITS can utilize two GPUs.  Recall that DIGITS is built on top of 3rd party frameworks so provided those frameworks can use two GPUs, then DIGITS can also.

Q: I own a Titan X. I read somewhere that its single-precision performance (FP32) is 7 TFLOPS and double-precision performance (FP64) is only 1.3 TFLOPS. Do the frameworks discussed here all use single-precision by default? If not, how can they be configured for best performance?
A: It is single precision by default in Digits

Q: Is it possible to train voice datas with using NVIDIA DIGITS?
A: Currently DIGITS is designed for training on images, but we would like to add support for speech/voice in the future

Devamını Oku

ABD Hava Kuvvetleri Komutanlığı Görüntü Analizinde Derin Öğrenme ve Doğal Dil İşleme Kullanımına Yönelik İhale Duyurusu Yaptı

ABD Hava Kuvvetleri, derin öğrenme alanındaki devrimsel nitelikteki başarıları gördükten sonra görüntü analizinde derin öğrenme ve doğal dil işleme kullanımına yönelik ihale duyurusunu yaptı. İhale duyurusunun detayı aşağıdaki adresde yer almaktadır.

https://www.fbo.gov/?s=opportunity&mode=form&id=fd107a8fbda12f4fc4ec55a713232436&tab=core&_cview=0

Özet olarak; “Approaches should consider applying and extending recent advances in deep learning, such as convolutional neural networks with localized object detection and classification, scene understanding/image captioning, text analytics using symbolic statistical inference, natural language processing for intermediate metadata tagging of text, context generation and semantics, and recommendation systems.”

Aslında ihalede istenenler özellikler Stanford Üniversitesi’nden Andrej Karpathy ve Li Fei-Fei  tarafından yazılan “Deep Visual-Semantic Alignments for Generating Image Descriptions” makalesinde detaylandırılmıştır.