Log in

Relevant bibliographies by topics / Visual and performative learning / Dissertations / Theses

Dissertations / Theses on the topic 'Visual and performative learning'

To see the other types of publications on this topic, follow the link: Visual and performative learning.

Author: Grafiati

Published: 4 June 2021

Last updated: 31 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Visual and performative learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Zhu, Fan. "Visual feature learning." Thesis, University of Sheffield, 2015. http://etheses.whiterose.ac.uk/8218/.

Full text

Abstract:

Categorization is a fundamental problem of many computer vision applications, e.g., image classification, pedestrian detection and face recognition. The robustness of a categorization system heavily relies on the quality of features, by which data are represented. The prior arts of feature extraction can be concluded in different levels, which, in a bottom up order, are low level features (e.g., pixels and gradients) and middle/high-level features (e.g., the BoW model and sparse coding). Low level features can be directly extracted from images or videos, while middle/high-level features are constructed upon low-level features, and are designed to enhance the capability of categorization systems based on different considerations (e.g., guaranteeing the domain-invariance and improving the discriminative power). This thesis focuses on the study of visual feature learning. Challenges that remain in designing visual features lie in intra-class variation, occlusions, illumination and view-point changes and insufficient prior knowledge. To address these challenges, I present several visual feature learning methods, where these methods cover the following sub-topics: (i) I start by introducing a segmentation-based object recognition system. (ii) When training data are insufficient, I seek data from other resources, which include images or videos in a different domain, actions captured from a different viewpoint and information in a different media form. In order to appropriately transfer such resources into the target categorization system, four transfer learning-based feature learning methods are presented in this section, where both cross-view, cross-domain and cross-modality scenarios are addressed accordingly. (iii) Finally, I present a random-forest based feature fusion method for multi-view action recognition.

APA, Harvard, Vancouver, ISO, and other styles

2

Huang, Wang. "Visual Sensation and Performative Cultural Politics: Chinese Literary Text Messages and the Colors of Texts." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1275499580.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Goh, Hanlin. "Learning deep visual representations." Paris 6, 2013. http://www.theses.fr/2013PA066356.

Full text

Abstract:

Les avancées récentes en apprentissage profond et en traitement d'image présentent l'opportunité d'unifier ces deux champs de recherche complémentaires pour une meilleure résolution du problème de classification d'images dans des catégories sémantiques. L'apprentissage profond apporte au traitement d'image le pouvoir de représentation nécessaire à l'amélioration des performances des méthodes de classification d'images. Cette thèse propose de nouvelles méthodes d'apprentissage de représentations visuelles profondes pour la résolution de cette tache. L'apprentissage profond a été abordé sous deux angles. D'abord nous nous sommes intéressés à l'apprentissage non supervisé de représentations latentes ayant certaines propriétés à partir de données en entrée. Il s'agit ici d'intégrer une connaissance à priori, à travers un terme de régularisation, dans l'apprentissage d'une machine de Boltzmann restreinte (RBM). Nous proposons plusieurs formes de régularisation qui induisent différentes propriétés telles que la parcimonie, la sélectivité et l'organisation en structure topographique. Le second aspect consiste au passage graduel de l'apprentissage non supervisé à l'apprentissage supervisé de réseaux profonds. Ce but est réalisé par l'introduction sous forme de supervision, d'une information relative à la catégorie sémantique. Deux nouvelles méthodes sont proposées. Le premier est basé sur une régularisation top-down de réseaux de croyance profonds à base de RBMs. Le second optimise un cout intégrant un critre de reconstruction et un critre de supervision pour l'entrainement d'autoencodeurs profonds. Les méthodes proposées ont été appliquées au problme de classification d'images. Nous avons adopté le modèle sac-de-mots comme modèle de base parce qu'il offre d'importantes possibilités grâce à l'utilisation de descripteurs locaux robustes et de pooling par pyramides spatiales qui prennent en compte l'information spatiale de l'image. L'apprentissage profonds avec agrÉgation spatiale est utilisé pour apprendre un dictionnaire hiÉrarchique pour l'encodage de reprÉsentations visuelles de niveau intermÉdiaire. Cette mÉthode donne des rÉsultats trs compétitifs en classification de scènes et d'images. Les dictionnaires visuels appris contiennent diverses informations non-redondantes ayant une structure spatiale cohérente. L'inférence est aussi très rapide. Nous avons par la suite optimisé l'étape de pooling sur la base du codage produit par le dictionnaire hiérarchique précédemment appris en introduisant introduit une nouvelle paramétrisation dérivable de l'opération de pooling qui permet un apprentissage par descente de gradient utilisant l'algorithme de rétro-propagation. Ceci est la premire tentative d'unification de l'apprentissage profond et du modèle de sac de mots. Bien que cette fusion puisse sembler évidente, l'union de plusieurs aspects de l'apprentissage profond de représentations visuelles demeure une tache complexe à bien des égards et requiert encore un effort de recherche important
Recent advancements in the areas of deep learning and visual information processing have presented an opportunity to unite both fields. These complementary fields combine to tackle the problem of classifying images into their semantic categories. Deep learning brings learning and representational capabilities to a visual processing model that is adapted for image classification. This thesis addresses problems that lead to the proposal of learning deep visual representations for image classification. The problem of deep learning is tackled on two fronts. The first aspect is the problem of unsupervised learning of latent representations from input data. The main focus is the integration of prior knowledge into the learning of restricted Boltzmann machines (RBM) through regularization. Regularizers are proposed to induce sparsity, selectivity and topographic organization in the coding to improve discrimination and invariance. The second direction introduces the notion of gradually transiting from unsupervised layer-wise learning to supervised deep learning. This is done through the integration of bottom-up information with top-down signals. Two novel implementations supporting this notion are explored. The first method uses top-down regularization to train a deep network of RBMs. The second method combines predictive and reconstructive loss functions to optimize a stack of encoder-decoder networks. The proposed deep learning techniques are applied to tackle the image classification problem. The bag-of-words model is adopted due to its strengths in image modeling through the use of local image descriptors and spatial pooling schemes. Deep learning with spatial aggregation is used to learn a hierarchical visual dictionary for encoding the image descriptors into mid-level representations. This method achieves leading image classification performances for object and scene images. The learned dictionaries are diverse and non-redundant. The speed of inference is also high. From this, a further optimization is performed for the subsequent pooling step. This is done by introducing a differentiable pooling parameterization and applying the error backpropagation algorithm. This thesis represents one of the first attempts to synthesize deep learning and the bag-of-words model. This union results in many challenging research problems, leaving much room for further study in this area

APA, Harvard, Vancouver, ISO, and other styles

4

Walker, Catherine Livesay. "Visual learning through Hypermedia." CSUSB ScholarWorks, 1996. https://scholarworks.lib.csusb.edu/etd-project/1148.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Owens, Andrew (Andrew Hale). "Learning visual models from paired audio-visual examples." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/107352.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 93-104).
From the clink of a mug placed onto a saucer to the bustle of a busy café, our days are filled with visual experiences that are accompanied by distinctive sounds. In this thesis, we show that these sounds can provide a rich training signal for learning visual models. First, we propose the task of predicting the sound that an object makes when struck as a way of studying physical interactions within a visual scene. We demonstrate this idea by training an algorithm to produce plausible soundtracks for videos in which people hit and scratch objects with a drumstick. Then, with human studies and automated evaluations on recognition tasks, we verify that the sounds produced by the algorithm convey information about actions and material properties. Second, we show that ambient audio - e.g., crashing waves, people speaking in a crowd - can also be used to learn visual models. We train a convolutional neural network to predict a statistical summary of the sounds that occur within a scene, and we demonstrate that the visual representation learned by the model conveys information about objects and scenes.
by Andrew Owens.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

6

Gontovnik, Monica. "Another Way of Being: The Performative Practices of Contemporary Female ColombianArtists." Ohio University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1420473106.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Peyre, Julia. "Learning to detect visual relations." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE016.

Full text

Abstract:

Nous étudions le problème de détection de relations visuelles de la forme (sujet, prédicat, objet) dans les images, qui sont des entités intermédiaires entre les objets et les scènes visuelles complexes. Cette thèse s’attaque à deux défis majeurs : (1) le problème d’annotations coûteuses pour l’entrainement de modèles fortement supervisés, (2) la variation d’apparence visuelle des relations. Nous proposons un premier modèle de détection de relations visuelles faiblement supervisé, n’utilisant que des annotations au niveau de l’image, qui, étant donné des détecteurs d’objets pré-entrainés, atteint une précision proche de celle de modèles fortement supervisés. Notre second modèle combine des représentations compositionnelles (sujet, objet, prédicat) et holistiques (triplet) afin de mieux modéliser les variations d’apparence visuelle et propose un module de raisonnement par analogie pour généraliser à de nouveaux triplets. Nous validons expérimentalement le bénéfice apporté par chacune de ces composantes sur des bases de données réelles
In this thesis, we study the problem of detection of visual relations of the form (subject, predicate, object) in images, which are intermediate level semantic units between objects and complex scenes. Our work addresses two main challenges in visual relation detection: (1) the difficulty of obtaining box-level annotations to train fully-supervised models, (2) the variability of appearance of visual relations. We first propose a weakly-supervised approach which, given pre-trained object detectors, enables us to learn relation detectors using image-level labels only, maintaining a performance close to fully-supervised models. Second, we propose a model that combines different granularities of embeddings (for subject, object, predicate and triplet) to better model appearance variation and introduce an analogical reasoning module to generalize to unseen triplets. Experimental results demonstrate the improvement of our hybrid model over a purely compositional model and validate the benefits of our transfer by analogy to retrieve unseen triplets

APA, Harvard, Vancouver, ISO, and other styles

8

Tang-Wright, Kimmy. "Visual topography and perceptual learning in the primate visual system." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:388b9658-dceb-443a-a19b-c960af162819.

Full text

Abstract:

The primate visual system is organised and wired in a topological manner. From the eye well into extrastriate visual cortex, a preserved spatial representation of the vi- sual world is maintained across many levels of processing. Diffusion-weighted imaging (DWI), together with probabilistic tractography, is a non-invasive technique for map- ping connectivity within the brain. In this thesis I probed the sensitivity and accuracy of DWI and probabilistic tractography by quantifying its capacity to detect topolog- ical connectivity in the post mortem macaque brain, between the lateral geniculate nucleus (LGN) and primary visual cortex (V1). The results were validated against electrophysiological and histological data from previous studies. Using the methodol- ogy developed in this thesis, it was possible to segment the LGN reliably into distinct subregions based on its structural connectivity to different parts of the visual field represented in V1. Quantitative differences in connectivity from magno- and parvo- cellular subcomponents of the LGN to different parts of V1 could be replicated with this method in post mortem brains. The topological corticocortical connectivity be- tween extrastriate visual area V5/MT and V1 could also be mapped in the post mortem macaque. In vivo DWI scans previously obtained from the same brains have lower resolution and signal-to-noise because of the shorter scan times. Nevertheless, in many cases, these yielded topological maps similar to the post mortem maps. These results indicate that the preserved topology of connection between LGN to V1, and V5/MT to V1, can be revealed using non-invasive measures of diffusion-weighted imaging and tractography in vivo. In a preliminary investigation using Human Connectome data obtained in vivo, I was not able to segment the retinotopic map in LGN based on con- nections to V1. This may be because information about the topological connectivity is not carried in the much lower resolution human diffusion data, or because of other methodological limitations. I also investigated the mechanisms of perceptual learning by developing a novel task-irrelevant perceptual learning paradigm designed to adapt neuronal elements early on in visual processing in a certain region of the visual field. There is evidence, although not clear-cut, to suggest that the paradigm elicits task- irrelevant perceptual learning, but that these effects only emerge when practice-related effects are accounted for. When orientation and location specific effects on perceptual performance are examined, the largest improvement occurs at the trained location, however, there is also significant improvement at one other 'untrained' location, and there is also a significant improvement in performance for a control group that did not receive any training at any location. The work highlights inherent difficulties in inves- tigating perceptual learning, which relate to the fact that learning likely takes place at both lower and higher levels of processing, however, the paradigm provides a good starting point for comprehensively investigating the complex mechanisms underlying perceptual learning.

APA, Harvard, Vancouver, ISO, and other styles

9

Shi, Xiaojin. "Visual learning from small training datasets /." Diss., Digital Dissertations Database. Restricted to UC campuses, 2005. http://uclibs.org/PID/11984.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Liu, Jingen. "Learning Semantic Features for Visual Recognition." Doctoral diss., University of Central Florida, 2009. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/3358.

Full text

Abstract:

Visual recognition (e.g., object, scene and action recognition) is an active area of research in computer vision due to its increasing number of real-world applications such as video (image) indexing and search, intelligent surveillance, human-machine interaction, robot navigation, etc. Effective modeling of the objects, scenes and actions is critical for visual recognition. Recently, bag of visual words (BoVW) representation, in which the image patches or video cuboids are quantized into visual words (i.e., mid-level features) based on their appearance similarity using clustering, has been widely and successfully explored. The advantages of this representation are: no explicit detection of objects or object parts and their tracking are required; the representation is somewhat tolerant to within-class deformations, and it is efficient for matching. However, the performance of the BoVW is sensitive to the size of the visual vocabulary. Therefore, computationally expensive cross-validation is needed to find the appropriate quantization granularity. This limitation is partially due to the fact that the visual words are not semantically meaningful. This limits the effectiveness and compactness of the representation. To overcome these shortcomings, in this thesis we present principled approach to learn a semantic vocabulary (i.e. high-level features) from a large amount of visual words (mid-level features). In this context, the thesis makes two major contributions. First, we have developed an algorithm to discover a compact yet discriminative semantic vocabulary. This vocabulary is obtained by grouping the visual-words based on their distribution in videos (images) into visual-word clusters. The mutual information (MI) be- tween the clusters and the videos (images) depicts the discriminative power of the semantic vocabulary, while the MI between visual-words and visual-word clusters measures the compactness of the vocabulary. We apply the information bottleneck (IB) algorithm to find the optimal number of visual-word clusters by finding the good tradeoff between compactness and discriminative power. We tested our proposed approach on the state-of-the-art KTH dataset, and obtained average accuracy of 94.2%. However, this approach performs one-side clustering, because only visual words are clustered regardless of which video they appear in. In order to leverage the co-occurrence of visual words and images, we have developed the co-clustering algorithm to simultaneously group the visual words and images. We tested our approach on the publicly available fifteen scene dataset and have obtained about 4% increase in the average accuracy compared to the one side clustering approaches. Second, instead of grouping the mid-level features, we first embed the features into a low-dimensional semantic space by manifold learning, and then perform the clustering. We apply Diffusion Maps (DM) to capture the local geometric structure of the mid-level feature space. The DM embedding is able to preserve the explicitly defined diffusion distance, which reflects the semantic similarity between any two features. Furthermore, the DM provides multi-scale analysis capability by adjusting the time steps in the Markov transition matrix. The experiments on KTH dataset show that DM can perform much better (about 3% to 6% improvement in average accuracy) than other manifold learning approaches and IB method. Above methods use only single type of features. In order to combine multiple heterogeneous features for visual recognition, we further propose the Fielder Embedding to capture the complicated semantic relationships between all entities (i.e., videos, images,heterogeneous features). The discovered relationships are then employed to further increase the recognition rate. We tested our approach on Weizmann dataset, and achieved about 17% 21% improvements in the average accuracy.
Ph.D.
School of Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Science PhD

APA, Harvard, Vancouver, ISO, and other styles

11

Beale, Dan. "Autonomous visual learning for robotic systems." Thesis, University of Bath, 2012. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.558886.

Full text

Abstract:

This thesis investigates the problem of visual learning using a robotic platform. Given a set of objects the robots task is to autonomously manipulate, observe, and learn. This allows the robot to recognise objects in a novel scene and pose, or separate them into distinct visual categories. The main focus of the work is in autonomously acquiring object models using robotic manipulation. Autonomous learning is important for robotic systems. In the context of vision, it allows a robot to adapt to new and uncertain environments, updating its internal model of the world. It also reduces the amount of human supervision needed for building visual models. This leads to machines which can operate in environments with rich and complicated visual information, such as the home or industrial workspace; also, in environments which are potentially hazardous for humans. The hypothesis claims that inducing robot motion on objects aids the learning process. It is shown that extra information from the robot sensors provides enough information to localise an object and distinguish it from the background. Also, that decisive planning allows the object to be separated and observed from a variety of dierent poses, giving a good foundation to build a robust classication model. Contributions include a new segmentation algorithm, a new classication model for object learning, and a method for allowing a robot to supervise its own learning in cluttered and dynamic environments.

APA, Harvard, Vancouver, ISO, and other styles

12

Lakshmi, Ratan Aparna. "Learning visual concepts for image classification." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/80092.

Full text

Abstract:

Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.
Includes bibliographical references (leaves 166-174).
by Aparna Lakshmi Ratan.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

13

Moghaddam, Baback 1963. "Probabilistic visual learning for object detection." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/10242.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.
Includes bibliographical references (leaves 78-82).
by Baback Moghaddam.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

14

Wilson, Andrew David. "Learning visual behavior for gesture analysis." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/62924.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Zhou, Bolei. "Interpretable representation learning for visual intelligence." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117837.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 131-140).
Recent progress of deep neural networks in computer vision and machine learning has enabled transformative applications across robotics, healthcare, and security. However, despite the superior performance of the deep neural networks, it remains challenging to understand their inner workings and explain their output predictions. This thesis investigates several novel approaches for opening up the "black box" of neural networks used in visual recognition tasks and understanding their inner working mechanism. I first show that objects and other meaningful concepts emerge as a consequence of recognizing scenes. A network dissection approach is further introduced to automatically identify the internal units as the emergent concept detectors and quantify their interpretability. Then I describe an approach that can efficiently explain the output prediction for any given image. It sheds light on the decision-making process of the networks and why the predictions succeed or fail. Finally, I show some ongoing efforts toward learning efficient and interpretable deep representations for video event understanding and some future directions.
by Bolei Zhou.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

16

Pillai, Sudeep. "Learning articulated motions from visual demonstration." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/89861.

Full text

Abstract:

Thesis: S.M. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
35
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 94-98).
Robots operating autonomously in household environments must be capable of interacting with articulated objects on a daily basis. They should be able to infer each object's underlying kinematic linkages purely by observing its motion during manipulation. This work proposes a framework that enables robots to learn the articulation in objects from user-provided demonstrations, using RGB-D sensors. We introduce algorithms that combine concepts in sparse feature tracking, motion segmentation, object pose estimation, and articulation learning, to develop our proposed framework. Additionally, our methods can predict the motion of previously seen articulated objects in future encounters. We present experiments that demonstrate the ability of our method, given RGB-D data, to identify, analyze and predict the articulation of a number of everyday objects within a human-occupied environment.
by Sudeep Pillai.
S.M. in Computer Science and Engineering

APA, Harvard, Vancouver, ISO, and other styles

17

Williams, Oliver Michael Christian. "Bayesian learning for efficient visual inference." Thesis, University of Cambridge, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.613979.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

North, Ben. "Learning dynamical models for visual tracking." Thesis, University of Oxford, 1998. http://ora.ox.ac.uk/objects/uuid:6ed12552-4c30-4d80-88ef-7245be2d8fb8.

Full text

Abstract:

Using some form of dynamical model in a visual tracking system is a well-known method for increasing robustness and indeed performance in general. Often, quite simple models are used and can be effective, but prior knowledge of the likely motion of the tracking target can often be exploited by using a specially-tailored model. Specifying such a model by hand, while possible, is a time-consuming and error-prone process. Much more desirable is for an automated system to learn a model from training data. A dynamical model learnt in this manner can also be a source of useful information in its own right, and a set of dynamical models can provide discriminatory power for use in classification problems. Methods exist to perform such learning, but are limited in that they assume the availability of 'ground truth' data. In a visual tracking system, this is rarely the case. A learning system must work from visual data alone, and this thesis develops methods for learning dynamical models while explicitly taking account of the nature of the training data --- they are noisy measurements. The algorithms are developed within two tracking frameworks. The Kalman filter is a simple and fast approach, applicable where the visual clutter is limited. The recently-developed Condensation algorithm is capable of tracking in more demanding situations, and can also employ a wider range of dynamical models than the Kalman filter, for instance multi-mode models. The success of the learning algorithms is demonstrated experimentally. When using a Kalman filter, the dynamical models learnt using the algorithms presented here produce better tracking when compared with those learnt using current methods. Learning directly from training data gathered using Condensation is an entirely new technique, and experiments show that many aspects of a multi-mode system can be successfully identified using very little prior information. Significant computational effort is required by the implementation of the methods, and there is scope for improvement in this regard. Other possibilities for future work include investigation of the strong links this work has with learning problems in other areas. Most notable is the study of the 'graphical models' commonly used in expert systems, where the ideas presented here promise to give insight and perhaps lead to new techniques.

APA, Harvard, Vancouver, ISO, and other styles

19

Florence, Peter R. (Peter Raymond). "Dense visual learning for robot manipulation." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/128398.

Full text

Abstract:

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2020
Cataloged from student-submitted PDF of thesis.
Includes bibliographical references (pages 115-127).
We would like to have highly useful robots which can richly perceive their world, semantically distinguish its fine details, and physically interact with it sufficiently for useful robotic manipulation. This is hard to achieve with previous methods: prior work has not equipped robots with the scalable ability to understand the dense visual state of their varied environments. The limitations have both been in the state representations used, and how to acquire them without significant human labeling effort. In this thesis we present work that leverages self-supervision, particularly via a mix of geometrical computer vision, deep visual learning, and robotic systems, to scalably produce dense visual inferences of the world state. These methods either enable robots to teach themselves dense visual models without human supervision, or they act as a large multiplying factor on the value of information provided by humans. Specifically, we develop a pipeline for providing ground truth labels of visual data in cluttered and multi-object scenes, we introduce the novel application of dense visual object descriptors to robotic manipulation and provide a fully robot-supervised pipeline to acquire them, and we leverage this dense visual understanding to efficiently learn new manipulation skills through imitation. With real robot hardware we demonstrate contact-rich tasks manipulating household objects, including generalizing across a class of objects, manipulating deformable objects, and manipulating a textureless symmetrical object, all with closed-loop, real-time vision-based manipulation policies.
by Peter R. Florence.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science

APA, Harvard, Vancouver, ISO, and other styles

20

Dey, Priya. "Visual speech in technology-enhanced learning." Thesis, University of Sheffield, 2012. http://etheses.whiterose.ac.uk/3329/.

Full text

Abstract:

This thesis investigates the use of synthetic talking heads, with lip, tongue and face movements synchronized with synthesized or natural speech, in technology-enhanced learning. This work applies talking heads in a speech tutoring application for teaching English as a second language. Previous studies have shown that speech perception is aided by visual information, but more research is needed to determine the effectiveness of visualization of articulators in pronunciation training. This thesis explores whether or not visual speech technology can give an improvement in learning pronunciation. This thesis investigates techniques for audiovisual speech synthesis, using both viseme-based and data-driven approaches to implement multiple talking heads. Intelligibility studies found the audiovisual heads to be more intelligible than audio alone, and the data-driven head was found to be more intelligible than the viseme-driven implementation. The talking heads are applied in a pronunciation-training application, which is evaluated by second-language learners to investigate the benefit of visual speech in technology-enhanced learning. User trials explored the efficacy of the software in demonstrating the /b/–/p/ contrast in English. The results indicate that learners showed an improvement in listening and pronunciation after using the software, while the benefit of visualization compared to auditory training alone varied between individuals. User evaluations found that the talking heads were perceived to be helpful in learning pronunciation, and the positive feedback on the tutoring system suggests that the use of talking heads in technology-enhanced learning could be useful in addition to traditional methods.

APA, Harvard, Vancouver, ISO, and other styles

21

Durand, Thibaut. "Weakly supervised learning for visual recognition." Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066142/document.

Full text

Abstract:

Cette thèse s'intéresse au problème de la classification d'images, où l'objectif est de prédire si une catégorie sémantique est présente dans l'image, à partir de son contenu visuel. Pour analyser des images de scènes complexes, il est important d'apprendre des représentations localisées. Pour limiter le coût d'annotation pendant l'apprentissage, nous nous sommes intéressé aux modèles d'apprentissage faiblement supervisé. Dans cette thèse, nous proposons des modèles qui simultanément classifient et localisent les objets, en utilisant uniquement des labels globaux pendant l'apprentissage. L'apprentissage faiblement supervisé permet de réduire le cout d'annotation, mais en contrepartie l'apprentissage est plus difficile. Le problème principal est comment agréger les informations locales (e.g. régions) en une information globale (e.g. image). La contribution principale de cette thèse est la conception de nouvelles fonctions de pooling (agrégation) pour l'apprentissage faiblement supervisé. En particulier, nous proposons une fonction de pooling « max+min », qui unifie de nombreuses fonctions de pooling. Nous décrivons comment utiliser ce pooling dans le framework Latent Structured SVM ainsi que dans des réseaux de neurones convolutifs. Pour résoudre les problèmes d'optimisation, nous présentons plusieurs solveurs, dont certains qui permettent d'optimiser une métrique d'ordonnancement (ranking) comme l'Average Precision. Expérimentalement, nous montrons l'intérêt nos modèles par rapport aux méthodes de l'état de l'art, sur dix bases de données standard de classification d'images, incluant ImageNet
This thesis studies the problem of classification of images, where the goal is to predict if a semantic category is present in the image, based on its visual content. To analyze complex scenes, it is important to learn localized representations. To limit the cost of annotation during training, we have focused on weakly supervised learning approaches. In this thesis, we propose several models that simultaneously classify and localize objects, using only global labels during training. The weak supervision significantly reduces the cost of full annotation, but it makes learning more challenging. The key issue is how to aggregate local scores - e.g. regions - into global score - e.g. image. The main contribution of this thesis is the design of new pooling functions for weakly supervised learning. In particular, we propose a “max + min” pooling function, which unifies many pooling functions. We describe how to use this pooling in the Latent Structured SVM framework as well as in convolutional networks. To solve the optimization problems, we present several solvers, some of which allow to optimize a ranking metric such as Average Precision. We experimentally show the interest of our models with respect to state-of-the-art methods, on ten standard image classification datasets, including the large-scale dataset ImageNet

APA, Harvard, Vancouver, ISO, and other styles

22

Nguyen, Duc Minh Chau. "Affordance learning for visual-semantic perception." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2021. https://ro.ecu.edu.au/theses/2443.

Full text

Abstract:

Affordance Learning is linked to the study of interactions between robots and objects, including how robots perceive objects by scene understanding. This area has been popular in the Psychology, which has recently come to influence Computer Vision. In this way, Computer Vision has borrowed the concept of affordance from Psychology in order to develop Visual-Semantic recognition systems, and to develop the capabilities of robots to interact with objects, in particular. However, existing systems of Affordance Learning are still limited to detecting and segmenting object affordances, which is called Affordance Segmentation. Further, these systems are not designed to develop specific abilities to reason about affordances. For example, a Visual-Semantic system, for captioning a scene, can extract information from an image, such as “a person holds a chocolate bar and eats it”, but does not highlight the affordances: “hold” and “eat”. Indeed, these affordances and others commonly appear within all aspects of life, since affordances usually connect to actions (from a linguistic view, affordances are generally known as verbs in sentences). Due to the above mentioned limitations, this thesis aims to develop systems of Affordance Learning for Visual-Semantic Perception. These systems can be built using Deep Learning, which has been empirically shown to be efficient for performing Computer Vision tasks. There are two goals of the thesis: (1) study what are the key factors that contribute to the performance of Affordance Segmentation and (2) reason about affordances (Affordance Reasoning) based on parts of objects for Visual-Semantic Perception. In terms of the first goal, the thesis mainly investigates the feature extraction module as this is one of the earliest steps in learning to segment affordances. The thesis finds that the quality of feature extraction from images plays a vital role in improved performance of Affordance Segmentation. With regard to the second goal, the thesis infers affordances from object parts to reason about part-affordance relationships. Based on this approach, the thesis devises an Object Affordance Reasoning Network that can learn to construct relationships between affordances and object parts. As a result, reasoning about affordance becomes achievable in the generation of scene graphs of affordances and object parts. Empirical results, obtained from extensive experiments, show the potential of the system (that the thesis developed) towards Affordance Reasoning from Scene Graph Generation.

APA, Harvard, Vancouver, ISO, and other styles

23

Safavi, Seyed Mehdi. "A performative view of knowledge exploitation and exploration : a case study of a higher education merger." Thesis, University of Edinburgh, 2014. http://hdl.handle.net/1842/17957.

Full text

Abstract:

Organizational transformations, such as mergers and acquisitions, disrupt the steady state of organizational daily life. Under some conditions, these kinds of disruptions may actually alter the organizational and occupational structure of everyday work. However, current theories of organizational learning and knowledge governance, such as the so-called ‘knowledge- or capability-based view of the firm’, are inadequate when it comes to the potential number of structural variations inherent in an organizational transformation taking place in non-commercial organizational settings such as higher education institutions. In an exploratory case study of a university merger, this dissertation inductively examines how governance structures in universities impact the creation and exploitation of knowledge, both in core academic activities (research and teaching) and in related and supporting administrative tasks. This setting provides an institutional configuration that differs considerably from that which has informed most previous research on the creation, sharing and exploitation of knowledge, but in which there are prominent institutional locales for the governance of knowledge processes. Taking a practice lens, this study proposes a finer-grained picture of those structural variations by depicting the recursive relationship between changes in knowledge content (ostensive aspects) and knowledge-use practices (performative aspects) in the academic merger. Similarities and differences in relation to knowledge governance in firms are also identified. The findings suggest a classification of the micro-processes by which organizational and competence-based capabilities are recreated, improving our understanding of knowledge-based capabilities (re)creation at different levels of organization and through different stages of merger implementation.

APA, Harvard, Vancouver, ISO, and other styles

24

Doyon, Julien. "Right temporal-lobe contribution to global visual processing and visual-cue learning." Thesis, McGill University, 1988. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=75696.

Full text

Abstract:

This thesis explores the visual functions of the right anterior temporal cortex of the human brain. In Part 1, 92 patients with unilateral temporal- or frontal-lobe excisions and 35 normal control subjects were tested under two experimental conditions (global, local) of a reaction-time task, employing hierarchically structured letters or designs as stimuli. In both versions, the right temporal-lobe group was less affected than other groups by interference from the global aspect of the stimulus. These findings support the hypothesis that the right temporal lobe contributes to global visual processing. In Part 2, the ability to learn a cue-system for discriminating between two targets against a background of visually similar items was examined in 107 patients with unilateral temporal- or frontal-lobe excisions and 37 control subjects, using three versions of a visual-cue learning task. With letters and nonsense syllables, all groups took longer to complete the task when the background information was changed after three learning trials. With abstract designs, only patients with right temporal-lobe lesions failed to show this interference effect after three learning trials, but did so after six. Hence, it is argued that the right temporal lobe plays a role in visual pattern-discrimination learning.

APA, Harvard, Vancouver, ISO, and other styles

25

Gepperth, Alexander Rainer Tassilo. "Neural learning methods for visual object detection." [S.l.] : [s.n.], 2006. http://deposit.ddb.de/cgi-bin/dokserv?idn=981053998.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Qin, Lei. "Online machine learning methods for visual tracking." Thesis, Troyes, 2014. http://www.theses.fr/2014TROY0017/document.

Full text

Abstract:

Nous étudions le problème de suivi de cible dans une séquence vidéo sans aucune connaissance préalable autre qu'une référence annotée dans la première image. Pour résoudre ce problème, nous proposons une nouvelle méthode de suivi temps-réel se basant sur à la fois une représentation originale de l’objet à suivre (descripteur) et sur un algorithme adaptatif capable de suivre la cible même dans les conditions les plus difficiles comme le cas où la cible disparaît et réapparait dans le scène (ré-identification). Tout d'abord, pour la représentation d’une région de l’image à suivre dans le temps, nous proposons des améliorations au descripteur de covariance. Ce nouveau descripteur est capable d’extraire des caractéristiques spécifiques à la cible, tout en ayant la capacité à s’adapter aux variations de l’apparence de la cible. Ensuite, l’étape algorithmique consiste à mettre en cascade des modèles génératifs et des modèles discriminatoires afin d’exploiter conjointement leurs capacités à distinguer la cible des autres objets présents dans la scène. Les modèles génératifs sont déployés dans les premières couches afin d’éliminer les candidats les plus faciles alors que les modèles discriminatoires sont déployés dans les couches suivantes afin de distinguer la cibles des autres objets qui lui sont très similaires. L’analyse discriminante des moindres carrés partiels (AD-MCP) est employée pour la construction des modèles discriminatoires. Enfin, un nouvel algorithme d'apprentissage en ligne AD-MCP a été proposé pour la mise à jour incrémentale des modèles discriminatoires
We study the challenging problem of tracking an arbitrary object in video sequences with no prior knowledge other than a template annotated in the first frame. To tackle this problem, we build a robust tracking system consisting of the following components. First, for image region representation, we propose some improvements to the region covariance descriptor. Characteristics of a specific object are taken into consideration, before constructing the covariance descriptor. Second, for building the object appearance model, we propose to combine the merits of both generative models and discriminative models by organizing them in a detection cascade. Specifically, generative models are deployed in the early layers for eliminating most easy candidates whereas discriminative models are in the later layers for distinguishing the object from a few similar "distracters". The Partial Least Squares Discriminant Analysis (PLS-DA) is employed for building the discriminative object appearance models. Third, for updating the generative models, we propose a weakly-supervised model updating method, which is based on cluster analysis using the mean-shift gradient density estimation procedure. Fourth, a novel online PLS-DA learning algorithm is developed for incrementally updating the discriminative models. The final tracking system that integrates all these building blocks exhibits good robustness for most challenges in visual tracking. Comparing results conducted in challenging video sequences showed that the proposed tracking system performs favorably with respect to a number of state-of-the-art methods

APA, Harvard, Vancouver, ISO, and other styles

27

Pralle, Mandi Jo. "Visual design in the online learning environment." [Ames, Iowa : Iowa State University], 2007.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

28

Hussain, Sibt Ul. "Machine Learning Methods for Visual Object Detection." Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00680048.

Full text

Abstract:

The goal of this thesis is to develop better practical methods for detecting common object classes in real world images. We present a family of object detectors that combine Histogram of Oriented Gradient (HOG), Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) features with efficient Latent SVM classifiers and effective dimensionality reduction and sparsification schemes to give state-of-the-art performance on several important datasets including PASCAL VOC2006 and VOC2007, INRIA Person and ETHZ. The three main contributions are as follows. Firstly, we pioneer the use of Local Ternary Pattern features for object detection, showing that LTP gives better overall performance than HOG and LBP, because it captures both rich local texture and object shape information while being resistant to variations in lighting conditions. It thus works well both for classes that are recognized mainly by their structure and ones that are recognized mainly by their textures. We also show that HOG, LBP and LTP complement one another, so that an extended feature set that incorporates all three of them gives further improvements in performance. Secondly, in order to tackle the speed and memory usage problems associated with high-dimensional modern feature sets, we propose two effective dimensionality reduction techniques. The first, feature projection using Partial Least Squares, allows detectors to be trained more rapidly with negligible loss of accuracy and no loss of run time speed for linear detectors. The second, feature selection using SVM weight truncation, allows active feature sets to be reduced in size by almost an order of magnitude with little or no loss, and often a small gain, in detector accuracy. Despite its simplicity, this feature selection scheme outperforms all of the other sparsity enforcing methods that we have tested. Lastly, we describe work in progress on Local Quantized Patterns (LQP), a generalized form of local pattern features that uses lookup table based vector quantization to provide local pattern style pixel neighbourhood codings that have the speed of LBP/LTP and some of the flexibility and power of traditional visual word representations. Our experiments show that LQP outperforms all of the other feature sets tested including HOG, LBP and LTP.

APA, Harvard, Vancouver, ISO, and other styles

29

Cabral, Ricardo da Silveira. "Unifying Low-Rank Models for Visual Learning." Research Showcase @ CMU, 2015. http://repository.cmu.edu/dissertations/506.

Full text

Abstract:

Many problems in signal processing, machine learning and computer vision can be solved by learning low rank models from data. In computer vision, problems such as rigid structure from motion have been formulated as an optimization over subspaces with fixed rank. These hard-rank constraints have traditionally been imposed by a factorization that parameterizes subspaces as a product of two matrices of fixed rank. Whilst factorization approaches lead to efficient and kernelizable optimization algorithms, they have been shown to be NP-Hard in presence of missing data. Inspired by recent work in compressed sensing, hard-rank constraints have been replaced by soft-rank constraints, such as the nuclear norm regularizer. Vis-a-vis hard-rank approaches, soft-rank models are convex even in presence of missing data: but how is convex optimization solving a NP-Hard problem? This thesis addresses this question by analyzing the relationship between hard and soft rank constraints in the unsupervised factorization with missing data problem. Moreover, we extend soft rank models to weakly supervised and fully supervised learning problems in computer vision. There are four main contributions of our work: (1) The analysis of a new unified low-rank model for matrix factorization with missing data. Our model subsumes soft and hard-rank approaches and merges advantages from previous formulations, such as efficient algorithms and kernelization. It also provides justifications on the choice of algorithms and regions that guarantee convergence to global minima. (2) A deterministic \rank continuation" strategy for the NP-hard unsupervised factorization with missing data problem, that is highly competitive with the state-of-the-art and often achieves globally optimal solutions. In preliminary work, we show that this optimization strategy is applicable to other NP-hard problems which are typically relaxed to convex semidentite programs (e.g., MAX-CUT, quadratic assignment problem). (3) A new soft-rank fully supervised robust regression model. This convex model is able to deal with noise, outliers and missing data in the input variables. (4) A new soft-rank model for weakly supervised image classification and localization. Unlike existing multiple-instance approaches for this problem, our model is convex.

APA, Harvard, Vancouver, ISO, and other styles

30

Xu, Yang. "Cortical spatiotemporal plasticity in visual category learning." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/272.

Full text

Abstract:

Central to human intelligence, visual categorization is a skill that is both remarkably fast and accurate. Although there have been numerous studies in primates regarding how information flows in inferiortemporal (ITC) and prefrontal (PFC) cortices during online discrimination of visual categories, there has been little comparable research on the human cortex. To bridge this gap, this thesis explores how visual categories emerge in prefrontal cortex and the ventral stream, which is the human homologue of ITC. In particular, cortical spatiotemporal plasticity in visual category learning was investigated using behavioral experiments, magnetoencephalographic (MEG) imaging, and statistical machine learning methods. From a theoretical perspective, scientists from work on non-human primates have posited that PFC plays a primary role in the encoding of visual categories. Much of the extant research in the cognitive neuroscience literature, however, emphasizes the role of the ventral stream. Despite their apparent incompatibility, no study has evaluated these theories in the human cortex by examining the roles of the ventral stream and PFC in online discrimination and acquisition of visual categories. To address this question, I conducted two learning experiments using visually-similar categories as stimuli and recorded cortical response using MEG—a neuroimaging technique that offers a millisecond temporal resolution. Across both experiments, categorical information was found to be available during the period of cortical activity. Moreover, late in the learning process, this information is supplied increasingly in the ventral stream but less so in prefrontal cortex. These findings extend previous theories by suggesting that the ventral stream is crucial to long-term encoding of visual categories when categorical perception is proficient, but that PFC jointly encodes visual categories early on during learning. From a methodological perspective, MEG is limited as a technique because it can lead to false discoveries in a large number of spatiotemporal regions of interest (ROIs) and, typically, can only coarsely reconstruct the spatial locations of cortical responses. To address the first problem, I developed an excursion algorithm that identified ROIs contiguous in time and space. I then used a permutation test to measure the global statistical significance of the ROIs. To address the second problem, I developed a method that incorporates domainspecific and experimental knowledge in the modeling process. Utilizing faces as a model category, I used a predefined “face” network to constrain the estimation of cortical activities by applying differential shrinkages to regions within and outside this network. I proposed and implemented a trial-partitioning approach which uses trials in the midst of learning for model estimation. Importantly, this renders localizing trials more precise in both the initial and final phases of learning. In summary, this thesis makes two significant contributions. First, it methodologically improves the way we can characterize the spatiotemporal properties of the human cortex using MEG. Second, it provides a combined theory of visual category learning by incorporating the large time scales that encompass the course of the learning.

APA, Harvard, Vancouver, ISO, and other styles

31

Ramachandran, Suchitra. "Visual Statistical Learning in Monkey Inferotemporal Cortex." Research Showcase @ CMU, 2014. http://repository.cmu.edu/dissertations/463.

Full text

Abstract:

Despite living in noisy sensory environments, humans and non-human primates have the ability to learn regularities and patterns in the environment solely on the basis of passive exposure. This ability to learn what is statistically likely and predictable in the environment is called statistical learning. Visual statistical learning of image sequences has been demonstrated at the level of single neurons in the rhesus macaque (monkey) inferotemporal cortex (IT). Upon subjecting monkeys to extensive exposure to pairs of images presented sequentially such that the display of one image always predicted the subsequent display of another image, IT neurons showed suppressed responses to images that occurred in a predicted context, but not when the same effect, called prediction suppression, more thoroughly, we discovered that this effect depends on the conditional probability between the images presented sequentially. Further, the effect generalizes across time and space, it is domain specific, and it can be induced by training monkeys on longer sequences. These effects are long-lasting and robust: they persist at least for 20 months after initial training with no exposure to the stimuli in the interim. We have preliminary evidence for the existence of neurophysiological markers of statistical learning in areas upstream of IT in the ventral visual stream, suggesting that learning statistical regularities may be a fundamental function of sensory cortex. images occurred in an unpredicted context (Meyer & Olson, 2011). Upon investigating this effect, called prediction suppression, more thoroughly, we discovered that this effect depends on the conditional probability between the images presented sequentially. Further, the effect generalizes across time and space, it is domain specific, and it can be induced by training monkeys on longer sequences. These effects are long-lasting and robust: they persist at least for 20 months after initial training with no exposure to the stimuli in the interim. We have preliminary evidence for the existence of neurophysiological markers of statistical learning in areas upstream of IT in the ventral visual stream, suggesting that learning statistical regularities may be a fundamental function of sensory cortex.

APA, Harvard, Vancouver, ISO, and other styles

32

Frier, Helen Jane. "Compass orientation during visual learning by honeybees." Thesis, University of Sussex, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.321446.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Kodirov, Elyor. "Cross-class transfer learning for visual data." Thesis, Queen Mary, University of London, 2017. http://qmro.qmul.ac.uk/xmlui/handle/123456789/31852.

Full text

Abstract:

Automatic analysis of visual data is a key objective of computer vision research; and performing visual recognition of objects from images is one of the most important steps towards understanding and gaining insights into the visual data. Most existing approaches in the literature for the visual recognition are based on a supervised learning paradigm. Unfortunately, they require a large amount of labelled training data which severely limits their scalability. On the other hand, recognition is instantaneous and effortless for humans. They can recognise a new object without seeing any visual samples by just knowing the description of it, leveraging similarities between the description of the new object and previously learned concepts. Motivated by humans recognition ability, this thesis proposes novel approaches to tackle cross-class transfer learning (crossclass recognition) problem whose goal is to learn a model from seen classes (those with labelled training samples) that can generalise to unseen classes (those with labelled testing samples) without any training data i.e., seen and unseen classes are disjoint. Specifically, the thesis studies and develops new methods for addressing three variants of the cross-class transfer learning: Chapter 3 The first variant is transductive cross-class transfer learning, meaning labelled training set and unlabelled test set are available for model learning. Considering training set as the source domain and test set as the target domain, a typical cross-class transfer learning assumes that the source and target domains share a common semantic space, where visual feature vector extracted from an image can be embedded using an embedding function. Existing approaches learn this function from the source domain and apply it without adaptation to the target one. They are therefore prone to the domain shift problem i.e., the embedding function is only concerned with predicting the training seen class semantic representation in the learning stage during learning, when applied to the test data it may underperform. In this thesis, a novel cross-class transfer learning (CCTL) method is proposed based on unsupervised domain adaptation. Specifically, a novel regularised dictionary learning framework is formulated by which the target class labels are used to regularise the learned target domain embeddings thus effectively overcoming the projection domain shift problem. Chapter 4 The second variant is inductive cross-class transfer learning, that is, only training set is assumed to be available during model learning, resulting in a harder challenge compared to the previous one. Nevertheless, this setting reflects a real-world setting in which test data is available after the model learning. The main problem remains the same as the previous variant, that is, the domain shift problem occurs when the model learned only from the training set is applied to the test set without adaptation. In this thesis, a semantic autoencoder (SAE) is proposed building on an encoder-decoder paradigm. Specifically, first a semantic space is defined so that knowledge transfer is possible from the seen classes to the unseen classes. Then, an encoder aims to embed/project a visual feature vector into the semantic space. However, the decoder exerts a generative task, that is, the projection must be able to reconstruct the original visual features. The generative task forces the encoder to preserve richer information, thus the learned encoder from seen classes is able generalise better to the new unseen classes. Chapter 5 The third one is unsupervised cross-class transfer learning. In this variant, no supervision is available for model learning i.e., only unlabelled training data is available, leading to the hardest setting compared to the previous cases. The goal, however, is the same, learning some knowledge from the training data that can be transferred to the test data composed of completely different labels from that of training data. The thesis proposes a novel approach which requires no labelled training data yet is able to capture discriminative information. The proposed model is based on a new graph regularised dictionary learning algorithm. By introducing a l1- norm graph regularisation term, instead of the conventional squared l2-norm, the model is robust against outliers and noises typical in visual data. Importantly, the graph and representation are learned jointly, resulting in further alleviation of the effects of data outliers. As an application, person re-identification is considered for this variant in this thesis.

APA, Harvard, Vancouver, ISO, and other styles

34

Crowley, Elliott Joseph. "Visual recognition in art using machine learning." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:d917f38e-64cb-4b09-9ccf-b081fe68b187.

Full text

Abstract:

This thesis is concerned with the problem of visual recognition in art - such as finding the objects (e.g. cars, cows and cathedrals) present in a painting, or identifying the subject of an oil portrait. Solving this problem is extremely beneficial to art historians, who are often interested in determining when an object first appeared in a painting or how the portrayal of an object has evolved over time. It allows them to avoid the unenviable task of finding paintings for study manually. However, visual recognition of art is a challenging problem, in part due to the lack of annotation in art. A solution is to train recognition models on natural, photographic images. These models have to overcome a domain shift when applied to art. Firstly, a thorough evaluation of the domain shift problem is conducted for the task of image classification in paintings; the performance of natural image-trained and painting- trained classifiers on a fixed set of paintings are compared for both shallow (Fisher Vec- tors) and deep image representations (Convolutional Neural Networks - CNNs) to exam- ine the performance gap across domains. Then, we show that this performance gap can be ameliorated by classifying regions using detectors. We next consider the problem of annotating gods and animals on classical Greek vases, starting from a large dataset of images of vases with associated brief text descriptions. To solve this, we develop a weakly supervised learning approach to solve the correspondence problem between the descriptions and unknown image regions. Then, we study the problem of matching photos of a person to paintings of that person, in order to retrieve similar paintings given a query photo. We show that performance at this task can be improved substantially by learning with a combination of photos and paintings - either by learning a linear projection matrix common across facial identities, or by fine-tuning a CNN. Finally, we present several applications of this research. These include a system that learns object classifiers on-the-fly from images crawled off the web, and uses these to find a variety of objects in very large datasets of art. We show that this research has resulted in the discovery of over 250,000 new object annotations across 93,000 paintings on the public Art UK website.

APA, Harvard, Vancouver, ISO, and other styles

35

Kashyap, Karan. "Learning digits via joint audio-visual representations." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/113143.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 59-60).
Our goal is to explore models for language learning in the manner that humans learn languages as children. Namely, children do not have intermediary text transcriptions in correlating visual and audio inputs from the environment; rather, they directly make connections between what they see and what they hear, sometimes even across languages! In this thesis, we present weakly-supervised models for learning representations of numerical digits between two modalities: speech and images. We experiment with architectures of convolutional neural networks taking in spoken utterances of numerical digits and images of handwritten digits as inputs. In nearly all cases we randomly initialize network weights (without pre-training) and evaluate the model's ability to return a matching image for a spoken input or to identify the number of overlapping digits between an utterance and an image. We also provide some visuals as evidence that our models are truly learning correspondences between the two modalities.
by Karan Kashyap.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

36

Gilja, Vikash. "Learning and applying model-based visual context." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/33139.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.
Includes bibliographical references (p. 53).
I believe that context's ability to reduce the ambiguity of an input signal makes it a vital constraint for understanding the real world. I specifically examine the role of context in vision and how a model-based approach can aid visual search and recognition. Through the implementation of a system capable of learning visual context models from an image database, I demonstrate the utility of the model-based approach. The system is capable of learning models for "water-horizon scenes" and "suburban street scenes" from a database of 745 images.
by Vikash Gilja.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

37

Woodley, Thomas Edward. "Visual tracking using offline and online learning." Thesis, University of Cambridge, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.608814.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Naha, Shujon. "Zero-shot Learning for Visual Recognition Problems." IEEE, 2015. http://hdl.handle.net/1993/31806.

Full text

Abstract:

In this thesis we discuss different aspects of zero-shot learning and propose solutions for three challenging visual recognition problems: 1) unknown object recognition from images 2) novel action recognition from videos and 3) unseen object segmentation. In all of these three problems, we have two different sets of classes, the “known classes”, which are used in the training phase and the “unknown classes” for which there is no training instance. Our proposed approach exploits the available semantic relationships between known and unknown object classes and use them to transfer the appearance models from known object classes to unknown object classes to recognize unknown objects. We also propose an approach to recognize novel actions from videos by learning a joint model that links videos and text. Finally, we present a ranking based approach for zero-shot object segmentation. We represent each unknown object class as a semantic ranking of all the known classes and use this semantic relationship to extend the segmentation model of known classes to segment unknown class objects.
October 2016

APA, Harvard, Vancouver, ISO, and other styles

39

Rao, Anantha N. "Learning-based Visual Odometry - A Transformer Approach." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1627658636420617.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Horn, Robert R. "Visual attention and information in observational learning." Thesis, Liverpool John Moores University, 2003. http://researchonline.ljmu.ac.uk/5624/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

White, Alan Daniel. "Visual-motor learning in minimally invasive surgery." Thesis, University of Leeds, 2016. http://etheses.whiterose.ac.uk/17321/.

Full text

Abstract:

The purpose of this thesis was to develop an in-depth understanding of motor control in surgery. This was achieved by applying current theories of sensorimotor learning and developing a novel experimental approach. A survey of expert opinion and a review of the existing literature identified several issues related to human performance and MIS. The approach of this thesis combined existing surgical training tools with state-of-the-art technology and adapted rigorous experimental psychology techniques (grounded in the principles of sensorimotor learning) within a controlled laboratory environment. Existing technology was incorporated into surgical scenarios via the Kinematic Assessment Tool - an experimentally validated, powerful and portable system capable of providing accurate and repeatable measures of visual-motor performance. The Kinematic Assessment Tool (KAT) was first established as an appropriate means of assessing visual-motor performance, subsequently the KAT was assessed as valid when assessing MIS performance. Following this, the system was used to investigate whether the principles of ‘structural learning’ could be applied to MIS. The final experiment investigated if there is any benefit of a standardised, repeatable laparoscopic warm-up to MIS performance. These experiments demonstrated that the KAT system combined with other existing technologies, can be used to investigate visual-motor performance. The results suggested that learning the control dynamics of the surgical instruments and variability in training is beneficial when presented with novel but similar tasks. These findings are consistent with structural learning theory. This thesis should inform current thinking on MIS training and performance and the future development of simulators with more emphasis on introducing variability within tasks during training. Further investigation of the role of structural learning in MIS is required.

APA, Harvard, Vancouver, ISO, and other styles

42

Ben-Younes, Hedi. "Multi-modal representation learning towards visual reasoning." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS173.

Full text

Abstract:

La quantité d'images présentes sur internet augmente considérablement, et il est nécessaire de développer des techniques permettant le traitement automatique de ces contenus. Alors que les méthodes de reconnaissance visuelle sont de plus en plus évoluées, la communauté scientifique s'intéresse désormais à des systèmes aux capacités de raisonnement plus poussées. Dans cette thèse, nous nous intéressons au Visual Question Answering (VQA), qui consiste en la conception de systèmes capables de répondre à une question portant sur une image. Classiquement, ces architectures sont conçues comme des systèmes d'apprentissage automatique auxquels on fournit des images, des questions et leur réponse. Ce problème difficile est habituellement abordé par des techniques d'apprentissage profond. Dans la première partie de cette thèse, nous développons des stratégies de fusion multimodales permettant de modéliser des interactions entre les représentations d'image et de question. Nous explorons des techniques de fusion bilinéaire, et assurons l'expressivité et la simplicité des modèles en utilisant des techniques de factorisation tensorielle. Dans la seconde partie, on s'intéresse au raisonnement visuel qui encapsule ces fusions. Après avoir présenté les schémas classiques d'attention visuelle, nous proposons une architecture plus avancée qui considère les objets ainsi que leurs relations mutuelles. Tous les modèles sont expérimentalement évalués sur des jeux de données standards et obtiennent des résultats compétitifs avec ceux de la littérature
The quantity of images that populate the Internet is dramatically increasing. It becomes of critical importance to develop the technology for a precise and automatic understanding of visual contents. As image recognition systems are becoming more and more relevant, researchers in artificial intelligence now seek for the next generation vision systems that can perform high-level scene understanding. In this thesis, we are interested in Visual Question Answering (VQA), which consists in building models that answer any natural language question about any image. Because of its nature and complexity, VQA is often considered as a proxy for visual reasoning. Classically, VQA architectures are designed as trainable systems that are provided with images, questions about them and their answers. To tackle this problem, typical approaches involve modern Deep Learning (DL) techniques. In the first part, we focus on developping multi-modal fusion strategies to model the interactions between image and question representations. More specifically, we explore bilinear fusion models and exploit concepts from tensor analysis to provide tractable and expressive factorizations of parameters. These fusion mechanisms are studied under the widely used visual attention framework: the answer to the question is provided by focusing only on the relevant image regions. In the last part, we move away from the attention mechanism and build a more advanced scene understanding architecture where we consider objects and their spatial and semantic relations. All models are thoroughly experimentally evaluated on standard datasets and the results are competitive with the literature

APA, Harvard, Vancouver, ISO, and other styles

43

Hanwell, David. "Weakly supervised learning of visual semantic attributes." Thesis, University of Bristol, 2014. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.687063.

Full text

Abstract:

There are at present many billions of images on the internet, only a fraction of which are labelled according to their semantic content. To automatically provide labels for the rest, models of visual semantic concepts must be created. Such models are traditionally trained using images which have been manually acquired, segmented, and labelled. In this thesis, we submit that such models can be learned automatically using those few images which have already been labelled, either directly by their creators, or indirectly by their associated text. Such imagery can be acquired easily, cheaply, and in large quantities, using web image searches. Though there has been some work towards learning from such weakly labelled data, all methods yet proposed require more than a minimum of human effort. In this thesis we put forth a number of methods for reliably learning models of visual semantic attributes using only the raw, unadulterated results of web image searches. The proposed methods do not require any human input beyond specifying the names of the attributes to be learned. We also present means of identifying and localising learned attributes in challenging, real-world images. Our methods are of a probabilistic nature, and make extensive use of multivariate Gaussian mixture models to represent both data and learned models. The contributions of this thesis also include several tools for acquiring and comparing these distributions, including a novel clustering algorithm. We apply our weakly supervised learning methods to the training of models of a variety of visual semantic attributes including colour and pattern terms. Detection and localization of the learned attributes in unseen realworld images is demonstrated, and both quantitative and qualitative results are presented. We compare against other work, including both general methods of weakly supervised learning, and more attribute specific methods. We apply our learning methods to the training sets of previous works, and assess their performance on the test sets used by other authors. Our results show that our methods give better results than the current state of the art.

APA, Harvard, Vancouver, ISO, and other styles

44

Hussain, Sabit ul. "Machine Learning Methods for Visual Object Detection." Thesis, Grenoble, 2011. http://www.theses.fr/2011GRENM070/document.

Full text

Abstract:

Le but de cette thèse est de développer des méthodes pratiques plus performantes pour la détection d'instances de classes d'objets de la vie quotidienne dans les images. Nous présentons une famille de détecteurs qui incorporent trois types d'indices visuelles performantes – histogrammes de gradients orientés (Histograms of Oriented Gradients, HOG), motifs locaux binaires (Local Binary Patterns, LBP) et motifs locaux ternaires (Local Ternary Patterns, LTP) – dans des méthodes de discrimination efficaces de type machine à vecteur de support latent (Latent SVM), sous deux régimes de réduction de dimension – moindres carrées partielles (Partial Least Squares, PLS) et sélection de variables par élagage de poids SVM (SVM Weight Truncation). Sur plusieurs jeux de données importantes, notamment ceux du PASCAL VOC2006 et VOC2007, INRIA Person et ETH Zurich, nous démontrons que nos méthodes améliorent l'état de l'art du domaine. Nos contributions principales sont : – Nous étudions l'indice visuelle LTP pour la détection d'objets. Nous démontrons que sa performance est globalement mieux que celle des indices bien établies HOG et LBP parce qu'elle permet d'encoder à la fois la texture locale de l'objet et sa forme globale, tout en étant résistante aux variations d'éclairage. Grâce à ces atouts, LTP fonctionne aussi bien pour les classes qui sont caractérisées principalement par leurs structures que pour celles qui sont caractérisées par leurs textures. En plus, nous démontrons que les indices HOG, LBP et LTP sont bien complémentaires, de sorte qu'un jeux d'indices étendu qui intègre tous les trois améliore encore la performance. – Les jeux d'indices visuelles performantes étant de dimension assez élevée, nous proposons deux méthodes de réduction de dimension afin d'améliorer leur vitesse et réduire leur utilisation de mémoire. La première, basée sur la projection moindres carrés partielles, diminue significativement le temps de formation des détecteurs linéaires, sans réduction de précision ni perte de vitesse d'exécution. La seconde, fondée sur la sélection de variables par l'élagage des poids du SVM, nous permet de réduire le nombre d'indices actives par un ordre de grandeur avec une réduction minime, voire même une petite augmentation, de la précision du détecteur. Malgré sa simplicité, cette méthode de sélection de variables surpasse toutes les autres approches que nous avons mis à l'essai. – Enfin, nous décrivons notre travail en cours sur une nouvelle variété d'indice visuelle – les « motifs locaux quantifiées » (Local Quantized Patterns, LQP). LQP généralise les indices existantes LBP / LTP en introduisant une étape de quantification vectorielle – ce qui permet une souplesse et une puissance analogue aux celles des approches de reconnaissance visuelle « sac de mots », qui sont basées sur la quantification des régions locales d'image considérablement plus grandes – sans perdre la simplicité et la rapidité qui caractérisent les approches motifs locales actuelles parce que les résultats de la quantification puissent être pré-compilés et stockés dans un tableau. LQP permet une augmentation considérable de la taille du support local de l'indice, et donc de sa puissance discriminatoire. Nos expériences indiquent qu'elle a la meilleure performance de toutes les indices visuelles testés, y compris HOG, LBP et LTP
The goal of this thesis is to develop better practical methods for detecting common object classes in real world images. We present a family of object detectors that combine Histogram of Oriented Gradient (HOG), Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) features with efficient Latent SVM classifiers and effective dimensionality reduction and sparsification schemes to give state-of-the-art performance on several important datasets including PASCAL VOC2006 and VOC2007, INRIA Person and ETHZ. The three main contributions are as follows. Firstly, we pioneer the use of Local Ternary Pattern features for object detection, showing that LTP gives better overall performance than HOG and LBP, because it captures both rich local texture and object shape information while being resistant to variations in lighting conditions. It thus works well both for classes that are recognized mainly by their structure and ones that are recognized mainly by their textures. We also show that HOG, LBP and LTP complement one another, so that an extended feature set that incorporates all three of them gives further improvements in performance. Secondly, in order to tackle the speed and memory usage problems associated with high-dimensional modern feature sets, we propose two effective dimensionality reduction techniques. The first, feature projection using Partial Least Squares, allows detectors to be trained more rapidly with negligible loss of accuracy and no loss of run time speed for linear detectors. The second, feature selection using SVM weight truncation, allows active feature sets to be reduced in size by almost an order of magnitude with little or no loss, and often a small gain, in detector accuracy. Despite its simplicity, this feature selection scheme outperforms all of the other sparsity enforcing methods that we have tested. Lastly, we describe work in progress on Local Quantized Patterns (LQP), a generalized form of local pattern features that uses lookup table based vector quantization to provide local pattern style pixel neighbourhood codings that have the speed of LBP/LTP and some of the flexibility and power of traditional visual word representations. Our experiments show that LQP outperforms all of the other feature sets tested including HOG, LBP and LTP

APA, Harvard, Vancouver, ISO, and other styles

45

Pyon, Wonn Sang. "Encoding Temporal Order and Visual Statistical Learning." Thesis, The University of Arizona, 2015. http://hdl.handle.net/10150/579050.

Full text

Abstract:

The literature suggests that visual statistical learning occurs from a very early age, with evidence suggesting that newborns are able to discern between familiar and novel sequences at just 2 days old. However, based on recent findings on the role of the medial temporal lobe in visual statistical learning in combination with our current understanding of this region's developmental timeline, we believe children younger than 40-months are unable to discern between the temporal regularities found between shapes in a sequence. In this particular study, we piloted two learning paradigms on adult subjects expecting to see a clear ability for the adult subjects to discriminate between our three categories of temporal order. Performance for our first paradigm, Fade-to-Reveal, revealed a significant improvement in reaction times through training, indicative of learning. For our second learning task Search-and-Find, the results of training suggested initial improvement with a regression in performance due to fatigue. Interestingly, subjects for both paradigms showed no real ability to explicitly recall the different shape-pairs at test. We interpret these opposing results to indicate that learning in these paradigms is implicit and thus the explicit recall test is not an appropriate measure of knowledge on shape-pairs.

APA, Harvard, Vancouver, ISO, and other styles

46

Liu, Li. "Learning discriminative feature representations for visual categorization." Thesis, University of Sheffield, 2015. http://etheses.whiterose.ac.uk/8239/.

Full text

Abstract:

Learning discriminative feature representations has attracted a great deal of attention due to its potential value and wide usage in a variety of areas, such as image/video recognition and retrieval, human activities analysis, intelligent surveillance and human-computer interaction. In this thesis we first introduce a new boosted key-frame selection scheme for action recognition. Specifically, we propose to select a subset of key poses for the representation of each action via AdaBoost and a new classifier, namely WLNBNN, is then developed for final classification. The experimental results of the proposed method are 0.6% - 13.2% better than previous work. After that, a domain-adaptive learning approach based on multiobjective genetic programming (MOGP) has been developed for image classification. In this method, a set of primitive 2-D operators are randomly combined to construct feature descriptors through the MOGP evolving and then evaluated by two objective fitness criteria, i.e., the classification error and the tree complexity. Later, the (near-)optimal feature descriptor can be obtained. The proposed approach can achieve 0.9% ∼ 25.9% better performance compared with state-of-the-art methods. Moreover, effective dimensionality reduction algorithms have also been widely used for obtaining better representations. In this thesis, we have proposed a novel linear unsupervised algorithm, termed Discriminative Partition Sparsity Analysis (DPSA), explicitly considering different probabilistic distributions that exist over the data points, simultaneously preserving the natural locality relationship among the data. All these above methods have been systematically evaluated on several public datasets, showing their accurate and robust performance (0.44% - 6.69% better than the previous) for action and image categorization. Targeting efficient image classification , we also introduce a novel unsupervised framework termed evolutionary compact embedding (ECE) which can automatically learn the task-specific binary hash codes. It is regarded as an optimization algorithm which combines the genetic programming (GP) and a boosting trick. The experimental results manifest ECE significantly outperform others by 1.58% - 2.19% for classification tasks. In addition, a supervised framework, bilinear local feature hashing (BLFH), has also been proposed to learn highly discriminative binary codes on the local descriptors for large-scale image similarity search. We address it as a nonconvex optimization problem to seek orthogonal projection matrices for hashing, which can successfully preserve the pairwise similarity between different local features and simultaneously take image-to-class (I2C) distances into consideration. BLFH produces outstanding results (0.017% - 0.149% better) compared to the state-of-the-art hashing techniques.

APA, Harvard, Vancouver, ISO, and other styles

47

Jiang, Ping, and R. Unbehauen. "Robot visual servoing with iterative learning control." IEEE, 2002. http://hdl.handle.net/10454/3495.

Full text

Abstract:

Yes
This paper presents an iterative learning scheme for vision guided robot trajectory tracking. At first, a stability criterion for designing iterative learning controller is proposed. It can be used for a system with initial resetting error. By using the criterion, one can convert the design problem into finding a positive definite discrete matrix kernel and a more general form of learning control can be obtained. Then, a three-dimensional (3-D) trajectory tracking system with a single static camera to realize robot movement imitation is presented based on this criterion.

APA, Harvard, Vancouver, ISO, and other styles

48

Al-Abood, Saleh Ahmed. "Effects of visual demonstrations on motor skill acquisition : a visual perception perspective." Thesis, Manchester Metropolitan University, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.340691.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Piñol, Naranjo Mónica. "Reinforcement learning of visual descriptors for object recognition." Doctoral thesis, Universitat Autònoma de Barcelona, 2014. http://hdl.handle.net/10803/283927.

Full text

Abstract:

El sistema visual humà és capaç de reconéixe l'objecte que hi ha en una imatge encara que l'objecte estigui parcialment oclòs, des de diferents punts de vista, en diferents colors i amb independència de la distància a la que es troba l'objecte de la càmera. Per poder realitzar això, l'ull obté l'imatge i extreu unes caracterítiques que són enviades al cervell i és allà on es classifica l'objecte per poder identificar-lo. En el reconeixement d'objectes, la visió per computador intenta imitar el sistema humà. Així, s'utilitza un algoritme per detectar característiques representatives de l'escena (detector), un altre algoritme per descriure les característiques extretes (descriptor) i finalment la informació es enviada a un tercer algoritme per fer la classificació (aprenentatge). Escollir aquests algoritmes és molt complicat i tant mateix una àrea d'investigació molt activa. En aquesta tesis ens hem enfocat en la selecció/aprenentatge del millor descriptor per a cada imatge. A l'actualitat hi ha molts descriptors a l'estat de l'art però no sabem quin es el millor, ja que no depèn sols d'ell mateix sinó també depen de les característiques de les imatges (base de dades) i dels algoritmes de classificació. Nosaltres proposem un marc de treball basat en l'aprenentatge per reforç i la bossa de característiques per poder escollir el millor descriptor per a cada imatge. El sistema permet analitzar el comportament de diferents classiicadors i conjunts de descriptors. A més el sistema que proposem per a la millora del reconeixement/classificació pot ser utilizat en altres àmbits de la visió per computador, com per exemple el video retrieval
The human visual system is able to recognize the object in an image even if the object is partially occluded, from various points of view, in different colors, or with independence of the distance to the object. To do this, the eye obtains an image and extracts features that are sent to the brain, and then, in the brain the object is recognized. In computer vision, the object recognition branch tries to learns from the human visual system behaviour to achieve its goal. Hence, an algorithm is used to identify representative features of the scene (detection), then another algorithm is used to describe these points (descriptor) and finally the extracted information is used for classifying the object in the scene. The selection of this set of algorithms is a very complicated task and thus, a very active research field. In this thesis we are focused on the selection/learning of the best descriptor for a given image. In the state of the art there are several descriptors but we do not know how to choose the best descriptor because depends on scenes that we will use (dataset) and the algorithm chosen to do the classification. We propose a framework based on reinforcement learning and bag of features to choose the best descriptor according to the given image. The system can analyse the behaviour of different learning algorithms and descriptor sets. Further- more the proposed framework for improving the classification/recognition ratio can be used with minor changes in other computer vision fields, such as video retrieval.

APA, Harvard, Vancouver, ISO, and other styles

50

Salazar, Rodrigo F. "Tob-down signals and learning in visual cortices /." [Zürich], 2004. http://e-collection.ethbib.ethz.ch/show?type=diss&nr=15718.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!