Academic literature on the topic 'Discriminative Pose Robust Descriptors'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Discriminative Pose Robust Descriptors.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Discriminative Pose Robust Descriptors"

1

Kniaz, V. V., V. V. Fedorenko, and N. A. Fomin. "DEEP LEARNING FOR LOWTEXTURED IMAGE MATCHING." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2 (May 30, 2018): 513–18. http://dx.doi.org/10.5194/isprs-archives-xlii-2-513-2018.

Full text
Abstract:
Low-textured objects pose challenges for an automatic 3D model reconstruction. Such objects are common in archeological applications of photogrammetry. Most of the common feature point descriptors fail to match local patches in featureless regions of an object. Hence, automatic documentation of the archeological process using Structure from Motion (SfM) methods is challenging. Nevertheless, such documentation is possible with the aid of a human operator. Deep learning-based descriptors have outperformed most of common feature point descriptors recently. This paper is focused on the development of a new Wide Image Zone Adaptive Robust feature Descriptor (WIZARD) based on the deep learning. We use a convolutional auto-encoder to compress discriminative features of a local path into a descriptor code. We build a codebook to perform point matching on multiple images. The matching is performed using the nearest neighbor search and a modified voting algorithm. We present a new “Multi-view Amphora” (Amphora) dataset for evaluation of point matching algorithms. The dataset includes images of an Ancient Greek vase found at Taman Peninsula in Southern Russia. The dataset provides color images, a ground truth 3D model, and a ground truth optical flow. We evaluated the WIZARD descriptor on the “Amphora” dataset to show that it outperforms the SIFT and SURF descriptors on the complex patch pairs.
APA, Harvard, Vancouver, ISO, and other styles
2

Rabba, Salah, Matthew Kyan, Lei Gao, Azhar Quddus, Ali Shahidi Zandi, and Ling Guan. "Discriminative Robust Head-Pose and Gaze Estimation Using Kernel-DMCCA Features Fusion." International Journal of Semantic Computing 14, no. 01 (March 2020): 107–35. http://dx.doi.org/10.1142/s1793351x20500014.

Full text
Abstract:
There remain outstanding challenges for improving accuracy of multi-feature information for head-pose and gaze estimation. The proposed framework employs discriminative analysis for head-pose and gaze estimation using kernel discriminative multiple canonical correlation analysis (K-DMCCA). The feature extraction component of the framework includes spatial indexing, statistical and geometrical elements. Head-pose and gaze estimation is constructed by feature aggregation and transforming features into a higher dimensional space using K-DMCCA for accurate estimation. The two main contributions are: Enhancing fusion performance through the use of kernel-based DMCCA, and by introducing an improved iris region descriptor based on quadtree. The overall approach is also inclusive of statistical and geometrical indexing that are calibration free (does not require any subsequent adjustment). We validate the robustness of the proposed framework across a wide variety of datasets, which consist of different modalities (RGB and Depth), constraints (wide range of head-poses, not only frontal), quality (accurately labelled for validation), occlusion (due to glasses, hair bang, facial hair) and illumination. Our method achieved an accurate head-pose and gaze estimation of 4.8∘ using Cave, 4.6∘ using MPII, 5.1∘ using ACS, 5.9∘ using EYEDIAP, 4.3∘ using OSLO and 4.6∘ using UULM datasets.
APA, Harvard, Vancouver, ISO, and other styles
3

Kawulok, Michal, Jakub Nalepa, Jolanta Kawulok, and Bogdan Smolka. "Dynamics of facial actions for assessing smile genuineness." PLOS ONE 16, no. 1 (January 5, 2021): e0244647. http://dx.doi.org/10.1371/journal.pone.0244647.

Full text
Abstract:
Applying computer vision techniques to distinguish between spontaneous and posed smiles is an active research topic of affective computing. Although there have been many works published addressing this problem and a couple of excellent benchmark databases created, the existing state-of-the-art approaches do not exploit the action units defined within the Facial Action Coding System that has become a standard in facial expression analysis. In this work, we explore the possibilities of extracting discriminative features directly from the dynamics of facial action units to differentiate between genuine and posed smiles. We report the results of our experimental study which shows that the proposed features offer competitive performance to those based on facial landmark analysis and on textural descriptors extracted from spatial-temporal blocks. We make these features publicly available for the UvA-NEMO and BBC databases, which will allow other researchers to further improve the classification scores, while preserving the interpretation capabilities attributed to the use of facial action units. Moreover, we have developed a new technique for identifying the smile phases, which is robust against the noise and allows for continuous analysis of facial videos.
APA, Harvard, Vancouver, ISO, and other styles
4

Sanyal, Soubhik, Sivaram Prasad Mudunuri, and Soma Biswas. "Discriminative pose-free descriptors for face and object matching." Pattern Recognition 67 (July 2017): 353–65. http://dx.doi.org/10.1016/j.patcog.2017.02.016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Singh, Geetika, and Indu Chhabra. "Discriminative Moment Feature Descriptors for Face Recognition." International Journal of Computer Vision and Image Processing 5, no. 2 (July 2015): 81–97. http://dx.doi.org/10.4018/ijcvip.2015070105.

Full text
Abstract:
Zernike Moment (ZM) is a promising technique to extract invariant features for face recognition. It has been modified in previous studies to Discriminative ZM (DZM), which selects most discriminative features to perform recognition, and shows improved results. The present paper proposes modification of DZM, named Modified DZM (MDZM), which selects coefficients based on their discriminative ability by considering extent of variability between their class-averages. This reduces within-class variations while maintaining between-class differences. The study also investigates this idea of feature selection on recently introduced Polar Complex Exponential Transform (PCET) (named discriminative or DPCET). Performance of the techniques is evaluated on ORL, Yale and FERET databases against pose, illumination, expression and noise variations. Accuracy improves up to 3.1% by MDZM at reduced dimensions over ZM and DZM. DPCET shows 1.9% of further improvement at less computational complexity. Performance is also tested on LFW database and compared with many other state-of-art approaches.
APA, Harvard, Vancouver, ISO, and other styles
6

Hajraoui, Abdellatif, and Mohamed Sabri. "Generic and Robust Method for Head Pose Estimation." Indonesian Journal of Electrical Engineering and Computer Science 4, no. 2 (November 1, 2016): 439. http://dx.doi.org/10.11591/ijeecs.v4.i2.pp439-446.

Full text
Abstract:
Head pose estimation has fascinated the research community due to its application in facial motion capture, human-computer interaction and video conferencing. It is a pre-requisite to gaze tracking, face recognition, and facial expression analysis. In this paper, we present a generic and robust method for model-based global 2D head pose estimation from single RGB Image. In our approach we use of the one part the Gabor filters to conceive a robust pose descriptor to illumination and facial expression variations, and that target the pose information. Moreover, we ensure the classification of these descriptors using a SVM classifier. The approach has proved effective view the rate for the correct pose estimations that we got.
APA, Harvard, Vancouver, ISO, and other styles
7

Lin, Guojun, Meng Yang, Linlin Shen, Mingzhong Yang, and Mei Xie. "Robust and discriminative dictionary learning for face recognition." International Journal of Wavelets, Multiresolution and Information Processing 16, no. 02 (March 2018): 1840004. http://dx.doi.org/10.1142/s0219691318400040.

Full text
Abstract:
For face recognition, conventional dictionary learning (DL) methods have some disadvantages. First, face images of the same person vary with facial expressions and pose, illumination and disguises, so it is hard to obtain a robust dictionary for face recognition. Second, they don’t cover important components (e.g., particularity and disturbance) completely, which limit their performance. In the paper, we propose a novel robust and discriminative DL (RDDL) model. The proposed model uses sample diversities of the same face image to learn a robust dictionary, which includes class-specific dictionary atoms and disturbance dictionary atoms. These atoms can well represent the data from different classes. Discriminative regularizations on the dictionary and the representation coefficients are used to exploit discriminative information, which improves effectively the classification capability of the dictionary. The proposed RDDL is extensively evaluated on benchmark face image databases, and it shows superior performance to many state-of-the-art dictionary learning methods for face recognition.
APA, Harvard, Vancouver, ISO, and other styles
8

Singh, Geetika, and Indu Chhabra. "Integrating Global Zernike and Local Discriminative HOG Features for Face Recognition." International Journal of Image and Graphics 16, no. 04 (October 2016): 1650021. http://dx.doi.org/10.1142/s0219467816500212.

Full text
Abstract:
Extraction of global face appearance and local interior differences is essential for any face recognition application. This paper presents a novel framework for face recognition by combining two effective descriptors namely, Zernike moments (ZM) and histogram of oriented gradients (HOG). ZMs are global descriptors that are invariant to image rotation, noise and scale. HOGs capture local details and are robust to illumination changes. Fusion of these two descriptors combines the merits of both local and global approaches and is effective against diverse variations present in face images. Further, as the processing time of HOG features is high owing to its large dimensionality, so, the study proposes to improve its performance by selecting only most discriminative HOG features (named discriminative HOG (DHOG)) for performing recognition. Efficacy of the proposed methods (DHOG, [Formula: see text] and [Formula: see text]) is tested on ORL, Yale and FERET databases. DHOG provides an improvement of 3% to 5% over the existing HOG approach. Recognition results achieved by [Formula: see text] and [Formula: see text] are up to 15% and 18% higher respectively than those obtained with these descriptors individually. Performance is also analyzed on LFW face database and compared with recent and state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
9

SCHWARTZ, WILLIAM ROBSON, and HELIO PEDRINI. "IMPROVED FRACTAL IMAGE COMPRESSION BASED ON ROBUST FEATURE DESCRIPTORS." International Journal of Image and Graphics 11, no. 04 (October 2011): 571–87. http://dx.doi.org/10.1142/s0219467811004251.

Full text
Abstract:
Fractal image compression is one of the most promising techniques for image compression due to advantages such as resolution independence and fast decompression. It exploits the fact that natural scenes present self-similarity to remove redundancy and obtain high compression rates with smaller quality degradation compared to traditional compression methods. The main drawback of fractal compression is its computationally intensive encoding process, due to the need for searching regions with high similarity in the image. Several approaches have been developed to reduce the computational cost to locate similar regions. In this work, we propose a method based on robust feature descriptors to speed up the encoding time. The use of robust features provides more discriminative and representative information for regions of the image. When the regions are better represented, the search for similar parts of the image can be reduced to focus only on the most likely matching candidates, which leads to reduction on the computational time. Our experimental results show that the use of robust feature descriptors reduces the encoding time while keeping high compression rates and reconstruction quality.
APA, Harvard, Vancouver, ISO, and other styles
10

Chen, Si, Dong Yan, and Yan Yan. "Directional Correlation Filter Bank for Robust Head Pose Estimation and Face Recognition." Mathematical Problems in Engineering 2018 (October 21, 2018): 1–10. http://dx.doi.org/10.1155/2018/1923063.

Full text
Abstract:
During the past few decades, face recognition has been an active research area in pattern recognition and computer vision due to its wide range of applications. However, one of the most challenging problems encountered by face recognition is the difficulty of handling large head pose variations. Therefore, the efficient and effective head pose estimation is a critical step of face recognition. In this paper, a novel feature extraction framework, called Directional Correlation Filter Bank (DCFB), is presented for head pose estimation. Specifically, in the proposed framework, the 1-Dimensional Optimal Tradeoff Filters (1D-OTF) corresponding to different head poses are simultaneously and jointly designed in the low-dimensional linear subspace. Different from the traditional methods that heavily rely on the precise localization of the key facial feature points, our proposed framework exploits the frequency domain of the face images, which effectively captures the high-order statistics of faces. As a result, the obtained features are compact and discriminative. Experimental results on public face databases with large head pose variations show the superior performance obtained by the proposed framework on the tasks of both head pose estimation and face recognition.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Discriminative Pose Robust Descriptors"

1

Ferraz, Colomina Luis. "Viewpoint invariant features and robust monocular Camera pose estimation." Doctoral thesis, Universitat Autònoma de Barcelona, 2016. http://hdl.handle.net/10803/368568.

Full text
Abstract:
La pose de la càmera respecte a una escena del món real determina la projecció perspectiva de l'escena sobre el pla imatge. L'anàlisi de les deformacions entre parelles d'imatges degudes a la perspectiva i la pose de la càmera han portat a molts investigadors en Visió per Computador a tractar amb problemes com, la capacitat per detectar i buscar coincidències de les mateixes característiques locals a diferents imatges o recuperar per cada imatge la pose original de la càmera. La diferencia entre els dos problemes recau en la localitat de la informació que es mostra a la imatge, mentre en el cas de les característiques es busca la invariància local, per al cas de la pose de la càmera es busquen fonts d'informació més global, com ara conjunts de característiques locals. La detecció de característiques locals és una peça clau per un ampli rang d'aplicacions de Visió per Computador donat que permet buscar coincidències i localitzar regions específiques de la imatge. A la primera part d'aquest treball la invariància de les característiques és abordada proposant algoritmes per millorar la robustesa a les pertorbacions de la imatge, canvis de perspectiva i poder de discriminació des de dos punts de vista: (i) detecció precisa de cantonades i taques a les imatges evitant redundàncies mitjançant el seu moviment a través de diferents escales, i (ii) aprenentatge de descriptors robustos. Concretament, proposem tres detectors invariants a escala on un d'ells detecta cantonades i taques simultàniament amb un increment de la càrrega computacional insignificant. També proposem un detector invariant afí de taques. Sobre descriptors, proposem aprendre'ls mitjançant xarxes neurals de convolució i grans conjunts de regions d'imatges anotades sota diferents condicions. Malgrat que és un tema investigat durant dècades, l'estimació de la pose de la càmera encara és un repte. L'objectiu dels algorismes de Perspective-n-Point (PnP) és estimar la localització i orientació d'una càmera calibrada a partir de n correspondències 3D-a-2D conegudes entre un prèviament conegut model 3D d'una escena real i característiques 2D obtingudes d'una única imatge. A la segona part d'aquesta tesi l'estimació de la pose de la càmera és adreçada amb nous mètodes de PnP, els quals redueixen dràsticament el cost computacional permetent aplicacions en temps real independentment del nombre de correspondències. A més, proporcionem un mecanisme integrat de rebuig de correspondències incorrectes amb una càrrega computacional insignificant i un nou mètode per incrementar la precisió que modela l'error de reprojecció de cada correspondència. A escenaris complexos i grans, amb potser centenars de milers de característiques, és difícil i computacionalment car trobar correspondències correctes. En aquest cas, proposem un mètode robust i precís per estimar la pose de la càmera. El nostre mètode s'aprofita de classificadors d'alt nivell, que estimen la pose de la càmera de manera poc precisa, per tal de restringir les correspondències a ser utilitzades pels nostres precisos algorismes de PnP.
Camera pose with respect to a real world scene determines the perspective projection of the scene on the image plane. The analysis of the deformations between pairs of images due to perspective and camera pose have led many Computer Vision researchers to deal with problems such as, the ability to detect and match the same local features in different images or recovering for each image its original camera pose. The difference between both problems lie in the locality of the image information, while for local features we look for local invariance, for camera pose we look for more global information sources, like sets of local features. Local feature detection is a cornerstone of a wide range of Computer Vision applications since it allows to match and localize specific image regions. In the first part of this work local invariance of features is tackled proposing algorithms to improve the robustness to image perturbations, perspective changes and discriminative power from two points of view: (i) accurate detection of non-redundant corner and blob image structures based on their movement along different scales, and (ii) learning robust descriptors. Concretely, we propose three scale invariant detectors, detecting one of them corners and blobs simultaneously with a negligible computational overhead. We also propose one blob affine invariant detector. In terms of descriptors, we propose to learn them using Convolutional Neural Networks and large datasets of annotated image regions under different image conditions. Despite being a topic researched for decades camera pose estimation is still an open challenge. The goal of the Perspective-n-Point (PnP) problem is to estimate the location and orientation of a calibrated camera from n known 3D-to-2D point correspondences between a previously known 3D model of a real scene and 2D features obtained from a single image. In the second part of this thesis camera pose estimation is addressed with novel PnP approaches, which reduces drastically the computational cost allowing real-time applications independently of the number of correspondences. In addition, we provide an integrated outlier rejection mechanism with a negligible computational overhead and a novel method to increase the accuracy by modelling the reprojection error of each correspondence. Finally in the case of complex and huge scenarios, with maybe hundreds of thousands of features, is difficult and computationally expensive to be able to find correct 3D-to-2D correspondences. In this case, a robust and accurate top-down approach for camera pose estimation is proposed. Our approach takes advantage of high-level classifiers, which estimates a rough camera pose, in order to constrain the 3D-to-2D correspondences to be used by our accurate and robust to outliers PnP method.
APA, Harvard, Vancouver, ISO, and other styles
2

Sanyal, Soubhik. "Discriminative Descriptors for Unconstrained Face and Object Recognition." Thesis, 2017. http://etd.iisc.ac.in/handle/2005/4177.

Full text
Abstract:
Face and object recognition is a challenging problem in the field of computer vision. It deals with identifying faces or objects form an image or video. Due to its numerous applications in biometrics, security, multimedia processing, on-line shopping, psychology and neuroscience, automated vehicle parking systems, autonomous driving and machine inspection, it has drawn attention from a lot of researches. Researchers have studied different aspects of this problem. Among them pose robust matching is a very important problem with various applications like recognizing faces and objects in uncontrolled scenarios in which the images appear in wide variety of pose and illumination conditions along with low resolution. In this thesis, we propose three discriminative pose-free descriptors, Subspace Point Representation (DPF-SPR), Layered Canonical Correlated (DPF-LCC ) and Aligned Discriminative Pose Robust (ADPR) descriptor, for matching faces and objects across pose. They are also robust for recognition in low resolution and varying illumination. We use training examples at very few poses to generate virtual intermediate pose subspaces. An image is represented by a feature set obtained by projecting its low-level feature on these subspaces. This way we gather more information regarding the unseen poses by generating synthetic data and make our features more robust towards unseen pose variations. Then we apply a discriminative transform to make this feature set suitable for recognition for generating two of our descriptors namely DPF-SPR and DPF-LCC. In one approach, we transform it to a vector by using subspace to point representation technique which generates our DPF-SPR descriptors. In the second approach, layered structures of canonical correlated subspaces are formed, onto which the feature set is projected which generates our DPF-LCC descriptor. In a third approach we first align the remaining subspaces with the frontal one before learning the discriminative metric and concatenate the aligned discriminative projected features to generate ADPR. Experiments on recognizing faces and objects across varying pose are done. Specifically we have done experiments on MultiPIE and Surveillance Cameras Face database for face recognition and COIL-20 and RGB-D dataset for object recognition. We show that our approaches can even improve the recognition rate over the state-of-the-art deep learning approaches. We also perform extensive analysis of our three descriptors to get a better qualitative understanding. We compare with state-of-the-art to show the effectiveness of the proposed approaches.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Discriminative Pose Robust Descriptors"

1

Chuang, Meng-Che, Jenq-Neng Hwang, and Kresimir Williams. "Automatic Fish Segmentation and Recognition for Trawl-Based Cameras." In Computer Vision, 847–74. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5204-8.ch034.

Full text
Abstract:
Camera-based fish abundance estimation with the aid of visual analysis techniques has drawn increasing attention. Live fish segmentation and recognition in open aquatic habitats, however, suffers from fast light attenuation, ubiquitous noise and non-lateral views of fish. In this chapter, an automatic live fish segmentation and recognition framework for trawl-based cameras is proposed. To mitigate the illumination issues, double local thresholding method is integrated with histogram backprojection to produce an accurate shape of fish segmentation. For recognition, a hierarchical partial classification is learned so that the coarse-to-fine categorization stops at any level where ambiguity exists. Attributes from important fish anatomical parts are focused to generate discriminative feature descriptors. Experiments on mid-water image sets show that the proposed framework achieves up to 93% of accuracy on live fish recognition based on automatic and robust segmentation results.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Discriminative Pose Robust Descriptors"

1

Sanyal, Soubhik, Devraj Mandal, and Soma Biswas. "Aligned discriminative pose robust descriptors for face and object recognition." In 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 2017. http://dx.doi.org/10.1109/icip.2017.8296395.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sanyal, Soubhik, Sivaram Prasad Mudunuri, and Soma Biswas. "Discriminative Pose-Free Descriptors for Face and Object Matching." In 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015. http://dx.doi.org/10.1109/iccv.2015.437.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Seo, Jeong-Jik, Hyung-Il Kim, and Yong Man Ro. "Pose-Robust and Discriminative Feature Representation by Multi-task Deep Learning for Multi-view Face Recognition." In 2015 IEEE International Symposium on Multimedia (ISM). IEEE, 2015. http://dx.doi.org/10.1109/ism.2015.93.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Nayak, Anshul, Azim Eskandarian, Prasenjit Ghorai, and Zachary Doerzaph. "A Comparative Study on Feature Descriptors for Relative Pose Estimation in Connected Vehicles." In ASME 2021 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 2021. http://dx.doi.org/10.1115/imece2021-70693.

Full text
Abstract:
Abstract In cooperative perception, reliable detection and localization of surrounding objects and communicating the information between vehicles is necessary for safety. However, V2V transmission of huge datasets or images can be computationally expensive and pose bandwidth issue often making real-time implementation non-feasible. An efficient and robust approach for ensuring such cooperation can be achieved by relative pose estimation between two vehicles sharing a common field of view. Especially when an object is not in the field of view of an ego vehicle, dynamically detecting the object and transferring its location in real-time to the ego vehicle is necessary. In such scenarios, reliable and robust pose recovery at each instant ensures accurate trajectory estimation by the ego vehicle. In our current study, pose recovery is achieved through common visual features present in a pair of images. Traditionally, algorithms like SIFT and KAZE have been used to detect and match features between an image pair from sensors looking from a different perspective. However, recently with the advent of binary detection and description algorithms like ORB and AKAZE, we have decided to analyze and show a comparative study on the efficacy and robustness of such methods for feature matching. The performance metrics for each method are decided based on total detected features, the number of good matches, and the computation time. The current study also tests the performance of each method under varying degrees of angular orientation and camera exposure setting, which can be helpful for motion estimation under dim light. Overall, AKAZE was computationally faster among all, while ORB and SIFT fared equally on other parameters. The corresponding research can be a precursor to future trajectory prediction of dynamic objects by the ego vehicle when there is a sudden loss of communication with the lead vehicle after the initial data transfer.
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Yichen, Jiehong Lin, Ke Chen, Zelin Xu, Yaowei Wang, and Kui Jia. "Manifold-Aware Self-Training for Unsupervised Domain Adaptation on Regressing 6D Object Pose." In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/193.

Full text
Abstract:
Domain gap between synthetic and real data in visual regression (e.g., 6D pose estimation) is bridged in this paper via global feature alignment and local refinement on the coarse classification of discretized anchor classes in target space, which imposes a piece-wise target manifold regularization into domain-invariant representation learning. Specifically, our method incorporates an explicit self-supervised manifold regularization, revealing consistent cumulative target dependency across domains, to a self-training scheme (e.g., the popular Self-Paced Self-Training) to encourage more discriminative transferable representations of regression tasks. Moreover, learning unified implicit neural functions to estimate relative direction and distance of targets to their nearest class bins aims to refine target classification predictions, which can gain robust performance against inconsistent feature scaling sensitive to UDA regressors. Experiment results on three public benchmarks of the challenging 6D pose estimation task can verify the effectiveness of our method, consistently achieving superior performance to the state-of-the-art for UDA on 6D pose estimation. Codes and pre-trained models are available https://github.com/Gorilla-Lab-SCUT/MAST.
APA, Harvard, Vancouver, ISO, and other styles
6

Sarkhel, Ritesh, and Arnab Nandi. "Deterministic Routing between Layout Abstractions for Multi-Scale Classification of Visually Rich Documents." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/466.

Full text
Abstract:
Classifying heterogeneous visually rich documents is a challenging task. Difficulty of this task increases even more if the maximum allowed inference turnaround time is constrained by a threshold. The increased overhead in inference cost, compared to the limited gain in classification capabilities make current multi-scale approaches infeasible in such scenarios. There are two major contributions of this work. First, we propose a spatial pyramid model to extract highly discriminative multi-scale feature descriptors from a visually rich document by leveraging the inherent hierarchy of its layout. Second, we propose a deterministic routing scheme for accelerating end-to-end inference by utilizing the spatial pyramid model. A depth-wise separable multi-column convolutional network is developed to enable our method. We evaluated the proposed approach on four publicly available, benchmark datasets of visually rich documents. Results suggest that our proposed approach demonstrates robust performance compared to the state-of-the-art methods in both classification accuracy and total inference turnaround.
APA, Harvard, Vancouver, ISO, and other styles
7

Bhabhrawala, Talib, and Venkat Krovi. "Shape Recovery From Medical Image Data Using Extended Superquadrics." In ASME 2005 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. ASMEDC, 2005. http://dx.doi.org/10.1115/detc2005-84738.

Full text
Abstract:
Rapid and representative reconstruction of geometric shape models from surface measurements has applications in diverse arenas ranging from industrial product design to biomedical organ/tissue modeling. However, despite the large body of work, most shape models have had limited success in bridging the gap between reconstruction, recognition, and analysis due to conflicting requirements. On one hand, large numbers of shape parameters are necessary to obtain meaningful information from noisy sensor data. On the other hand, search and recognition techniques require shape parameterizations/abstractions employing few robust shape descriptors. The extension of such shape models to encompass various analysis modalities (in the form of kinematics, dynamics and FEA) now necessitates the inclusion of the appropriate physics (preferably in parametric form) to support the simulation based refinement process. Thus, in this paper we discuss development of a class of parametric shape abstraction models termed as extended superquadrics. The underlying geometric and computational data structure intimately ties together implicit-, explicit- and parametric- surface representation together with a volumetric solid representation that makes them well suited for shape representation. Furthermore, such models are well suited for transitioning to analysis, as for example, in model-based non rigid structure and motion recovery or for mesh generation and simplified volumetric-FEA applications. However, the development of the concomitant methods and benchmarking is necessary prior to widespread acceptance. We will explore some of these aspects further in this paper supported with case studies of shape abstraction from image data in the biomedical/life-sciences arena whose diversity and irregularities pose difficulties for more traditional models.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography