Academic literature on the topic 'Scene coordinates regression (SCR)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Scene coordinates regression (SCR).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Scene coordinates regression (SCR)":

1

Huang, Min, Zexu Liu, Tianen Liu, and Jingyang Wang. "CCDS-YOLO: Multi-Category Synthetic Aperture Radar Image Object Detection Model Based on YOLOv5s." Electronics 12, no. 16 (August 18, 2023): 3497. http://dx.doi.org/10.3390/electronics12163497.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Synthetic Aperture Radar (SAR) is an active microwave sensor that has attracted widespread attention due to its ability to observe the ground around the clock. Research on multi-scale and multi-category target detection methods holds great significance in the fields of maritime resource management and wartime reconnaissance. However, complex scenes often influence SAR object detection, and the diversity of target scales also brings challenges to research. This paper proposes a multi-category SAR image object detection model, CCDS-YOLO, based on YOLOv5s, to address these issues. Embedding the Convolutional Block Attention Module (CBAM) in the feature extraction part of the backbone network enables the model’s ability to extract and fuse spatial information and channel information. The 1 × 1 convolution in the feature pyramid network and the first layer convolution of the detection head are replaced with the expanded convolution, Coordinate Conventional (CoordConv), forming a CRD-FPN module. This module more accurately perceives the spatial details of the feature map, enhancing the model’s ability to handle regression tasks compared to traditional convolution. In the detector segment, a decoupled head is utilized for feature extraction, offering optimal and effective feature information for the classification and regression branches separately. The traditional Non-Maximum Suppression (NMS) is substituted with the Soft Non-Maximum Suppression (Soft-NMS), successfully reducing the model’s duplicate detection rate for compact objects. Based on the experimental findings, the approach presented in this paper demonstrates excellent results in multi-category target recognition for SAR images. Empirical comparisons are conducted on the filtered MSAR dataset. Compared with YOLOv5s, the performance of CCDS-YOLO has been significantly improved. The mAP@0.5 value increases by 3.3% to 92.3%, the precision increases by 3.4%, and the mAP@0.5:0.95 increases by 6.7%. Furthermore, in comparison with other mainstream detection models, CCDS-YOLO stands out in overall performance and anti-interference ability.
2

He, Rongru, Xiwen Luo, Zhigang Zhang, Wenyu Zhang, Chunyu Jiang, and Bingxuan Yuan. "Identification Method of Rice Seedlings Rows Based on Gaussian Heatmap." Agriculture 12, no. 10 (October 20, 2022): 1736. http://dx.doi.org/10.3390/agriculture12101736.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The identification method of rice seedling rows based on machine vision is affected by environmental factors that decrease the accuracy and the robustness of the rice seedling row identification algorithm (e.g., ambient light transformation, similarity of weed and rice features, and lack of seedlings in rice rows). To solve the problem of the above environmental factors, a Gaussian Heatmap-based method is proposed for rice seedling row identification in this study. The proposed method is a CNN model that comprises the High-Resolution Convolution Module of the feature extraction model and the Gaussian Heatmap of the regression module of key points. The CNN model is guided using Gaussian Heatmap generated by the continuity of rice row growth and the distribution characteristics of rice in rice rows to learn the distribution characteristics of rice seedling rows in the training process, and the positions of the coordinates of the respective key point are accurately returned through the regression module. For the three rice scenarios (including normal scene, missing seedling scene and weed scene), the PCK and average pixel offset of the model were 94.33%, 91.48%, 94.36% and 3.09, 3.13 and 3.05 pixels, respectively, for the proposed method, and the forward inference speed of the model reached 22 FPS, which can meet the real-time requirements and accuracy of agricultural machinery in field management.
3

Ballesta, Mónica, Luis Payá, Sergio Cebollada, Oscar Reinoso, and Francisco Murcia. "A CNN Regression Approach to Mobile Robot Localization Using Omnidirectional Images." Applied Sciences 11, no. 16 (August 16, 2021): 7521. http://dx.doi.org/10.3390/app11167521.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Understanding the environment is an essential ability for robots to be autonomous. In this sense, Convolutional Neural Networks (CNNs) can provide holistic descriptors of a scene. These descriptors have proved to be robust in dynamic environments. The aim of this paper is to perform hierarchical localization of a mobile robot in an indoor environment by means of a CNN. Omnidirectional images are used as the input of the CNN. Experiments include a classification study in which the CNN is trained so that the robot is able to find out the room where it is located. Additionally, a transfer learning technique transforms the original CNN into a regression CNN which is able to estimate the coordinates of the position of the robot in a specific room. Regarding classification, the room retrieval task is performed with considerable success. As for the regression stage, when it is performed along with an approach based on splitting rooms, it also provides relatively accurate results.
4

Shen, Xiaoyan, Shinan Zhou, and Dongsheng Li. "Microdisplacement Measurement Based on F-P Etalon: Processing Method and Experiments." Sensors 21, no. 11 (May 28, 2021): 3749. http://dx.doi.org/10.3390/s21113749.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Herein, a processing method is proposed for accurate microdisplacement measurements from a 2D Fabry–Perot (F-P) fringe pattern. The core of the processing algorithm uses the F-P interference imaging concentric ring pattern to accurately calculate the centre coordinates of the concentric ring. The influencing factors of measurement were analysed, and the basic idea of data processing was provided. In particular, the coordinate rotation by the 45-degree method (CR) was improved; consequently, the virtual pixel interval was reduced by half, and the calculation accuracy of the circle centre coordinate was improved. Experiments were conducted to analyse the influence of the subdivision and circle fitting methods. The results show that the proposed secondary coordinate rotation (SCR) by 45 degrees method can obtain higher accuracy of the centre coordinate than the CR method, and that the multichord averaging method (MCAM) is more suitable for calculation of the centre coordinate than the circular regression method (CRM). Displacement measurement experiments were performed. The results show that the standard experimental deviation of the centre of the circle is approximately 0.009 µm, and the extended uncertainty of the displacement measurement in the range of 5 mm is approximately 0.03 μm. The data processing method studied in this study can be widely used in the field of F-P interferometry.
5

Ma, Li, Ning Cao, Xiaoliang Feng, and Minghe Mao. "Indoor Positioning Algorithm Based on Maximum Correntropy Unscented Information Filter." ISPRS International Journal of Geo-Information 10, no. 7 (June 28, 2021): 441. http://dx.doi.org/10.3390/ijgi10070441.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In view of the fact that indoor positioning systems are usually affected by non-Gaussian noise in complex indoor environments, this paper tests the data in the actual scene and analyzes the distribution characteristics of noise, and proposes a new indoor positioning algorithm based on maximum correntropy unscented information filter (MCUIF). The proposed indoor positioning algorithm includes three steps: First, the estimation of the state matrix and the corresponding covariance matrix are predicted through the unscented transformation (UT). Second, the observed information is reconstructed by using a nonlinear regression method on the basis of the maximum correntropy criterion (MCC). Third, the contribution of information vector is gained by non-Gaussian measurement and the predicted information vector is corrected by the contribution of information vector. Finally, the gain of information filtering is got by the information entropy state matrix and the information entropy measurement matrix to calculate the position coordinates of the unknown nodes. This algorithm enhances the robustness of the MCUIF to non-Gaussian noise in complex indoor environments. The results from the indoor positioning experiments show that MCUIF is better than the traditional methods in state estimation and position location of the unknown nodes.
6

Spevakova, S. S., A. G. Spevakov, and I. V. Chernetskaya. "Mathematical Model of Multispectral Data Processing for a Mobile Ecology Monitoring Platform." Proceedings of the Southwest State University. Series: IT Management, Computer Science, Computer Engineering. Medical Equipment Engineering 13, no. 2 (August 3, 2023): 153–69. http://dx.doi.org/10.21869/2223-1536-2023-13-2-153-169.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The purpose of research is a mathematical justification of the process of processing multispectral data in order to detect local environmental pollution zones with the possibility of classifying the pollutant. Methods. The fundamentals of the applied theory of stochastic systems based on equations for multidimensional characteristic functions and functionals are used as a basic mathematical apparatus. When determining a contaminant, a criterion reflecting the ability of objects obeying Lambert's law is used. To solve the problem of object classification, approaches using binary logistic regression are applied. Statistical methods of analysis were used to evaluate the results of the study. Results. The obtained partial mathematical models allow us to take into account many factors affecting mobile environmental monitoring platforms operating in automatic mode. Substantiate the possibility of remote analysis of local environmental pollution zones, with the possibility of determining pollutants such as hydrocarbons, phosphate ions, etc., as well as searching for unauthorized locations of construction and household garbage. They increase the accuracy characteristics by 1,3 times when determining the parameters of selected objects due to the processing of data obtained in various spectral ranges. They contribute to reducing the computational complexity of the classification algorithm by 1,1 times, taking into account the volume of input data in a limited spectral range and reducing the resolution of the reference object, while not affecting the accuracy of classification. Conclusion. A mathematical model has been developed for processing data and images obtained in several spectral ranges during the operation of a multispectral device for an autonomous mobile environmental monitoring platform, which makes it possible to identify objects in the field of view of the device from a mobile platform, to obtain a detailed image of working scene objects with spatial reference relative to the coordinate system used, a distinctive feature of which is to increase the accuracy of calculating the coordinates of local zones pollution, and increasing the reliability of the classification of objects based on the characteristics of diffusive reflectivity in various spectral ranges.
7

Wang, Shuzhe, Zakaria Laskar, Iaroslav Melekhov, Xiaotian Li, Yi Zhao, Giorgos Tolias, and Juho Kannala. "HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer." International Journal of Computer Vision, February 6, 2024. http://dx.doi.org/10.1007/s11263-023-01982-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractVisual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image and a pre-built 3D model. Recently, deep neural networks have been exploited to regress the mapping between raw pixels and 3D coordinates in the scene, and thus the matching is implicitly performed by the forward pass through the network. However, in a large and ambiguous environment, learning such a regression task directly can be difficult for a single network. In this work, we present a new hierarchical scene coordinate network to predict pixel scene coordinates in a coarse-to-fine manner from a single RGB image. The proposed method, which is an extension of HSCNet, allows us to train compact models which scale robustly to large environments. It sets a new state-of-the-art for single-image localization on the 7-Scenes, 12-Scenes, Cambridge Landmarks datasets, and the combined indoor scenes.
8

zhang, kai, Xiaolin Meng, and Qing Wang. "An End-to-end Learning Framework for Visual Camera Relocalization Using RGB and RGB-D Images." Measurement Science and Technology, May 22, 2024. http://dx.doi.org/10.1088/1361-6501/ad4f02.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Camera relocalization plays a vital role in the realms of machine perception, robotics, and augmented reality. Direct learning methods based on structures can have a learning-based approach that can learn scene coordinates and use them for camera position estimation. However, the two-stage learning of scene coordinate regression and camera position estimation can result in some of the scene coordinate regression knowledge being lost throughout the learning process of the final pose estimation system, thereby reducing the accuracy of the pose estimation. This paper introduces an innovative end-to-end learning framework tailored for visual camera relocalization by employing both RGB and RGB-D images. Distinguished by its integration of scene coordinate regression with pose estimation into a concurrent inner and outer loop during a singular training phase, this framework notably enhances pose estimation accuracy. Engineered for flexibility, it accommodates training with or without depth cues and necessitates merely a single RGB image during testing. Empirical evaluation substantiates the proposed method's state-of-the-art precision, attaining an average pose accuracy of 0.019m and 0.74º on the indoor 7Scenes dataset, together with 0.162m and 0.30º on the outdoor Cambridge Landmarks dataset.
9

Izquierdo, Rubén, Álvaro Quintanar, David Fernández Llorca, Iván García Daza, Noelia Hernández, Ignacio Parra, and Miguel Ángel Sotelo. "Vehicle trajectory prediction on highways using bird eye view representations and deep learning." Applied Intelligence, July 20, 2022. http://dx.doi.org/10.1007/s10489-022-03961-y.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractThis work presents a novel method for predicting vehicle trajectories in highway scenarios using efficient bird’s eye view representations and convolutional neural networks. Vehicle positions, motion histories, road configuration, and vehicle interactions are easily included in the prediction model using basic visual representations. The U-net model has been selected as the prediction kernel to generate future visual representations of the scene using an image-to-image regression approach. A method has been implemented to extract vehicle positions from the generated graphical representations to achieve subpixel resolution. The method has been trained and evaluated using the PREVENTION dataset, an on-board sensor dataset. Different network configurations and scene representations have been evaluated. This study found that U-net with 6 depth levels using a linear terminal layer and a Gaussian representation of the vehicles is the best performing configuration. The use of lane markings was found to produce no improvement in prediction performance. The average prediction error is 0.47 and 0.38 meters and the final prediction error is 0.76 and 0.53 meters for longitudinal and lateral coordinates, respectively, for a predicted trajectory length of 2.0 seconds. The prediction error is up to 50% lower compared to the baseline method.
10

Ameperosa, Ezra, and Pranav A. Bhounsule. "Domain Randomization Using Deep Neural Networks for Estimating Positions of Bolts." Journal of Computing and Information Science in Engineering 20, no. 5 (May 26, 2020). http://dx.doi.org/10.1115/1.4047074.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Current manual practices of replacing bolts on structures are time-consuming and costly, especially because of numerous bolts. Thus, an automated method that can visually detect and localize bolt positions would be highly beneficial. We demonstrate the use of deep neural networks using domain randomization for detecting and localizing bolts on a workpiece. In contrast to previous approaches that require training on real images, the use of domain randomization enables all training in simulation. The key idea is to create a wide variety of computer-generated synthetic images by varying the texture, color, camera position and orientation, distractor objects, and noise, and train the neural network on these images such that the neural network is robust to scene variability and hence provides accurate results when deployed on real images. Using domain randomization, we train two neural networks, a faster regional convolutional neural network for detecting the bolt and placing a bounding box, and a regression convolutional neural network for estimating the x- and y-position of the bolts relative to the coordinates fixed to the workpiece. Our results indicate that in the best case, we can detect bolts with 85% accuracy and can predict 75% of bolts within 1.27 cm accuracy. The novelty of this work is in using domain randomization to detect and localize: (1) multiples of a single object and (2) small-sized objects (0.6 cm × 2.5 cm).

Dissertations / Theses on the topic "Scene coordinates regression (SCR)":

1

Martin-Lac, Victor. "Aerial navigation based on SAR imaging and reference geospatial data." Electronic Thesis or Diss., Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2024. http://www.theses.fr/2024IMTA0400.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Nous cherchons les moyens algorithmiques de déterminer l’état cinématique d’un appareil aérien à partir d’une image RSO observée et de données géospatiales de référence qui peuvent être RSO, optiques ou vectorielles. Nous déterminons la transformation qui associe les coordonnées de l’observation et les coordonnées de la référence et dont les paramètres sont l’état cinématique. Nous poursuivons trois approches. La première repose sur la détection et l’appariement de structures telles que des contours. Nous proposons un algorithme de type Iterative Closest Point (ICP) et démontrons comment il peut servir à estimer l’état cinématique complet. Nous proposons ensuite un système complet qui inclue un détecteur de contours multimodal appris. La seconde approche repose sur une métrique de similarité multimodale, ce qui est un moyen de mesurer la vraisemblance que deux restrictions locales de données géospatiales représentent le même point géographique. Nous déterminons l’état cinématique sous l’hypothèse duquel l’image SAR est la plus similaire aux données géospatiales de référence. La troisième approche repose sur la régression de coordonnées de scène. Nous prédisons les coordonnées géographiques de morceaux d’images et déduisons l’état cinématique à partir des correspondances ainsi prédites. Cependant, dans cette approche, nous ne satisfaisons pas l’hypothèse de multimodalité
We seek the algorithmic means of determining the kinematic state of an aerial device from an observation SAR image and reference geospatial data that may be SAR, optical or vector. We determine a transform that relates the observation and reference coordinates and whose parameters are the kinematic state. We follow three approaches. The first one is based on detecting and matching structures such as contours. We propose an iterative closest point algorithm and demonstrate how it can serve to estimate the full kinematic state. We then propose a complete pipeline that includes a learned multimodal contour detector. The second approach is based on a multimodal similarity metric, which is the means of measuring the likelihood that two local patches of geospatial data represent the same geographic point. We determine the kinematic state under the hypothesis of which the SAR image is most similar to the reference geospatial data. The third approach is based on scene coordinates regression. We predict the geographic coordinates of random image patches and infer the kinematic state from these predicted correspondences. However, in this approach, we do not address the fact that the modality of the observation and the reference are different

Conference papers on the topic "Scene coordinates regression (SCR)":

1

Cai, Ming, Huangying Zhan, Chamara Saroj Weerasekera, Kejie Li, and Ian Reid. "Camera Relocalization by Exploiting Multi-View Constraints for Scene Coordinates Regression." In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, 2019. http://dx.doi.org/10.1109/iccvw.2019.00469.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Freitas, Rafael, Thiago Paixão, Rodrigo Berriel, Alberto Souza, Claudine Badue, and Thiago Santos. "Relevant Traffic Light Localization via Deep Regression." In Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2019. http://dx.doi.org/10.5753/eniac.2019.9306.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Artificial intelligence advances have an important role on self-driving cars development, such as assisting the recognition of traffic lights. However, when relying on images of the scene alone, little progress was observed on selecting the traffic lights defining guidance to the car. Common detection approaches rely on additional high-level decision-making process to select a relevant traffic light. This work address the problem by proposing a deep regression system with an outliers resilient loss to predict the coordinates of a relevant traffic light in the image plane. The prediction can be used as a high-level decision-maker or as an assistant to a cheaper classifier to work on a region of interest. Results for European scenes show success in about 88% of the cases.
3

Ameperosa, Ezra, and Pranav A. Bhounsule. "Domain Randomization for Detection and Position Estimation of Multiples of a Single Object With Applications to Localizing Bolts on Structures." In ASME 2019 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2019. http://dx.doi.org/10.1115/detc2019-97393.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Periodic replacement of fasteners such as bolts are an integral part of many structures (e.g., airplanes, cars, ships) and require periodic maintenance that may involve either their tightening or replacement. Current manual practices are time consuming and costly especially due to the large number of bolts. Thus, an automated method that is able to visually detect and localize bolt positions would be highly beneficial. In this paper, we demonstrate the use of deep neural network using domain randomization for detecting and localizing multiple bolts on a workpiece. In contrast to previous deep learning approaches that require training on real images, the use of domain randomization allows for all training to be done in simulation. The key idea here is to create a wide variety of computer generated synthetic images by varying the texture, color, camera position and orientation, distractor objects, and noise, and train the neural network on these images such that the neural network is robust to scene variability and hence provides accurate results when deployed on real images. Using domain randomization, we train two neural networks, a faster regional convolutional neural network for detecting the bolt and predicting a bounding box, and a regression convolutional neural network for estimating the x- and y-position of the bolt relative to the coordinates fixed to the workpiece. Our results indicate that in the best case we are able to detect bolts with 85% accuracy and are able to predict the position of 75% of bolts within 1.27 cm. The novelty of this work is in the use of domain randomization to detect and localize: (1) multiples of a single object, and (2) small sized objects (0.6 cm × 2.5 cm).

To the bibliography