Статті в журналах з теми "Estimation de poses humaines"

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Estimation de poses humaines.

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "Estimation de poses humaines".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

R, Jayasri. "HUMAN POSE ESTIMATION." International Scientific Journal of Engineering and Management 03, no. 03 (March 23, 2024): 1–9. http://dx.doi.org/10.55041/isjem01426.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In "Human Pose Estimation" with integrated feedback mechanisms to assess and guide users in achieving correct poses. Utilizing advanced deep learning techniques in computer vision, the system swiftly detects key points on the human body and provides instant feedback on pose accuracy. Built on convolutional neural networks trained on extensive pose datasets, the system includes pose detection, classification, and feedback stages. By comparing detected poses with predefined correct poses, the system delivers positive feedback for accurate poses and corrective guidance for deviations. Key Words: Human Pose Estimation, Pose Detection, Pose Classification, Correct Pose Assessment, Fitness Training, Key Points Detection, Correct Pose Thresholds
2

Lv Yao-wen, 吕耀文, 王建立 WANG Jian-li, 王昊京 WANG Hao-jing, 刘维 LIU Wei, 吴量 WU Liang, and 曹景太 CAO Jing-tai. "Estimation of camera poses by parabolic motion." Optics and Precision Engineering 22, no. 4 (2014): 1078–85. http://dx.doi.org/10.3788/ope.20142204.1078.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Shalimova, E. A., E. V. Shalnov, and A. S. Konushin. "Camera parameters estimation from pose detections." Computer Optics 44, no. 3 (June 2020): 385–92. http://dx.doi.org/10.18287/2412-6179-co-600.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Some computer vision tasks become easier with known camera calibration. We propose a method for camera focal length, location and orientation estimation by observing human poses in the scene. Weak requirements to the observed scene make the method applicable to a wide range of scenarios. Our evaluation shows that even being trained only on synthetic dataset, the proposed method outperforms known solution. Our experiments show that using only human poses as the input also allows the proposed method to calibrate dynamic visual sensors.
4

Mahajan, Priyanshu, Shambhavi Gupta, and Divya Kheraj Bhanushali. "Body Pose Estimation using Deep Learning." International Journal for Research in Applied Science and Engineering Technology 11, no. 3 (March 31, 2023): 1419–24. http://dx.doi.org/10.22214/ijraset.2023.49688.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Abstract: Healthcare, sports analysis, gaming, and entertain- ment are just some of the many fields that could benefit from solving the challenging issue of real-time human pose detection and recognition in computer vision. Capturing human motion, analysing physical exercise, and giving feedback on performance can all benefit from reliable detection and recognition of body poses. The recent progress in deep learning has made it possible to create real-time systems that can accurately and quickly recognise and identify human poses.
5

Aju, Abin, Christa Mathew, and O. S. Gnana Prakasi. "PoseNet based Model for Estimation of Karate Poses." Journal of Innovative Image Processing 4, no. 1 (May 16, 2022): 16–25. http://dx.doi.org/10.36548/jiip.2022.1.002.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In the domain of computer vision, human pose estimation is becoming increasingly significant. It's one of the most compelling areas of research, and it's gaining a lot of interest due to its usefulness and flexibility in a variety of fields, including healthcare, gaming, augmented reality, virtual trainings and sports. Human pose estimation has opened a door of opportunities. This paper proposes a model for estimation and classification of karate poses which can be used in virtual karate posture correction and trainings. A pretrained model, PoseNet has been used for pose estimation using the results of which the angles between specific joints are calculated and fed into a K-Nearest Neighbors Classifier to classify the poses. The results obtained show that the model achieves an accuracy of 98.75%.
6

Astuti, Ani Dwi, Tita Karlita, and Rengga Asmara. "Yoga Pose Rating using Pose Estimation and Cosine Similarity." Jurnal Ilmu Komputer dan Informasi 16, no. 2 (July 3, 2023): 115–24. http://dx.doi.org/10.21609/jiki.v16i2.1151.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
One type of exercise that many people do today is yoga. However, doing yoga yourself without an instructor carries a risk of injury if not done correctly. This research proposes an application in the form of a website that can assess the accuracy of a person's yoga position, by using ResNet for pose estimation and cosine similarity for calculating the similarity of positions. The application will recognize a person's body pose and then compare it with the poses of professionals so that the accuracy of their position can be assessed. There are three types of datasets used, the first is the COCO dataset to train a pose estimation model so that it can recognize someone's pose, the second is a reference dataset that contains yoga poses performed by professionals, and the third is a dataset that contains pictures of yoga poses that are considered correct. There are 9 yoga poses used, namely Child's Pose, Swimmers, Downdog, Chair Pose, Crescent Lunge, Planks, Side Plank, Low Cobra, Namaste. The optimal pose estimation model has a precision value of 87% and a recall of 88.2%. The model was obtained using the Adam optimizer, 30 epochs, and a learning rate of 0.0001.
7

Jagtap, Aniket. "Yoga Guide: Yoga Pose Estimation Using Machine Learning." International Journal for Research in Applied Science and Engineering Technology 12, no. 2 (February 29, 2024): 296–97. http://dx.doi.org/10.22214/ijraset.2024.58272.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Abstract: A deep learning model is proposed which uses convolutional neural networks/LR algorithm for yoga pose identification along with a human joints localization model followed by a process for identification of errors in the pose for developing the system. After obtaining all the information about the pose of the user the system gives feedback to improve or correct the posture of the user. we propose an improved algorithm to calculate scores that can be applied to all poses. Our application is evaluated on different Yoga poses under different scenes, and its robustness is guaranteed.
8

Sun, Jun, Mantao Wang, Xin Zhao, and Dejun Zhang. "Multi-View Pose Generator Based on Deep Learning for Monocular 3D Human Pose Estimation." Symmetry 12, no. 7 (July 4, 2020): 1116. http://dx.doi.org/10.3390/sym12071116.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In this paper, we study the problem of monocular 3D human pose estimation based on deep learning. Due to single view limitations, the monocular human pose estimation cannot avoid the inherent occlusion problem. The common methods use the multi-view based 3D pose estimation method to solve this problem. However, single-view images cannot be used directly in multi-view methods, which greatly limits practical applications. To address the above-mentioned issues, we propose a novel end-to-end 3D pose estimation network for monocular 3D human pose estimation. First, we propose a multi-view pose generator to predict multi-view 2D poses from the 2D poses in a single view. Secondly, we propose a simple but effective data augmentation method for generating multi-view 2D pose annotations, on account of the existing datasets (e.g., Human3.6M, etc.) not containing a large number of 2D pose annotations in different views. Thirdly, we employ graph convolutional network to infer a 3D pose from multi-view 2D poses. From experiments conducted on public datasets, the results have verified the effectiveness of our method. Furthermore, the ablation studies show that our method improved the performance of existing 3D pose estimation networks.
9

Su, Jianhua, Zhi-Yong Liu, Hong Qiao, and Chuankai Liu. "Pose-estimation and reorientation of pistons for robotic bin-picking." Industrial Robot: An International Journal 43, no. 1 (January 18, 2016): 22–32. http://dx.doi.org/10.1108/ir-06-2015-0129.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Purpose – Picking up pistons in arbitrary poses is an important step on car engine assembly line. The authors usually use vision system to estimate the pose of the pistons and then guide a stable grasp. However, a piston in some poses, e.g. the mouth of the piston faces forward, is hardly to be directly grasped by the gripper. Thus, we need to reorient the piston to achieve a desired pose, i.e. let its mouth face upward, for grasping. Design/methodology/approach – This paper aims to present a vision-based picking system that can grasp pistons in arbitrary poses. The whole picking process is divided into two stages. At localization stage, a hierarchical approach is proposed to estimate the piston’s pose from image which usually involves both heavy noise and edge distortions. At grasping stage, multi-step robotic manipulations are designed to enable the piston to follow a nominal trajectory to reach to the minimum of the distance between the piston’s center and the support plane. That is, under the design input, the piston would be pushed to achieve a desired orientation. Findings – A target piston in arbitrary poses would be picked from the conveyor belt by the gripper with the proposed method. Practical implications – The designed robotic bin-picking system using vision is an advantage in terms of flexibility in automobile manufacturing industry. Originality/value – The authors develop a methodology that uses a pneumatic gripper and 2D vision information for picking up multiple pistons in arbitrary poses. The rough pose of the parts are detected based on a hierarchical approach for detection of multiple ellipses in the environment that usually involve edge distortions. The pose uncertainties of the piston are eliminated by multi-step robotic manipulations.
10

Fujita, Kohei, and Tsuyoshi Tasaki. "PYNet: Poseclass and Yaw Angle Output Network for Object Pose Estimation." Journal of Robotics and Mechatronics 35, no. 1 (February 20, 2023): 8–17. http://dx.doi.org/10.20965/jrm.2023.p0008.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
The issues of estimating the poses of simple-shaped objects, such as retail store goods, have been addresses to ease the grasping of objects by robots. Conventional methods to estimate poses with an RGBD camera mounted on robots have difficulty estimating the three-dimensional poses of simple-shaped objects with few shape features. Therefore, in this study, we propose a new class called “poseclass” to indicate the grounding face of an object. The poseclass is of discrete value and solvable as a classification problem; it can be estimated with high accuracy; in addition, the three-dimensional pose estimation problems can be simplified into one-dimensional pose-estimation problem to estimate the yaw angles on the grounding face. We have developed a new neural network (PYNet) to estimate the poseclass and yaw angle, and compared it with conventional methods to determine its ratio of estimating unknown simple-shaped object poses with an angle error of 30° or less. The ratio of PYNet (68.9%) is an 18.1 pt higher than that of the conventional methods (50.8%). Additionally, a PYNet-implemented robot successfully grasped convenience store goods.
11

Tang, Danhang, Hyung Jin Chang, Alykhan Tejani, and Tae-Kyun Kim. "Latent Regression Forest: Structured Estimation of 3D Hand Poses." IEEE Transactions on Pattern Analysis and Machine Intelligence 39, no. 7 (July 1, 2017): 1374–87. http://dx.doi.org/10.1109/tpami.2016.2599170.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
12

Cheng, Yu, Bo Wang, Bo Yang, and Robby T. Tan. "Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 2 (May 18, 2021): 1157–65. http://dx.doi.org/10.1609/aaai.v35i2.16202.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Despite the recent progress, 3D multi-person pose estimation from monocular videos is still challenging due to the commonly encountered problem of missing information caused by occlusion, partially out-of-frame target persons, and inaccurate person detection. To tackle this problem, we propose a novel framework integrating graph convolutional networks (GCNs) and temporal convolutional networks (TCNs) to robustly estimate camera-centric multi-person 3D poses that does not require camera parameters. In particular, we introduce a human-joint GCN, which unlike the existing GCN, is based on a directed graph that employs the 2D pose estimator's confidence scores to improve the pose estimation results. We also introduce a human-bone GCN, which models the bone connections and provides more information beyond human joints. The two GCNs work together to estimate the spatial frame-wise 3D poses and can make use of both visible joint and bone information in the target frame to estimate the occluded or missing human-part information. To further refine the 3D pose estimation, we use our temporal convolutional networks (TCNs) to enforce the temporal and human-dynamics constraints. We use a joint-TCN to estimate person-centric 3D poses across frames, and propose a velocity-TCN to estimate the speed of 3D joints to ensure the consistency of the 3D pose estimation in consecutive frames. Finally, to estimate the 3D human poses for multiple persons, we propose a root-TCN that estimates camera-centric 3D poses without requiring camera parameters. Quantitative and qualitative evaluations demonstrate the effectiveness of the proposed method.
13

Li, Jiaman, C. Karen Liu, and Jiajun Wu. "Ego-Body Pose Estimation via Ego-Head Pose Estimation." AI Matters 9, no. 2 (June 2023): 20–23. http://dx.doi.org/10.1145/3609468.3609473.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Estimating 3D human motion from an ego-centric video, which records the environment viewed from the first-person perspective with a front-facing monocular camera, is critical to applications in VR/AR. However, naively learning a mapping between egocentric videos and full-body human motions is challenging for two reasons. First, modeling this complex relationship is difficult; unlike reconstruction motion from third-person videos, the human body is often out of view of an egocentric video. Second, learning this mapping requires a large-scale, diverse dataset containing paired egocentric videos and the corresponding 3D human poses. Creating such a dataset requires meticulous instrumentation for data acquisition, and unfortunately, such a dataset does not currently exist. As such, existing works have only worked on small-scale datasets with limited motion and scene diversity (yuan20183d; yuan2019ego; luo2021dynamics).
14

Li, Haolun, and Chi-Man Pun. "CEE-Net: Complementary End-to-End Network for 3D Human Pose Generation and Estimation." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 1 (June 26, 2023): 1305–13. http://dx.doi.org/10.1609/aaai.v37i1.25214.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
The limited number of actors and actions in existing datasets make 3D pose estimators tend to overfit, which can be seen from the performance degradation of the algorithm on cross-datasets, especially for rare and complex poses. Although previous data augmentation works have increased the diversity of the training set, the changes in camera viewpoint and position play a dominant role in improving the accuracy of the estimator, while the generated 3D poses are limited and still heavily rely on the source dataset. In addition, these works do not consider the adaptability of the pose estimator to generated data, and complex poses will cause training collapse. In this paper, we propose the CEE-Net, a Complementary End-to-End Network for 3D human pose generation and estimation. The generator extremely expands the distribution of each joint-angle in the existing dataset and limits them to a reasonable range. By learning the correlations within and between the torso and limbs, the estimator can combine different body-parts more effectively and weaken the influence of specific joint-angle changes on the global pose, improving the generalization ability. Extensive ablation studies show that our pose generator greatly strengthens the joint-angle distribution, and our pose estimator can utilize these poses positively. Compared with the state-of-the-art methods, our method can achieve much better performance on various cross-datasets, rare and complex poses.
15

Zhang, Maomao, Ao Li, Honglei Liu, and Minghui Wang. "Coarse-to-Fine Hand–Object Pose Estimation with Interaction-Aware Graph Convolutional Network." Sensors 21, no. 23 (December 3, 2021): 8092. http://dx.doi.org/10.3390/s21238092.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
The analysis of hand–object poses from RGB images is important for understanding and imitating human behavior and acts as a key factor in various applications. In this paper, we propose a novel coarse-to-fine two-stage framework for hand–object pose estimation, which explicitly models hand–object relations in 3D pose refinement rather than in the process of converting 2D poses to 3D poses. Specifically, in the coarse stage, 2D heatmaps of hand and object keypoints are obtained from RGB image and subsequently fed into pose regressor to derive coarse 3D poses. As for the fine stage, an interaction-aware graph convolutional network called InterGCN is introduced to perform pose refinement by fully leveraging the hand–object relations in 3D context. One major challenge in 3D pose refinement lies in the fact that relations between hand and object change dynamically according to different HOI scenarios. In response to this issue, we leverage both general and interaction-specific relation graphs to significantly enhance the capacity of the network to cover variations of HOI scenarios for successful 3D pose refinement. Extensive experiments demonstrate state-of-the-art performance of our approach on benchmark hand–object datasets.
16

Aoki, Koki, Tomoya Sato, Eijiro Takeuchi, Yoshiki Ninomiya, and Junichi Meguro. "Error Covariance Estimation of 3D Point Cloud Registration Considering Surrounding Environment." Journal of Robotics and Mechatronics 35, no. 2 (April 20, 2023): 435–44. http://dx.doi.org/10.20965/jrm.2023.p0435.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
To realize autonomous vehicle safety, it is important to accurately estimate the vehicle’s pose. As one of the localization techniques, 3D point cloud registration is commonly used. However, pose errors are likely to occur when there are few features in the surrounding environment. Although many studies have been conducted on estimating error distribution of 3D point cloud registration, the real environment is not reflected. This paper presents real-time error covariance estimation in 3D point cloud registration according to the surrounding environment. The proposed method provides multiple initial poses for iterative optimization in the registration method. Using converged poses in multiple searches, the error covariance reflecting the real environment is obtained. However, the initial poses were limited to directions in which the pose error was likely to occur. Hence, the limited search efficiently determined local optima of the registration. In addition, the process was conducted within 10 Hz, which is laser imaging detection and ranging (LiDAR) period; however, the execution time exceeded 100 ms in some places. Therefore, further improvement is necessary.
17

Chen, Ning, Shaopeng Wu, Yupeng Chen, Zhanghua Wang, and Ziqian Zhang. "A Pose Estimation Algorithm for Multimodal Data Fusion." Traitement du Signal 39, no. 6 (December 31, 2022): 1971–79. http://dx.doi.org/10.18280/ts.390609.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In response to the problem that the previous pose detection systems are not effective under conditions such as severe occlusion or uneven illumination, this paper focuses on the multimodal information fusion pose estimation problem. The main work is to design a multimodal data fusion pose estimation algorithm for the problem of pose estimation in complex scenes such as low-texture targets and poor lighting conditions. The network takes images and point clouds as input and extracts local color and spatial features of the target object using the improved DenseNet and PointNet++ networks, which are combined with a microscopic bit-pose iterative network to achieve end-to-end bit-pose estimation. Excellent detection accuracy was obtained on two benchmark datasets of LineMOD (97.8%) and YCB-Video (95.3%) for pose estimation. The algorithm is able to obtain accurate poses of target objects from complex scenes, providing accurate, real-time and robust relative poses for object tracking in motion and wave compensation.
18

Guo, Fangtai, Zaixing He, Shuyou Zhang, and Xinyue Zhao. "Estimation of 3D human hand poses with structured pose prior." IET Computer Vision 13, no. 8 (December 2019): 683–90. http://dx.doi.org/10.1049/iet-cvi.2018.5480.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
19

Niwaya, Haruo, Haruki Imaoka, and Atsuo Shibuya. "Estimation of the Distribution of Garment Pressure for Several Poses." Sen'i Gakkaishi 52, no. 5 (1996): 248–52. http://dx.doi.org/10.2115/fiber.52.5_248.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
20

Favorskaya, M., D. Novikov, and Y. Savitskaya. "HUMAN ACTION POSELETS ESTIMATION VIA COLOR G-SURF IN STILL IMAGES." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XL-5/W6 (May 18, 2015): 51–58. http://dx.doi.org/10.5194/isprsarchives-xl-5-w6-51-2015.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Human activity is a persistent subject of interest in the last decade. On the one hand, video sequences provide a huge volume of motion information in order to recognize the human active actions. On the other hand, the spatial information about static human poses is valuable for human action recognition. Poselets were introduced as latent variables representing a configuration for mutual locations of body parts and allowing different views of description. In current research, some modifications of Speeded-Up Robust Features (SURF) invariant to affine geometrical transforms and illumination changes were tested. First, a grid of rectangles is imposed on object of interest in a still image. Second, sparse descriptor based on Gauge-SURF (G-SURF) invariant to color/lighting changes is constructed for each rectangle separately. A common Spatial POselet Descriptor (SPOD) aggregates the SPODs of rectangles with following random forest classification in order to receive fast classification results. The proposed approach was tested on samples from PASCAL Visual Object Classes (VOC) Dataset and Challenge 2010 providing accuracy 61-68% for all possible 3D poses locations and 82-86% for front poses locations regarding to nine action categories.
21

Qian, Junpeng, Xiaogang Cheng, Bin Yang, Zhe Li, Junchi Ren, Thomas Olofsson, and Haibo Li. "Vision-Based Contactless Pose Estimation for Human Thermal Discomfort." Atmosphere 11, no. 4 (April 12, 2020): 376. http://dx.doi.org/10.3390/atmos11040376.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Real-time and effective human thermal discomfort detection plays a critical role in achieving energy efficient control of human centered intelligent buildings because estimation results can provide effective feedback signals to heating, ventilation and air conditioning (HVAC) systems. How to detect occupant thermal discomfort is a challenge. Unfortunately, contact or semi-contact perception methods are inconvenient in practical application. From the contactless perspective, a kind of vision-based contactless human discomfort pose estimation method was proposed in this paper. Firstly, human pose data were captured from a vision-based sensor, and corresponding human skeleton information was extracted. Five thermal discomfort-related human poses were analyzed, and corresponding algorithms were constructed. To verify the effectiveness of the algorithms, 16 subjects were invited for physiological experiments. The validation results show that the proposed algorithms can recognize the five human poses of thermal discomfort.
22

Kim, Jong-Wook, Jin-Young Choi, Eun-Ju Ha, and Jae-Ho Choi. "Human Pose Estimation Using MediaPipe Pose and Optimization Method Based on a Humanoid Model." Applied Sciences 13, no. 4 (February 20, 2023): 2700. http://dx.doi.org/10.3390/app13042700.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Seniors who live alone at home are at risk of falling and injuring themselves and, thus, may need a mobile robot that monitors and recognizes their poses automatically. Even though deep learning methods are actively evolving in this area, they have limitations in estimating poses that are absent or rare in training datasets. For a lightweight approach, an off-the-shelf 2D pose estimation method, a more sophisticated humanoid model, and a fast optimization method are combined to estimate joint angles for 3D pose estimation. As a novel idea, the depth ambiguity problem of 3D pose estimation is solved by adding a loss function deviation of the center of mass from the center of the supporting feet and penalty functions concerning appropriate joint angle rotation range. To verify the proposed pose estimation method, six daily poses were estimated with a mean joint coordinate difference of 0.097 m and an average angle difference per joint of 10.017 degrees. In addition, to confirm practicality, videos of exercise activities and a scene of a person falling were filmed, and the joint angle trajectories were produced as the 3D estimation results. The optimized execution time per frame was measured at 0.033 s on a single-board computer (SBC) without GPU, showing the feasibility of the proposed method as a real-time system.
23

Zhao, Wenhui, Bin Xu, and Xinzhong Wu. "Robot grasping system based on deep learning target detection." Journal of Physics: Conference Series 2450, no. 1 (March 1, 2023): 012071. http://dx.doi.org/10.1088/1742-6596/2450/1/012071.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Abstract The traditional robot grasping system often uses fixed point grasping or demonstrative grasping, but with the increase in the diversity of grasping targets and the randomness of poses, the traditional grasping method is no longer sufficient. A robot grasping method based on deep learning target detection is proposed for a high error rate of target recognition and low success rate of grasping in the robot grasping process. The method investigates the robotic arm hand-eye calibration and the deep learning-based target detection and poses estimation algorithm. The Basler camera is used as the visual perception tool of the robot arm, the AUBO i10 robot arm is used as the main body of the experiment, and the PP-YOLO deep learning algorithm performs target detection and poses estimation on the object. Through the collection of experimental data, several grasping experiments were conducted on the diversity of targets randomly placed in the poses under real scenes. The results showed that the success rate of grasping target detection was 94.93% and the robot grasping success rate was 93.37%.
24

McCall, Sheldon, Liyun Gong, Afreen Naz, Syed Waqar Ahmed, Wing On Tam, and Miao Yu. "A Novel Mobile Vision Based Technique for 3D Human Pose Estimation." European Journal of Electrical Engineering and Computer Science 7, no. 6 (December 26, 2023): 82–87. http://dx.doi.org/10.24018/ejece.2023.7.6.573.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In this work, we propose a novel technique for accurately constructing 3D human poses based on mobile phone camera recordings. From the originally recorded video frames by a mobile phone camera, firstly a mask R-CNN network is applied to detect the human body and extract 2D body skeletons. Based on the 2D skeletons, a temporal convolutional network (TCN) is then applied to lift 2D skeletons to 3D ones for the 3D human pose estimation. From the experimental evaluations, it is shown that 3D human poses can be accurately reconstructed by the proposed technique in this work based on mobile phone camera recordings while the reconstruction result is very close to the one by a specialized motion capture system.
25

Šajina, Romeo, and Marina Ivašić-Kos. "3D Pose Estimation and Tracking in Handball Actions Using a Monocular Camera." Journal of Imaging 8, no. 11 (November 10, 2022): 308. http://dx.doi.org/10.3390/jimaging8110308.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Player pose estimation is particularly important for sports because it provides more accurate monitoring of athlete movements and performance, recognition of player actions, analysis of techniques, and evaluation of action execution accuracy. All of these tasks are extremely demanding and challenging in sports that involve rapid movements of athletes with inconsistent speed and position changes, at varying distances from the camera with frequent occlusions, especially in team sports when there are more players on the field. A prerequisite for recognizing the player’s actions on the video footage and comparing their poses during the execution of an action is the detection of the player’s pose in each element of an action or technique. First, a 2D pose of the player is determined in each video frame, and converted into a 3D pose, then using the tracking method all the player poses are grouped into a sequence to construct a series of elements of a particular action. Considering that action recognition and comparison depend significantly on the accuracy of the methods used to estimate and track player pose in real-world conditions, the paper provides an overview and analysis of the methods that can be used for player pose estimation and tracking using a monocular camera, along with evaluation metrics on the example of handball scenarios. We have evaluated the applicability and robustness of 12 selected 2-stage deep learning methods for 3D pose estimation on a public and a custom dataset of handball jump shots for which they have not been trained and where never-before-seen poses may occur. Furthermore, this paper proposes methods for retargeting and smoothing the 3D sequence of poses that have experimentally shown a performance improvement for all tested models. Additionally, we evaluated the applicability and robustness of five state-of-the-art tracking methods on a public and a custom dataset of a handball training recorded with a monocular camera. The paper ends with a discussion apostrophizing the shortcomings of the pose estimation and tracking methods, reflected in the problems of locating key skeletal points and generating poses that do not follow possible human structures, which consequently reduces the overall accuracy of action recognition.
26

Hu, Xiaoling, and Chang Liu. "Animal Pose Estimation Based on Contrastive Learning with Dynamic Conditional Prompts." Animals 14, no. 12 (June 7, 2024): 1712. http://dx.doi.org/10.3390/ani14121712.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Traditional animal pose estimation techniques based on images face significant hurdles, including scarce training data, costly data annotation, and challenges posed by non-rigid deformation. Addressing these issues, we proposed dynamic conditional prompts for the prior knowledge of animal poses in language modalities. Then, we utilized a multimodal (language–image) collaborative training and contrastive learning model to estimate animal poses. Our method leverages text prompt templates and image feature conditional tokens to construct dynamic conditional prompts that integrate rich linguistic prior knowledge in depth. The text prompts highlight key points and relevant descriptions of animal poses, enhancing their representation in the learning process. Meanwhile, transformed via a fully connected non-linear network, image feature conditional tokens efficiently embed the image features into these prompts. The resultant context vector, derived from the fusion of the text prompt template and the image feature conditional token, generates a dynamic conditional prompt for each input sample. By utilizing a contrastive language–image pre-training model, our approach effectively synchronizes and strengthens the training interactions between image and text features, resulting in an improvement to the precision of key-point localization and overall animal pose estimation accuracy. The experimental results show that language–image contrastive learning based on dynamic conditional prompts enhances the average accuracy of animal pose estimation on the AP-10K and Animal Pose datasets.
27

Zhou, Xiaolong, Tian Jin, Yongpeng Dai, Yongping Song, and Kemeng Li. "Three-Dimensional Human Pose Estimation from Micro-Doppler Signature Based on SISO UWB Radar." Remote Sensing 16, no. 7 (April 6, 2024): 1295. http://dx.doi.org/10.3390/rs16071295.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In this paper, we propose an innovative approach for transforming 2D human pose estimation into 3D models using Single Input–Single Output (SISO) Ultra-Wideband (UWB) radar technology. This method addresses the significant challenge of reconstructing 3D human poses from 1D radar signals, a task traditionally hindered by low spatial resolution and complex inverse problems. The difficulty is further exacerbated by the ambiguity in 3D pose reconstruction, as multiple 3D poses may correspond to similar 2D projections. Our solution, termed the Radar PoseLifter network, leverages the micro-Doppler signatures inherent in 1D radar echoes to effectively convert 2D pose information into 3D structures. The network is specifically designed to handle the long-range dependencies present in sequences of 2D poses. It employs a fully convolutional architecture, enhanced with a dilated temporal convolutions network, for efficient data processing. We rigorously evaluated the Radar PoseLifter network using the HPSUR dataset, which includes a diverse range of human movements. This dataset comprises data from five individuals with varying physical characteristics, performing a variety of actions. Our experimental results demonstrate the method’s robustness and accuracy in estimating complex human poses, highlighting its effectiveness. This research contributes significantly to the advancement of human motion capture using radar technology. It presents a viable solution for applications where precision and reliability in motion capture are paramount. The study not only enhances the understanding of 3D pose estimation from radar data but also opens new avenues for practical applications in various fields.
28

Zhang, Beichen, and Yue Bao. "Age Estimation of Faces in Videos Using Head Pose Estimation and Convolutional Neural Networks." Sensors 22, no. 11 (May 31, 2022): 4171. http://dx.doi.org/10.3390/s22114171.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Age estimation from human faces is an important yet challenging task in computer vision because of the large differences between physical age and apparent age. Due to the differences including races, genders, and other factors, the performance of a learning method for this task strongly depends on the training data. Although many inspiring works have focused on the age estimation of a single human face through deep learning, the existing methods still have lower performance when dealing with faces in videos because of the differences in head pose between frames, which can lead to greatly different results. In this paper, a combined system of age estimation and head pose estimation is proposed to improve the performance of age estimation from faces in videos. We use deep regression forests (DRFs) to estimate the age of facial images, while a multiloss convolutional neural network is also utilized to estimate the head pose. Accordingly, we estimate the age of faces only for head poses within a set degree threshold to enable value refinement. First, we divided the images in the Cross-Age Celebrity Dataset (CACD) and the Asian Face Age Dataset (AFAD) according to the estimated head pose degrees and generated separate age estimates for images with different poses. The experimental results showed that the accuracy of age estimation from frontal facial images was better than that for faces at different angles, thus demonstrating the effect of head pose on age estimation. Further experiments were conducted on several videos to estimate the age of the same person with his or her face at different angles, and the results show that our proposed combined system can provide more precise and reliable age estimates than a system without head pose estimation.
29

Wei, Yongfeng, Hanmeng Zhang, Caili Gong, Dong Wang, Ming Ye, and Yupu Jia. "Study of Pose Estimation Based on Spatio-Temporal Characteristics of Cow Skeleton." Agriculture 13, no. 8 (August 1, 2023): 1535. http://dx.doi.org/10.3390/agriculture13081535.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
The pose of cows reflects their body condition, and the information contained in the skeleton can provide data support for lameness, estrus, milk yield, and contraction behavior detection. This paper presents an algorithm for automatically detecting the condition of cows in a real farm environment based on skeleton spatio-temporal features. The cow skeleton is obtained by matching Partial Confidence Maps (PCMs) and Partial Affinity Fields (PAFs). The effectiveness of skeleton extraction was validated by testing 780 images for three different poses (standing, walking, and lying). The results indicate that the Average Precision of Keypoints (APK) for the pelvis is highest in the standing and lying poses, achieving 89.52% and 90.13%, respectively. For walking, the highest APK for the legs was 88.52%, while the back APK was the lowest across all poses. To estimate the pose, a Multi-Scale Temporal Convolutional Network (MS-TCN) was constructed, and comparative experiments were conducted to compare different attention mechanisms and activation functions. Among the tested models, the CMS-TCN with Coord Attention and Gaussian Error Linear Unit (GELU) activation functions achieved precision, recall, and F1 scores of 94.71%, 86.99%, and 90.69%, respectively. This method demonstrates a relatively high detection rate, making it a valuable reference for animal pose estimation in precision livestock farming.
30

Xu, Xixia, Qi Zou, and Xue Lin. "Adaptive Hypergraph Neural Network for Multi-Person Pose Estimation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 2955–63. http://dx.doi.org/10.1609/aaai.v36i3.20201.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
This paper proposes a novel two-stage hypergraph-based framework, dubbed ADaptive Hypergraph Neural Network (AD-HNN) to estimate multiple human poses from a single image, with a keypoint localization network and an Adaptive-Pose Hypergraph Neural Network (AP-HNN) added onto the former network. For providing better guided representations of AP-HNN, we employ a Semantic Interaction Convolution (SIC) module within the initial localization network to acquire more explicit predictions. Build upon this, we design a novel adaptive hypergraph to represent a human body for capturing high-order semantic relations among different joints. Notably, it can adaptively adjust the relations between joints and seek the most reasonable structure for the variable poses to benefit the keypoint localization. These two stages are combined to be trained in an end-to-end fashion. Unlike traditional Graph Convolutional Networks (GCNs) that are based on a fixed tree structure, AP-HNN can deal with ambiguity in human pose estimation. Experimental results demonstrate that the AD-HNN achieves state-of-the-art performance both on the MS-COCO, MPII and CrowdPose datasets.
31

Su, Yongzhi, Jason Rambach, Alain Pagani, and Didier Stricker. "SynPo-Net—Accurate and Fast CNN-Based 6DoF Object Pose Estimation Using Synthetic Training." Sensors 21, no. 1 (January 5, 2021): 300. http://dx.doi.org/10.3390/s21010300.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Estimation and tracking of 6DoF poses of objects in images is a challenging problem of great importance for robotic interaction and augmented reality. Recent approaches applying deep neural networks for pose estimation have shown encouraging results. However, most of them rely on training with real images of objects with severe limitations concerning ground truth pose acquisition, full coverage of possible poses, and training dataset scaling and generalization capability. This paper presents a novel approach using a Convolutional Neural Network (CNN) trained exclusively on single-channel Synthetic images of objects to regress 6DoF object Poses directly (SynPo-Net). The proposed SynPo-Net is a network architecture specifically designed for pose regression and a proposed domain adaptation scheme transforming real and synthetic images into an intermediate domain that is better fit for establishing correspondences. The extensive evaluation shows that our approach significantly outperforms the state-of-the-art using synthetic training in terms of both accuracy and speed. Our system can be used to estimate the 6DoF pose from a single frame, or be integrated into a tracking system to provide the initial pose.
32

Fan, Zhen, Xiu Li, and Yipeng Li. "Multi-Agent Deep Reinforcement Learning for Online 3D Human Poses Estimation." Remote Sensing 13, no. 19 (October 6, 2021): 3995. http://dx.doi.org/10.3390/rs13193995.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Most multi-view based human pose estimation techniques assume the cameras are fixed. While in dynamic scenes, the cameras should be able to move and seek the best views to avoid occlusions and extract 3D information of the target collaboratively. In this paper, we address the problem of online view selection for a fixed number of cameras to estimate multi-person 3D poses actively. The proposed method exploits a distributed multi-agent based deep reinforcement learning framework, where each camera is modeled as an agent, to optimize the action of all the cameras. An inter-agent communication protocol was developed to transfer the cameras’ relative positions between agents for better collaboration. Experiments on the Panoptic dataset show that our method outperforms other view selection methods by a large margin given an identical number of cameras. To the best of our knowledge, our method is the first to address online active multi-view 3D pose estimation with multi-agent reinforcement learning.
33

Barfoot, Timothy D., and Paul T. Furgale. "Associating Uncertainty With Three-Dimensional Poses for Use in Estimation Problems." IEEE Transactions on Robotics 30, no. 3 (June 2014): 679–93. http://dx.doi.org/10.1109/tro.2014.2298059.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
34

Mekami, Hayet, Abdennacer Bounoua, and Sidahmed Benabderrahmane. "Leveraging deep learning with symbolic sequences for robust head poses estimation." Pattern Analysis and Applications 23, no. 3 (November 7, 2019): 1391–406. http://dx.doi.org/10.1007/s10044-019-00857-5.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
35

Desai, Miral, and Hiren Mewada. "A novel approach for yoga pose estimation based on in-depth analysis of human body joint detection accuracy." PeerJ Computer Science 9 (January 13, 2023): e1152. http://dx.doi.org/10.7717/peerj-cs.1152.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Virtual motion and pose from images and video can be estimated by detecting body joints and their interconnection. The human body has diverse and complicated poses in yoga, making its classification challenging. This study estimates yoga poses from the images using a neural network. Five different yoga poses, viz. downdog, tree, plank, warrior2, and goddess in the form of RGB images are used as the target inputs. The BlazePose model was used to localize the body joints of the yoga poses. It detected a maximum of 33 body joints, referred to as keypoints, covering almost all the body parts. Keypoints achieved from the model are considered as predicted joint locations. True keypoints, as the ground truth body joint for individual yoga poses, are identified manually using the open source image annotation tool named Makesense AI. A detailed analysis of the body joint detection accuracy is proposed in the form of percentage of corrected keypoints (PCK) and percentage of detected joints (PDJ) for individual body parts and individual body joints, respectively. An algorithm is designed to measure PCK and PDJ in which the distance between the predicted joint location and true joint location is calculated. The experiment evaluation suggests that the adopted model obtained 93.9% PCK for the goddess pose. The maximum PCK achieved for the goddess pose, i.e., 93.9%, PDJ evaluation was carried out in the staggering mode where maximum PDJ is obtained as 90% to 100% for almost all the body joints.
36

Hong, Sungjin, and Yejin Kim. "Dynamic Pose Estimation Using Multiple RGB-D Cameras." Sensors 18, no. 11 (November 10, 2018): 3865. http://dx.doi.org/10.3390/s18113865.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Human poses are difficult to estimate due to the complicated body structure and the self-occlusion problem. In this paper, we introduce a marker-less system for human pose estimation by detecting and tracking key body parts, namely the head, hands, and feet. Given color and depth images captured by multiple red, green, blue, and depth (RGB-D) cameras, our system constructs a graph model with segmented regions from each camera and detects the key body parts as a set of extreme points based on accumulative geodesic distances in the graph. During the search process, local detection using a supervised learning model is utilized to match local body features. A final set of extreme points is selected with a voting scheme and tracked with physical constraints from the unified data received from the multiple cameras. During the tracking process, a Kalman filter-based method is introduced to reduce positional noises and to recover from a failure of tracking extremes. Our system shows an average of 87% accuracy against the commercial system, which outperforms the previous multi-Kinects system, and can be applied to recognize a human action or to synthesize a motion sequence from a few key poses using a small set of extremes as input data.
37

Bautembach, Dennis, Iason Oikonomidis, and Antonis Argyros. "Filling the Joints: Completion and Recovery of Incomplete 3D Human Poses." Technologies 6, no. 4 (October 30, 2018): 97. http://dx.doi.org/10.3390/technologies6040097.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
We present a comparative study of three matrix completion and recovery techniques based on matrix inversion, gradient descent, and Lagrange multipliers, applied to the problem of human pose estimation. 3D human pose estimation algorithms may exhibit noise or may completely fail to provide estimates for some joints. A post-process is often employed to recover the missing joints’ locations from the remaining ones, typically by enforcing kinematic constraints or by using a prior learned from a database of natural poses. Matrix completion and recovery techniques fall into the latter category and operate by filling-in missing entries of a matrix whose available/non-missing entries may be additionally corrupted by noise. We compare the performance of three such techniques in terms of the estimation error of their output as well as their runtime, in a series of simulated and real-world experiments. We conclude by recommending use cases for each of the compared techniques.
38

Liu, Yuanyuan, Xingmei Li, Fang Fang, Fayong Zhang, Jingying Chen, and Zhizhong Zeng. "Visual Focus of Attention and Spontaneous Smile Recognition Based on Continuous Head Pose Estimation by Cascaded Multi-Task Learning." International Journal of Pattern Recognition and Artificial Intelligence 33, no. 07 (June 7, 2019): 1940006. http://dx.doi.org/10.1142/s0218001419400068.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Multi-person Visual focus of attention (M-VFOA) and spontaneous smile (SS) recognition are important for persons’ behavior understanding and analysis in class. Recently, promising results have been reported using special hardware in constrained environment. However, M-VFOA and SS remain challenging problems in natural and crowd classroom environment, e.g. various poses, occlusion, expressions, illumination and poor image quality, etc. In this study, a robust and un-invasive M-VFOA and SS recognition system has been developed based on continuous head pose estimation in the natural classroom. A novel cascaded multi-task Hough forest (CM-HF) combined with weighted Hough voting and multi-task learning is proposed for continuous head pose estimation, tip of the nose location and SS recognition, which improves accuracies of recognition and reduces the training time. Then, M-VFOA can be recognized based on estimated head poses, environmental cues and prior states in the natural classroom. Meanwhile, SS is classified using CM-HF with local cascaded mouth-eyes areas normalized by the estimated head poses. The method is rigorously evaluated for continuous head pose estimation, multi-person VFOA recognition, and SS recognition on some public available datasets and real-class video sequences. Experimental results show that our method reduces training time greatly and outperforms the state-of-the-art methods for both performance and robustness with an average accuracy of 83.5% on head pose estimation, 67.8% on M-VFOA recognition and 97.1% on SS recognition in challenging environments.
39

Ravan, Aniket, Ruopei Feng, Martin Gruebele, and Yann R. Chemla. "Rapid automated 3-D pose estimation of larval zebrafish using a physical model-trained neural network." PLOS Computational Biology 19, no. 10 (October 23, 2023): e1011566. http://dx.doi.org/10.1371/journal.pcbi.1011566.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Quantitative ethology requires an accurate estimation of an organism’s postural dynamics in three dimensions plus time. Technological progress over the last decade has made animal pose estimation in challenging scenarios possible with unprecedented detail. Here, we present (i) a fast automated method to record and track the pose of individual larval zebrafish in a 3-D environment, applicable when accurate human labeling is not possible; (ii) a rich annotated dataset of 3-D larval poses for ethologists and the general zebrafish and machine learning community; and (iii) a technique to generate realistic, annotated larval images in different behavioral contexts. Using a three-camera system calibrated with refraction correction, we record diverse larval swims under free swimming conditions and in response to acoustic and optical stimuli. We then employ a convolutional neural network to estimate 3-D larval poses from video images. The network is trained against a set of synthetic larval images rendered using a 3-D physical model of larvae. This 3-D model samples from a distribution of realistic larval poses that we estimate a priori using a template-based pose estimation of a small number of swim bouts. Our network model, trained without any human annotation, performs larval pose estimation three orders of magnitude faster and with accuracy comparable to the template-based approach, capturing detailed kinematics of 3-D larval swims. It also applies accurately to other datasets collected under different imaging conditions and containing behavioral contexts not included in our training.
40

Wang, X., H. Yu, and D. Feng. "Pose estimation in runway end safety area using geometry structure features." Aeronautical Journal 120, no. 1226 (April 2016): 675–91. http://dx.doi.org/10.1017/aer.2016.16.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
ABSTRACTA novel image-based method is presented in this paper to estimate the poses of commercial aircrafts in a runway end safety area. Based on the fact that similar poses of an aircraft will have similar geometry structures, this method first extracts features to describe the structure of an aircraft's fuselage and aerofoil by RANdom Sample Consensus algorithm (RANSAC), and then uses the central moments to obtain the aircrafts’ pose information. Based on the proposed pose information, a two-step feature matching strategy is further designed to identify an aircraft's particular pose. In order to validate the accuracy of the pose estimation and the effectiveness of the proposed algorithm, we construct a pose database of two common aircrafts in Asia. The experiments show that the designed low-dimension features can accurately capture the aircraft's pose information and the proposed algorithm can achieve satisfied matching accuracy.
41

Wang, Xianghan, Jie Jiang, Yanming Guo, Lai Kang, Yingmei Wei, and Dan Li. "CFAM: Estimating 3D Hand Poses from a Single RGB Image with Attention." Applied Sciences 10, no. 2 (January 15, 2020): 618. http://dx.doi.org/10.3390/app10020618.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Precise 3D hand pose estimation can be used to improve the performance of human–computer interaction (HCI). Specifically, computer-vision-based hand pose estimation can make this process more natural. Most traditional computer-vision-based hand pose estimation methods use depth images as the input, which requires complicated and expensive acquisition equipment. Estimation through a single RGB image is more convenient and less expensive. Previous methods based on RGB images utilize only 2D keypoint score maps to recover 3D hand poses but ignore the hand texture features and the underlying spatial information in the RGB image, which leads to a relatively low accuracy. To address this issue, we propose a channel fusion attention mechanism that combines 2D keypoint features and RGB image features at the channel level. In particular, the proposed method replans weights by using cascading RGB images and 2D keypoint features, which enables rational planning and the utilization of various features. Moreover, our method improves the fusion performance of different types of feature maps. Multiple contrast experiments on public datasets demonstrate that the accuracy of our proposed method is comparable to the state-of-the-art accuracy.
42

El Kaid, Amal, and Karim Baïna. "A Systematic Review of Recent Deep Learning Approaches for 3D Human Pose Estimation." Journal of Imaging 9, no. 12 (December 12, 2023): 275. http://dx.doi.org/10.3390/jimaging9120275.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Three-dimensional human pose estimation has made significant advancements through the integration of deep learning techniques. This survey provides a comprehensive review of recent 3D human pose estimation methods, with a focus on monocular images, videos, and multi-view cameras. Our approach stands out through a systematic literature review methodology, ensuring an up-to-date and meticulous overview. Unlike many existing surveys that categorize approaches based on learning paradigms, our survey offers a fresh perspective, delving deeper into the subject. For image-based approaches, we not only follow existing categorizations but also introduce and compare significant 2D models. Additionally, we provide a comparative analysis of these methods, enhancing the understanding of image-based pose estimation techniques. In the realm of video-based approaches, we categorize them based on the types of models used to capture inter-frame information. Furthermore, in the context of multi-person pose estimation, our survey uniquely differentiates between approaches focusing on relative poses and those addressing absolute poses. Our survey aims to serve as a pivotal resource for researchers, highlighting state-of-the-art deep learning strategies and identifying promising directions for future exploration in 3D human pose estimation.
43

Liu, Pengpeng, Tao Yu, Zhi Zeng, Yebin Liu, Guixuan Zhang, and Zhen Song. "Relative Pose Estimation for RGB-D Human Input Scans via Implicit Function Reconstruction." Wireless Communications and Mobile Computing 2022 (February 11, 2022): 1–9. http://dx.doi.org/10.1155/2022/4351951.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
To achieve a promising performance on relative pose estimation for RGB-D scans, a considerable overlap between two RGB-D inputs is often required for most existing methods. However, in many practical applications for human scans, we often have to estimate the relative poses under arbitrary overlaps, which is challenging for existing methods. To deal with this problem, this paper presents a novel end-to-end and coarse-to-fine optimization method. Our method is self-supervision which firstly combines implicit function reconstruction with differentiable render for RGB-D human input scans at arbitrary overlaps in relative pose estimation. The insight is to take advantage of the underlying human geometry prior as much as possible. First of all, for stable coarse poses, we utilize the implicit function reconstruction to dig out abundant hidden cues from unseen regions in the initialization module. To further refine the poses, the differentiable render is leveraged to establish a self-supervision mechanism in the optimization module, which is independent of standard pipelines for feature extracting and accurate correspondence matching. More importantly, our proposed method is flexible to be extended to multiview input scans. The results and evaluations demonstrate that our optimization module is robust for real-world noisy inputs, and our approach outperforms considerably than standard pipelines in non-overlapping setups.
44

Jo, BeomJun, and SeongKi Kim. "Comparative Analysis of OpenPose, PoseNet, and MoveNet Models for Pose Estimation in Mobile Devices." Traitement du Signal 39, no. 1 (February 28, 2022): 119–24. http://dx.doi.org/10.18280/ts.390111.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Pose estimation is a significant strategy that has been actively researched in various fields. For example, the strategy has been adopted for motion capture in moviemaking, and character control in video games. It can also be applied to implement the user interfaces of mobile devices through human poses. Therefore, this paper compares and analyzes four popular pose estimation models, namely, OpenPose, PoseNet, MoveNet Lightning, and MoveNet Thunder, using pre-classified images. The results show that MoveNet Lightning was the fastest, and OpenPose was the slowest among the four models. But OpenPose was the only model capable of estimating the poses of multiple persons. The accuracies of OpenPose, PoseNet, MoveNet Lightning, and MoveNet Thunder were 86.2%, 97.6%, 75.1%, and 80.6%, respectively.
45

Yuan, Honglin, Tim Hoogenkamp, and Remco C. Veltkamp. "RobotP: A Benchmark Dataset for 6D Object Pose Estimation." Sensors 21, no. 4 (February 11, 2021): 1299. http://dx.doi.org/10.3390/s21041299.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.
46

Miyazaki, Jisho, and Keiji Matsumoto. "Imaginarity-free quantum multiparameter estimation." Quantum 6 (March 10, 2022): 665. http://dx.doi.org/10.22331/q-2022-03-10-665.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Multiparameter quantum estimation is made difficult by the following three obstacles. First, incompatibility among different physical quantities poses a limit on the attainable precision. Second, the ultimate precision is not saturated until you discover the optimal measurement. Third, the optimal measurement may generally depend on the target values of parameters, and thus may be impossible to perform for unknown target states. We present a method to circumvent these three obstacles. A class of quantum statistical models, which utilizes antiunitary symmetries or, equivalently, real density matrices, offers compatible multiparameter estimations. The symmetries accompany the target-independent optimal measurements for pure-state models. Based on this finding, we propose methods to implement antiunitary symmetries for quantum metrology schemes. We further introduce a function which measures antiunitary asymmetry of quantum statistical models as a potential tool to characterize quantumness of phase transitions.
47

Wang, Fei, Chen Liang, Changlei Ru, and Hongtai Cheng. "An Improved Point Cloud Descriptor for Vision Based Robotic Grasping System." Sensors 19, no. 10 (May 14, 2019): 2225. http://dx.doi.org/10.3390/s19102225.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In this paper, a novel global point cloud descriptor is proposed for reliable object recognition and pose estimation, which can be effectively applied to robot grasping operation. The viewpoint feature histogram (VFH) is widely used in three-dimensional (3D) object recognition and pose estimation in real scene obtained by depth sensor because of its recognition performance and computational efficiency. However, when the object has a mirrored structure, it is often difficult to distinguish the mirrored poses relative to the viewpoint using VFH. In order to solve this difficulty, this study presents an improved feature descriptor named orthogonal viewpoint feature histogram (OVFH), which contains two components: a surface shape component and an improved viewpoint direction component. The improved viewpoint component is calculated by the orthogonal vector of the viewpoint direction, which is obtained based on the reference frame estimated for the entire point cloud. The evaluation of OVFH using a publicly available data set indicates that it enhances the ability to distinguish between mirrored poses while ensuring object recognition performance. The proposed method uses OVFH to recognize and register objects in the database and obtains precise poses by using the iterative closest point (ICP) algorithm. The experimental results show that the proposed approach can be effectively applied to guide the robot to grasp objects with mirrored poses.
48

Amadi, Lawrence, and Gady Agam. "Weakly Supervised 2D Pose Adaptation and Body Part Segmentation for Concealed Object Detection." Sensors 23, no. 4 (February 10, 2023): 2005. http://dx.doi.org/10.3390/s23042005.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Weakly supervised pose estimation can be used to assist unsupervised body part segmentation and concealed item detection. The accuracy of pose estimation is essential for precise body part segmentation and accurate concealed item detection. In this paper, we show how poses obtained from an RGB pretrained 2D pose detector can be modified for the backscatter image domain. The 2D poses are refined using RANSAC bundle adjustment to minimize the projection loss in 3D. Furthermore, we show how 2D poses can be optimized using a newly proposed 3D-to-2D pose correction network weakly supervised with pose prior regularizers and multi-view pose and posture consistency losses. The optimized 2D poses are used to segment human body parts. We then train a body-part-aware anomaly detection network to detect foreign (concealed threat) objects on segmented body parts. Our work is applied to the TSA passenger screening dataset containing millimeter wave scan images of airport travelers annotated with only binary labels that indicate whether a foreign object is concealed on a body part. Our proposed approach significantly improves the detection accuracy of TSA 2D backscatter images in existing works with a state-of-the-art performance of 97% F1-score, 0.0559 log-loss on the TSA-PSD test-set, and a 74% reduction in 2D pose error.
49

Zhao, Shida, Zongchun Bai, Lili Meng, Guofeng Han, and Enze Duan. "Pose Estimation and Behavior Classification of Jinling White Duck Based on Improved HRNet." Animals 13, no. 18 (September 10, 2023): 2878. http://dx.doi.org/10.3390/ani13182878.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
In breeding ducks, obtaining the pose information is vital for perceiving their physiological health, ensuring welfare in breeding, and monitoring environmental comfort. This paper proposes a pose estimation method by combining HRNet and CBAM to achieve automatic and accurate detection of duck’s multi-poses. Through comparison, HRNet-32 is identified as the optimal option for duck pose estimation. Based on this, multiple CBAM modules are densely embedded into the HRNet-32 network to obtain the pose estimation model based on HRNet-32-CBAM, realizing accurate detection and correlation of eight keypoints across six different behaviors. Furthermore, the model’s generalization ability is tested under different illumination conditions, and the model’s comprehensive detection abilities are evaluated on Cherry Valley ducklings of 12 and 24 days of age. Moreover, this model is compared with mainstream pose estimation methods to reveal its advantages and disadvantages, and its real-time performance is tested using images of 256 × 256, 512 × 512, and 728 × 728 pixel sizes. The experimental results indicate that for the duck pose estimation dataset, the proposed method achieves an average precision (AP) of 0.943, which has a strong generalization ability and can achieve real-time estimation of the duck’s multi-poses under different ages, breeds, and farming modes. This study can provide a technical reference and a basis for the intelligent farming of poultry animals.
50

Borthakur, Debanjan, Arindam Paul, Dev Kapil, and Manob Jyoti Saikia. "Yoga Pose Estimation Using Angle-Based Feature Extraction." Healthcare 11, no. 24 (December 9, 2023): 3133. http://dx.doi.org/10.3390/healthcare11243133.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Objective: This research addresses the challenges of maintaining proper yoga postures, an issue that has been exacerbated by the COVID-19 pandemic and the subsequent shift to virtual platforms for yoga instruction. This research aims to develop a mechanism for detecting correct yoga poses and providing real-time feedback through the application of computer vision and machine learning (ML) techniques. Methods and Procedures: This study utilized computer vision-based pose estimation methods to extract features and calculate yoga pose angles. A variety of models, including extremely randomized trees, logistic regression, random forest, gradient boosting, extreme gradient boosting, and deep neural networks, were trained and tested to classify yoga poses. Our study employed the Yoga-82 dataset, consisting of many yoga pose images downloaded from the web. Results: The results of this study show that the extremely randomized trees model outperformed the other models, achieving the highest prediction accuracy of 91% on the test dataset and 92% in a fivefold cross-validation experiment. Other models like random forest, gradient boosting, extreme gradient boosting, and deep neural networks achieved accuracies of 90%, 89%, 90%, and 85%, respectively, while logistic regression underperformed, having the lowest accuracy. Conclusion: This research concludes that the extremely randomized trees model presents superior predictive power for yoga pose recognition. This suggests a valuable avenue for future exploration in this domain. Moreover, the approach has significant potential for implementation on low-powered smartphones with minimal latency, thereby enabling real-time feedback for users practicing yoga at home.

До бібліографії