To see the other types of publications on this topic, follow the link: Human keypoint detection.

Journal articles on the topic 'Human keypoint detection'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 21 journal articles for your research on the topic 'Human keypoint detection.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhang, Jing, Zhe Chen, and Dacheng Tao. "Towards High Performance Human Keypoint Detection." International Journal of Computer Vision 129, no. 9 (July 1, 2021): 2639–62. http://dx.doi.org/10.1007/s11263-021-01482-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gajic, Dusan, Gorana Gojic, Dinu Dragan, and Veljko Petrovic. "Comparative evaluation of keypoint detectors for 3d digital avatar reconstruction." Facta universitatis - series: Electronics and Energetics 33, no. 3 (2020): 379–94. http://dx.doi.org/10.2298/fuee2003379g.

Full text
Abstract:
Three-dimensional personalized human avatars have been successfully utilized in shopping, entertainment, education, and health applications. However, it is still a challenging task to obtain both a complete and highly detailed avatar automatically. One approach is to use general-purpose, photogrammetry-based algorithms on a series of overlapping images of the person. We argue that the quality of avatar reconstruction can be increased by modifying parts of the photogrammetry-based algorithm pipeline to be more specifically tailored to the human body shape. In this context, we perform an extensive, standalone evaluation of eleven algorithms for keypoint detection, which is the first phase of the photogrammetry-based reconstruction pipeline. We include well established, patented Distinctive image features from scale-invariant keypoints (SIFT) and Speeded up robust features (SURF) detection algorithms as a baseline since they are widely incorporated into photogrammetry-based software. All experiments are conducted on a dataset of 378 images of human body captured in a controlled, multi-view stereo setup. Our findings are that binary detectors highly outperform commonly used SIFT-like detectors in the avatar reconstruction task, both in terms of detection speed and in number of detected keypoints.
APA, Harvard, Vancouver, ISO, and other styles
3

Jeong, Jeongseok, Byeongjun Park, and Kyoungro Yoon. "3D Human Skeleton Keypoint Detection Using RGB and Depth Image." Transactions of The Korean Institute of Electrical Engineers 70, no. 9 (September 30, 2021): 1354–61. http://dx.doi.org/10.5370/kiee.2021.70.9.1354.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Xu, Ruinian, Fu-Jen Chu, Chao Tang, Weiyu Liu, and Patricio Vela. "An Affordance Keypoint Detection Network for Robot Manipulation." IEEE Robotics and Automation Letters 6, no. 2 (April 2021): 2870–77. http://dx.doi.org/10.1109/lra.2021.3062560.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Wang, Jue, and Zhigang Luo. "Pointless Pose: Part Affinity Field-Based 3D Pose Estimation without Detecting Keypoints." Electronics 10, no. 8 (April 13, 2021): 929. http://dx.doi.org/10.3390/electronics10080929.

Full text
Abstract:
Human pose estimation finds its application in an extremely wide domain and is therefore never pointless. We propose in this paper a new approach that, unlike any prior one that we are aware of, bypasses the 2D keypoint detection step based on which the 3D pose is estimated, and is thus pointless. Our motivation is rather straightforward: 2D keypoint detection is vulnerable to occlusions and out-of-image absences, in which case the 2D errors propagate to 3D recovery and deteriorate the results. To this end, we resort to explicitly estimating the human body regions of interest (ROI) and their 3D orientations. Even if a portion of the human body, like the lower arm, is partially absent, the predicted orientation vector pointing from the upper arm will take advantage of the local image evidence and recover the 3D pose. This is achieved, specifically, by deforming a skeleton-shaped puppet template to fit the estimated orientation vectors. Despite its simple nature, the proposed approach yields truly robust and state-of-the-art results on several benchmarks and in-the-wild data.
APA, Harvard, Vancouver, ISO, and other styles
6

Tinchev, Georgi, Adrian Penate-Sanchez, and Maurice Fallon. "SKD: Keypoint Detection for Point Clouds Using Saliency Estimation." IEEE Robotics and Automation Letters 6, no. 2 (April 2021): 3785–92. http://dx.doi.org/10.1109/lra.2021.3065224.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Apurupa, Leela, J. D.Dorathi Jayaseeli, and D. Malathi. "An Integrated Technique for Image Forgery Detection using Block and Keypoint Based Feature Techniques." International Journal of Engineering & Technology 7, no. 3.12 (July 20, 2018): 505. http://dx.doi.org/10.14419/ijet.v7i3.12.16168.

Full text
Abstract:
The invention of the net has introduced the unthinkable growth and developments within the illustrious analysis fields like drugs, satellite imaging, image process, security, biometrics, and genetic science. The algorithms enforced within the twenty first century has created the human life more leisurely and secure, however the protection to the first documents belongs to the genuine person is remained as involved within the digital image process domain. a replacement study is planned during this analysis paper to discover. The key plan in the deliberate take a look at and therefore the detection of the suspected regions are detected via the adaptive non-overlapping and abnormal blocks and this method is allotted exploitation the adaptive over-segmentation algorithmic rule. The extraction of the feature points is performed by playacting the matching between every block and its options. The feature points are step by step replaced by exploitation the super pixels within the planned Forgery Region Extraction algorithm then merge the neighboring obstructs that have comparative local shading decisions into the element squares to encourage the brought together districts; at last, it applies the morphological activity to the bound together areas to ask the recognized falsification districts The planned forgery detection algorithmic rule achieves far better detection results even below numerous difficult conditions the sooner strategies all told aspects. We have analyzed the results obtained by the each SIFT and SURF and it is well-tried that the planned technique SURF is giving more satisfactory results by both subjective and objective analysis.
APA, Harvard, Vancouver, ISO, and other styles
8

T. Psota, Eric, Ty Schmidt, Benny Mote, and Lance C. Pérez. "Long-Term Tracking of Group-Housed Livestock Using Keypoint Detection and MAP Estimation for Individual Animal Identification." Sensors 20, no. 13 (June 30, 2020): 3670. http://dx.doi.org/10.3390/s20133670.

Full text
Abstract:
Tracking individual animals in a group setting is a exigent task for computer vision and animal science researchers. When the objective is months of uninterrupted tracking and the targeted animals lack discernible differences in their physical characteristics, this task introduces significant challenges. To address these challenges, a probabilistic tracking-by-detection method is proposed. The tracking method uses, as input, visible keypoints of individual animals provided by a fully-convolutional detector. Individual animals are also equipped with ear tags that are used by a classification network to assign unique identification to instances. The fixed cardinality of the targets is leveraged to create a continuous set of tracks and the forward-backward algorithm is used to assign ear-tag identification probabilities to each detected instance. Tracking achieves real-time performance on consumer-grade hardware, in part because it does not rely on complex, costly, graph-based optimizations. A publicly available, human-annotated dataset is introduced to evaluate tracking performance. This dataset contains 15 half-hour long videos of pigs with various ages/sizes, facility environments, and activity levels. Results demonstrate that the proposed method achieves an average precision and recall greater than 95% across the entire dataset. Analysis of the error events reveals environmental conditions and social interactions that are most likely to cause errors in real-world deployments.
APA, Harvard, Vancouver, ISO, and other styles
9

Wang, Yuan-Kai, Hong-Yu Chen, and Jian-Ru Chen. "Unobtrusive Sleep Monitoring Using Movement Activity by Video Analysis." Electronics 8, no. 7 (July 20, 2019): 812. http://dx.doi.org/10.3390/electronics8070812.

Full text
Abstract:
Sleep healthcare at home is a new research topic that needs to develop new sensors, hardware and algorithms with the consideration of convenience, portability and accuracy. Monitoring sleep behaviors by visual sensors represents one new unobtrusive approach to facilitating sleep monitoring and benefits sleep quality. The challenge of video surveillance for sleep behavior analysis is that we have to tackle bad image illumination issue and large pose variations during sleeping. This paper proposes a robust method for sleep pose analysis with human joints model. The method first tackles the illumination variation issue of infrared videos to improve the image quality and help better feature extraction. Image matching by keypoint features is proposed to detect and track the positions of human joints and build a human model robust to occlusion. Sleep poses are then inferred from joint positions by probabilistic reasoning in order to tolerate occluded joints. Experiments are conducted on the video polysomnography data recorded in sleep laboratory. Sleep pose experiments are given to examine the accuracy of joint detection and tacking, and the accuracy of sleep poses. High accuracy of the experiments demonstrates the validity of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
10

Herpers, R., L. Witta, J. Bruske, and G. Sommer. "Dynamic Cell Structures for the Evaluation of Keypoints in Facial Images." International Journal of Neural Systems 08, no. 01 (February 1997): 27–39. http://dx.doi.org/10.1142/s0129065797000057.

Full text
Abstract:
In this contribution Dynamic Cell Structures (DCS network) are applied to classify local image structures at particular facial landmarks. The facial landmarks such as the corners of the eyes or intersections of the iris with the eyelid are computed in advance by a combined model and data driven sequential search strategy. To reduce the detection error after the processing of the sequential search strategy, the computed image positions are verified applying a DCS network. The DCS network is trained by supervised learning with feature vectors which encode spatially arranged edge and structural information at the keypoint position considered. The model driven localization as well as the data driven verification are based on steerable filters, which build a representation comparable with one provided by a receptive field in the human visual system. We apply a DCS based classifier because of its ability to grasp the topological structure of complex input spaces and because it has proved successful in a number of other classification tasks. In our experiments the average error resulting from false positive classifications is less than 1%.
APA, Harvard, Vancouver, ISO, and other styles
11

Jin, Ren, Jiaqi Jiang, Yuhua Qi, Defu Lin, and Tao Song. "Drone Detection and Pose Estimation Using Relational Graph Networks." Sensors 19, no. 6 (March 26, 2019): 1479. http://dx.doi.org/10.3390/s19061479.

Full text
Abstract:
With the upsurge in use of Unmanned Aerial Vehicles (UAVs), drone detection and pose estimation by using optical sensors becomes an important research subject in cooperative flight and low-altitude security. The existing technology only obtains the position of the target UAV based on object detection methods. To achieve better adaptability and enhanced cooperative performance, the attitude information of the target drone becomes a key message to understand its state and intention, e.g., the acceleration of quadrotors. At present, most of the object 6D pose estimation algorithms depend on accurate pose annotation or a 3D target model, which costs a lot of human resource and is difficult to apply to non-cooperative targets. To overcome these problems, a quadrotor 6D pose estimation algorithm was proposed in this paper. It was based on keypoints detection (only need keypoints annotation), relational graph network and perspective-n-point (PnP) algorithm, which achieves state-of-the-art performance both in simulation and real scenario. In addition, the inference ability of our relational graph network to the keypoints of four motors was also evaluated. The accuracy and speed were improved significantly compared with the state-of-the-art keypoints detection algorithm.
APA, Harvard, Vancouver, ISO, and other styles
12

Borgmann, B., M. Hebel, M. Arens, and U. Stilla. "INFORMATION ACQUISITION ON PEDESTRIAN MOVEMENTS IN URBAN TRAFFIC WITH A MOBILE MULTI-SENSOR SYSTEM." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2021 (June 28, 2021): 131–38. http://dx.doi.org/10.5194/isprs-archives-xliii-b2-2021-131-2021.

Full text
Abstract:
Abstract. This paper presents an approach which combines LiDAR sensors and cameras of a mobile multi-sensor system to obtain information about pedestrians in the vicinity of the sensor platform. Such information can be used, for example, in the context of driver assistance systems. In the first step, our approach starts by using LiDAR sensor data to detect and track pedestrians, benefiting from LiDAR’s capability to directly provide accurate 3D data. After LiDAR-based detection, the approach leverages the typically higher data density provided by 2D cameras to determine the body pose of the detected pedestrians. The approach combines several state-of-the-art machine learning techniques: it uses a neural network and a subsequent voting process to detect pedestrians in LiDAR sensor data. Based on the known geometric constellation of the different sensors and the knowledge of the intrinsic parameters of the cameras, image sections are generated with the respective regions of interest showing only the detected pedestrians. These image sections are then processed with a method for image-based human pose estimation to determine keypoints for different body parts. These keypoints are finally projected from 2D image coordinates to 3D world coordinates using the assignment of the original LiDAR points to a particular pedestrian.
APA, Harvard, Vancouver, ISO, and other styles
13

Kanase, Rahul Ravikant, Akash Narayan Kumavat, Rohit Datta Sinalkar, and Sakshi Somani. "Pose Estimation and Correcting Exercise Posture." ITM Web of Conferences 40 (2021): 03031. http://dx.doi.org/10.1051/itmconf/20214003031.

Full text
Abstract:
Our posture shows an impact on health both mentally and physically. Various methods have been proposed in order to detect different postures of a human being. Posture analysis also plays an essential role in the field of medicine such as finding out sleeping posture of a patient. Image processing based and sensor based approach are the leading posture analysis approaches. Sensor based approach is used by numerous models to focus on posture detection in which the person needs to wear some particular devices or sensors which is helpful in cases such as fall detection. Image processing based approach helps to analyze postures such as standing and sitting postures. Fitness exercises are exceptionally beneficial to individual health, but, they can also be ineffectual and quite possibly harmful if performed incorrectly. When someone does not use the proper posture, exercise mistakes occur. This proposed application utilizes pose estimation and detects the user’s exercise posture and provides detailed, customized recommendations on how the user can improve their posture. A pose estimator called OpenPose is used in this application. OpenPose is a pre trained model composed of a multi-stage CNN to detect a user’s posture. This application then evaluates the vector geometry of the pose through an exercise to provide helpful feedback. Pose estimation is a method in which spatial locations of key body joints is calculated using image or video of the person. This computer vision technique detects human posture in images or videos and shows the keypoints such as elbow or knee in the output image.
APA, Harvard, Vancouver, ISO, and other styles
14

Kashevnik, Alexey, Walaa Othman, Igor Ryabchikov, and Nikolay Shilov. "Estimation of Motion and Respiratory Characteristics during the Meditation Practice Based on Video Analysis." Sensors 21, no. 11 (May 29, 2021): 3771. http://dx.doi.org/10.3390/s21113771.

Full text
Abstract:
Meditation practice is mental health training. It helps people to reduce stress and suppress negative thoughts. In this paper, we propose a camera-based meditation evaluation system, that helps meditators to improve their performance. We rely on two main criteria to measure the focus: the breathing characteristics (respiratory rate, breathing rhythmicity and stability), and the body movement. We introduce a contactless sensor to measure the respiratory rate based on a smartphone camera by detecting the chest keypoint at each frame, using an optical flow based algorithm to calculate the displacement between frames, filtering and de-noising the chest movement signal, and calculating the number of real peaks in this signal. We also present an approach to detecting the movement of different body parts (head, thorax, shoulders, elbows, wrists, stomach and knees). We have collected a non-annotated dataset for meditation practice videos consists of ninety videos and the annotated dataset consists of eight videos. The non-annotated dataset was categorized into beginner and professional meditators and was used for the development of the algorithm and for tuning the parameters. The annotated dataset was used for evaluation and showed that human activity during meditation practice could be correctly estimated by the presented approach and that the mean absolute error for the respiratory rate is around 1.75 BPM, which can be considered tolerable for the meditation application.
APA, Harvard, Vancouver, ISO, and other styles
15

Gu, Yanlei, Huiyang Zhang, and Shunsuke Kamijo. "Multi-Person Pose Estimation using an Orientation and Occlusion Aware Deep Learning Network." Sensors 20, no. 6 (March 12, 2020): 1593. http://dx.doi.org/10.3390/s20061593.

Full text
Abstract:
Image based human behavior and activity understanding has been a hot topic in the field of computer vision and multimedia. As an important part, skeleton estimation, which is also called pose estimation, has attracted lots of interests. For pose estimation, most of the deep learning approaches mainly focus on the joint feature. However, the joint feature is not sufficient, especially when the image includes multi-person and the pose is occluded or not fully visible. This paper proposes a novel multi-task framework for the multi-person pose estimation. The proposed framework is developed based on Mask Region-based Convolutional Neural Networks (R-CNN) and extended to integrate the joint feature, body boundary, body orientation and occlusion condition together. In order to further improve the performance of the multi-person pose estimation, this paper proposes to organize the different information in serial multi-task models instead of the widely used parallel multi-task network. The proposed models are trained on the public dataset Common Objects in Context (COCO), which is further augmented by ground truths of body orientation and mutual-occlusion mask. Experiments demonstrate the performance of the proposed method for multi-person pose estimation and body orientation estimation. The proposed method can detect 84.6% of the Percentage of Correct Keypoints (PCK) and has an 83.7% Correct Detection Rate (CDR). Comparisons further illustrate the proposed model can reduce the over-detection compared with other methods.
APA, Harvard, Vancouver, ISO, and other styles
16

Malleson, Charles, John Collomosse, and Adrian Hilton. "Real-Time Multi-person Motion Capture from Multi-view Video and IMUs." International Journal of Computer Vision 128, no. 6 (December 17, 2019): 1594–611. http://dx.doi.org/10.1007/s11263-019-01270-5.

Full text
Abstract:
AbstractA real-time motion capture system is presented which uses input from multiple standard video cameras and inertial measurement units (IMUs). The system is able to track multiple people simultaneously and requires no optical markers, specialized infra-red cameras or foreground/background segmentation, making it applicable to general indoor and outdoor scenarios with dynamic backgrounds and lighting. To overcome limitations of prior video or IMU-only approaches, we propose to use flexible combinations of multiple-view, calibrated video and IMU input along with a pose prior in an online optimization-based framework, which allows the full 6-DoF motion to be recovered including axial rotation of limbs and drift-free global position. A method for sorting and assigning raw input 2D keypoint detections into corresponding subjects is presented which facilitates multi-person tracking and rejection of any bystanders in the scene. The approach is evaluated on data from several indoor and outdoor capture environments with one or more subjects and the trade-off between input sparsity and tracking performance is discussed. State-of-the-art pose estimation performance is obtained on the Total Capture (mutli-view video and IMU) and Human 3.6M (multi-view video) datasets. Finally, a live demonstrator for the approach is presented showing real-time capture, solving and character animation using a light-weight, commodity hardware setup.
APA, Harvard, Vancouver, ISO, and other styles
17

Papakostas, Michalis, Akilesh Rajavenkatanarayanan, and Fillia Makedon. "CogBeacon: A Multi-Modal Dataset and Data-Collection Platform for Modeling Cognitive Fatigue." Technologies 7, no. 2 (June 13, 2019): 46. http://dx.doi.org/10.3390/technologies7020046.

Full text
Abstract:
In this work, we present CogBeacon, a multi-modal dataset designed to target the effects of cognitive fatigue in human performance. The dataset consists of 76 sessions collected from 19 male and female users performing different versions of a cognitive task inspired by the principles of the Wisconsin Card Sorting Test (WCST), a popular cognitive test in experimental and clinical psychology designed to assess cognitive flexibility, reasoning, and specific aspects of cognitive functioning. During each session, we record and fully annotate user EEG functionality, facial keypoints, real-time self-reports on cognitive fatigue, as well as detailed information of the performance metrics achieved during the cognitive task (success rate, response time, number of errors, etc.). Along with the dataset we provide free access to the CogBeacon data-collection software to provide a standardized mechanism to the community for collecting and annotating physiological and behavioral data for cognitive fatigue analysis. Our goal is to provide other researchers with the tools to expand or modify the functionalities of the CogBeacon data-collection framework in a hardware-independent way. As a proof of concept we show some preliminary machine learning-based experiments on cognitive fatigue detection using the EEG information and the subjective user reports as ground truth. Our experiments highlight the meaningfulness of the current dataset, and encourage our efforts towards expanding the CogBeacon platform. To our knowledge, this is the first multi-modal dataset specifically designed to assess cognitive fatigue and the only free software available to allow experiment reproducibility for multi-modal cognitive fatigue analysis.
APA, Harvard, Vancouver, ISO, and other styles
18

Wang, Xupeng, Mohammed Bennamoun, Ferdous Sohel, and Hang Lei. "Diffusion Geometry Derived Keypoints and Local Descriptors for 3D Deformable Shape Analysis." Journal of Circuits, Systems and Computers, July 18, 2020, 2150016. http://dx.doi.org/10.1142/s021812662150016x.

Full text
Abstract:
Geometric analysis of three-dimensional (3D) surfaces with local deformations is a challenging task, required by mobile devices. In this paper, we propose a new local feature-based method derived from diffusion geometry, including a keypoint detector named persistence-based Heat Kernel Signature (pHKS), and a feature descriptor named Heat Propagation Strips (HeaPS). The pHKS detector first constructs a scalar field using the heat kernel signature function. The scalar field is generated at a small scale to capture fine geometric information of the local surface. Persistent homology is then computed to extract all the local maxima from the scalar field, and to provide a measure of persistence. Points with a high persistence are selected as pHKS keypoints. In order to describe a keypoint, an intrinsic support region is generated by the diffusion area. This support region is more robust than its geodesic distance counterpart, and provides a local surface with adaptive scale for subsequent feature description. The HeaPS descriptor is then developed by encoding the information contained in both the spatial and temporal domains of the heat kernel. We conducted several experiments to evaluate the effectiveness of the proposed method. On the TOSCA Dataset, the HeaPS descriptor achieved a high performance in terms of descriptiveness. The feature detector and descriptor were then tested on the SHREC 2010 Feature Detection and Description Dataset, and produced results that were better than the state-of-the-art methods. Finally, their application to shape retrieval was evaluated. The proposed pHKS detector and HeaPS descriptor achieved a notable improvement on the SHREC 2014 Human Dataset.
APA, Harvard, Vancouver, ISO, and other styles
19

Lee, Sarada M. W., Andrew Shaw, Jodie L. Simpson, David Uminsky, and Luke W. Garratt. "Differential cell counts using center-point networks achieves human-level accuracy and efficiency over segmentation." Scientific Reports 11, no. 1 (August 19, 2021). http://dx.doi.org/10.1038/s41598-021-96067-3.

Full text
Abstract:
AbstractDifferential cell counts is a challenging task when applying computer vision algorithms to pathology. Existing approaches to train cell recognition require high availability of multi-class segmentation and/or bounding box annotations and suffer in performance when objects are tightly clustered. We present differential count network (“DCNet”), an annotation efficient modality that utilises keypoint detection to locate in brightfield images the centre points of cells (not nuclei) and their cell class. The single centre point annotation for DCNet lowered burden for experts to generate ground truth data by 77.1% compared to bounding box labeling. Yet centre point annotation still enabled high accuracy when training DCNet on a multi-class algorithm on whole cell features, matching human experts in all 5 object classes in average precision and outperforming humans in consistency. The efficacy and efficiency of the DCNet end-to-end system represents a significant progress toward an open source, fully computationally approach to differential cell count based diagnosis that can be adapted to any pathology need.
APA, Harvard, Vancouver, ISO, and other styles
20

Khalifa, Intissar, Ridha Ejbali, Raimondo Schettini, and Mourad Zaied. "Deep Multi-Stage Approach For Emotional Body Gesture Recognition In Job Interview." Computer Journal, April 19, 2021. http://dx.doi.org/10.1093/comjnl/bxab011.

Full text
Abstract:
Abstract Affective computing is a key research topic in artificial intelligence which is applied to psychology and machines. It consists of the estimation and measurement of human emotions. A person’s body language is one of the most significant sources of information during job interview, and it reflects a deep psychological state that is often missing from other data sources. In our work, we combine two tasks of pose estimation and emotion classification for emotional body gesture recognition to propose a deep multi-stage architecture that is able to deal with both tasks. Our deep pose decoding method detects and tracks the candidate’s skeleton in a video using a combination of depthwise convolutional network and detection-based method for 2D pose reconstruction. Moreover, we propose a representation technique based on the superposition of skeletons to generate for each video sequence a single image synthesizing the different poses of the subject. We call this image: ‘history pose image’, and it is used as input to the convolutional neural network model based on the Visual Geometry Group architecture. We demonstrate the effectiveness of our method in comparison with other methods in the state of the art on the standard Common Object in Context keypoint dataset and Face and Body gesture video database.
APA, Harvard, Vancouver, ISO, and other styles
21

Yang, Zhihui, Xiangyu Tang, Lijuan Zhang, and Zhiling Yang. "A combined local and global structure module for human pose estimation." Journal of Computational Methods in Sciences and Engineering, August 13, 2021, 1–11. http://dx.doi.org/10.3233/jcm-215210.

Full text
Abstract:
Human pose estimate can be used in action recognition, video surveillance and other fields, which has received a lot of attentions. Since the flexibility of human joints and environmental factors greatly influence pose estimation accuracy, related research is confronted with many challenges. In this paper, we incorporate the pyramid convolution and attention mechanism into the residual block, and introduce a hybrid structure model which synthetically applies the local and global information of the image for the analysis of keypoints detection. In addition, our improved structure model adopts grouped convolution, and the attention module used is lightweight, which will reduce the computational cost of the network. Simulation experiments based on the MS COCO human body keypoints detection data set show that, compared with the Simple Baseline model, our model is similar in parameters and GFLOPs (giga floating-point operations per second), but the performance is better on the detection of accuracy under the multi-person scenes.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography