Siga este link para ver outros tipos de publicações sobre o tema: 2D Encoding representation.

Artigos de revistas sobre o tema "2D Encoding representation"

Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos

Selecione um tipo de fonte:

Veja os 47 melhores artigos de revistas para estudos sobre o assunto "2D Encoding representation".

Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.

Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.

Veja os artigos de revistas das mais diversas áreas científicas e compile uma bibliografia correta.

1

He, Qingdong, Hao Zeng, Yi Zeng e Yijun Liu. "SCIR-Net: Structured Color Image Representation Based 3D Object Detection Network from Point Clouds". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 4 (28 de junho de 2022): 4486–94. http://dx.doi.org/10.1609/aaai.v36i4.20371.

Texto completo da fonte
Resumo:
3D object detection from point clouds data has become an indispensable part in autonomous driving. Previous works for processing point clouds lie in either projection or voxelization. However, projection-based methods suffer from information loss while voxelization-based methods bring huge computation. In this paper, we propose to encode point clouds into structured color image representation (SCIR) and utilize 2D CNN to fulfill the 3D detection task. Specifically, we use the structured color image encoding module to convert the irregular 3D point clouds into a squared 2D tensor image, where each point corresponds to a spatial point in the 3D space. Furthermore, in order to fit for the Euclidean structure, we apply feature normalization to parameterize the 2D tensor image onto a regular dense color image. Then, we conduct repeated multi-scale fusion with different levels so as to augment the initial features and learn scale-aware feature representations for box prediction. Extensive experiments on KITTI benchmark, Waymo Open Dataset and more challenging nuScenes dataset show that our proposed method yields decent results and demonstrate the effectiveness of such representations for point clouds.
Estilos ABNT, Harvard, Vancouver, APA, etc.
2

Wu, Banghe, Chengzhong Xu e Hui Kong. "LiDAR Road-Atlas: An Efficient Map Representation for General 3D Urban Environment". Field Robotics 3, n.º 1 (10 de janeiro de 2023): 435–59. http://dx.doi.org/10.55417/fr.2023014.

Texto completo da fonte
Resumo:
In this work, we propose the LiDAR Road-Atlas, a compact and efficient 3D map representation, for autonomous robot or vehicle navigation in a general urban environment. The LiDAR Road-Atlas can be generated by an online mapping framework which incrementally merges local 2D occupancy grid maps (2D-OGMs). Specifically, the contributions of our method are threefold. First, we solve the challenging problem of creating local 2D-OGMs in nonstructured urban scenes based on a real-time delimitation of traversable and curb regions in a LiDAR point cloud. Second, we achieve accurate 3D mapping in multiple-layer urban road scenarios by a probabilistic fusion scheme. Third, we achieve a very efficient 3D map representation of a general environment thanks to the automatic local-OGM-induced traversable-region labeling and a sparse probabilistic local point-cloud encoding. Given the LiDAR Road-Atlas, one can achieve accurate vehicle localization, path planning, and some other tasks. Our map representation is insensitive to dynamic objects which can be filtered out in the resulting map based on a probabilistic fusion. Empirically, we compare our map representation with a couple of popular map representations in robotics society, and our map representation is more favorable in terms of efficiency, scalability, and compactness. Additionally, we also evaluate localization performance given the LiDAR Road-Atlas representations on two public datasets. With a 16-channel LiDAR sensor, our method achieves an average global localization error of 0.26 m (translation) and 1.07 (rotation) on the Apollo dataset, and 0.89 m (translation) and 1.29 (rotation) on the MulRan dataset, respectively, at 10 Hz, which validates its promising performance. The code for this work is open-sourced at https://github.com/IMRL/Lidar-road-atlas.
Estilos ABNT, Harvard, Vancouver, APA, etc.
3

Yuan, Hangjie, e Dong Ni. "Learning Visual Context for Group Activity Recognition". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 4 (18 de maio de 2021): 3261–69. http://dx.doi.org/10.1609/aaai.v35i4.16437.

Texto completo da fonte
Resumo:
Group activity recognition aims to recognize an overall activity in a multi-person scene. Previous methods strive to reason on individual features. However, they under-explore the person-specific contextual information, which is significant and informative in computer vision tasks. In this paper, we propose a new reasoning paradigm to incorporate global contextual information. Specifically, we propose two modules to bridge the gap between group activity and visual context. The first is Transformer based Context Encoding (TCE) module, which enhances individual representation by encoding global contextual information to individual features and refining the aggregated information. The second is Spatial-Temporal Bilinear Pooling (STBiP) module. It firstly further explores pairwise relationships for the context encoded individual representation, then generates semantic representations via gated message passing on a constructed spatial-temporal graph. On their basis, we further design a two-branch model that integrates the designed modules into a pipeline. Systematic experiments demonstrate each module's effectiveness on either branch. Visualizations indicate that visual contextual cues can be aggregated globally by TCE. Moreover, our method achieves state-of-the-art results on two widely used benchmarks using only RGB images as input and 2D backbones.
Estilos ABNT, Harvard, Vancouver, APA, etc.
4

Yang, Xiaobao, Shuai He, Junsheng Wu, Yang Yang, Zhiqiang Hou e Sugang Ma. "Exploring Spatial-Based Position Encoding for Image Captioning". Mathematics 11, n.º 21 (4 de novembro de 2023): 4550. http://dx.doi.org/10.3390/math11214550.

Texto completo da fonte
Resumo:
Image captioning has become a hot topic in artificial intelligence research and sits at the intersection of computer vision and natural language processing. Most recent imaging captioning models have adopted an “encoder + decoder” architecture, in which the encoder is employed generally to extract the visual feature, while the decoder generates the descriptive sentence word by word. However, the visual features need to be flattened into sequence form before being forwarded to the decoder, and this results in the loss of the 2D spatial position information of the image. This limitation is particularly pronounced in the Transformer architecture since it is inherently not position-aware. Therefore, in this paper, we propose a simple coordinate-based spatial position encoding method (CSPE) to remedy this deficiency. CSPE firstly creates the 2D position coordinates for each feature pixel, and then encodes them by row and by column separately via trainable or hard encoding, effectively strengthening the position representation of visual features and enriching the generated description sentences. In addition, in order to reduce the time cost, we also explore a diagonal-based spatial position encoding (DSPE) approach. Compared with CSPE, DSPE is slightly inferior in performance but has a faster calculation speed. Extensive experiments on the MS COCO 2014 dataset demonstrate that CSPE and DSPE can significantly enhance the spatial position representation of visual features. CSPE, in particular, demonstrates BLEU-4 and CIDEr metrics improved by 1.6% and 5.7%, respectively, compared with a baseline model without sequence-based position encoding, and also outperforms current sequence-based position encoding approaches by a significant margin. In addition, the robustness and plug-and-play ability of the proposed method are validated based on a medical captioning generation model.
Estilos ABNT, Harvard, Vancouver, APA, etc.
5

Rebollo-Neira, Laura, e Aurelien Inacio. "Enhancing sparse representation of color images by cross channel transformation". PLOS ONE 18, n.º 1 (26 de janeiro de 2023): e0279917. http://dx.doi.org/10.1371/journal.pone.0279917.

Texto completo da fonte
Resumo:
Transformations for enhancing sparsity in the approximation of color images by 2D atomic decomposition are discussed. The sparsity is firstly considered with respect to the most significant coefficients in the wavelet decomposition of the color image. The discrete cosine transform is singled out as an effective 3 point transformation for this purpose. The enhanced feature is further exploited by approximating the transformed arrays using an effective greedy strategy with a separable highly redundant dictionary. The relevance of the achieved sparsity is illustrated by a simple encoding procedure. On typical test images the compression at high quality recovery is shown to significantly improve upon JPEG and WebP formats.
Estilos ABNT, Harvard, Vancouver, APA, etc.
6

Tripura Sundari, Yeluripati Bala, e K. Usha Mahalakshmi. "Enhancing Brain Tumor Diagnosis: A 3D Auto-Encoding Approach for Accurate Classification". International Journal of Scientific Methods in Engineering and Management 01, n.º 09 (2023): 38–46. http://dx.doi.org/10.58599/ijsmem.2023.1905.

Texto completo da fonte
Resumo:
The brain’s capacity to control and coordinate the body’s other organs makes it an integral part of the nervous system. Brain tumours, which form when abnormal cells in the brain grow uncontrolled, may be deadly if not diagnosed and treated promptly. The use of image processing technology is essential in the quest to identify malignancies in medical imaging. Because of this, the final depiction will be more complete. Only a tiny percentage of 3D form instances will be viable for feature learning because of its complex spatial structure. These issues have inspired the development of some potential solutions, such as automatic encoders for learning properties from 2D images and the translation of 3D shapes into 2D space. With the help of camera images and state-space structures, the suggested 3D-based Spatial Auto Encoder method may automatically learn a representation of the state. Autoencoders can be taught to use the prototypes they generate to rebuild a picture, with the resulting learned coefficients being put to use in 3D form matching and retrieval. It’s possible that the learned coefficients may serve this function. The auto-encoder’s impressive results in image retrieval have been attributed, at least in part, to the ease with which it can learn new features from existing ones.
Estilos ABNT, Harvard, Vancouver, APA, etc.
7

Rybińska-Fryca, Anna, Anita Sosnowska e Tomasz Puzyn. "Representation of the Structure—A Key Point of Building QSAR/QSPR Models for Ionic Liquids". Materials 13, n.º 11 (30 de maio de 2020): 2500. http://dx.doi.org/10.3390/ma13112500.

Texto completo da fonte
Resumo:
The process of encoding the structure of chemicals by molecular descriptors is a crucial step in quantitative structure-activity/property relationships (QSAR/QSPR) modeling. Since ionic liquids (ILs) are disconnected structures, various ways of representing their structure are used in the QSAR studies: the models can be based on descriptors either derived for particular ions or for the whole ionic pair. We have examined the influence of the type of IL representation (separate ions vs. ionic pairs) on the model’s quality, the process of the automated descriptors selection and reliability of the applicability domain (AD) assessment. The result of the benchmark study showed that a less precise description of ionic liquid, based on the 2D descriptors calculated for ionic pairs, is sufficient to develop a reliable QSAR/QSPR model with the highest accuracy in terms of calibration as well as validation. Moreover, the process of a descriptors’ selection is more effective when the possible number of variables can be decreased at the beginning of model development. Additionally, 2D descriptors usually demand less effort in mechanistic interpretation and are more convenient for virtual screening studies.
Estilos ABNT, Harvard, Vancouver, APA, etc.
8

Cohen, Lear, Ehud Vinepinsky, Opher Donchin e Ronen Segev. "Boundary vector cells in the goldfish central telencephalon encode spatial information". PLOS Biology 21, n.º 4 (25 de abril de 2023): e3001747. http://dx.doi.org/10.1371/journal.pbio.3001747.

Texto completo da fonte
Resumo:
Navigation is one of the most fundamental cognitive skills for the survival of fish, the largest vertebrate class, and almost all other animal classes. Space encoding in single neurons is a critical component of the neural basis of navigation. To study this fundamental cognitive component in fish, we recorded the activity of neurons in the central area of the goldfish telencephalon while the fish were freely navigating in a quasi-2D water tank embedded in a 3D environment. We found spatially modulated neurons with firing patterns that gradually decreased with the distance of the fish from a boundary in each cell’s preferred direction, resembling the boundary vector cells found in the mammalian subiculum. Many of these cells exhibited beta rhythm oscillations. This type of spatial representation in fish brains is unique among space-encoding cells in vertebrates and provides insights into spatial cognition in this lineage.
Estilos ABNT, Harvard, Vancouver, APA, etc.
9

Ciprian, David, e Vasile Gui. "2D Sensor Based Design of a Dynamic Hand Gesture Interpretation System". Advanced Engineering Forum 8-9 (junho de 2013): 553–62. http://dx.doi.org/10.4028/www.scientific.net/aef.8-9.553.

Texto completo da fonte
Resumo:
A complete 2D sensor based system for dynamic gesture interpretation is presented in this paper. A hand model is devised for this purpose, composed of the palm area and the fingertips. Multiple cues are integrated in a feature space. Segmentation is carried out in this space to output the hand model. The robust technique of mean shift mode estimation is used to estimate the parameters of the hand model, making it adaptive and robust. The model is validated in various experiments concerning difficult situations like occlusion, varying illumination, and camouflage. Real time requirements are also met. The gesture interpretation approach refers to dynamic hand gestures. A collection of fingertip locations is collected from the hand model. Tensor voting approach is used to smooth and reconstruct the trajectory. The final output is represented by an encoding sequence of local trajectory directions. These are obtained by mean shift mode detection on the trajectory representation on Radon space. This module was tested and proved highly accurate.
Estilos ABNT, Harvard, Vancouver, APA, etc.
10

Huang, Yuhao, Sanping Zhou, Junjie Zhang, Jinpeng Dong e Nanning Zheng. "Voxel or Pillar: Exploring Efficient Point Cloud Representation for 3D Object Detection". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 3 (24 de março de 2024): 2426–35. http://dx.doi.org/10.1609/aaai.v38i3.28018.

Texto completo da fonte
Resumo:
Efficient representation of point clouds is fundamental for LiDAR-based 3D object detection. While recent grid-based detectors often encode point clouds into either voxels or pillars, the distinctions between these approaches remain underexplored. In this paper, we quantify the differences between the current encoding paradigms and highlight the limited vertical learning within. To tackle these limitations, we propose a hybrid detection framework named Voxel-Pillar Fusion (VPF), which synergistically combines the unique strengths of both voxels and pillars. To be concrete, we first develop a sparse voxel-pillar encoder that encodes point clouds into voxel and pillar features through 3D and 2D sparse convolutions respectively, and then introduce the Sparse Fusion Layer (SFL), facilitating bidirectional interaction between sparse voxel and pillar features. Our computationally efficient, fully sparse method can be seamlessly integrated into both dense and sparse detectors. Leveraging this powerful yet straightforward representation, VPF delivers competitive performance, achieving real-time inference speeds on the nuScenes and Waymo Open Dataset.
Estilos ABNT, Harvard, Vancouver, APA, etc.
11

Zafar, Bushra, Rehan Ashraf, Nouman Ali, Muhammad Iqbal, Muhammad Sajid, Saadat Dar e Naeem Ratyal. "A Novel Discriminating and Relative Global Spatial Image Representation with Applications in CBIR". Applied Sciences 8, n.º 11 (14 de novembro de 2018): 2242. http://dx.doi.org/10.3390/app8112242.

Texto completo da fonte
Resumo:
The requirement for effective image search, which motivates the use of Content-Based Image Retrieval (CBIR) and the search of similar multimedia contents on the basis of user query, remains an open research problem for computer vision applications. The application domains for Bag of Visual Words (BoVW) based image representations are object recognition, image classification and content-based image analysis. Interest point detectors are quantized in the feature space and the final histogram or image signature do not retain any detail about co-occurrences of features in the 2D image space. This spatial information is crucial, as it adversely affects the performance of an image classification-based model. The most notable contribution in this context is Spatial Pyramid Matching (SPM), which captures the absolute spatial distribution of visual words. However, SPM is sensitive to image transformations such as rotation, flipping and translation. When images are not well-aligned, SPM may lose its discriminative power. This paper introduces a novel approach to encoding the relative spatial information for histogram-based representation of the BoVW model. This is established by computing the global geometric relationship between pairs of identical visual words with respect to the centroid of an image. The proposed research is evaluated by using five different datasets. Comprehensive experiments demonstrate the robustness of the proposed image representation as compared to the state-of-the-art methods in terms of precision and recall values.
Estilos ABNT, Harvard, Vancouver, APA, etc.
12

Cao, Hezhi, Xia Xi, Guan Wu, Ruizhen Hu e Ligang Liu. "ScanBot: Autonomous Reconstruction via Deep Reinforcement Learning". ACM Transactions on Graphics 42, n.º 4 (26 de julho de 2023): 1–16. http://dx.doi.org/10.1145/3592113.

Texto completo da fonte
Resumo:
Autoscanning of an unknown environment is the key to many AR/VR and robotic applications. However, autonomous reconstruction with both high efficiency and quality remains a challenging problem. In this work, we propose a reconstruction-oriented autoscanning approach, called ScanBot, which utilizes hierarchical deep reinforcement learning techniques for global region-of-interest (ROI) planning to improve the scanning efficiency and local next-best-view (NBV) planning to enhance the reconstruction quality. Given the partially reconstructed scene, the global policy designates an ROI with insufficient exploration or reconstruction. The local policy is then applied to refine the reconstruction quality of objects in this region by planning and scanning a series of NBVs. A novel mixed 2D-3D representation is designed for these policies, where a 2D quality map with tailored quality channels encoding the scanning progress is consumed by the global policy, and a coarse-to-fine 3D volumetric representation that embodies both local environment and object completeness is fed to the local policy. These two policies iterate until the whole scene has been completely explored and scanned. To speed up the learning of complex environmental dynamics and enhance the agent's memory for spatial-temporal inference, we further introduce two novel auxiliary learning tasks to guide the training of our global policy. Thorough evaluations and comparisons are carried out to show the feasibility of our proposed approach and its advantages over previous methods. Code and data are available at https://github.com/HezhiCao/Scanbot.
Estilos ABNT, Harvard, Vancouver, APA, etc.
13

Basak, Krishna, Nilamadhab Mishra e Hsien-Tsung Chang. "TranStutter: A Convolution-Free Transformer-Based Deep Learning Method to Classify Stuttered Speech Using 2D Mel-Spectrogram Visualization and Attention-Based Feature Representation". Sensors 23, n.º 19 (22 de setembro de 2023): 8033. http://dx.doi.org/10.3390/s23198033.

Texto completo da fonte
Resumo:
Stuttering, a prevalent neurodevelopmental disorder, profoundly affects fluent speech, causing involuntary interruptions and recurrent sound patterns. This study addresses the critical need for the accurate classification of stuttering types. The researchers introduce “TranStutter”, a pioneering Convolution-free Transformer-based DL model, designed to excel in speech disfluency classification. Unlike conventional methods, TranStutter leverages Multi-Head Self-Attention and Positional Encoding to capture intricate temporal patterns, yielding superior accuracy. In this study, the researchers employed two benchmark datasets: the Stuttering Events in Podcasts Dataset (SEP-28k) and the FluencyBank Interview Subset. SEP-28k comprises 28,177 audio clips from podcasts, meticulously annotated into distinct dysfluent and non-dysfluent labels, including Block (BL), Prolongation (PR), Sound Repetition (SR), Word Repetition (WR), and Interjection (IJ). The FluencyBank subset encompasses 4144 audio clips from 32 People Who Stutter (PWS), providing a diverse set of speech samples. TranStutter’s performance was assessed rigorously. On SEP-28k, the model achieved an impressive accuracy of 88.1%. Furthermore, on the FluencyBank dataset, TranStutter demonstrated its efficacy with an accuracy of 80.6%. These results highlight TranStutter’s significant potential in revolutionizing the diagnosis and treatment of stuttering, thereby contributing to the evolving landscape of speech pathology and neurodevelopmental research. The innovative integration of Multi-Head Self-Attention and Positional Encoding distinguishes TranStutter, enabling it to discern nuanced disfluencies with unparalleled precision. This novel approach represents a substantial leap forward in the field of speech pathology, promising more accurate diagnostics and targeted interventions for individuals with stuttering disorders.
Estilos ABNT, Harvard, Vancouver, APA, etc.
14

Mieites, Verónica, José A. Gutiérrez-Gutiérrez, José M. López-Higuera e Olga M. Conde. "Single-Image Multi-Parametric Representation of Optical Properties through Encodings to the HSV Color Space". Applied Sciences 14, n.º 1 (23 de dezembro de 2023): 155. http://dx.doi.org/10.3390/app14010155.

Texto completo da fonte
Resumo:
The visualization of 2D clinical data often relies on color-coded images, but different colormaps can introduce cognitive biases, impacting result interpretation. Moreover, when using color for diagnosis with multiple biomarkers, the application of distinct colormaps for each parameter can hinder comparisons. Our aim was to introduce a visualization technique that utilizes the hue (H), saturation (S), and value (V) in a single image to convey multi-parametric data on various optical properties in an effective manner. To achieve this, we conducted a study involving two datasets, one comprising multi-modality measurements of the human aorta and the other featuring multiple parameters of dystrophic mice muscles. Through this analysis, we determined that H is best suited to emphasize differences related to pathology, while V highlights high-spatial-resolution data disparities, and color alterations effectively indicate changes in chemical component concentrations. Furthermore, encoding structural information as S and V within the same image assists in pinpointing the specific locations of these variations. In cases where all data are of a high resolution, H remains the optimal indicator of pathology, ensuring results’ interpretability. This approach simplifies the selection of an appropriate colormap and enhances the ability to grasp a sample’s characteristics at a single glance.
Estilos ABNT, Harvard, Vancouver, APA, etc.
15

Kumar e Benbasat. "The Effect of Relationship Encoding, Task Type, and Complexity on Information Representation: An Empirical Evaluation of 2D and 3D Line Graphs". MIS Quarterly 28, n.º 2 (2004): 255. http://dx.doi.org/10.2307/25148635.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
16

Xidias, E. K., P. Th Zacharia e N. A. Aspragathos. "Time-optimal task scheduling for articulated manipulators in environments cluttered with obstacles". Robotica 28, n.º 3 (24 de agosto de 2009): 427–40. http://dx.doi.org/10.1017/s0263574709005748.

Texto completo da fonte
Resumo:
SUMMARYThis paper proposes a new approach for solving a generalization of the task scheduling problem for articulated robots (either redundant or non-redundant), where the robot's 2D environment is cluttered with obstacles of arbitrary size, shape and location, while a set of task-points are located in the robot's free-space. The objective is to determine the optimum collision-free robot's tip tour through all task-points passing from each one exactly once and returning to the initial task-point. This scheduling problem combines two computationally NP-hard problems: the optimal scheduling of robot tasks and the collision-free motion planning between the task-points.The proposed approach employs the bump-surface (B-Surface) concept for the representation of the 2D robot's environment by a B-Spline surface embedded in 3D Euclidean space. The time-optimal task scheduling is being searched on the generated B-Surface using a genetic algorithm (GA) with a special encoding in order to take into consideration the infinite configurations corresponding to each task-point. The result of the GA's searching constitutes the solution to the task scheduling problem and satisfies optimally the task scheduling criteria and objectives. Extensive experimental results show the efficiency and the effectiveness of the proposed method to determine the collision-free motion among obstacles.
Estilos ABNT, Harvard, Vancouver, APA, etc.
17

Hu, Yubin, Sheng Ye, Wang Zhao, Matthieu Lin, Yuze He, Yu-Hui Wen, Ying He e Yong-Jin Liu. "O^2-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 3 (24 de março de 2024): 2285–93. http://dx.doi.org/10.1609/aaai.v38i3.28002.

Texto completo da fonte
Resumo:
Occlusion is a common issue in 3D reconstruction from RGB-D videos, often blocking the complete reconstruction of objects and presenting an ongoing problem. In this paper, we propose a novel framework, empowered by a 2D diffusion-based in-painting model, to reconstruct complete surfaces for the hidden parts of objects. Specifically, we utilize a pre-trained diffusion model to fill in the hidden areas of 2D images. Then we use these in-painted images to optimize a neural implicit surface representation for each instance for 3D reconstruction. Since creating the in-painting masks needed for this process is tricky, we adopt a human-in-the-loop strategy that involves very little human engagement to generate high-quality masks. Moreover, some parts of objects can be totally hidden because the videos are usually shot from limited perspectives. To ensure recovering these invisible areas, we develop a cascaded network architecture for predicting signed distance field, making use of different frequency bands of positional encoding and maintaining overall smoothness. Besides the commonly used rendering loss, Eikonal loss, and silhouette loss, we adopt a CLIP-based semantic consistency loss to guide the surface from unseen camera angles. Experiments on ScanNet scenes show that our proposed framework achieves state-of-the-art accuracy and completeness in object-level reconstruction from scene-level RGB-D videos. Code: https://github.com/THU-LYJ-Lab/O2-Recon.
Estilos ABNT, Harvard, Vancouver, APA, etc.
18

A. Aljazaery, Ibtisam, Haider Th Salim Alrikabi e Abdul Hadi M. Alaidi. "Encryption of Color Image Based on DNA Strand and Exponential Factor". International Journal of Online and Biomedical Engineering (iJOE) 18, n.º 03 (8 de março de 2022): 101–13. http://dx.doi.org/10.3991/ijoe.v18i03.28021.

Texto completo da fonte
Resumo:
— In this study, a new method has been eliciting for encoding 2D and 3D color images. The DNA strand construction was used as the basis for structuring the method. This method consisted of two main stages, the encryption and decryption stages. As each stage includes several operations to reach the desired goal. In the coding stage, a special table was prepared to show the mechanism of work. It starts with encoding the DNA bases into two binary orders, then two zeros are added to the string to finally consist of four binary bits whose size is parallel to the representation of a set of hexadecimal numbers represented in binary, where the XOR operation is then done between the two values to be the result is completely different from the original code. Then the binary values we obtained are converted to decimal values that are placed in an array with the same size as the image to be encoded. Finally, this last array was processed with the exponential function factor, so the final result is a 100% encoded image. In the decoding stage, another algorithm was built that reflects the work of what preceded it in the encryption stage, where the result was an exact copy of the original image. It is worth noting that standard images of different sizes were used as testing images. The performance evaluation of the method was calculated based on several factors: MSE, peak PSNR, and the time required to perform the encoding and decoding process. The method achieved good results when compared with the results of other methods in terms of quality and time.
Estilos ABNT, Harvard, Vancouver, APA, etc.
19

Cao, Chen, Baocheng Yu, Wenxia Xu, Guojun Chen e Yuming Ai. "End-to-End Implicit Object Pose Estimation". Sensors 24, n.º 17 (3 de setembro de 2024): 5721. http://dx.doi.org/10.3390/s24175721.

Texto completo da fonte
Resumo:
To accurately estimate the 6D pose of objects, most methods employ a two-stage algorithm. While such two-stage algorithms achieve high accuracy, they are often slow. Additionally, many approaches utilize encoding–decoding to obtain the 6D pose, with many employing bilinear sampling for decoding. However, bilinear sampling tends to sacrifice the accuracy of precise features. In our research, we propose a novel solution that utilizes implicit representation as a bridge between discrete feature maps and continuous feature maps. We represent the feature map as a coordinate field, where each coordinate pair corresponds to a feature value. These feature values are then used to estimate feature maps of arbitrary scales, replacing upsampling for decoding. We apply the proposed implicit module to a bidirectional fusion feature pyramid network. Based on this implicit module, we propose three network branches: a class estimation branch, a bounding box estimation branch, and the final pose estimation branch. For this pose estimation branch, we propose a miniature dual-stream network, which estimates object surface features and complements the relationship between 2D and 3D. We represent the rotation component using the SVD (Singular Value Decomposition) representation method, resulting in a more accurate object pose. We achieved satisfactory experimental results on the widely used 6D pose estimation benchmark dataset Linemod. This innovative approach provides a more convenient solution for 6D object pose estimation.
Estilos ABNT, Harvard, Vancouver, APA, etc.
20

Miao, Jun, Maoxuan Zhang, Yiru Chang e Yuanhua Qiao. "Transformer-Based Recognition Model for Ground-Glass Nodules from the View of Global 3D Asymmetry Feature Representation". Symmetry 15, n.º 12 (12 de dezembro de 2023): 2192. http://dx.doi.org/10.3390/sym15122192.

Texto completo da fonte
Resumo:
Ground-glass nodules (GGN) are the main manifestation of early lung cancer, and accurate and efficient identification of ground-glass pulmonary nodules is of great significance for the treatment of lung diseases. In response to the problem of traditional machine learning requiring manual feature extraction, and most deep learning models applied to 2D image classification, this paper proposes a Transformer-based recognition model for ground-glass nodules from the view of global 3D asymmetry feature representation. Firstly, a 3D convolutional neural network is used as the backbone to extract the features of the three-dimensional CT-image block of pulmonary nodules automatically; secondly, positional encoding information is added to the extracted feature map and input into the Transformer encoder layer for further extraction of global 3D asymmetry features, which can preserve more spatial information and obtain higher-order asymmetry feature representation; finally, the extracted asymmetry features are entered into a support vector machine or ELM-KNN model to further improve the recognition ability of the model. The experimental results show that the recognition accuracy of the proposed method reaches 95.89%, which is 4.79, 2.05, 4.11, and 2.74 percentage points higher than the common deep learning models of AlexNet, DenseNet121, GoogLeNet, and VGG19, respectively; compared with the latest models proposed in the field of pulmonary nodule classification, the accuracy has been improved by 2.05, 2.05, and 0.68 percentage points, respectively, which can effectively improve the recognition accuracy of ground-glass nodules.
Estilos ABNT, Harvard, Vancouver, APA, etc.
21

Tasnim, Nusrat, e Joong-Hwan Baek. "Deep Learning-Based Human Action Recognition with Key-Frames Sampling Using Ranking Methods". Applied Sciences 12, n.º 9 (20 de abril de 2022): 4165. http://dx.doi.org/10.3390/app12094165.

Texto completo da fonte
Resumo:
Nowadays, the demand for human–machine or object interaction is growing tremendously owing to its diverse applications. The massive advancement in modern technology has greatly influenced researchers to adopt deep learning models in the fields of computer vision and image-processing, particularly human action recognition. Many methods have been developed to recognize human activity, which is limited to effectiveness, efficiency, and use of data modalities. Very few methods have used depth sequences in which they have introduced different encoding techniques to represent an action sequence into the spatial format called dynamic image. Then, they have used a 2D convolutional neural network (CNN) or traditional machine learning algorithms for action recognition. These methods are completely dependent on the effectiveness of the spatial representation. In this article, we propose a novel ranking-based approach to select key frames and adopt a 3D-CNN model for action classification. We directly use the raw sequence instead of generating the dynamic image. We investigate the recognition results with various levels of sampling to show the competency and robustness of the proposed system. We also examine the universality of the proposed method on three benchmark human action datasets: DHA (depth-included human action), MSR-Action3D (Microsoft Action 3D), and UTD-MHAD (University of Texas at Dallas Multimodal Human Action Dataset). The proposed method secures better performance than state-of-the-art techniques using depth sequences.
Estilos ABNT, Harvard, Vancouver, APA, etc.
22

Mohd Hanafi, F., e M. I. Hassan. "THE INTEGRATION OF 3D SPATIAL AND NON – SPATIAL COMPONENT FOR STRATA MANAGEMENT". ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-4/W16 (1 de outubro de 2019): 417–24. http://dx.doi.org/10.5194/isprs-archives-xlii-4-w16-417-2019.

Texto completo da fonte
Resumo:
Abstract. Nowadays, due to rapid development and large populations especially in urban areas has caused indoor spaces of buildings becomes bigger and more complex. In most developing countries, the needs of advance cadastre systems and land administration are vital due to rapid development and large population area especially in the city centre such as Kuala Lumpur. More populations have caused more limited space, which explains the need to build a more vertical building. Due to this, an efficient strata management are required for managing the strata title. A study of country-based profile on cadastre domain standard has been conceptualized for land administration in Malaysia that allows integration of 2D and 3D representation of spatial units with supports of both formal and informal Rights, Restrictions and Responsibilities (RRR). Since this research used Malaysia cadastre management as a case study, the proposed model for the Malaysian land administration country profile was embedded in the integration model. Meanwhile, a new working item proposal for LADM Edition II has been introduced on the idea of encoding further integration of land administration with an existing standard such as IndoorGML. Hence, this paper proposes a conceptual model on the integration between both legal space (indoor) and legal object using LADM Edition II and IndoorGML standards for strata purposes. Three objectives had been recognized to achieve the aim of the study. Firstly, to identify the integration of spatial components and non-spatial components for strata management. The second is to develop a conceptual data model for strata with the integration of LADM Edition II and IndoorGML and lastly, is to develop a prototype to validate the proposed conceptual data model. Thus, the development of the conceptual model may provide insights or ideas for future work and land administration on strata purposes.
Estilos ABNT, Harvard, Vancouver, APA, etc.
23

Vargas, David, Ivan Vasconcelos, Yanadet Sripanich e Matteo Ravasi. "Scattering-based focusing for imaging in highly complex media from band-limited, multicomponent data". GEOPHYSICS 86, n.º 5 (1 de setembro de 2021): WC141—WC157. http://dx.doi.org/10.1190/geo2020-0939.1.

Texto completo da fonte
Resumo:
Reconstructing the details of subsurface structures deep beneath complex overburden structures, such as subsalt, remains a challenge for seismic imaging. Over the past few years, the Marchenko redatuming approach has proven to reliably retrieve full-wavefield information in the presence of complex overburden effects. When used for redatuming, current practical Marchenko schemes cannot make use of a priori subsurface models with sharp contrasts because of their requirements regarding initial focusing functions, which for sufficiently complex media can result in redatumed fields with significant waveform inaccuracies. Using a scattering framework, we evaluate an alternative form of the Marchenko representation that aims at retrieving only the unknown perturbations to focusing functions and redatumed fields. From this framework, we have developed a two-step practical focusing-based redatuming scheme that first solves an inverse problem for the background focusing functions, which are then used to estimate the perturbations to focusing functions and redatumed fields. In our scheme, initial focusing functions are significantly different from previous approaches because they contain complex waveforms encoding the full transmission response of the a priori model. Our goal is the handling of not only highly complex media but also realistic data — band-limited, unevenly sampled, free-surface-multiple contaminated data. To that end, we combine the versatility of Rayleigh-Marchenko redatuming with our scattering-based scheme allowing an extended version of the method able to handle single-sided band-limited multicomponent data. This scattering-Rayleigh-Marchenko strategy accurately retrieves wavefields while requiring minimum preprocessing of the data. In support of the new methods, we evaluate a comprehensive set of numerical tests using a complex 2D subsalt model. Our numerical results indicate that the scattering approaches retrieve accurate redatumed fields that appropriately account for the complexity of the a priori model. We find that the improvements in wavefield retrieval translate into measurable improvements in our subsalt images.
Estilos ABNT, Harvard, Vancouver, APA, etc.
24

Kong, Weiwei, Yusheng Du, Leilei He e Zejiang Li. "Improved 3D Object Detection Based on PointPillars". Electronics 13, n.º 15 (24 de julho de 2024): 2915. http://dx.doi.org/10.3390/electronics13152915.

Texto completo da fonte
Resumo:
Despite the recent advancements in 3D object detection, the conventional 3D point cloud object detection algorithms have been found to exhibit limited accuracy for the detection of small objects. To address the challenge of poor detection of small-scale objects, this paper adopts the PointPillars algorithm as the baseline model and proposes a two-stage 3D target detection approach. As a cutting-edge solution, point cloud processing is performed using Transformer models. Additionally, a redefined attention mechanism is introduced to further enhance the detection capabilities of the algorithm. In the first stage, the algorithm uses PointPillars as the baseline model. The central concept of this algorithm is to transform the point cloud space into equal-sized columns. During the feature extraction stage, when the features from all cylinders are transformed into pseudo-images, the proposed algorithm incorporates attention mechanisms adapted from the Squeeze-and-Excitation (SE) method to emphasize and suppress feature information. Furthermore, the 2D convolution of the traditional backbone network is replaced by dynamic convolution. Concurrently, the addition of the attention mechanism further improves the feature representation ability of the network. In the second phase, the candidate frames generated in the first phase are refined using a Transformer-based approach. The proposed algorithm applies channel weighting in the decoder to enhance channel information, leading to improved detection accuracy and reduced false detections. The encoder constructs the initial point features from the candidate frames for encoding. Meanwhile, the decoder applies channel weighting to enhance the channel information, thereby improving the detection accuracy and reducing false detections. In the KITTI dataset, the experimental results verify the effectiveness of this method in small objects detection. Experimental results show that the proposed method significantly improves the detection capability of small objects compared with the baseline PointPillars. In concrete terms, in the moderate difficulty detection category, cars, pedestrians, and cyclists average precision (AP) values increased by 5.30%, 8.1%, and 10.6%, respectively. Moreover, the proposed method surpasses existing mainstream approaches in the cyclist category.
Estilos ABNT, Harvard, Vancouver, APA, etc.
25

Shrivastava, Aditya Divyakant, Neil Swainston, Soumitra Samanta, Ivayla Roberts, Marina Wright Muelas e Douglas B. Kell. "MassGenie: A Transformer-Based Deep Learning Method for Identifying Small Molecules from Their Mass Spectra". Biomolecules 11, n.º 12 (30 de novembro de 2021): 1793. http://dx.doi.org/10.3390/biom11121793.

Texto completo da fonte
Resumo:
The ‘inverse problem’ of mass spectrometric molecular identification (‘given a mass spectrum, calculate/predict the 2D structure of the molecule whence it came’) is largely unsolved, and is especially acute in metabolomics where many small molecules remain unidentified. This is largely because the number of experimentally available electrospray mass spectra of small molecules is quite limited. However, the forward problem (‘calculate a small molecule’s likely fragmentation and hence at least some of its mass spectrum from its structure alone’) is much more tractable, because the strengths of different chemical bonds are roughly known. This kind of molecular identification problem may be cast as a language translation problem in which the source language is a list of high-resolution mass spectral peaks and the ‘translation’ a representation (for instance in SMILES) of the molecule. It is thus suitable for attack using the deep neural networks known as transformers. We here present MassGenie, a method that uses a transformer-based deep neural network, trained on ~6 million chemical structures with augmented SMILES encoding and their paired molecular fragments as generated in silico, explicitly including the protonated molecular ion. This architecture (containing some 400 million elements) is used to predict the structure of a molecule from the various fragments that may be expected to be observed when some of its bonds are broken. Despite being given essentially no detailed nor explicit rules about molecular fragmentation methods, isotope patterns, rearrangements, neutral losses, and the like, MassGenie learns the effective properties of the mass spectral fragment and valency space, and can generate candidate molecular structures that are very close or identical to those of the ‘true’ molecules. We also use VAE-Sim, a previously published variational autoencoder, to generate candidate molecules that are ‘similar’ to the top hit. In addition to using the ‘top hits’ directly, we can produce a rank order of these by ‘round-tripping’ candidate molecules and comparing them with the true molecules, where known. As a proof of principle, we confine ourselves to positive electrospray mass spectra from molecules with a molecular mass of 500Da or lower, including those in the last CASMI challenge (for which the results are known), getting 49/93 (53%) precisely correct. The transformer method, applied here for the first time to mass spectral interpretation, works extremely effectively both for mass spectra generated in silico and on experimentally obtained mass spectra from pure compounds. It seems to act as a Las Vegas algorithm, in that it either gives the correct answer or simply states that it cannot find one. The ability to create and to ‘learn’ millions of fragmentation patterns in silico, and therefrom generate candidate structures (that do not have to be in existing libraries) directly, thus opens up entirely the field of de novo small molecule structure prediction from experimental mass spectra.
Estilos ABNT, Harvard, Vancouver, APA, etc.
26

Cheng, Ta-Ying, Hsuan-Ru Yang, Niki Trigoni, Hwann-Tzong Chen e Tyng-Luh Liu. "Pose Adaptive Dual Mixup for Few-Shot Single-View 3D Reconstruction". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 1 (28 de junho de 2022): 427–35. http://dx.doi.org/10.1609/aaai.v36i1.19920.

Texto completo da fonte
Resumo:
We present a pose adaptive few-shot learning procedure and a two-stage data interpolation regularization, termed Pose Adaptive Dual Mixup (PADMix), for single-image 3D reconstruction. While augmentations via interpolating feature-label pairs are effective in classification tasks, they fall short in shape predictions potentially due to inconsistencies between interpolated products of two images and volumes when rendering viewpoints are unknown. PADMix targets this issue with two sets of mixup procedures performed sequentially. We first perform an input mixup which, combined with a pose adaptive learning procedure, is helpful in learning 2D feature extraction and pose adaptive latent encoding. The stagewise training allows us to build upon the pose invariant representations to perform a follow-up latent mixup under one-to-one correspondences between features and ground-truth volumes. PADMix significantly outperforms previous literature on few-shot settings over the ShapeNet dataset and sets new benchmarks on the more challenging real-world Pix3D dataset.
Estilos ABNT, Harvard, Vancouver, APA, etc.
27

Moreno-Avendano, Santiago, Daniel Mejia-Parra e Oscar Ruiz-Salguero. "Triangle mesh skeletonization using non-deterministic voxel thinning and graph spectrum segmentation". MATEC Web of Conferences 336 (2021): 02030. http://dx.doi.org/10.1051/matecconf/202133602030.

Texto completo da fonte
Resumo:
In the context of shape processing, the estimation of the medial axis is relevant for the simplification and re-parameterization of 3D bodies. The currently used methods are based on (1) General fields, (2) Geometric methods and (3) voxel-based thinning. They present shortcomings such as (1) overrepresentation and non-smoothness of the medial axis due to high frequency nodes and (2) biased-skeletons due to skewed thinning. To partially overcome these limitations, this article presents a non-deterministic algorithm for the estimation of the 1D skeleton of triangular B-Reps or voxel-based body representations. Our method articulates (1) a novel randomized thinning algorithm that avoids possible skewings in the final skeletonization, (2) spectral-based segmentation that eliminates short dead-end branches, and (3) a maximal excursion method for reduction of high frequencies. The test results show that the randomized order in the removal of the instantaneous skin of the solid region eliminates bias of the skeleton, thus respecting features of the initial solid. An Alpha Shape-based inversion of the skeleton encoding results in triangular boundary Representations of the original body, which present reasonable quality for fast non-minute scenes. Future work is needed to (a) tune the spectral filtering of high frequencies off the basic skeleton and (b) extend the algorithm to solid regions whose skeletons mix 1D and 2D entities.
Estilos ABNT, Harvard, Vancouver, APA, etc.
28

Ruan, Xiongtao, e Robert F. Murphy. "Evaluation of methods for generative modeling of cell and nuclear shape". Bioinformatics 35, n.º 14 (7 de dezembro de 2018): 2475–85. http://dx.doi.org/10.1093/bioinformatics/bty983.

Texto completo da fonte
Resumo:
Abstract Motivation Cell shape provides both geometry for, and a reflection of, cell function. Numerous methods for describing and modeling cell shape have been described, but previous evaluation of these methods in terms of the accuracy of generative models has been limited. Results Here we compare traditional methods and deep autoencoders to build generative models for cell shapes in terms of the accuracy with which shapes can be reconstructed from models. We evaluated the methods on different collections of 2D and 3D cell images, and found that none of the methods gave accurate reconstructions using low dimensional encodings. As expected, much higher accuracies were observed using high dimensional encodings, with outline-based methods significantly outperforming image-based autoencoders. The latter tended to encode all cells as having smooth shapes, even for high dimensions. For complex 3D cell shapes, we developed a significant improvement of a method based on the spherical harmonic transform that performs significantly better than other methods. We obtained similar results for the joint modeling of cell and nuclear shape. Finally, we evaluated the modeling of shape dynamics by interpolation in the shape space. We found that our modified method provided lower deformation energies along linear interpolation paths than other methods. This allows practical shape evolution in high dimensional shape spaces. We conclude that our improved spherical harmonic based methods are preferable for cell and nuclear shape modeling, providing better representations, higher computational efficiency and requiring fewer training images than deep learning methods. Availability and implementation All software and data is available at http://murphylab.cbd.cmu.edu/software. Supplementary information Supplementary data are available at Bioinformatics online.
Estilos ABNT, Harvard, Vancouver, APA, etc.
29

Rudolph, Michael, Stefan Schneegass e Amr Rizk. "Transcoding V-PCC Point Cloud Streams in Real-time". ACM Transactions on Multimedia Computing, Communications, and Applications, agosto de 2024. http://dx.doi.org/10.1145/3682062.

Texto completo da fonte
Resumo:
Dynamic Point Clouds are a representation for 3D immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point cloud videos accessible. Addressing this, Video-based Point Cloud Compression (V-PCC) projects points into 2D patches to compress video frames, leveraging the high compression efficiency of legacy video codecs and exploiting temporal correlations in the 2D images. However, clustering and projecting points into meaningful 2D patches is computationally intensive, leading to high encoding latency in V-PCC. Applying adaptive streaming techniques, originating from traditional video streaming, multiplies the computational effort as multiple encodings of the same content are required. In this light, transcoding a compressed representation into lower qualities for dynamic adaptation to user requirements is gaining popularity. To address the high latency when employing the full decoder-encoder stack of V-PCC during transcoding, we propose RABBIT, a novel technique that only re-encodes the underlying video sub-streams. This is in contrast to slow V-PCC transcoding that reconstructs and re-encodes the raw point cloud at a new quality setting. By eliminating expensive overhead resulting from calculations based on the 3D space representation, the latency of RABBIT is bounded by the latency of transcoding the underlying video streams, allowing optimized video codec implementations to be used to meet the real time requirements of adaptive streaming systems. Our evaluations of RABBIT, using various optimized video codec implementations, shows on-par quality with the baseline V-PCC transcoding given a high-quality representation. Given unicast or multicast distribution of a point cloud stream and in-network or edge transcoders, our evaluations show the tradeoff between rate-distortion performance and the required network bandwidth.
Estilos ABNT, Harvard, Vancouver, APA, etc.
30

ALGÜL, Enes. "Classifying RNA Strands with A Novel Graph Representation Based on the Sequence Free Energy". Türk Doğa ve Fen Dergisi, 5 de maio de 2023. http://dx.doi.org/10.46810/tdfd.1240075.

Texto completo da fonte
Resumo:
ABSTRACT Ribonucleic acids (RNA) are macromolecules in all living cell, and they are mediators between DNA and protein. Structurally, RNAs are more similar to the DNA. In this paper, we introduce a compact graph representation utilizing the Minimum Free Energy (MFE) of RNA molecules' secondary structure. This representation represents structural components of secondary RNAs as edges of the graphs, and MFE of these components represents their edge weights. The labeling process is used to determine these weights by considering both the MFE of the 2D RNA structures, and the specific settings in the RNA structures. This encoding is used to make the representation more compact by giving a unique graph representation for the secondary structural elements in the graph. Armed with the representation, we apply graph-based algorithms to categorize RNA molecules. We also present the result of the cutting-edge graph-based methods (All Paths Cycle Embeddings (APC), Shortest Paths Kernel/Embedding (SP), and Weisfeiler - Lehman and Optimal Assignment Kernel (WLOA)) on our dataset [1] using this new graph representation. Finally, we compare the results of the graph-based algorithms to a standard bioinformatics algorithm (Needleman-Wunsch) used for DNA and RNA comparison.
Estilos ABNT, Harvard, Vancouver, APA, etc.
31

Wang, N. F., e K. Tai. "Design of 2-DOF Compliant Mechanisms to Form Grip-and-Move Manipulators for 2D Workspace". Journal of Mechanical Design 132, n.º 3 (1 de março de 2010). http://dx.doi.org/10.1115/1.4001213.

Texto completo da fonte
Resumo:
This paper demonstrates the design of compliant grip-and-move manipulators by structural optimization using genetic algorithms. The manipulator is composed of two compliant mechanisms (each with two degrees of freedom) that work like two fingers so that the manipulator can grip an object and convey it from one point to another anywhere within a two-dimensional workspace. The synthesis of such compliant mechanisms is accomplished by formulating the problem as a structural topology and shape optimization problem with multiple objectives and constraints to achieve the desired behavior of the manipulator. A multiobjective genetic algorithm is then applied coupled with an enhanced morphological representation for defining and encoding the structural geometry variables. The solution framework is integrated with a nonlinear finite element code for large-displacement analyses of the compliant structures to compute the paths generated by these mechanisms, with the resulting optimal designs used to realize various manipulator configurations.
Estilos ABNT, Harvard, Vancouver, APA, etc.
32

Khandhadia, Amit P., Aidan P. Murphy, Kenji W. Koyano, Elena M. Esch e David A. Leopold. "Encoding of 3D physical dimensions by face-selective cortical neurons". Proceedings of the National Academy of Sciences 120, n.º 9 (21 de fevereiro de 2023). http://dx.doi.org/10.1073/pnas.2214996120.

Texto completo da fonte
Resumo:
Neurons throughout the primate inferior temporal (IT) cortex respond selectively to visual images of faces and other complex objects. The response magnitude of neurons to a given image often depends on the size at which the image is presented, usually on a flat display at a fixed distance. While such size sensitivity might simply reflect the angular subtense of retinal image stimulation in degrees, one unexplored possibility is that it tracks the real-world geometry of physical objects, such as their size and distance to the observer in centimeters. This distinction bears fundamentally on the nature of object representation in IT and on the scope of visual operations supported by the ventral visual pathway. To address this question, we assessed the response dependency of neurons in the macaque anterior fundus (AF) face patch to the angular versus physical size of faces. We employed a macaque avatar to stereoscopically render three-dimensional (3D) photorealistic faces at multiple sizes and distances, including a subset of size/distance combinations designed to cast the same size retinal image projection. We found that most AF neurons were modulated principally by the 3D physical size of the face rather than its two-dimensional (2D) angular size on the retina. Further, most neurons responded strongest to extremely large and small faces, rather than to those of normal size. Together, these findings reveal a graded encoding of physical size among face patch neurons, providing evidence that category-selective regions of the primate ventral visual pathway participate in a geometric analysis of real-world objects.
Estilos ABNT, Harvard, Vancouver, APA, etc.
33

Liu, Yongtao, Bryan D. Huey, Maxim A. Ziatdinov e Sergei V. Kalinin. "Physical discovery in representation learning via conditioning on prior knowledge". Journal of Applied Physics 136, n.º 6 (14 de agosto de 2024). http://dx.doi.org/10.1063/5.0222403.

Texto completo da fonte
Resumo:
Recent advances in electron, scanning probe, optical, and chemical imaging and spectroscopy yield bespoke data sets containing the information of structure and functionality of complex systems. In many cases, the resulting data sets are underpinned by low-dimensional simple representations encoding the factors of variability within the data. The representation learning methods seek to discover these factors of variability, ideally further connecting them with relevant physical mechanisms. However, generally, the task of identifying the latent variables corresponding to actual physical mechanisms is extremely complex. Here, we present an empirical study of an approach based on conditioning the data on the known (continuous) physical parameters and systematically compare it with the previously introduced approach based on the invariant variational autoencoders. The conditional variational autoencoder (cVAE) approach does not rely on the existence of the invariant transforms and hence allows for much greater flexibility and applicability. Interestingly, cVAE allows for limited extrapolation outside of the original domain of the conditional variable. However, this extrapolation is limited compared to the cases when true physical mechanisms are known, and the physical factor of variability can be disentangled in full. We further show that introducing the known conditioning results in the simplification of the latent distribution if the conditioning vector is correlated with the factor of variability in the data, thus allowing us to separate relevant physical factors. We initially demonstrate this approach using 1D and 2D examples on a synthetic data set and then extend it to the analysis of experimental data on ferroelectric domain dynamics visualized via piezoresponse force microscopy.
Estilos ABNT, Harvard, Vancouver, APA, etc.
34

Zhang, Jianqun, Qing Zhang, Xianrong Qin e Yuantao Sun. "Robust fault diagnosis of quayside container crane gearbox based on 2D image representation in frequency domain and CNN". Structural Health Monitoring, 24 de abril de 2023, 147592172311688. http://dx.doi.org/10.1177/14759217231168877.

Texto completo da fonte
Resumo:
To accurately diagnose the quayside container crane (QCC) gearbox faults, this article proposes a method that combines the frequency-domain Markov transformation field (FDMTF) and multi-branch residual convolutional neural network (MBRCNN). Firstly, the gearbox vibration signal is converted into the frequency domain to reveal the components and amplitude of signals stably and concisely. Then, the one-dimensional frequency signal is encoded into the two-dimensional image by the Markov transformation field to capture the dynamic characteristics of signals. Thirdly, the MBRCNN network is constructed, which can extract multi-scale features and alleviate the problems caused by the deep network structure. Finally, the FDMTF image is fed into the constructed MBRCNN model for pattern recognition. The effectiveness of the proposed FDMTF–MBRCNN method is verified by two case studies. In Case 1, the diagnosis results of a benchmark dataset achieve 100% accuracy, better than seven state-of-the-art methods published in recent 3 years. In Case 2, the diagnosis results of the dataset collected from a 1:4 scaled test rig achieve 98.85% accuracy, better than eleven encoding methods and four convolutional neural network methods. It also can obtain a recognition accuracy of more than 94% under the conditions of small sample, different network hyper-parameters, or variable loads, which verifies its robustness. These case studies show that the FDMTF–MBRCNN method is expected to be applied to the actual fault diagnosis of QCC gearboxes.
Estilos ABNT, Harvard, Vancouver, APA, etc.
35

Chakravarthula, Praneeth, Ethan Tseng, Henry Fuchs e Felix Heide. "Hogel-free Holography". ACM Transactions on Graphics, 30 de março de 2022. http://dx.doi.org/10.1145/3516428.

Texto completo da fonte
Resumo:
Holography is a promising avenue for high-quality displays without requiring bulky, complex optical systems. While recent work has demonstrated accurate hologram generation of 2D scenes, high-quality holographic projections of 3D scenes has been out of reach until now. Existing multiplane 3D holography approaches fail to model wavefronts in the presence of partial occlusion while holographic stereogram methods have to make a fundamental trade off between spatial and angular resolution. In addition, existing 3D holographic display methods rely on heuristic encoding of complex amplitude into phase-only pixels which results in holograms with severe artifacts. Fundamental limitations of the input representation, wavefront modeling, and optimization methods prohibit artifact-free 3D holographic projections in today’s displays. To lift these limitations, we introduce hogel-free holography which optimizes for true 3D holograms, supporting both depth- and view- dependent effects for the first time. Our approach overcomes the fundamental spatio-angular resolution trade-off typical to stereogram approaches. Moreover, it avoids heuristic encoding schemes to achieve high image fidelity over a 3D volume. We validate that the proposed method achieves 10 dB PSNR improvement on simulated holographic reconstructions. We also validate our approach on an experimental prototype with accurate parallax and depth focus effects.
Estilos ABNT, Harvard, Vancouver, APA, etc.
36

A, Hepzibah Christinal, Kowsalya G, Abraham Chandy D, Jebasingh S e Chandrajit Bajaj. "Analysis of the Measurement Matrix in Directional Predictive Coding for Compressive Sensing of Medical Images". ELCVIA Electronic Letters on Computer Vision and Image Analysis 20, n.º 2 (25 de janeiro de 2022). http://dx.doi.org/10.5565/rev/elcvia.1412.

Texto completo da fonte
Resumo:
Compressive sensing of 2D signals involves three fundamental steps: sparse representation, linear measurement matrix, and recovery of the signal. This paper focuses on analyzing the efficiency of various measurement matrices for compressive sensing of medical images based on theoretical predictive coding. During encoding, the prediction is efficiently chosen by four directional predictive modes for block-based compressive sensing measurements. In this work, Gaussian, Bernoulli, Laplace, Logistic, and Cauchy random matrices are used as the measurement matrices. While decoding, the same optimal prediction is de-quantized. Peak-signal-to-noise ratio and sparsity are used for evaluating the performance of measurement matrices. The experimental result shows that the spatially directional predictive coding (SDPC) with Laplace measurement matrices performs better compared to scalar quantization (SQ) and differential pulse code modulation (DPCM) methods. The results indicate that the Laplace measurement matrix is the most suitable in compressive sensing of medical images.
Estilos ABNT, Harvard, Vancouver, APA, etc.
37

Wirnsberger, Gregor, Iva Pritišanac, Gustav Oberdorfer e Karl Gruber. "Flattening the curve—How to get better results with small deep‐mutational‐scanning datasets". Proteins: Structure, Function, and Bioinformatics, 19 de março de 2024. http://dx.doi.org/10.1002/prot.26686.

Texto completo da fonte
Resumo:
AbstractProteins are used in various biotechnological applications, often requiring the optimization of protein properties by introducing specific amino‐acid exchanges. Deep mutational scanning (DMS) is an effective high‐throughput method for evaluating the effects of these exchanges on protein function. DMS data can then inform the training of a neural network to predict the impact of mutations. Most approaches use some representation of the protein sequence for training and prediction. As proteins are characterized by complex structures and intricate residue interaction networks, directly providing structural information as input reduces the need to learn these features from the data. We introduce a method for encoding protein structures as stacked 2D contact maps, which capture residue interactions, their evolutionary conservation, and mutation‐induced interaction changes. Furthermore, we explored techniques to augment neural network training performance on smaller DMS datasets. To validate our approach, we trained three neural network architectures originally used for image analysis on three DMS datasets, and we compared their performances with networks trained solely on protein sequences. The results confirm the effectiveness of the protein structure encoding in machine learning efforts on DMS data. Using structural representations as direct input to the networks, along with data augmentation and pretraining, significantly reduced demands on training data size and improved prediction performance, especially on smaller datasets, while performance on large datasets was on par with state‐of‐the‐art sequence convolutional neural networks. The methods presented here have the potential to provide the same workflow as DMS without the experimental and financial burden of testing thousands of mutants. Additionally, we present an open‐source, user‐friendly software tool to make these data analysis techniques accessible, particularly to biotechnology and protein engineering researchers who wish to apply them to their mutagenesis data.
Estilos ABNT, Harvard, Vancouver, APA, etc.
38

Gantzer, Philippe, Ruben Staub, Yu Harabuchi, Satoshi Maeda e Alexandre Varnek. "Chemography‐guided analysis of a reaction path network for ethylene hydrogenation with a model Wilkinson's catalyst". Molecular Informatics, 9 de agosto de 2024. http://dx.doi.org/10.1002/minf.202400063.

Texto completo da fonte
Resumo:
AbstractVisualization and analysis of large chemical reaction networks become rather challenging when conventional graph‐based approaches are used. As an alternative, we propose to use the chemical cartography (“chemography”) approach, describing the data distribution on a 2‐dimensional map. Here, the Generative Topographic Mapping (GTM) algorithm ‐ an advanced chemography approach ‐ has been applied to visualize the reaction path network of a simplified Wilkinson's catalyst‐catalyzed hydrogenation containing some 105 structures generated with the help of the Artificial Force Induced Reaction (AFIR) method using either Density Functional Theory or Neural Network Potential (NNP) for potential energy surface calculations. Using new atoms permutation invariant 3D descriptors for structure encoding, we've demonstrated that GTM possesses the abilities to cluster structures that share the same 2D representation, to visualize potential energy surface, to provide an insight on the reaction path exploration as a function of time and to compare reaction path networks obtained with different methods of energy assessment.
Estilos ABNT, Harvard, Vancouver, APA, etc.
39

Šuba, Radan. "Vario-scale data structures". Architecture and the Built Environment, 2018. http://dx.doi.org/10.59490/abe.2017.18.3592.

Texto completo da fonte
Resumo:
The previous chapter presents state-of-the-art in map generalization at NMAs’ and continuous generalization. There is a noticeable technological shift towards continuous generalisation which supports interactive map use where users can zoom in, out and navigate more gradual way. Despite some research efforts there is no satisfactory solution yet. Therefore, this chapter introduces the truly smooth vario-scale structure for geographic information where a small step in the scale dimension leads to a small change in representation of geographic features that are represented on the map. With this approach there is no (or minimal) geometric data redundancy and there is no (temporal) delay any more between the availability of data sets at different map scales (as was and is the case with more traditional approaches of multi-scale representations). Moreover, continuous generalisation of real world features is based on the structure that can be used for presenting a smooth zoom action to the user. More specific, Section 3.1 and 3.2 provide historical overview of the development and the theoretical framework for vario-scale representations: the tGAP-structure (topological Generalized Area Partitioning). Section 3.3 describes the initial effort to generate the better cartographic content; the concept of constraint tGAP. Section 3.4 explains the 3D SSC (Space-Scale Cube) encoding of 2D truly vario-scale data. Section 3.5 shows idea how to combine more level of details in one map. Section 3.6 summarizes the open questions of the vario-scale concept and it indicates research covered in following chapters. Finally, Section 3.7 presents vario-scale data research in parallel to this PhD for progressive data transfer. Then, Section 3.8 summarises the chapter.
Estilos ABNT, Harvard, Vancouver, APA, etc.
40

"DESIGN ANALYSIS OF 2-D DWT BASED IMAGE COMPRESSION USING FPGA FOR MEMORY SPEED OPTIMIZER." July-2020 9, n.º 7 (2020): 214–20. http://dx.doi.org/10.29121/ijesrt.v9.i7.2020.22.

Texto completo da fonte
Resumo:
Wavelet Transform is successfully applied a number of fields, covering anything from pure mathematics to applied science. Numerous studies, done on wavelet Transform, have proven its advantages in image processing and data compression and have made it a encoding technique in recent data compression standards along with multi- resolution decomposition of signal and image processing applications. Pure software implementations for the Discrete Wavelet Transform (DWT), however, seem the performance bottleneck in realtime systems in terms of performance. Therefore, hardware acceleration for the DWT has developed into topic of contemporary research. On the compression of image using 2-Dimensional DWT (2D-DWT) two filters are widely-used, a highpass as well as a lowpass filter. Because filter coefficients are irrational numbers, it's advocated that they must be approximated with the use of binary fractions. The truth and efficiency with that your filter coefficients are rationalized within the implementation impacts the compression and critical hardware properties just like throughput and power consumption. An expensive precision representation ensures good compression performance, but at the expense of increased hardware resources and processing time. Conversely, lower precision with the filter coefficients ends up with smaller, faster hardware, but at the expense of poor compression performance.
Estilos ABNT, Harvard, Vancouver, APA, etc.
41

Chen, Xi, Leah Varghese, Suzanne L. Baker e William J. Jagust. "Age‐related and AD pathology‐related differences in neural activation and neural specificity in memory". Alzheimer's & Dementia 19, S24 (dezembro de 2023). http://dx.doi.org/10.1002/alz.082868.

Texto completo da fonte
Resumo:
AbstractBackgroundBeta‐amyloid (Aß) and tau deposition is spatially selective in the early stages of Alzheimer’s Disease (AD), with Aß depositing predominately in posterior‐medial (PM) regions and tau accumulating in the medial temporal lobe (MTL) with increasing age before spreading to anterior‐temporal (AT) regions. Functionally, AT and PM are linked to object and scene processing, respectively, and the hippocampus in the MTL is important for integration and forming associations between objects and scenes. Given the differential vulnerability of AT, PM, and MTL to Aß and tau, we examined the behavioral and neural differences in memory for objects, scenes, and object‐in‐scene pairs between young and cognitively normal older adults with and without AD pathologies.MethodTwenty‐three young (19‐34 yrs) and 32 older (60‐91 yrs) participants completed an incidental encoding task on object, scene, and object‐in‐scene pair images during fMRI and a recognition test outside the scanner (Fig.1). Thirty older participants underwent PiB and FTP PET that measured Aß and tau. We conducted univariate whole‐brain analysis and multi‐voxel pattern analysis (MVPA) of AT, PM, and the hippocampus. The MVPA trains a support vector machine to classify the stimulus category (object, scene, pair) based on the ROI’s activation pattern. A higher classification accuracy represents informative neural representation that reflects functional specificity of the region.ResultBehaviorally, older people had worse object (ps<.027) and pair (ps<.017) memory. Greater temporal meta‐ROI tau was related to worse object (p = .032) and pair (p = .022) memory. Whole‐brain analyses showed activations in AT and PM networks for object and scene processing, and additional frontal, temporal, and occipital regions for pair processing (Fig.2A‐C) where older adults, compared to younger adults, showed activation reductions in fusiform, parahippocampal, precuneus, and occipital regions (Fig.2D). The MVPA revealed age‐related reductions in classification accuracy for pairs (Fig.3A) and that greater Aß and tau were related to lower accuracy in the hippocampus (Fig.3B).ConclusionWe found evidence supporting specific vulnerabilities in object and associative memory in older adults correlated with tau pathology. FMRI findings suggest both lower neural activation and lower neural specificity in older adults when processing complex associations. AD pathology appears to contribute to lower neural specificity in the hippocampus.
Estilos ABNT, Harvard, Vancouver, APA, etc.
42

Agarwal, Abhishek, Sriram Goverapet Srinivasan e Beena Rai. "Data-Driven Discovery of 2D Materials for Solar Water Splitting". Frontiers in Materials 8 (16 de setembro de 2021). http://dx.doi.org/10.3389/fmats.2021.679269.

Texto completo da fonte
Resumo:
Hydrogen economy, wherein hydrogen is used as the fuel in the transport and energy sectors, holds significant promise in mitigating the deleterious effects of global warming. Photocatalytic water splitting using sunlight is perhaps the cleanest way of producing the hydrogen fuel. Among various other factors, widespread adoption of this technology has mainly been stymied by the lack of a catalyst material with high efficiency. 2D materials have shown significant promise as efficient photocatalysts for water splitting. The availability of open databases containing the “computed” properties of 2D materials and advancements in deep learning now enable us to do “inverse” design of these 2D photocatalysts for water splitting. We use one such database (Jain et al., ACS Energ. Lett. 2019, 4, 6, 1410–1411) to build a generative model for the discovery of novel 2D photocatalysts. The structures of the materials were converted into a 3D image–based representation that was used to train a cell, a basis autoencoder and a segmentation network to ascertain the lattice parameters as well as position of atoms from the images. Subsequently, the cell and basis encodings were used to train a conditional variational autoencoder (CVAE) to learn a continuous representation of the materials in a latent space. The latent space of the CVAE was then sampled to generate several new 2D materials that were likely to be efficient photocatalysts for water splitting. The bandgap of the generated materials was predicted using a graph neural network model while the band edge positions were obtained via empirical correlations. Although our generative modeling framework was used to discover novel 2D photocatalysts for water splitting reaction, it is generic in nature and can be used directly to discover novel materials for other applications as well.
Estilos ABNT, Harvard, Vancouver, APA, etc.
43

Li, Liujunli, Timo Flesch, Ce Ma, Jingjie Li, Yizhou Chen, Hung-Tu Chen e Jeffrey C. Erlich. "Encoding of 2D self-centered plans and world-centered positions in the rat frontal orienting field." Journal of Neuroscience, 12 de agosto de 2024, e0018242024. http://dx.doi.org/10.1523/jneurosci.0018-24.2024.

Texto completo da fonte
Resumo:
The neural mechanisms of motor planning have been extensively studied in rodents. Preparatory activity in the frontal cortex predicts upcoming choice, but limitations of typical tasks have made it challenging to determine whether the spatial information is in a self-centered direction reference frame or a world-centered position reference frame. Here, we trained male rats to make delayed visually-guided orienting movements to six different directions, with four different target positions for each direction, which allowed us to disentangle direction versus position tuning in neural activity. We recorded single unit activity from the rat frontal orienting field (FOF) in the secondary motor cortex, a region involved in planning orienting movements. Population analyses revealed that the FOF encodes two separate 2D maps of space. First, a 2D map of the planned and ongoing movement in a self-centered direction reference frame. Second, a 2D map of the animal’s current position on the port wall in a world-centered reference frame. Thus, preparatory activity in the FOF represents self-centered upcoming movement directions, but FOF neurons multiplex both self- and world-reference frame variables at the level of single neurons. Neural network model comparison supports the view that despite the presence of world-centered representations, the FOF receives the target information as self-centered input and generates self-centered planning signals.Significance StatementMotor planning in the real world involves complex coordinate transformations: eye to head to body, hand, etc. Typical rodent tasks (e.g., go/no-go or two-alternative) are too simple for studying these processes. We trained rats to perform delayed, visually-guided movements in multiple directions from varied start positions to explore coordinate systems in planning. We found that the Frontal Orienting Field (FOF) encodes two separate maps: one for planning in self-centered coordinates and another for encoding current position in world-centered coordinates. Additionally, position and direction information are multiplexed at the single-neuron level. Our task and findings provide a foundation for understanding complex motor planning at a circuit level.
Estilos ABNT, Harvard, Vancouver, APA, etc.
44

Stouffer, Kaitlin M., Alain Trouvé, Laurent Younes, Michael Kunst, Lydia Ng, Hongkui Zeng, Manjari Anant et al. "Cross-modality mapping using image varifolds to align tissue-scale atlases to molecular-scale measures with application to 2D brain sections". Nature Communications 15, n.º 1 (25 de abril de 2024). http://dx.doi.org/10.1038/s41467-024-47883-4.

Texto completo da fonte
Resumo:
AbstractThis paper explicates a solution to building correspondences between molecular-scale transcriptomics and tissue-scale atlases. This problem arises in atlas construction and cross-specimen/technology alignment where specimens per emerging technology remain sparse and conventional image representations cannot efficiently model the high dimensions from subcellular detection of thousands of genes. We address these challenges by representing spatial transcriptomics data as generalized functions encoding position and high-dimensional feature (gene, cell type) identity. We map onto low-dimensional atlas ontologies by modeling regions as homogeneous random fields with unknown transcriptomic feature distribution. We solve simultaneously for the minimizing geodesic diffeomorphism of coordinates through LDDMM and for these latent feature densities. We map tissue-scale mouse brain atlases to gene-based and cell-based transcriptomics data from MERFISH and BARseq technologies and to histopathology and cross-species atlases to illustrate integration of diverse molecular and cellular datasets into a single coordinate system as a means of comparison and further atlas construction.
Estilos ABNT, Harvard, Vancouver, APA, etc.
45

Stark, Philipp, Efe Bozkir, Weronika Sójka, Markus Huff, Enkelejda Kasneci e Richard Göllner. "The impact of presentation modes on mental rotation processing: a comparative analysis of eye movements and performance". Scientific Reports 14, n.º 1 (29 de maio de 2024). http://dx.doi.org/10.1038/s41598-024-60370-6.

Texto completo da fonte
Resumo:
AbstractMental rotation is the ability to rotate mental representations of objects in space. Shepard and Metzler’s shape-matching tasks, frequently used to test mental rotation, involve presenting pictorial representations of 3D objects. This stimulus material has raised questions regarding the ecological validity of the test for mental rotation with actual visual 3D objects. To systematically investigate differences in mental rotation with pictorial and visual stimuli, we compared data of $$N=54$$ N = 54 university students from a virtual reality experiment. Comparing both conditions within subjects, we found higher accuracy and faster reaction times for 3D visual figures. We expected eye tracking to reveal differences in participants’ stimulus processing and mental rotation strategies induced by the visual differences. We statistically compared fixations (locations), saccades (directions), pupil changes, and head movements. Supplementary Shapley values of a Gradient Boosting Decision Tree algorithm were analyzed, which correctly classified the two conditions using eye and head movements. The results indicated that with visual 3D figures, the encoding of spatial information was less demanding, and participants may have used egocentric transformations and perspective changes. Moreover, participants showed eye movements associated with more holistic processing for visual 3D figures and more piecemeal processing for pictorial 2D figures.
Estilos ABNT, Harvard, Vancouver, APA, etc.
46

Baker, Nicholas, e Philip J. Kellman. "Shape from dots: a window into abstraction processes in visual perception". Frontiers in Computer Science 6 (16 de maio de 2024). http://dx.doi.org/10.3389/fcomp.2024.1367534.

Texto completo da fonte
Resumo:
IntroductionA remarkable phenomenon in perception is that the visual system spontaneously organizes sets of discrete elements into abstract shape representations. We studied perceptual performance with dot displays to discover what spatial relationships support shape perception.MethodsIn Experiment 1, we tested conditions that lead dot arrays to be perceived as smooth contours vs. having vertices. We found that the perception of a smooth contour vs. a vertex was influenced by spatial relations between dots beyond the three points that define the angle of the point in question. However, there appeared to be a hard boundary around 90° such that any angle 90° or less was perceived as a vertex regardless of the spatial relations of ancillary dots. We hypothesized that dot arrays whose triplets were perceived as smooth curves would be more readily perceived as a unitary object because they can be encoded more economically. In Experiment 2, we generated dot arrays with and without such “vertex triplets” and compared participants’ phenomenological reports of a unified shape with smooth curves vs. shapes with angular corners. Observers gave higher shape ratings for dot arrays from curvilinear shapes. In Experiment 3, we tested shape encoding using a mental rotation task. Participants judged whether two dot arrays were the same or different at five angular differences. Subjects responded reliably faster for displays without vertex triplets, suggesting economical encoding of smooth displays. We followed this up in Experiment 4 using a visual search task. Shapes with and without vertex triplets were embedded in arrays with 25 distractor dots. Participants were asked to detect which display in a 2IFC paradigm contained a shape against a distractor with random dots. Performance was better when the dots were sampled from a smooth shape than when they were sampled from a shape with vertex triplets.Results and discussionThese results suggest that the visual system processes dot arrangements as coherent shapes automatically using precise smoothness constraints. This ability may be a consequence of processes that extract curvature in defining object shape and is consistent with recent theory and evidence suggesting that 2D contour representations are composed of constant curvature primitives.
Estilos ABNT, Harvard, Vancouver, APA, etc.
47

Altenhöner, Reinhard, Ina Blümel, Franziska Boehm, Jens Bove, Katrin Bicher, Christian Bracht, Ortrun Brand et al. "NFDI4Culture - Consortium for research data on material and immaterial cultural heritage". Research Ideas and Outcomes 6 (31 de julho de 2020). http://dx.doi.org/10.3897/rio.6.e57036.

Texto completo da fonte
Resumo:
Digital data on tangible and intangible cultural assets is an essential part of daily life, communication and experience. It has a lasting influence on the perception of cultural identity as well as on the interactions between research, the cultural economy and society. Throughout the last three decades, many cultural heritage institutions have contributed a wealth of digital representations of cultural assets (2D digital reproductions of paintings, sheet music, 3D digital models of sculptures, monuments, rooms, buildings), audio-visual data (music, film, stage performances), and procedural research data such as encoding and annotation formats. The long-term preservation and FAIR availability of research data from the cultural heritage domain is fundamentally important, not only for future academic success in the humanities but also for the cultural identity of individuals and society as a whole. Up to now, no coordinated effort for professional research data management on a national level exists in Germany. NFDI4Culture aims to fill this gap and create a user-centered, research-driven infrastructure that will cover a broad range of research domains from musicology, art history and architecture to performance, theatre, film, and media studies. The research landscape addressed by the consortium is characterized by strong institutional differentiation. Research units in the consortium's community of interest comprise university institutes, art colleges, academies, galleries, libraries, archives and museums. This diverse landscape is also characterized by an abundance of research objects, methodologies and a great potential for data-driven research. In a unique effort carried out by the applicant and co-applicants of this proposal and ten academic societies, this community is interconnected for the first time through a federated approach that is ideally suited to the needs of the participating researchers. To promote collaboration within the NFDI, to share knowledge and technology and to provide extensive support for its users have been the guiding principles of the consortium from the beginning and will be at the heart of all workflows and decision-making processes. Thanks to these principles, NFDI4Culture has gathered strong support ranging from individual researchers to high-level cultural heritage organizations such as the UNESCO, the International Council of Museums, the Open Knowledge Foundation and Wikimedia. On this basis, NFDI4Culture will take innovative measures that promote a cultural change towards a more reflective and sustainable handling of research data and at the same time boost qualification and professionalization in data-driven research in the domain of cultural heritage. This will create a long-lasting impact on science, cultural economy and society as a whole.
Estilos ABNT, Harvard, Vancouver, APA, etc.
Oferecemos descontos em todos os planos premium para autores cujas obras estão incluídas em seleções literárias temáticas. Contate-nos para obter um código promocional único!

Vá para a bibliografia