Academic literature on the topic 'Multi-Branch generative models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Multi-Branch generative models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Multi-Branch generative models"

1

Xiong, Zuobin, Wei Li, and Zhipeng Cai. "Federated Generative Model on Multi-Source Heterogeneous Data in IoT." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10537–45. http://dx.doi.org/10.1609/aaai.v37i9.26252.

Full text
Abstract:
The study of generative models is a promising branch of deep learning techniques, which has been successfully applied to different scenarios, such as Artificial Intelligence and the Internet of Things. While in most of the existing works, the generative models are realized as a centralized structure, raising the threats of security and privacy and the overburden of communication costs. Rare efforts have been committed to investigating distributed generative models, especially when the training data comes from multiple heterogeneous sources under realistic IoT settings. In this paper, to handle this challenging problem, we design a federated generative model framework that can learn a powerful generator for the hierarchical IoT systems. Particularly, our generative model framework can solve the problem of distributed data generation on multi-source heterogeneous data in two scenarios, i.e., feature related scenario and label related scenario. In addition, in our federated generative models, we develop a synchronous and an asynchronous updating methods to satisfy different application requirements. Extensive experiments on a simulated dataset and multiple real datasets are conducted to evaluate the data generation performance of our proposed generative models through comparison with the state-of-the-arts.
APA, Harvard, Vancouver, ISO, and other styles
2

Safarov, Furkat, Ugiloy Khojamuratova, Misirov Komoliddin, Furkat Bolikulov, Shakhnoza Muksimova, and Young-Im Cho. "MBGPIN: Multi-Branch Generative Prior Integration Network for Super-Resolution Satellite Imagery." Remote Sensing 17, no. 5 (February 25, 2025): 805. https://doi.org/10.3390/rs17050805.

Full text
Abstract:
Achieving super-resolution with satellite images is a critical task for enhancing the utility of remote sensing data across various applications, including urban planning, disaster management, and environmental monitoring. Traditional interpolation methods often fail to recover fine details, while deep-learning-based approaches, including convolutional neural networks (CNNs) and generative adversarial networks (GANs), have significantly advanced super-resolution performance. Recent studies have explored large-scale models, such as Transformer-based architectures and diffusion models, demonstrating improved texture realism and generalization across diverse datasets. However, these methods frequently have high computational costs and require extensive datasets for training, making real-world deployment challenging. We propose the multi-branch generative prior integration network (MBGPIN) to address these limitations. This novel framework integrates multiscale feature extraction, hybrid attention mechanisms, and generative priors derived from pretrained VQGAN models. The dual-pathway architecture of the MBGPIN includes a feature extraction pathway for spatial features and a generative prior pathway for external guidance, dynamically fused using an adaptive generative prior fusion (AGPF) module. Extensive experiments on benchmark datasets such as UC Merced, NWPU-RESISC45, and RSSCN7 demonstrate that the MBGPIN achieves superior performance compared to state-of-the-art methods, including large-scale super-resolution models. The MBGPIN delivers a higher peak signal-to-noise ratio (PSNR) and higher structural similarity index measure (SSIM) scores while preserving high-frequency details and complex textures. The model also achieves significant computational efficiency, with reduced floating point operations (FLOPs) and faster inference times, making it scalable for real-world applications.
APA, Harvard, Vancouver, ISO, and other styles
3

Niu, Zhenye, Yuxia Li, Yushu Gong, Bowei Zhang, Yuan He, Jinglin Zhang, Mengyu Tian, and Lei He. "Multi-Class Guided GAN for Remote-Sensing Image Synthesis Based on Semantic Labels." Remote Sensing 17, no. 2 (January 20, 2025): 344. https://doi.org/10.3390/rs17020344.

Full text
Abstract:
In the scenario of limited labeled remote-sensing datasets, the model’s performance is constrained by the insufficient availability of data. Generative model-based data augmentation has emerged as a promising solution to this limitation. While existing generative models perform well in natural scene domains (e.g., faces and street scenes), their performance in remote sensing is hindered by severe data imbalance and the semantic similarity among land-cover classes. To tackle these challenges, we propose the Multi-Class Guided GAN (MCGGAN), a novel network for generating remote-sensing images from semantic labels. Our model features a dual-branch architecture with a global generator that captures the overall image structure and a multi-class generator that improves the quality and differentiation of land-cover types. To integrate these generators, we design a shared-parameter encoder for consistent feature encoding across two branches, and a spatial decoder that synthesizes outputs from the class generators, preventing overlap and confusion. Additionally, we employ perceptual loss (LVGG) to assess perceptual similarity between generated and real images, and texture matching loss (LT) to capture fine texture details. To evaluate the quality of image generation, we tested multiple models on two custom datasets (one from Chongzhou, Sichuan Province, and another from Wuzhen, Zhejiang Province, China) and a public dataset LoveDA. The results show that MCGGAN achieves improvements of 52.86 in FID, 0.0821 in SSIM, and 0.0297 in LPIPS compared to the Pix2Pix baseline. We also conducted comparative experiments to assess the semantic segmentation accuracy of the U-Net before and after incorporating the generated images. The results show that data augmentation with the generated images leads to an improvement of 4.47% in FWIoU and 3.23% in OA across the Chongzhou and Wuzhen datasets. Experiments show that MCGGAN can be effectively used as a data augmentation approach to improve the performance of downstream remote-sensing image segmentation tasks.
APA, Harvard, Vancouver, ISO, and other styles
4

Meng, Xiang Bao, Lei Wang, and Zi Jian Pan. "Parametric Modeling of Transition Tube with Constant Section Area along Straight, Circular and Oblique Central Route on CATIA." Advanced Materials Research 619 (December 2012): 18–21. http://dx.doi.org/10.4028/www.scientific.net/amr.619.18.

Full text
Abstract:
Parametric modeling of transition tubes were implemented based on constant cross section area assumption along the main central routes on CATIA software. The key objective of modeling these similar structures is to provide more geometric configuration options and modifications of micro channels for multi phase flow systems. The modeling processes were parameterized and analyzed by CATIA “Generative Shape Design” module with the help of “Parameters” and “Relations” functions. The surface models are all designed in circular cross sections that are constrained in two ways: one is perpendicular to the main central routes of the tube for planar transitional junction, and another is, perpendicular to the sub-branch central routes for oblique transitional junction three dimensionally. Next work is emphasized on numerical simulation and experimental investigation with these geometric structures in a multi phase flow system.
APA, Harvard, Vancouver, ISO, and other styles
5

Shen, Qiwei, Junjie Xu, Jiahao Mei, Xingjiao Wu, and Daoguo Dong. "EmoStyle: Emotion-Aware Semantic Image Manipulation with Audio Guidance." Applied Sciences 14, no. 8 (April 10, 2024): 3193. http://dx.doi.org/10.3390/app14083193.

Full text
Abstract:
With the flourishing development of generative models, image manipulation is receiving increasing attention. Rather than text modality, several elegant designs have delved into leveraging audio to manipulate images. However, existing methodologies mainly focus on image generation conditional on semantic alignment, ignoring the vivid affective information depicted in the audio. We propose an Emotion-aware StyleGAN Manipulator (EmoStyle), a framework where affective information from audio can be explicitly extracted and further utilized during image manipulation. Specifically, we first leverage the multi-modality model ImageBind for initial cross-modal retrieval between images and music, and select the music-related image for further manipulation. Simultaneously, by extracting sentiment polarity from the lyrics of the audio, we generate an emotionally rich auxiliary music branch to accentuate the affective information. We then leverage pre-trained encoders to encode audio and the audio-related image into the same embedding space. With the aligned embeddings, we manipulate the image via a direct latent optimization method. We conduct objective and subjective evaluations on the generated images, and our results show that our framework is capable of generating images with specified human emotions conveyed in the audio.
APA, Harvard, Vancouver, ISO, and other styles
6

Guo, Xiaoqiang, Xinhua Liu, Grzegorz Królczyk, Maciej Sulowicz, Adam Glowacz, Paolo Gardoni, and Zhixiong Li. "Damage Detection for Conveyor Belt Surface Based on Conditional Cycle Generative Adversarial Network." Sensors 22, no. 9 (May 3, 2022): 3485. http://dx.doi.org/10.3390/s22093485.

Full text
Abstract:
The belt conveyor is an essential piece of equipment in coal mining for coal transportation, and its stable operation is key to efficient production. Belt surface of the conveyor is vulnerable to foreign bodies which can be extremely destructive. In the past decades, much research and numerous approaches to inspect belt status have been proposed, and machine learning-based non-destructive testing (NDT) methods are becoming more and more popular. Deep learning (DL), as a branch of machine learning (ML), has been widely applied in data mining, natural language processing, pattern recognition, image processing, etc. Generative adversarial networks (GAN) are one of the deep learning methods based on generative models and have been proved to be of great potential. In this paper, a novel multi-classification conditional CycleGAN (MCC-CycleGAN) method is proposed to generate and discriminate surface images of damages of conveyor belt. A novel architecture of improved CycleGAN is designed to enhance the classification performance using a limited capacity images dataset. Experimental results show that the proposed deep learning network can generate realistic belt surface images with defects and efficiently classify different damaged images of the conveyor belt surface.
APA, Harvard, Vancouver, ISO, and other styles
7

Wang, Jiawei, and Zhen Chen. "Factor-GAN: Enhancing stock price prediction and factor investment with Generative Adversarial Networks." PLOS ONE 19, no. 6 (June 25, 2024): e0306094. http://dx.doi.org/10.1371/journal.pone.0306094.

Full text
Abstract:
Deep learning, a pivotal branch of artificial intelligence, has increasingly influenced the financial domain with its advanced data processing capabilities. This paper introduces Factor-GAN, an innovative framework that utilizes Generative Adversarial Networks (GAN) technology for factor investing. Leveraging a comprehensive factor database comprising 70 firm characteristics, Factor-GAN integrates deep learning techniques with the multi-factor pricing model, thereby elevating the precision and stability of investment strategies. To explain the economic mechanisms underlying deep learning, we conduct a subsample analysis of the Chinese stock market. The findings reveal that the deep learning-based pricing model significantly enhances return prediction accuracy and factor investment performance in comparison to linear models. Particularly noteworthy is the superior performance of the long-short portfolio under Factor-GAN, demonstrating an annualized return of 23.52% with a Sharpe ratio of 1.29. During the transition from state-owned enterprises (SOEs) to non-SOEs, our study discerns shifts in factor importance, with liquidity and volatility gaining significance while fundamental indicators diminish. Additionally, A-share listed companies display a heightened emphasis on momentum and growth indicators relative to their dual-listed counterparts. This research holds profound implications for the expansion of explainable artificial intelligence research and the exploration of financial technology applications.
APA, Harvard, Vancouver, ISO, and other styles
8

Ao, Zhuoyu, Weixi Wang, Yaoyu Li, Hongsheng Huang, Xiaoming Li, Renzhong Guo, and Shengjun Tang. "Structured Generation Method of 3D Synthetic Tree Models for Precision Assessment." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLVIII-1-2024 (May 10, 2024): 7–12. http://dx.doi.org/10.5194/isprs-archives-xlviii-1-2024-7-2024.

Full text
Abstract:
Abstract. The technology for 3D reconstruction of tree models based on point clouds has been extensively researched, necessitating effective datasets for the study of branch and leaf separation, skeleton point extraction, and tree parameter extraction methods. However, existing datasets for 3D tree models face several challenges, including insufficient data volume for deep learning network training, low accuracy of model ground truth impeding effective method precision evaluation, and a lack of dataset richness to satisfy the needs of multi-type method assessments. In response to these challenges, This paper introduces, for the first time, a fully automated method for generating structured three-dimensional synthetic tree models, and constructs a large-scale 3D synthetic tree dataset enriched with comprehensive structural information. This method facilitates automated computation across several processes, including the mass generation of simulated trees, separation of branches and leaves, noise generation, extraction of skeleton points, and volume calculation. To validate the usability of this dataset across various applications, this paper employs state-of-the-art (SoTA) algorithms to verify the accuracy of methods in 3D tree model reconstruction and carbon stock calculation, thereby thoroughly demonstrating the dataset’s effectiveness.
APA, Harvard, Vancouver, ISO, and other styles
9

Mednikov, Aleksandr, Alexey Maksimov, and Elina Tyurina. "Mathematical modeling of mini-CHP based on biomass." E3S Web of Conferences 69 (2018): 02005. http://dx.doi.org/10.1051/e3sconf/20186902005.

Full text
Abstract:
One of the promising directions of small-scale distributed power generation for Russia is the use of biomass. The present work is devoted to studies of an mini-CHP based on multi-stage biomass gasification. Mathematical models of elements and mini-CHP in general based on technological schemes were constructed. The mathematical models were constructed with the software developed at Melentiev Energy Systems Institute of Siberian Branch of the Russian Academy of Sciences. The calculations were made for two sizes of internal combustion engines. Thus, we obtained the values of flow rates, temperatures of heat carriers at various points of flow charts of the plants.
APA, Harvard, Vancouver, ISO, and other styles
10

Rebuffel, Clement, Marco Roberti, Laure Soulier, Geoffrey Scoutheeten, Rossella Cancelliere, and Patrick Gallinari. "Controlling hallucinations at word level in data-to-text generation." Data Mining and Knowledge Discovery 36, no. 1 (October 22, 2021): 318–54. http://dx.doi.org/10.1007/s10618-021-00801-4.

Full text
Abstract:
AbstractData-to-Text Generation (DTG) is a subfield of Natural Language Generation aiming at transcribing structured data in natural language descriptions. The field has been recently boosted by the use of neural-based generators which exhibit on one side great syntactic skills without the need of hand-crafted pipelines; on the other side, the quality of the generated text reflects the quality of the training data, which in realistic settings only offer imperfectly aligned structure-text pairs. Consequently, state-of-art neural models include misleading statements –usually called hallucinations—in their outputs. The control of this phenomenon is today a major challenge for DTG, and is the problem addressed in the paper. Previous work deal with this issue at the instance level: using an alignment score for each table-reference pair. In contrast, we propose a finer-grained approach, arguing that hallucinations should rather be treated at the word level. Specifically, we propose a Multi-Branch Decoder which is able to leverage word-level labels to learn the relevant parts of each training instance. These labels are obtained following a simple and efficient scoring procedure based on co-occurrence analysis and dependency parsing. Extensive evaluations, via automated metrics and human judgment on the standard WikiBio benchmark, show the accuracy of our alignment labels and the effectiveness of the proposed Multi-Branch Decoder. Our model is able to reduce and control hallucinations, while keeping fluency and coherence in generated texts. Further experiments on a degraded version of ToTTo show that our model could be successfully used on very noisy settings.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Multi-Branch generative models"

1

Pinton, Noel Jeffrey. "Reconstruction synergique TEP/TDM à l'aide de l'apprentissage profond." Electronic Thesis or Diss., Brest, 2024. http://www.theses.fr/2024BRES0123.

Full text
Abstract:
L’adoption généralisée des scanners hybrides Tomographie à émission de positons (TEP)/Tomodensitométrie (TDM) a conduit à une augmentation significative de la disponibilité des données d’imagerie combinées TEP/TDM. Cependant, les méthodologies actuelles traitent souvent chaque modalité de manière indépendante, négligeant ainsi le potentiel d’amélioration de la qualité des images grâce à l’exploitation des informations anatomiques et fonctionnelles complémentaires propres à chaque modalité. Exploiter ces informations intermodales pourrait améliorer les reconstructions TEP et TDM en fournissant une vision synergique des détails anatomiques et fonctionnels. Cette thèse propose une méthode innovante de reconstruction synergique d’images médicales via des modèles génératifs multibranches. En exploitant des autoencodeurs variationnels (VAE) multi-branches, notre approche apprend conjointement des images TEP et TDM, assurant un débruitage efficace et une reconstruction haute-fidélité. Ce cadre améliore la qualité des images et ouvre de nouvelles perspectives pour l’imagerie médicale multimodale en contexte clinique et de recherche
The widespread adoption of hybrid Positron emission tomography (PET)/Computed tomography (CT) scanners has led to a significant increase in the availability of combined PET/CT imaging data. However, current methodologies often process each modality independently, overlooking the potential to enhance image quality by leveraging the complementary anatomical and functional information intrinsic to each modality. Exploiting intermodal information has the potential to improve both PET and CT reconstructions by providing a synergistic view of anatomical and functional details. This thesis introduces a novel approach for synergistic reconstruction of medical images using multi-branch generative models. By employing variational autoencoders (VAEs) with a multi-branch architecture, our model simultaneously learns from paired PET and CT images,allowing for effective joint denoising and highfidelity reconstruction of both modalities. Beyond improving image quality, this framework also paves the way for future advancements in multi-modal medical imaging, highlighting the transformative potential of integrated approaches for hybrid imaging modalities in clinical and research settings
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Multi-Branch generative models"

1

He, Xiaoxu, and Mingyu Sun. "Biomimetic Form-Finding Study of Bone Needle Microstructure Based on Sponge Regeneration Behavior." In Computational Design and Robotic Fabrication, 90–101. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-99-8405-3_8.

Full text
Abstract:
AbstractThe concept of “nature-algorithm-structure” refers to a digital design method in architecture that draws inspiration from nature, extracting its mathematical and physical conceptual models to construct structural systems with parameters. This study aims to address the challenge of parametric form-finding in reticular tension structures. By observing the phenomenon of “sponge regeneration”, we further illustrate the generation and optimization of reticular tension structures through the hierarchical structures of “monomer”-“path”-“mesh”. Tensile structural systems are rebound forms, and their analytical models must account for their nonlinear characteristics and the existence of equilibrium self-course. Starting from the growth dynamics of “sponge regeneration behavior”, this paper extracts the logic behind it: sponge monomers combine randomly into partial units under the condition of shredding and discrete, forming a single organism through aggregation. The multi-dimensional bone needle serves as a structural component, enabling multi-axis reorganization, while the multi-directional mesh surface as a morphological component realizes multi-branch reproduction, forming a natural “network tension structure”. This study focuses on the biomimetic form-finding of bone needle microstructure, drawing inspiration from sponge regeneration behavior. By analyzing the growth dynamics of sponge regeneration, we aim to develop a better understanding of the principles behind the formation of bone needle microstructure. This finding provides significant reference for the development of modern structures and promotes the bioshape and optimization of tensile structures.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Multi-Branch generative models"

1

Ling, Zeyu, Bo Han, Yongkang Wong, Han Lin, Mohan Kankanhalli, and Weidong Geng. "MCM: Multi-condition Motion Synthesis Framework." In Thirty-Third International Joint Conference on Artificial Intelligence {IJCAI-24}. California: International Joint Conferences on Artificial Intelligence Organization, 2024. http://dx.doi.org/10.24963/ijcai.2024/120.

Full text
Abstract:
Conditional human motion synthesis (HMS) aims to generate human motion sequences that conform to specific conditions. Text and audio represent the two predominant modalities employed as HMS control conditions. While existing research has primarily focused on single conditions, the multi-condition human motion synthesis remains underexplored. In this study, we propose a multi-condition HMS framework, termed MCM, based on a dual-branch structure composed of a main branch and a control branch. This framework effectively extends the applicability of the diffusion model, which is initially predicated solely on textual conditions, to auditory conditions. This extension encompasses both music-to-dance and co-speech HMS while preserving the intrinsic quality of motion and the capabilities for semantic association inherent in the original model. Furthermore, we propose the implementation of a Transformer-based diffusion model, designated as MWNet, as the main branch. This model adeptly apprehends the spatial intricacies and inter-joint correlations inherent in motion sequences, facilitated by the integration of multi-wise self-attention modules. Extensive experiments show that our method achieves competitive results in single-condition and multi-condition HMS tasks.
APA, Harvard, Vancouver, ISO, and other styles
2

Li, Yu-Lei. "Unsupervised Embedding and Association Network for Multi-Object Tracking." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. California: International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/157.

Full text
Abstract:
How to generate robust trajectories of multiple objects without using any manual identity annotation? Recently, identity embedding features from Re-ID models are adopted to associate targets into trajectories. However, most previous methods equipped with embedding features heavily rely on manual identity annotations, which bring a high cost for the multi-object tracking (MOT) task. To address the above problem, we present an unsupervised embedding and association network (UEANet) for learning discriminative embedding features with pseudo identity labels. Specifically, we firstly generate the pseudo identity labels by adopting a Kalman filter tracker to associate multiple targets into trajectories and assign a unique identity label to each trajectory. Secondly, we train the transformer-based identity embedding branch and MLP-based data association branch of UEANet with these pseudo labels, and UEANet extracts branch-dependent features for the unsupervised MOT task. Experimental results show that UEANet confirms the outstanding ability to suppress IDS and achieves comparable performance compared with state-of-the-art methods on three MOT datasets.
APA, Harvard, Vancouver, ISO, and other styles
3

Urata, Kazuya, Ryo Tsumoto, Kentaro Yaji, and Kikuo Fujita. "Multi-Stage Optimal Design for Turbulent Pipe Systems by Data-Driven Morphological Exploration and Evolutionary Shape Optimization." In ASME 2024 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2024. http://dx.doi.org/10.1115/detc2024-143383.

Full text
Abstract:
Abstract Turbulence fields typically cause cumbersome multimodality in their solution space and the optimal shape of flow paths could be strongly dependent on a huge number of branch patterns. These features hinder gradient-based structural optimization frameworks from finding promising solutions for turbulent pipe systems. In this paper, we propose a multi-stage framework that integrates data-driven morphological exploration and evolutionary shape optimization to address the challenges posed by the complexity of turbulent pipe systems. Our framework begins with data-driven morphological exploration, aiming to find promising topologies as well as shapes for selecting a reasonable number of candidates for the next shape refinement stage. Herein, we employ data-driven topology design, a gradient-free and multiobjective optimization methodology incorporating a deep generative model and the concept of evolutionary algorithms to generate promising arrangements. Subsequently, representative shapes are extracted through a deep clustering strategy. The final stage involves refining these shapes through shape optimization using a genetic algorithm. Focusing on a two-dimensional turbulent pipe system with a min/max objective, our numerical results show the effectiveness of the proposed framework in delivering high-performance solutions for the turbulent flow optimization problem with branching.
APA, Harvard, Vancouver, ISO, and other styles
4

Gijrath, Hans, and Mats A˚bom. "A Matrix Formalism for Fluid-Borne Sound in Pipe Systems." In ASME 2002 International Mechanical Engineering Congress and Exposition. ASMEDC, 2002. http://dx.doi.org/10.1115/imece2002-33356.

Full text
Abstract:
In this paper a general matrix formalism for predicting fluid-borne sound in gas filled pipe systems of arbitrary geometry is presented. Based on the formalism, a code, valid from the low frequency plane wave range up to frequencies where a large number of modes propagate in each pipe, has been developed. The formalism is based on representing the pipe system as an equivalent network of acoustical 2-ports, where each 2-port corresponds to a physical pipe element. Interfaces or branch points between N (≥ 2) pipes in the physical system are represented as node points, which are modelled as multi-ports of order N. For the low frequency range the ports of the equivalent network are defined using travelling pressure wave amplitudes as the state variables. This gives a so-called scattering-matrix formalism that has been described earlier in the literature. For the high frequency multi-mode range it is demonstrated that the same formalism still holds if the state variables are defined via acoustic power. Furthermore, compared to the standard power flow models used today, e.g., the VDI 3733 standard, the suggested matrix formulation can also include the effect of reflections. To enable modelling both of sound generation from fluid machines (fans, compressors,…) and flow generated sound, e.g., from flow separation at constrictions (valves) and bends, both the 2-ports and multi-ports are allowed to be active. In the first version of the code semi-empirical models for flow generated sound from, e.g., valves have been included.
APA, Harvard, Vancouver, ISO, and other styles
5

Guo, Hang, Tao Dai, Guanghao Meng, and Shu-Tao Xia. "Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement." In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/87.

Full text
Abstract:
Scene text image super-resolution (STISR), aiming to improve image quality while boosting downstream scene text recognition accuracy, has recently achieved great success. However, most existing methods treat the foreground (character regions) and background (non-character regions) equally in the forward process, and neglect the disturbance from the complex background, thus limiting the performance. To address these issues, in this paper, we propose a novel method LEMMA that explicitly models character regions to produce high-level text-specific guidance for super-resolution. To model the location of characters effectively, we propose the location enhancement module to extract character region features based on the attention map sequence. Besides, we propose the multi-modal alignment module to perform bidirectional visual-semantic alignment to generate high-quality prior guidance, which is then incorporated into the super-resolution branch in an adaptive manner using the proposed adaptive fusion module. Experiments on TextZoom and four scene text recognition benchmarks demonstrate the superiority of our method over other state-of-the-art methods. Code is available at https://github.com/csguoh/LEMMA.
APA, Harvard, Vancouver, ISO, and other styles
6

Wu, Tong, Bicheng Dai, Shuxin Chen, Yanyun Qu, and Yuan Xie. "Meta Segmentation Network for Ultra-Resolution Medical Images." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/76.

Full text
Abstract:
Despite recent great progress on semantic segmentation, there still exist huge challenges in medical ultra-resolution image segmentation. The methods based on multi-branch structure can make a good balance between computational burdens and segmentation accuracy. However, the fusion structure in these methods require to be designed elaborately to achieve desirable result, which leads to model redundancy. In this paper, we propose Meta Segmentation Network (MSN) to solve this challenging problem. With the help of meta-learning, the fusion module of MSN is quite simple but effective. MSN can fast generate the weights of fusion layers through a simple meta-learner, requiring only a few training samples and epochs to converge. In addition, to avoid learning all branches from scratch, we further introduce a particular weight sharing mechanism to realize a fast knowledge adaptation and share the weights among multiple branches, resulting in the performance improvement and significant parameters reduction. The experimental results on two challenging ultra-resolution medical datasets BACH and ISIC show that MSN achieves the best performance compared with the state-of-the-art approaches.
APA, Harvard, Vancouver, ISO, and other styles
7

Erol, Anil, Saad Ahmed, Paris von Lockette, and Zoubeida Ounaies. "Analysis of Microstructure-Based Network Models for the Nonlinear Electrostriction Modeling of Electro-Active Polymers." In ASME 2017 Conference on Smart Materials, Adaptive Structures and Intelligent Systems. American Society of Mechanical Engineers, 2017. http://dx.doi.org/10.1115/smasis2017-3979.

Full text
Abstract:
Relaxor ferroelectric polymers are a unique branch of electro-active polymers (EAPs) that generate high electromechanical strain with relatively low hysteresis and high nonlinearity. Polyvinylidene fluoride-based EAPs possess these qualities due to the semicrystalline nature of their microstructure. The interactions of electric dipoles within the microstructure of the material generate large strains under an external electric field, and the reduced crystalline domain sizes yield a relaxor effect by exhibiting low hysteresis and hyperelastic properties. This phenomenon has been partially modeled by previous works, but micro-electro-mechanisms for electrostriction in the microstructure have been largely ignored. This study focuses on the effects of various microstructural frameworks on the nonlinear dielectric behavior of dipole-based, semicrystalline EAPs. The Helmholtz free energy function of a microscopic representative volume element (RVE) is composed of an electrostatic energy and an elastic energy. The dipole-dipole interaction energy is prescribed for the electrostatic forces observed among the crystalline regions, and the elastic component attributed to the relaxation of the amorphous phase is modeled by the hyperelastic eight-chain model, which is microstructure-based. The RVE of the system is modeled by a central dipole surrounded by dipoles whose relative spatial locations are determined by a probability distribution function (PDF). The hyperelastic amorphous phase constitutes the volume separating the central and surrounding dipoles. The free energy of the RVE is implemented into a continuum description of the equilibrium of the system to obtain electromechanical relations. Additionally, this electromechanical response data is applied to a 1D structural mechanics model for simulating the large deformation of a multi-layered beam. The effects of microstructure on electrostrictive coupling are explored by varying the centers and deviations of dipole locations within the PDF. Discrete microstructural arrangements representing 3-chain network averaging schemes may be studied alongside more continuous ellipsoidal or random models of dipole spatial arrangements. The simulation results of the PDF-based networks are in good agreement with experimental data. The results indicate that the electrostrictive behavior of EAPs is strongly dependent on (1) the relative dipole spatial locations and (2) the extent of the regions containing dipoles, which represent crystalline domains. The model finds that adding extra crystalline domains in the network averaging schemes generates a better characteristic behavior due to a broader averaging of spatial orientations. These results offer a gateway to predicting microstructurally-dependent dipole-based behavior that can lead to the predictive theoretical tailoring of microstructures for desired electromechanical properties.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography