Se connecter

Bibliographies thématiques / End-to-end multimodal modelling

Littérature scientifique sur le sujet « End-to-end multimodal modelling »

Auteur : Grafiati

Publié le 20 juillet 2024

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Sommaire

Articles de revues
Thèses
Chapitres de livres

Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « End-to-end multimodal modelling ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Articles de revues sur le sujet "End-to-end multimodal modelling"

1

Bazaras, Darius, et Ramūnas Palšaitis. « MULTIMODAL APPROACH TO THE INTERNATIONAL TRANSIT TRANSPORT ». TRANSPORT 18, n^o 6 (31 décembre 2003) : 248–54. http://dx.doi.org/10.3846/16483840.2003.10414106.

Texte intégral

Résumé :

In the article not only the problems of multi-modal and inter-modal conveyances in Lithuania and the concept of transit and the transit system stimulating factors are analysed, but also the modelling of transit transport and the flows of the loads are given. The main part of the article comes to the analysis of resent situation of Lithuania. In this part the place of transport sector in the market of transit services is analysed and the transit profit for Lithuanian economy is evaluated. The conclusions and proposals are given at the end of the article.

Styles APA, Harvard, Vancouver, ISO, etc.

2

Crespi, Pietro, Alberto Franchi et Nicola Giordano. « Multimodal Pushover Analysis for R.C. Bridges ». Applied Mechanics and Materials 725-726 (janvier 2015) : 888–95. http://dx.doi.org/10.4028/www.scientific.net/amm.725-726.888.

Texte intégral

Résumé :

In recent years, Italian technical-scientific community has increased its interest on the evaluation of the seismic response of existing structures. Among this wide range of structures, reinforced concrete bridges stand out for their strategic relevance and technical complexity. Most of these structures were built between 60ies and 70ies, according to design procedures which ignored nowadays knowledge in seismic engineering. Thus, the necessity to evaluate the real strength capacity of these structures with modern analysis techniques has become essential, leading to the determination of their safety level in case of an earthquake. In particular, for the assessment of several bridges of a motorway network, a multi-modal pushover analysis approach has been considered. This analysis technique allows considering the nonlinear behaviour and the complex dynamic response of such structures without exceeding in high computational costs. Some basic rules were defined (constitutive laws of materials, finite element type, plastic hinge models, etc.) for the modelling of bridges, in order to guarantee homogeneous comparable results among different structures of a network. At the end, some of the results are compared to see the variation of the verification level with respect to both the number of modes considered and the analysis’s accuracy in terms of number of loading steps.

Styles APA, Harvard, Vancouver, ISO, etc.

3

Dietze, E., F. Maussion, M. Ahlborn, B. Diekmann, K. Hartmann, K. Henkel, T. Kasper, G. Lockot, S. Opitz et T. Haberzettl. « Sediment transport processes across the Tibetan Plateau inferred from robust grain-size end members in lake sediments ». Climate of the Past 10, n^o 1 (16 janvier 2014) : 91–106. http://dx.doi.org/10.5194/cp-10-91-2014.

Texte intégral

Résumé :

Abstract. Grain-size distributions offer powerful proxies of past environmental conditions that are related to sediment sorting processes. However, they are often of multimodal character because sediments can get mixed during deposition. To facilitate the use of grain size as palaeoenvironmental proxy, this study aims to distinguish the main detrital processes that contribute to lacustrine sedimentation across the Tibetan Plateau using grain-size end-member modelling analysis. Between three and five robust grain-size end-member subpopulations were distinguished at different sites from similarly–likely end-member model runs. Their main modes were grouped and linked to common sediment transport and depositional processes that can be associated with contemporary Tibetan climate (precipitation patterns and lake ice phenology, gridded wind and shear stress data from the High Asia Reanalysis) and local catchment configurations. The coarse sands and clays with grain-size modes >250 μm and <2 μm were probably transported by fluvial processes. Aeolian sands (~200 μm) and coarse local dust (~60 μm), transported by saltation and in near-surface suspension clouds, are probably related to occasional westerly storms in winter and spring. Coarse regional dust with modes ~25 μm may derive from near-by sources that keep in longer term suspension. The continuous background dust is differentiated into two robust end members (modes: 5–10 and 2–5 μm) that may represent different sources, wind directions and/or sediment trapping dynamics from long-range, upper-level westerly and episodic northerly wind transport. According to this study grain-size end members of only fluvial origin contribute small amounts to mean Tibetan lake sedimentation (19± 5%), whereas local to regional aeolian transport and background dust deposition dominate the clastic sedimentation in Tibetan lakes (contributions: 42 ± 14% and 51 ± 11%). However, fluvial and alluvial reworking of aeolian material from nearby slopes during summer seems to limit end-member interpretation and should be crosschecked with other proxy information. If not considered as a stand-alone proxy, a high transferability to other regions and sediment archives allows helpful reconstructions of past sedimentation history.

Styles APA, Harvard, Vancouver, ISO, etc.

4

Dietze, E., F. Maussion, M. Ahlborn, B. Diekmann, K. Hartmann, K. Henkel, T. Kasper, G. Lockot, S. Opitz et T. Haberzettl. « Sediment transport processes across the Tibetan Plateau inferred from robust grain size end-members in lake sediments ». Climate of the Past Discussions 9, n^o 4 (21 août 2013) : 4855–92. http://dx.doi.org/10.5194/cpd-9-4855-2013.

Texte intégral

Résumé :

Abstract. Grain size distributions offer powerful proxies of past environmental conditions that are related to sediment sorting processes. However, they are often of multimodal character because sediments can get mixed during deposition. To facilitate the use of grain size as palaeoenvironmental proxy this study aims to distinguish the main detrital processes that contribute to lacustrine sedimentation across the Tibetan Plateau using grain size end-member modelling analysis. Between three and five robust grain size end-member subpopulations were distinguished at different sites from similarly-likely end-member model runs. Their main modes were grouped and linked to sediment transport and depositional processes associated with certain climatic background and catchment configurations. The coarse sands and clays with grain size modes > 250 μm and < 2 μm were probably transported by fluvial processes. Aeolian sands (~ 200 μm) and coarse local dust (~ 60 μm), transported by saltation and in near-surface suspension clouds, are probably related to occasional westerly storms in winter and spring. Coarse regional dust with modes ~ 25 μm may derive from near-by sources that keep in longer-term suspension. The continuous background dust is differentiated into two robust end-members (modes: 5–10 and 2–5 μm) that may represent different sources, wind directions and/or sediment trapping dynamics from long-range, upper-level westerly and episodic northerly wind transport. According to this study grain size end-members of only fluvial origin contribute small amounts to mean Tibetan lake sedimentation (19 ± 5%), whereas local to regional aeolian transport and background dust deposition dominate the clastic sedimentation in Tibetan lakes (contributions: 42 ± 14% and 51 ± 11%). However, fluvial and alluvial reworking of aeolian material from nearby slopes during summer seems to limit end-member interpretation and should be crosschecked with other proxy information. If not considered as a stand-alone proxy, a high transferability to other regions and sediment archives allows helpful reconstructions of past sedimentation history.

Styles APA, Harvard, Vancouver, ISO, etc.

5

Riva, Marco, Patrick Hiepe, Mona Frommert, Ignazio Divenuto, Lorenzo G. Gay, Tommaso Sciortino, Marco Conti Nibali, Marco Rossi, Federico Pessina et Lorenzo Bello. « Intraoperative Computed Tomography and Finite Element Modelling for Multimodal Image Fusion in Brain Surgery ». Operative Neurosurgery 18, n^o 5 (25 juillet 2019) : 531–41. http://dx.doi.org/10.1093/ons/opz196.

Texte intégral

Résumé :

Abstract BACKGROUND intraoperative computer tomography (iCT) and advanced image fusion algorithms could improve the management of brainshift and the navigation accuracy. OBJECTIVE To evaluate the performance of an iCT-based fusion algorithm using clinical data. METHODS Ten patients with brain tumors were enrolled; preoperative MRI was acquired. The iCT was applied at the end of microsurgical resection. Elastic image fusion of the preoperative MRI to iCT data was performed by deformable fusion employing a biomechanical simulation based on a finite element model. Fusion accuracy was evaluated: the target registration error (TRE, mm) was measured for rigid and elastic fusion (Rf and Ef) and anatomical landmark pairs were divided into test and control structures according to distinct involvement by the brainshift. Intraoperative points describing the stereotactic position of the brain were also acquired and a qualitative evaluation of the adaptive morphing of the preoperative MRI was performed by 5 observers. RESULTS The mean TRE for control and test structures with Rf was 1.81 ± 1.52 and 5.53 ± 2.46 mm, respectively. No significant change was observed applying Ef to control structures; the test structures showed reduced TRE values of 3.34 ± 2.10 mm after Ef (P < .001). A 32% average gain (range 9%-54%) in accuracy of image registration was recorded. The morphed MRI showed robust matching with iCT scans and intraoperative stereotactic points. CONCLUSIONS The evaluated method increased the registration accuracy of preoperative MRI and iCT data. The iCT-based non-linear morphing of the preoperative MRI can potentially enhance the consistency of neuronavigation intraoperatively.

Styles APA, Harvard, Vancouver, ISO, etc.

6

Li, Zhigang, Aimei Dong et Jing Zhou. « Research of Low-Rank Representation and Discriminant Correlation Analysis for Alzheimer’s Disease Diagnosis ». Computational and Mathematical Methods in Medicine 2020 (19 mars 2020) : 1–8. http://dx.doi.org/10.1155/2020/5294840.

Texte intégral

Résumé :

As population aging is becoming more common worldwide, applying artificial intelligence into the diagnosis of Alzheimer’s disease (AD) is critical to improve the diagnostic level in recent years. In early diagnosis of AD, the fusion of complementary information contained in multimodality data (e.g., magnetic resonance imaging (MRI), positron emission tomography (PET), and cerebrospinal fluid (CSF)) has obtained enormous achievement. Detecting Alzheimer’s disease using multimodality data has two difficulties: (1) there exists noise information in multimodal data; (2) how to establish an effective mathematical model of the relationship between multimodal data? To this end, we proposed a method named LDF which is based on the combination of low-rank representation and discriminant correlation analysis (DCA) to fuse multimodal datasets. Specifically, the low-rank representation method is used to extract the latent features of the submodal data, so the noise information in the submodal data is removed. Then, discriminant correlation analysis is used to fuse the submodal data, so the complementary information can be fully utilized. The experimental results indicate the effectiveness of this method.

Styles APA, Harvard, Vancouver, ISO, etc.

7

Abourraja, Mohamed Nezar, Mustapha Oudani, Mohamed Yassine Samiri, Jaouad Boukachour, Abdelaziz Elfazziki, Abdelhadi Bouain et Mehdi Najib. « An improving agent-based engineering strategy for minimizing unproductive situations of cranes in a rail–rail transshipment yard ». SIMULATION 94, n^o 8 (6 octobre 2017) : 681–705. http://dx.doi.org/10.1177/0037549717733050.

Texte intégral

Résumé :

Nowadays, seaports seek to achieve a better massification (massive transportation of containers) share of their hinterland transport by promoting rail and river connections in order to more rapidly evacuate increasing container traffic shipped by sea and to avoid landside congestion. The attractiveness of a seaport to shipping enterprises depends not only on its reliability and nautical qualities but also on its massified hinterland connection capacity. Contrary to what has been observed in Europe, the massification share of Le Havre seaport has stagnated in recent years. To overcome this situation, Le Havre Port Authority is putting into service a multimodal hub terminal linked only with massified modes. In this study, we focus on rail–rail transshipment of this new terminal, specifically on minimizing unproductive situations of cranes to improve crane productivity and to speed up freight train processing. To this end, an improving agent-based engineering strategy called the “crane anti-collision strategy” is proposed and tested using multi-method simulation software (Anylogic). In a numerical study, the simulation results reveal that our developed model is very satisfactory and outperforms other existing simulation models.

Styles APA, Harvard, Vancouver, ISO, etc.

8

Prokofieva, E. S., et V. V. Panin. « UNIFORM PRINCIPLES OF ORGANIZATION OF RAIL FREIGHT TRANSPORTATION OPERATIONS ». World of Transport and Transportation 17, n^o 5 (7 juin 2020) : 186–98. http://dx.doi.org/10.30932/1992-3252-2019-17-5-186-198.

Texte intégral

Résumé :

Improving the operational efficiency of transport is one of the main catalysts for raising the quality of life and socio-economic development. The prevailing global trends are aimed at establishing logistic chains from producers to consumers. The cross-border movement of world production within the framework of globalization, the deepening of integration processes between countries highlight the issues of developing a methodology for planning and evaluating international transport corridors from various points of view, and economic, technical, social, environmental and other aspects. The urgent issue is the harmonization of the technical and regulatory framework for the organization of «seamless» transportation, the synchronization of the technological cycles of various modes of transport, including operation of multimodal transport hubs. A large number of studies are dedicated to the modelling and control of traffic flows, new methods of forecasting the transport situation using the results of computer dynamic transport modelling, analysis of existing infrastructure resources and elimination of bottlenecks in the existing railway system (including through the expansion of the railway network and innovative methods of traffic control).Hence the achievement of target parameters of efficiency of production activity of all participants in railway transportations is possible through qualitative changes in the system of the organization of transportation process, the objective of the research was to study technological aspects of improvement of quality of functioning of the transport system in relation to activity of the Russian railways.The research applied methods of system analysis of scientific and technical data and technical and economic indicators of complex systems.The issues of implementation of end-to-end principles of transportation process management are considered, the concept of «technological polygon of transportation process management» is expanded. The analysis and grouping of methods for improving the efficiency of railway transport was carried out with regard to improving the operation quality indicators and refined model of rolling stock management, increasing the efficiency of the use of traction equipment, profiling of railway itineraries in respect of the preferred types of traffic, establishing sections well-provided for safe passage of freight trains and synchronized with the operation ranges of locomotives, organizing freightage in wagons withincreased axle load, removing barriers and restrictions in the power supply sector, etc.The transition to the «polygon» model of the organization of the technological process, the unification and optimization of the use of available resources, and, as a result, the reduction of risks of losses from transportation, service provision and unplanned use of resources are effective tools to improve the quality of rail operations.

Styles APA, Harvard, Vancouver, ISO, etc.

9

Horoshkova, Lidiia, Olena Vasyl’yeva, Oksana Maslova et Alexander Sumets. « River logistics amid war and post-war recovery in Ukraine : current situation and prospects . » University Economic Bulletin, n^o 56 (31 mars 2023) : 113–25. http://dx.doi.org/10.31470/2306-546x-2023-56-113-125.

Texte intégral

Résumé :

Relevance in the research topic. Recovery of Ukraine, overcoming consequences of hostilities on its territory, national security and defense capability issues solution are relevant today. Therefore, it is possible to solve logistical and infrastructure problems precisely by inland water transport and its infrastructure modernization, expanded network of river ports and higher efficiency of their facilities, encouragement of private investments attraction, and inducement of inland water transport development. Problem statement. There is a need to build logistics management system for navigable inland waterways in the South of Ukraine, namely the "Danube - Black Sea" (Bystre rivermouth) to expand the use of Reni, Izmail and Ust-Dunaysk ports. Analysis of recent research and publications. Key publication analysis reveal that river logistics development issues are considered by many researchers. So research works of Krčum, M., Plazibat V., Gorana J. M., Wójcikiewicza R., Kaupb M., Nowakowski T., Kulczyk J., Skupień E. & Tubis A., Kolář J. & Stopka, O., Krile S. are focused on certain national problem-solving. For example: the Croatian transport system and the ways of sea and river ports integration, peculiarities of inland water transport in Szczecin (Poland), determinants of Lower Vistula river ports transformation, the location of port`s multimodal logistics center at the Labe River (Elbe), the development of inland water transport market, as well as the strengthening of European cooperation in the field of inland water transport, etc. Unsolved parts of the general problem. Taking into account the importance of shipping revival at the Danube River for Ukraine, it is advisable to study the waterways and the possibilities to increase Izmail, Reni and Ust-Dunaysk seaports efficiency. Study task and objective. The following scientific tasks will be solved aimed at their practical implementation: current position and dynamics analysis of Izmail port development; predictive modelling of opportunities and conditions for its performance improvement while solving war transport problems and in the context of post-war recovery of Ukraine. Research method and methodology. While doing the research, general scientific (analysis and synthesis, induction and deduction, group analysis) and special (abstraction, modelling, etc.) methods of studying economic phenomena and processes have been used. The main material (study results). Ukraine belongs to the largest European countries with a strong potential for the development of inland waterways and river transport. Nevertheless the industry of river freight and passenger transportation has been gradually deteriorating its performance for more than twenty years. When the war started and the maritime territory was occupied, significant amount of the country's infrastructure (roads, bridges, railways) was destroyed, the river transport supported transport logistics needs not only of Ukraine, but also of other countries. Intensification of the Danube (Bystre rivermouth) navigation and of Reni, Izmail and Ust-Dunaysk sea ports activity played an important role in this. Analysis of SE "Izmail Commercial Sea Port" performance revealed lower port's efficiency during recent years. Significant positive changes took place in 2022. The cargo handling volume of Izmail port in 2021 was 3.84 million tons, and in 2022 - 8.89 million tons. Thus, according to 2022 results, Izmail port exceeded 2021 indicators by 218%. The total cargo handling volume of Izmail sea port in February 2023 was 1 million 345 thousand tons, i.e. 24% more than in January 2023. Today Izmail Port exceeds the plan by 274%. The dredging works in Izmail and Reni ports contributed to a significant improvement in the Danube region ports indicators. A forecast was made based on the Izmail sea port activity data regarding its main activity indicators until 2026. Positive indicators obtained in 2022 have been taken into account when forecasting port`s performance indicators. As for the further increase in cargo turnover of SE "Izmail Commercial Sea Port", it will be facilitated by the planned feeder container ship work (March 2023), started from the end of 2022 between the Romanian Port Constanța and the Ukrainian Reni port. The service of feeder container ship work based on SE "Izmail Commercial Sea Port" and Port Constanța facilities will receive Arkas, ZIM, Maersk, Hapag-Lloyd, SMA and others lines containers. The containers will follow in the transshipment mode, that is, there will be no need to issue transit customs documents at Port Constanța. Conclusions. The study shows that Ukraine, which has a strong potential for inland waterways and river transport development has recently not been used it enough. As a result, the river freight and passenger transport industry has gradually deteriorated its performance. It has been proven that amid war in 2022, the Danube ports performance indicators have been significantly improved. The last induced cargo transportation when exploitation of a number of Ukrainian ports was made impossible. The case of current position and efficiency analysis of SE "Izmail Commercial Sea Port" proved the possibility to improve its performance indicators. According to the results of its activitys forecast modelling until 2026, the expected volume of freight transportation and the structure of the freight flow have been determined. Reasonable expediency and effectiveness of feeder container ship work en route SE "Izmail Commercial Sea Port" - Port Constanța.

Styles APA, Harvard, Vancouver, ISO, etc.

10

Yurttas, Can, Oliver M. Fisher, Delia Cortés-Guiral, Sebastian P. Haen, Ingmar Königsrainer, Alfred Königsrainer, Stefan Beckert, Winston Liauw et Markus W. Löffler. « Cytoreductive surgery and HIPEC in colorectal cancer—did we get hold of the wrong end of the stick ? » memo - Magazine of European Medical Oncology 13, n^o 4 (20 octobre 2020) : 434–39. http://dx.doi.org/10.1007/s12254-020-00653-6.

Texte intégral

Résumé :

SummaryCytoreductive surgery (CRS) and hyperthermic intraperitoneal chemotherapy (HIPEC) are a multimodal treatment approach combining surgical interventions of varying extent with administration of heated cytostatic drugs flushed through the abdominal cavity. Hitherto, this treatment has been popular for peritoneal metastasis (PM), e.g. from colorectal cancer (CRC). Recent randomized controlled trials (RCT) question the benefit of HIPEC in its present form for CRC treatment and raise fundamental issues, eliciting discussions and expert statements regarding HIPEC relevance and interpretation of these results. Unfortunately, such discussions have to remain uninformed, due to the lacking publication of crucial peer reviewed RCT results. Novel basic research aware of HIPEC futility suggests there may be systematic limitations. Innovative modelling approaches for HIPEC may shed light on the reasons for therapeutic failure of frequently used drugs and may lead the way to select better alternatives and/or more rational approaches for the design of HIPEC procedures (e.g. regarding exposure time or temperature). Available evidence strongly supports the notion that CRS is the mainstay for the treatment effects observed in PM from CRC. Unfortunately, HIPEC has become a surrogate for surgical expertise in the field and optimal surgery may therefore outweigh the potentially harmful effects of HIPEC treatment, particularly in lieu of modern systemic chemotherapies. The current situation which frequently is assumed to be deadlocked should be regarded as a challenge to investigate HIPEC with well-designed prospective clinical trials, potentially even constituting an opportunity for introducing innovative trial designs that solve the multifaceted issues of a very heterogeneous treatment approach.

Styles APA, Harvard, Vancouver, ISO, etc.

Thèses sur le sujet "End-to-end multimodal modelling"

1

Labbé, Etienne. « Description automatique des événements sonores par des méthodes d'apprentissage profond ». Electronic Thesis or Diss., Université de Toulouse (2023-....), 2024. http://www.theses.fr/2024TLSES054.

Texte intégral

Résumé :

Dans le domaine de l'audio, la majorité des systèmes d'apprentissage automatique se concentrent sur la reconnaissance d'un nombre restreint d'événements sonores. Cependant, lorsqu'une machine est en interaction avec des données réelles, elle doit pouvoir traiter des situations beaucoup plus variées et complexes. Pour traiter ce problème, les annotateurs ont recours au langage naturel, qui permet de résumer n'importe quelle information sonore. La Description Textuelle Automatique de l'Audio (DTAA ou Automated Audio Captioning en anglais) a été introduite récemment afin de développer des systèmes capables de produire automatiquement une description de tout type de son sous forme de texte. Cette tâche concerne toutes sortes d'événements sonores comme des sons environnementaux, urbains, domestiques, des bruitages, de la musique ou de parole. Ce type de système pourrait être utilisé par des personnes sourdes ou malentendantes, et pourrait améliorer l'indexation de grandes bases de données audio. Dans la première partie de cette thèse, nous présentons l'état de l'art de la tâche de DTAA au travers d'une description globale des jeux de données publics, méthodes d'apprentissage, architectures et métriques d'évaluation. À l'aide de ces connaissances, nous présentons ensuite l'architecture de notre premier système de DTAA, qui obtient des scores encourageants sur la principale métrique de DTAA nommée SPIDEr : 24,7 % sur le corpus Clotho et 40,1 % sur le corpus AudioCaps. Dans une seconde partie, nous explorons de nombreux aspects des systèmes de DTAA. Nous nous focalisons en premier lieu sur les méthodes d'évaluations au travers de l'étude de SPIDEr. Pour cela, nous proposons une variante nommée SPIDEr-max, qui considère plusieurs candidats pour chaque fichier audio, et qui montre que la métrique SPIDEr est très sensible aux mots prédits. Puis, nous améliorons notre système de référence en explorant différentes architectures et de nombreux hyper-paramètres pour dépasser l'état de l'art sur AudioCaps (SPIDEr de 49,5 %). Ensuite, nous explorons une méthode d'apprentissage multitâche visant à améliorer la sémantique des phrases générées par notre système. Enfin, nous construisons un système de DTAA généraliste et sans biais nommé CONETTE, pouvant générer différents types de descriptions qui se rapprochent de celles des jeux de données cibles. Dans la troisième et dernière partie, nous proposons d'étudier les capacités d'un système de DTAA pour rechercher automatiquement du contenu audio dans une base de données. Notre approche obtient des scores comparables aux systèmes dédiés à cette tâche, alors que nous utilisons moins de paramètres. Nous introduisons également des méthodes semi-supervisées afin d'améliorer notre système à l'aide de nouvelles données audio non annotées, et nous montrons comment la génération de pseudo-étiquettes peut impacter un modèle de DTAA. Enfin, nous avons étudié les systèmes de DTAA dans d'autres langues que l'anglais : français, espagnol et allemand. De plus, nous proposons un système capable de produire les quatre langues en même temps, et nous le comparons avec les systèmes spécialisés dans chaque langue
In the audio research field, the majority of machine learning systems focus on recognizing a limited number of sound events. However, when a machine interacts with real data, it must be able to handle much more varied and complex situations. To tackle this problem, annotators use natural language, which allows any sound information to be summarized. Automated Audio Captioning (AAC) was introduced recently to develop systems capable of automatically producing a description of any type of sound in text form. This task concerns all kinds of sound events such as environmental, urban, domestic sounds, sound effects, music or speech. This type of system could be used by people who are deaf or hard of hearing, and could improve the indexing of large audio databases. In the first part of this thesis, we present the state of the art of the AAC task through a global description of public datasets, learning methods, architectures and evaluation metrics. Using this knowledge, we then present the architecture of our first AAC system, which obtains encouraging scores on the main AAC metric named SPIDEr: 24.7% on the Clotho corpus and 40.1% on the AudioCaps corpus. Then, subsequently, we explore many aspects of AAC systems in the second part. We first focus on evaluation methods through the study of SPIDEr. For this, we propose a variant called SPIDEr-max, which considers several candidates for each audio file, and which shows that the SPIDEr metric is very sensitive to the predicted words. Then, we improve our reference system by exploring different architectures and numerous hyper-parameters to exceed the state of the art on AudioCaps (SPIDEr of 49.5%). Next, we explore a multi-task learning method aimed at improving the semantics of sentences generated by our system. Finally, we build a general and unbiased AAC system called CONETTE, which can generate different types of descriptions that approximate those of the target datasets. In the third and last part, we propose to study the capabilities of a AAC system to automatically search for audio content in a database. Our approach obtains competitive scores to systems dedicated to this task, while using fewer parameters. We also introduce semi-supervised methods to improve our system using new unlabeled audio data, and we show how pseudo-label generation can impact a AAC model. Finally, we studied the AAC systems in languages other than English: French, Spanish and German. In addition, we propose a system capable of producing all four languages at the same time, and we compare it with systems specialized in each language

Styles APA, Harvard, Vancouver, ISO, etc.

Chapitres de livres sur le sujet "End-to-end multimodal modelling"

1

Huseyinov, Ilham N. « Fuzzy Linguistic Modelling in Multi Modal Human Computer Interaction ». Dans Speech, Image, and Language Processing for Human Computer Interaction, 64–79. IGI Global, 2012. http://dx.doi.org/10.4018/978-1-4666-0954-9.ch004.

Texte intégral

Résumé :

The purpose of this chapter is to explore fuzzy logic based methodology for computing an adaptive interface in an environment of imperfect, vague, multimodal, complex nonlinear hyper information space. To this end, based on fuzzy linguistic modelling and fuzzy multi level granulation an adaptation strategy to cognitive/learning styles is presented. The granulated fuzzy if-then rules are utilized to adaptively map cognitive/learning styles of users to their information navigation and presentation preferences through natural language expressions. The important implications of this approach are that, first, uncertain and vague information is handled; second, a mechanism for approximate adaptation at a variety of granulation levels is provided; third, a qualitative linguistic model of adaptation is presented. The proposed approach is close to human reasoning and thereby lowers the cost of solution, and facilitates the design of human computer interaction systems with high level intelligence capability.

Styles APA, Harvard, Vancouver, ISO, etc.

Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!