Gotowa bibliografia na temat „Cross-modal document classification”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Cross-modal document classification”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Artykuły w czasopismach na temat "Cross-modal document classification"
Zeng, Dehong, Xiaosong Chen, Zhengxin Song, Yun Xue i Qianhua Cai. "Multimodal Interaction and Fused Graph Convolution Network for Sentiment Classification of Online Reviews". Mathematics 11, nr 10 (17.05.2023): 2335. http://dx.doi.org/10.3390/math11102335.
Pełny tekst źródłaBakkali, Souhail, Zuheng Ming, Mickael Coustaty, Marçal Rusiñol i Oriol Ramos Terrades. "VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification". Pattern Recognition, luty 2023, 109419. http://dx.doi.org/10.1016/j.patcog.2023.109419.
Pełny tekst źródłaRozprawy doktorskie na temat "Cross-modal document classification"
Bakkali, Souhail. "Multimodal Document Understanding with Unified Vision and Language Cross-Modal Learning". Electronic Thesis or Diss., La Rochelle, 2022. http://www.theses.fr/2022LAROS046.
Pełny tekst źródłaThe frameworks developed in this thesis were the outcome of an iterative process of analysis and synthesis between existing theories and our performed studies. More specifically, we wish to study cross-modality learning for contextualized comprehension on document components across language and vision. The main idea is to leverage multimodal information from document images into a common semantic space. This thesis focuses on advancing the research on cross-modality learning and makes contributions on four fronts: (i) to proposing a cross-modal approach with deep networks to jointly leverage visual and textual information into a common semantic representation space to automatically perform and make predictions about multimodal documents (i.e., the subject matter they are about); (ii) to investigating competitive strategies to address the tasks of cross-modal document classification, content-based retrieval and few-shot document classification; (iii) to addressing data-related issues like learning when data is not annotated, by proposing a network that learns generic representations from a collection of unlabeled documents; and (iv) to exploiting few-shot learning settings when data contains only few examples
Tran, Thi Quynh Nhi. "Robust and comprehensive joint image-text representations". Thesis, Paris, CNAM, 2017. http://www.theses.fr/2017CNAM1096/document.
Pełny tekst źródłaThis thesis investigates the joint modeling of visual and textual content of multimedia documents to address cross-modal problems. Such tasks require the ability to match information across modalities. A common representation space, obtained by eg Kernel Canonical Correlation Analysis, on which images and text can be both represented and directly compared is a generally adopted solution.Nevertheless, such a joint space still suffers from several deficiencies that may hinder the performance of cross-modal tasks. An important contribution of this thesis is therefore to identify two major limitations of such a space. The first limitation concerns information that is poorly represented on the common space yet very significant for a retrieval task. The second limitation consists in a separation between modalities on the common space, which leads to coarse cross-modal matching. To deal with the first limitation concerning poorly-represented data, we put forward a model which first identifies such information and then finds ways to combine it with data that is relatively well-represented on the joint space. Evaluations on emph{text illustration} tasks show that by appropriately identifying and taking such information into account, the results of cross-modal retrieval can be strongly improved. The major work in this thesis aims to cope with the separation between modalities on the joint space to enhance the performance of cross-modal tasks.We propose two representation methods for bi-modal or uni-modal documents that aggregate information from both the visual and textual modalities projected on the joint space. Specifically, for uni-modal documents we suggest a completion process relying on an auxiliary dataset to find the corresponding information in the absent modality and then use such information to build a final bi-modal representation for a uni-modal document. Evaluations show that our approaches achieve state-of-the-art results on several standard and challenging datasets for cross-modal retrieval or bi-modal and cross-modal classification
Streszczenia konferencji na temat "Cross-modal document classification"
Bakkali, Souhail, Zuheng Ming, Mickael Coustaty i Marcal Rusinol. "Cross-Modal Deep Networks For Document Image Classification". W 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 2020. http://dx.doi.org/10.1109/icip40778.2020.9191268.
Pełny tekst źródła