Academic literature on the topic 'Pretrained models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Pretrained models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Pretrained models":

1

Hofmann, Valentin, Goran Glavaš, Nikola Ljubešić, Janet B. Pierrehumbert, and Hinrich Schütze. "Geographic Adaptation of Pretrained Language Models." Transactions of the Association for Computational Linguistics 12 (2024): 411–31. http://dx.doi.org/10.1162/tacl_a_00652.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract While pretrained language models (PLMs) have been shown to possess a plethora of linguistic knowledge, the existing body of research has largely neglected extralinguistic knowledge, which is generally difficult to obtain by pretraining on text alone. Here, we contribute to closing this gap by examining geolinguistic knowledge, i.e., knowledge about geographic variation in language. We introduce geoadaptation, an intermediate training step that couples language modeling with geolocation prediction in a multi-task learning setup. We geoadapt four PLMs, covering language groups from three geographic areas, and evaluate them on five different tasks: fine-tuned (i.e., supervised) geolocation prediction, zero-shot (i.e., unsupervised) geolocation prediction, fine-tuned language identification, zero-shot language identification, and zero-shot prediction of dialect features. Geoadaptation is very successful at injecting geolinguistic knowledge into the PLMs: The geoadapted PLMs consistently outperform PLMs adapted using only language modeling (by especially wide margins on zero-shot prediction tasks), and we obtain new state-of-the-art results on two benchmarks for geolocation prediction and language identification. Furthermore, we show that the effectiveness of geoadaptation stems from its ability to geographically retrofit the representation space of the PLMs.
2

Bear Don’t Walk IV, Oliver J., Tony Sun, Adler Perotte, and Noémie Elhadad. "Clinically relevant pretraining is all you need." Journal of the American Medical Informatics Association 28, no. 9 (June 21, 2021): 1970–76. http://dx.doi.org/10.1093/jamia/ocab086.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Clinical notes present a wealth of information for applications in the clinical domain, but heterogeneity across clinical institutions and settings presents challenges for their processing. The clinical natural language processing field has made strides in overcoming domain heterogeneity, while pretrained deep learning models present opportunities to transfer knowledge from one task to another. Pretrained models have performed well when transferred to new tasks; however, it is not well understood if these models generalize across differences in institutions and settings within the clinical domain. We explore if institution or setting specific pretraining is necessary for pretrained models to perform well when transferred to new tasks. We find no significant performance difference between models pretrained across institutions and settings, indicating that clinically pretrained models transfer well across such boundaries. Given a clinically pretrained model, clinical natural language processing researchers may forgo the time-consuming pretraining step without a significant performance drop.
3

Basu, Sourya, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy, Vijil Chenthamarakshan, Kush R. Varshney, Lav R. Varshney, and Payel Das. "Equi-Tuning: Group Equivariant Fine-Tuning of Pretrained Models." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 6 (June 26, 2023): 6788–96. http://dx.doi.org/10.1609/aaai.v37i6.25832.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We introduce equi-tuning, a novel fine-tuning method that transforms (potentially non-equivariant) pretrained models into group equivariant models while incurring minimum L_2 loss between the feature representations of the pretrained and the equivariant models. Large pretrained models can be equi-tuned for different groups to satisfy the needs of various downstream tasks. Equi-tuned models benefit from both group equivariance as an inductive bias and semantic priors from pretrained models. We provide applications of equi-tuning on three different tasks: image classification, compositional generalization in language, and fairness in natural language generation (NLG). We also provide a novel group-theoretic definition for fairness in NLG. The effectiveness of this definition is shown by testing it against a standard empirical method of fairness in NLG. We provide experimental results for equi-tuning using a variety of pretrained models: Alexnet, Resnet, VGG, and Densenet for image classification; RNNs, GRUs, and LSTMs for compositional generalization; and GPT2 for fairness in NLG. We test these models on benchmark datasets across all considered tasks to show the generality and effectiveness of the proposed method.
4

Wang, Canjun, Zhao Li, Tong Chen, Ruishuang Wang, and Zhengyu Ju. "Research on the Application of Prompt Learning Pretrained Language Model in Machine Translation Task with Reinforcement Learning." Electronics 12, no. 16 (August 9, 2023): 3391. http://dx.doi.org/10.3390/electronics12163391.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
With the continuous advancement of deep learning technology, pretrained language models have emerged as crucial tools for natural language processing tasks. However, optimization of pretrained language models is essential for specific tasks such as machine translation. This paper presents a novel approach that integrates reinforcement learning with prompt learning to enhance the performance of pretrained language models in machine translation tasks. In our methodology, a “prompt” string is incorporated into the input of the pretrained language model, to guide the generation of an output that aligns closely with the target translation. Reinforcement learning is employed to train the model in producing optimal translation results. During this training process, the target translation is utilized as a reward signal to incentivize the model to generate an output that aligns more closely with the desired translation. Experimental results validated the effectiveness of the proposed approach. The pretrained language model trained with prompt learning and reinforcement learning exhibited superior performance compared to traditional pretrained language models in machine translation tasks. Furthermore, we observed that different prompt strategies significantly impacted the model’s performance, underscoring the importance of selecting an optimal prompt strategy tailored to the specific task. The results suggest that using techniques such as prompt learning and reinforcement learning can improve the performance of pretrained language models for tasks such as text generation and machine translation. The method proposed in this paper not only offers a fresh perspective on leveraging pretrained language models in machine translation and other related tasks but also serves as a valuable reference for further research in this domain. By combining reinforcement learning with prompt learning, researchers can explore new avenues for optimizing pretrained language models and improving their efficacy in various natural language processing tasks.
5

Parmonangan, Ivan Halim, Marsella Marsella, Doharfen Frans Rino Pardede, Katarina Prisca Rijanto, Stephanie Stephanie, Kreshna Adhitya Chandra Kesuma, Valentina Tiara Cahyaningtyas, and Maria Susan Anggreainy. "Training CNN-based Model on Low Resource Hardware and Small Dataset for Early Prediction of Melanoma from Skin Lesion Images." Engineering, MAthematics and Computer Science (EMACS) Journal 5, no. 2 (May 31, 2023): 41–46. http://dx.doi.org/10.21512/emacsjournal.v5i2.9904.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Melanoma is a kind of rare skin cancer that can spread quickly to the other skin layers and the organs beneath. Melanoma is known to be curable only if it is diagnosed at an early stage. This poses a challenge for accurate prediction to cut the number of deaths caused by melanoma. Deep learning methods have recently shown promising performance in classifying images accurately. However, it requires a lot of samples to generalize well, while the number of melanoma sample images is limited. To solve this issue, transfer learning has widely adapted to transfer the knowledge of the pretrained model to another domain or new dataset which has lesser samples or different tasks. This study is aimed to find which method is better to achieve this for early melanoma prediction from skin lesion images. We investigated three pretrained and one non-pretrained image classification models. Specifically, we choose the pretrained models which are efficient to train on small training sample and low hardware resource. The result shows that using limited sample images and low hardware resource, pretrained image models yield better overall accuracy and recall compared to the non-pretrained model. This suggests that pretrained models are more suitable in this task with constrained data and hardware resource.
6

Edman, Lukas, Gabriele Sarti, Antonio Toral, Gertjan van Noord, and Arianna Bisazza. "Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation." Transactions of the Association for Computational Linguistics 12 (2024): 392–410. http://dx.doi.org/10.1162/tacl_a_00651.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Pretrained character-level and byte-level language models have been shown to be competitive with popular subword models across a range of Natural Language Processing tasks. However, there has been little research on their effectiveness for neural machine translation (NMT), particularly within the popular pretrain-then-finetune paradigm. This work performs an extensive comparison across multiple languages and experimental conditions of character- and subword-level pretrained models (ByT5 and mT5, respectively) on NMT. We show the effectiveness of character-level modeling in translation, particularly in cases where fine-tuning data is limited. In our analysis, we show how character models’ gains in translation quality are reflected in better translations of orthographically similar words and rare words. While evaluating the importance of source texts in driving model predictions, we highlight word-level patterns within ByT5, suggesting an ability to modulate word-level and character-level information during generation. We conclude by assessing the efficiency tradeoff of byte models, suggesting their usage in non-time-critical scenarios to boost translation quality.
7

Won, Hyun-Sik, Min-Ji Kim, Dohyun Kim, Hee-Soo Kim, and Kang-Min Kim. "University Student Dropout Prediction Using Pretrained Language Models." Applied Sciences 13, no. 12 (June 13, 2023): 7073. http://dx.doi.org/10.3390/app13127073.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Predicting student dropout from universities is an imperative but challenging task. Numerous data-driven approaches that utilize both student demographic information (e.g., gender, nationality, and high school graduation year) and academic information (e.g., GPA, participation in activities, and course evaluations) have shown meaningful results. Recently, pretrained language models have achieved very successful results in understanding the tasks associated with structured data as well as textual data. In this paper, we propose a novel student dropout prediction framework based on demographic and academic information, using a pretrained language model to capture the relationship between different forms of information. To this end, we first formulate both types of information in natural language form. We then recast the student dropout prediction task as a natural language inference (NLI) task. Finally, we fine-tune the pretrained language models to predict student dropout. In particular, we further enhance the model using a continuous hypothesis. The experimental results demonstrate that the proposed model is effective for the freshmen dropout prediction task. The proposed method exhibits significant improvements of as much as 9.00% in terms of F1-score compared with state-of-the-art techniques.
8

Zhou, Shengchao, Gaofeng Meng, Zhaoxiang Zhang, Richard Yi Da Xu, and Shiming Xiang. "Robust Feature Rectification of Pretrained Vision Models for Object Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (June 26, 2023): 3796–804. http://dx.doi.org/10.1609/aaai.v37i3.25492.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Pretrained vision models for object recognition often suffer a dramatic performance drop with degradations unseen during training. In this work, we propose a RObust FEature Rectification module (ROFER) to improve the performance of pretrained models against degradations. Specifically, ROFER first estimates the type and intensity of the degradation that corrupts the image features. Then, it leverages a Fully Convolutional Network (FCN) to rectify the features from the degradation by pulling them back to clear features. ROFER is a general-purpose module that can address various degradations simultaneously, including blur, noise, and low contrast. Besides, it can be plugged into pretrained models seamlessly to rectify the degraded features without retraining the whole model. Furthermore, ROFER can be easily extended to address composite degradations by adopting a beam search algorithm to find the composition order. Evaluations on CIFAR-10 and Tiny-ImageNet demonstrate that the accuracy of ROFER is 5% higher than that of SOTA methods on different degradations. With respect to composite degradations, ROFER improves the accuracy of a pretrained CNN by 10% and 6% on CIFAR-10 and Tiny-ImageNet respectively.
9

Elazar, Yanai, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, and Yoav Goldberg. "Measuring and Improving Consistency in Pretrained Language Models." Transactions of the Association for Computational Linguistics 9 (2021): 1012–31. http://dx.doi.org/10.1162/tacl_a_00410.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Consistency of a model—that is, the invariance of its behavior under meaning-preserving alternations in its input—is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel🤘, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for 38 relations. Using ParaRel🤘, we show that the consistency of all PLMs we experiment with is poor— though with high variance between relations. Our analysis of the representational spaces of PLMs suggests that they have a poor structure and are currently not suitable for representing knowledge robustly. Finally, we propose a method for improving model consistency and experimentally demonstrate its effectiveness.1
10

Takeoka, Kunihiro. "Low-resouce Taxonomy Enrichment with Pretrained Language Models." Journal of Natural Language Processing 29, no. 1 (2022): 259–63. http://dx.doi.org/10.5715/jnlp.29.259.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Pretrained models":

1

Neupane, Aashish. "Visual Saliency Analysis on Fashion Images Using Image Processing and Deep Learning Approaches." OpenSIUC, 2020. https://opensiuc.lib.siu.edu/theses/2784.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
ABSTRACTAASHISH NEUPANE, for the Master of Science degree in BIOMEDICAL ENGINEERING, presented on July 35, 2020, at Southern Illinois University Carbondale. TITLE: VISUAL SALIENCY ANALYSIS ON FASHION IMAGES USING IMAGE PROCESSING AND DEEP LEARNING APPROACHES.MAJOR PROFESSOR: Dr. Jun QinState-of-art computer vision technologies have been applied in fashion in multiple ways, and saliency modeling is one of those applications. In computer vision, a saliency map is a 2D topological map which indicates the probabilistic distribution of visual attention priorities. This study is focusing on analysis of the visual saliency on fashion images using multiple saliency models, evaluated by several evaluation metrics. A human subject study has been conducted to collect people’s visual attention on 75 fashion images. Binary ground-truth fixation maps for these images have been created based on the experimentally collected visual attention data using Gaussian blurring function. Saliency maps for these 75 fashion images were generated using multiple conventional saliency models as well as deep feature-based state-of-art models. DeepFeat has been studied extensively, with 44 sets of saliency maps, exploiting the features extracted from GoogLeNet and ResNet50. Seven other saliency models have also been utilized to predict saliency maps on these images. The results were compared over 5 evaluation metrics – AUC, CC, KL Divergence, NSS and SIM. The performance of all 8 saliency models on prediction of visual attention on fashion images over all five metrics were comparable to the benchmarked scores. Furthermore, the models perform well consistently over multiple evaluation metrics, thus indicating that saliency models could in fact be applied to effectively predict salient regions in random fashion advertisement images.
2

Pelloin, Valentin. "La compréhension de la parole dans les systèmes de dialogues humain-machine à l'heure des modèles pré-entraînés." Electronic Thesis or Diss., Le Mans, 2024. http://www.theses.fr/2024LEMA1002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Dans cette thèse, la compréhension automatique de la parole (SLU) est étudiée dans le cadre applicatif de dialogues téléphoniques à buts définis (réservation de chambres d'hôtel par exemple). Historiquement, la SLU était réalisée en cascade : un système de reconnaissance de la parole réalisait une transcription en mots, puis un système de compréhension y associait une annotation sémantique. Le développement des méthodes neuronales profondes a fait émerger les architectures de bout-en-bout, où la tâche de compréhension est réalisée par un système unique, appliqué directement à partir du signal de parole pour en extraire l’annotation sémantique. Récemment, les modèles dits pré-entraînés de manière non supervisée (SSL) ont apporté de nouvelles avancées en traitement automatique des langues (TAL). Appris de façon générique sur de très grandes masses de données, ils peuvent ensuite être adaptés pour d'autres applications. À ce jour, les meilleurs résultats SLU sont obtenus avec des systèmes en cascade intégrant des modèles SSL.Cependant, aucune des architectures, cascade ou bout-en-bout, n'est parfaite. À travers cette thèse, nous étudions ces architectures et proposons des versions hybrides qui tentent de tirer parti des avantages de chacune. Après avoir développé un modèle SLU bout-en-bout à l’état de l’art, nous avons évalué différentes stratégies d’hybridation. Les avancées apportées par les modèles SSL en cours de thèse, nous ont amenés à les intégrer dans notre architecture hybride
In this thesis, spoken language understanding (SLU) is studied in the application context of telephone dialogues with defined goals (hotel booking reservations, for example). Historically, SLU was performed through a cascade of systems: a first system would transcribe the speech into words, and a natural language understanding system would link those words to a semantic annotation. The development of deep neural methods has led to the emergence of end-to-end architectures, where the understanding task is performed by a single system, applied directly to the speech signal to extract the semantic annotation. Recently, so-called self-supervised learning (SSL) pre-trained models have brought new advances in natural language processing (NLP). Learned in a generic way on very large datasets, they can then be adapted for other applications. To date, the best SLU results have been obtained with pipeline systems incorporating SSL models.However, none of the architectures, pipeline or end-to-end, is perfect. In this thesis, we study these architectures and propose hybrid versions that attempt to benefit from the advantages of each. After developing a state-of-the-art end-to-end SLU model, we evaluated different hybrid strategies. The advances made by SSL models during the course of this thesis led us to integrate them into our hybrid architecture
3

Kulhánek, Jonáš. "End-to-end dialogové systémy s předtrénovanými jazykovými modely." Master's thesis, 2021. http://www.nusl.cz/ntk/nusl-448383.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Current dialogue systems typically consist of separate components, which are manu- ally engineered to a large part and need extensive annotation. End-to-end trainable sys- tems exist but produce lower-quality, unreliable outputs. The recent transformer-based pre-trained language models such as GPT-2 brought considerable progress to language modelling, but they rely on huge amounts of textual data, which are not available for common dialogue domains. Therefore, training these models runs a high risk of overfit- ting. To overcome these obstacles, we propose a novel end-to-end dialogue system called AuGPT. We add auxiliary training objectives to use training data more efficiently, and we use massive data augmentation via back-translation and pretraining on multiple datasets to increase data volume and diversity. We evaluate our system using automatic methods (corpus-based metrics, user simulation), human evaluation as part of the DSTC 9 shared task challenge (where our system placed 3rd out of 10), as well as extensive manual error analysis. Our method substantially outperforms the baseline on the MultiWOZ bench- mark and shows competitive results with state-of-the-art end-to-end dialogue systems. 1

Book chapters on the topic "Pretrained models":

1

Gad, Ahmed Fawzy. "Deploying Pretrained Models." In Practical Computer Vision Applications Using Deep Learning with CNNs, 295–338. Berkeley, CA: Apress, 2018. http://dx.doi.org/10.1007/978-1-4842-4167-7_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Jain, Shashank Mohan. "Fine-Tuning Pretrained Models." In Introduction to Transformers for NLP, 137–51. Berkeley, CA: Apress, 2022. http://dx.doi.org/10.1007/978-1-4842-8844-3_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Sun, Kaili, Xudong Luo, and Michael Y. Luo. "A Survey of Pretrained Language Models." In Knowledge Science, Engineering and Management, 442–56. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-10986-7_36.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Souza, Fábio, Rodrigo Nogueira, and Roberto Lotufo. "BERTimbau: Pretrained BERT Models for Brazilian Portuguese." In Intelligent Systems, 403–17. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-61377-8_28.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Song, Yunfeng, Xiaochao Fan, Yong Yang, Ge Ren, and Weiming Pan. "Large Pretrained Models on Multimodal Sentiment Analysis." In Lecture Notes in Electrical Engineering, 506–13. Singapore: Springer Singapore, 2022. http://dx.doi.org/10.1007/978-981-16-9423-3_63.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Lovón-Melgarejo, Jesús, Jose G. Moreno, Romaric Besançon, Olivier Ferret, and Lynda Tamine. "Probing Pretrained Language Models with Hierarchy Properties." In Lecture Notes in Computer Science, 126–42. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-56060-6_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Yarlagadda, Madhulika, Susrutha Ettimalla, and Bhanu Sri Davuluri. "Zero-Shot Document Classification Using Pretrained Models." In Multifaceted approaches for Data Acquisition, Processing & Communication, 104–10. London: CRC Press, 2024. http://dx.doi.org/10.1201/9781003470939-14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Tan, Zhen, Lu Cheng, Song Wang, Bo Yuan, Jundong Li, and Huan Liu. "Interpreting Pretrained Language Models via Concept Bottlenecks." In Advances in Knowledge Discovery and Data Mining, 56–74. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2259-4_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Hao, Kaifeng, Jianfeng Li, Cuiqin Hou, Xuexuan Wang, and Pengyu Li. "Combining Pretrained and Graph Models for Text Classification." In Communications in Computer and Information Science, 422–29. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-92307-5_49.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ni, Bolin, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, and Haibin Ling. "Expanding Language-Image Pretrained Models for General Video Recognition." In Lecture Notes in Computer Science, 1–18. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19772-7_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Pretrained models":

1

Zhang, Zhiyuan, Xiaoqian Liu, Yi Zhang, Qi Su, Xu Sun, and Bin He. "Pretrain-KGE: Learning Knowledge Representation from Pretrained Language Models." In Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.findings-emnlp.25.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Chen, Catherine, Kevin Lin, and Dan Klein. "Constructing Taxonomies from Pretrained Language Models." In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.naacl-main.373.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Koto, Fajri, Jey Han Lau, and Timothy Baldwin. "Discourse Probing of Pretrained Language Models." In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.naacl-main.301.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zhou, Jingren. "Large-scale Multi-Modality Pretrained Models." In MM '21: ACM Multimedia Conference. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3474085.3480241.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Davison, Joe, Joshua Feldman, and Alexander Rush. "Commonsense Knowledge Mining from Pretrained Models." In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/d19-1109.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Weller, Orion, Marc Marone, Vladimir Braverman, Dawn Lawrie, and Benjamin Van Durme. "Pretrained Models for Multilingual Federated Learning." In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA, USA: Association for Computational Linguistics, 2022. http://dx.doi.org/10.18653/v1/2022.naacl-main.101.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Troshin, Sergey, and Nadezhda Chirkova. "Probing Pretrained Models of Source Codes." In Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2022. http://dx.doi.org/10.18653/v1/2022.blackboxnlp-1.31.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Dalvi, Fahim, Hassan Sajjad, Nadir Durrani, and Yonatan Belinkov. "Analyzing Redundancy in Pretrained Transformer Models." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.398.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Tamkin, Alex, Trisha Singh, Davide Giovanardi, and Noah Goodman. "Investigating Transferability in Pretrained Language Models." In Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.findings-emnlp.125.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kurita, Keita, Paul Michel, and Graham Neubig. "Weight Poisoning Attacks on Pretrained Models." In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.acl-main.249.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Pretrained models":

1

Lohn, Andrew. Poison in the Well: Securing the Shared Resources of Machine Learning. Center for Security and Emerging Technology, June 2021. http://dx.doi.org/10.51593/2020ca013.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Modern machine learning often relies on open-source datasets, pretrained models, and machine learning libraries from across the internet, but are those resources safe to use? Previously successful digital supply chain attacks against cyber infrastructure suggest the answer may be no. This report introduces policymakers to these emerging threats and provides recommendations for how to secure the machine learning supply chain.
2

Shrestha, Tanuja, Mir A. Matin, Vishwas Chitale, and Samuel Thomas. Exploring the potential of deep learning for classifying camera trap data: A case study from Nepal - working paper. International Centre for Integrated Mountain Development (ICIMOD), September 2023. http://dx.doi.org/10.53055/icimod.1016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Data from camera trap networks provide crucial information on various important aspects of wildlife presence, movement, and behaviour. However, manual processing of large volumes of images captured is time and resource intensive. This study explores three different approaches of deep learning methods to detect and classify images of key animal species collected from the ICIMOD Knowledge Park at Godavari, Nepal. It shows that transfer learning with ImageNet pretrained models (A1) can be used to detect animal species with minimal model training and testing. These methods when scaled up offer tremendous scope for quicker and informed conflict management actions, including automated response, which can help minimise human wildlife conflict management costs across countries in the region.

To the bibliography