Увійти

Готові списки джерел за темами / Data-to-text generation

Добірка наукової літератури з теми "Data-to-text generation"

Автор: Grafiati

Опубліковано: 8 червня 2024

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Data-to-text generation".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Зміст

Статті в журналах
Дисертації
Книги
Частини книг
Тези доповідей конференцій
Звіти організацій

Статті в журналах з теми "Data-to-text generation":

1

Yang, Sen, and Yang Liu. "Data-to-text Generation via Planning." Journal of Physics: Conference Series 1827, no. 1 (March 1, 2021): 012190. http://dx.doi.org/10.1088/1742-6596/1827/1/012190.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Puduppully, Ratish, Yao Fu, and Mirella Lapata. "Data-to-text Generation with Variational Sequential Planning." Transactions of the Association for Computational Linguistics 10 (2022): 697–715. http://dx.doi.org/10.1162/tacl_a_00484.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input. We focus on generating long-form text, that is, documents with multiple paragraphs, and propose a neural model enhanced with a planning component responsible for organizing high-level information in a coherent and meaningful way. We infer latent plans sequentially with a structured variational model, while interleaving the steps of planning and generation. Text is generated by conditioning on previous variational decisions and previously generated text. Experiments on two data-to-text benchmarks (RotoWire and MLB) show that our model outperforms strong baselines and is sample-efficient in the face of limited training data (e.g., a few hundred instances).

3

Gong, Heng, Xiaocheng Feng, and Bing Qin. "DiffuD2T: Empowering Data-to-Text Generation with Diffusion." Electronics 12, no. 9 (May 7, 2023): 2136. http://dx.doi.org/10.3390/electronics12092136.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Surrounded by structured data, such as medical data, financial data, knowledge bases, etc., data-to-text generation has become an important natural language processing task that can help people better understand the meaning of those data by providing them with user-friendly text. Existing methods for data-to-text generation show promising results in tackling two major challenges: content planning and surface realization, which transform structured data into fluent text. However, they lack an iterative refinement process for generating text, which can enable the model to perfect the text step-by-step while accepting control over the process. In this paper, we explore enhancing data-to-text generation with an iterative refinement process via diffusion. We have four main contributions: (1) we use the diffusion model to improve the prefix tuning for data-to-text generation; (2) we propose a look-ahead guiding loss to supervise the iterative refinement process for better text generation; (3) we extract content plans from reference text and propose a planning-then-writing pipeline to give the model content planning ability; and (4) we conducted experiments on three data-to-text generation datasets and both automatic evaluation criteria (BLEU, NIST, METEOR, ROUGEL, CIDEr, TER, MoverScore, BLEURT, and BERTScore) and human evaluation criteria (Quality and Naturalness) show the effectiveness of our model. Our model can improve the competitive prefix tuning method by 2.19% in terms of a widely-used automatic evaluation criterion BLEU (BiLingual Evaluation Understudy) on WebNLG dataset with GPT-2 Large as the pretrained language model backbone. Human evaluation criteria also show that our model can improve the quality and naturalness of the generated text across all three datasets.

4

Puduppully, Ratish, and Mirella Lapata. "Data-to-text Generation with Macro Planning." Transactions of the Association for Computational Linguistics 9 (2021): 510–27. http://dx.doi.org/10.1162/tacl_a_00381.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract Recent approaches to data-to-text generation have adopted the very successful encoder-decoder architecture or variants thereof. These models generate text that is fluent (but often imprecise) and perform quite poorly at selecting appropriate content and ordering it coherently. To overcome some of these issues, we propose a neural model with a macro planning stage followed by a generation stage reminiscent of traditional methods which embrace separate modules for planning and surface realization. Macro plans represent high level organization of important content such as entities, events, and their interactions; they are learned from data and given as input to the generator. Extensive experiments on two data-to-text benchmarks (RotoWire and MLB) show that our approach outperforms competitive baselines in terms of automatic and human evaluation.

5

Zhang, Dell, Jiahao Yuan, Xiaoling Wang, and Adam Foster. "Probabilistic Verb Selection for Data-to-Text Generation." Transactions of the Association for Computational Linguistics 6 (December 2018): 511–27. http://dx.doi.org/10.1162/tacl_a_00038.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In data-to-text Natural Language Generation (NLG) systems, computers need to find the right words to describe phenomena seen in the data. This paper focuses on the problem of choosing appropriate verbs to express the direction and magnitude of a percentage change (e.g., in stock prices). Rather than simply using the same verbs again and again, we present a principled data-driven approach to this problem based on Shannon’s noisy-channel model so as to bring variation and naturalness into the generated text. Our experiments on three large-scale real-world news corpora demonstrate that the proposed probabilistic model can be learned to accurately imitate human authors’ pattern of usage around verbs, outperforming the state-of-the-art method significantly.

6

Li, Shujie, Liang Li, Ruiying Geng, Min Yang, Binhua Li, Guanghu Yuan, Wanwei He, et al. "Unifying Structured Data as Graph for Data-to-Text Pre-Training." Transactions of the Association for Computational Linguistics 12 (2024): 210–28. http://dx.doi.org/10.1162/tacl_a_00641.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract Data-to-text (D2T) generation aims to transform structured data into natural language text. Data-to-text pre-training has proved to be powerful in enhancing D2T generation and yields impressive performance. However, previous pre-training methods either oversimplified structured data into a sequence without considering input structures or designed training objectives tailored for a specific data structure (e.g., table or knowledge graph). In this paper, we unify different types of structured data (i.e., table, key-value data, knowledge graph) into the graph format and cast different D2T generation tasks as graph-to-text generation. To effectively exploit the structural information of the input graph, we propose a structure-enhanced pre-training method for D2T generation by designing a structure-enhanced Transformer. Concretely, we devise a position matrix for the Transformer, encoding relative positional information of connected nodes in the input graph. In addition, we propose a new attention matrix to incorporate graph structures into the original Transformer by taking the available explicit connectivity structure into account. Extensive experiments on six benchmark datasets show the effectiveness of our model. Our source codes are available at https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/unid2t.

7

Gong, Heng, Xiaocheng Feng, and Bing Qin. "Quality Control for Distantly-Supervised Data-to-Text Generation via Meta Learning." Applied Sciences 13, no. 9 (April 30, 2023): 5573. http://dx.doi.org/10.3390/app13095573.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Data-to-text generation plays an important role in natural language processing by processing structured data and helping people understand those data by generating user-friendly descriptive text. It can be applied to news generation, financial report generation, customer service, etc. However, in practice, it needs to adapt to different domains that may lack an annotated training corpus. To alleviate this dataset scarcity problem, distantly-supervised data-to-text generation has emerged, which constructs a training corpus automatically and is more practical to apply to new domains when well-aligned data is expensive to obtain. However, this distant supervision method of training induces an over-generation problem since the automatically aligned text includes hallucination. These expressions cannot be inferred from the data, misguiding the model to produce unfaithful text. To exploit the noisy dataset while maintaining faithfulness, we empower the neural data-to-text model by dynamically increasing the weights of those well-aligned training instances and reducing the weights of the low-quality ones via meta learning. To our best knowledge, we are the first to alleviate the noise in distantly-supervised data-to-text generation via meta learning. In addition, we rewrite those low-quality texts to provide better training instances. Finally, we construct a new distantly-supervised dataset, DIST-ToTTo (abbreviation for Distantly-supervised Table-To-Text), and conduct experiments on both the benchmark WITA (abbreviation for the data source Wikipedia and Wikidata) and DIST-ToTTo datasets. The evaluation results show that our model can improve the state-of-the-art DSG (abbreviation for Distant Supervision Generation) model across all automatic evaluation metrics, with an improvement of 3.72% on the WITA dataset and 3.82% on the DIST-ToTTo dataset in terms of the widely used metric BLEU (abbreviation for BiLingual Evaluation Understudy). Furthermore, based on human evaluation, our model can generate more grammatically correct and more faithful text compared to the state-of-the-art DSG model.

8

Puduppully, Ratish, Li Dong, and Mirella Lapata. "Data-to-Text Generation with Content Selection and Planning." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6908–15. http://dx.doi.org/10.1609/aaai.v33i01.33016908.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Recent advances in data-to-text generation have led to the use of large-scale datasets and neural network models which are trained end-to-end, without explicitly modeling what to say and in what order. In this work, we present a neural network architecture which incorporates content selection and planning without sacrificing end-to-end training. We decompose the generation task into two stages. Given a corpus of data records (paired with descriptive documents), we first generate a content plan highlighting which information should be mentioned and in which order and then generate the document while taking the content plan into account. Automatic and human-based evaluation experiments show that our model1 outperforms strong baselines improving the state-of-the-art on the recently released RotoWIRE dataset.

9

Gkatzia, Dimitra, Oliver Lemon, and Verena Rieser. "Data-to-Text Generation Improves Decision-Making Under Uncertainty." IEEE Computational Intelligence Magazine 12, no. 3 (August 2017): 10–17. http://dx.doi.org/10.1109/mci.2017.2708998.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Rebuffel, Clement, Marco Roberti, Laure Soulier, Geoffrey Scoutheeten, Rossella Cancelliere, and Patrick Gallinari. "Controlling hallucinations at word level in data-to-text generation." Data Mining and Knowledge Discovery 36, no. 1 (October 22, 2021): 318–54. http://dx.doi.org/10.1007/s10618-021-00801-4.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

AbstractData-to-Text Generation (DTG) is a subfield of Natural Language Generation aiming at transcribing structured data in natural language descriptions. The field has been recently boosted by the use of neural-based generators which exhibit on one side great syntactic skills without the need of hand-crafted pipelines; on the other side, the quality of the generated text reflects the quality of the training data, which in realistic settings only offer imperfectly aligned structure-text pairs. Consequently, state-of-art neural models include misleading statements –usually called hallucinations—in their outputs. The control of this phenomenon is today a major challenge for DTG, and is the problem addressed in the paper. Previous work deal with this issue at the instance level: using an alignment score for each table-reference pair. In contrast, we propose a finer-grained approach, arguing that hallucinations should rather be treated at the word level. Specifically, we propose a Multi-Branch Decoder which is able to leverage word-level labels to learn the relevant parts of each training instance. These labels are obtained following a simple and efficient scoring procedure based on co-occurrence analysis and dependency parsing. Extensive evaluations, via automated metrics and human judgment on the standard WikiBio benchmark, show the accuracy of our alignment labels and the effectiveness of the proposed Multi-Branch Decoder. Our model is able to reduce and control hallucinations, while keeping fluency and coherence in generated texts. Further experiments on a degraded version of ToTTo show that our model could be successfully used on very noisy settings.

Більше джерел

Дисертації з теми "Data-to-text generation":

1

Gkatzia, Dimitra. "Data-driven approaches to content selection for data-to-text generation." Thesis, Heriot-Watt University, 2015. http://hdl.handle.net/10399/3003.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Data-to-text systems are powerful in generating reports from data automatically and thus they simplify the presentation of complex data. Rather than presenting data using visualisation techniques, data-to-text systems use human language, which is the most common way for human-human communication. In addition, data-to-text systems can adapt their output content to users’ preferences, background or interests and therefore they can be pleasant for users to interact with. Content selection is an important part of every data-to-text system, because it is the module that decides which from the available information should be conveyed to the user. This thesis makes three important contributions. Firstly, it investigates data-driven approaches to content selection with respect to users’ preferences. It develops, compares and evaluates two novel content selection methods. The first method treats content selection as a Markov Decision Process (MDP), where the content selection decisions are made sequentially, i.e. given the already chosen content, decide what to talk about next. The MDP is solved using Reinforcement Learning (RL) and is optimised with respect to a cumulative reward function. The second approach considers all content selection decisions simultaneously by taking into account data relationships and treats content selection as a multi-label classification task. The evaluation shows that the users significantly prefer the output produced by the RL framework, whereas the multi-label classification approach scores significantly higher than the RL method in automatic metrics. The results also show that the end users’ preferences should be taken into account when developing Natural Language Generation (NLG) systems. NLG systems are developed with the assistance of domain experts, however the end users are normally non-experts. Consider for instance a student feedback generation system, where the system imitates the teachers. The system will produce feedback based on the lecturers’ rather than the students’ preferences although students are the end users. Therefore, the second contribution of this thesis is an approach that adapts the content to “speakers” and “hearers” simultaneously. It considers initially two types of known stakeholders; lecturers and students. It develops a novel approach that analyses the preferences of the two groups using Principal Component Regression and uses the derived knowledge to hand-craft a reward function that is then optimised using RL. The results show that the end users prefer the output generated by this system, rather than the output that is generated by a system that mimics the experts. Therefore, it is possible to model the middle ground of the preferences of different known stakeholders. In most real world applications however, first-time users are generally unknown, which is a common problem for NLG and interactive systems: the system cannot adapt to user preferences without prior knowledge. This thesis contributes a novel framework for addressing unknown stakeholders such as first time users, using Multi-objective Optimisation to minimise regret for multiple possible user types. In this framework, the content preferences of potential users are modelled as objective functions, which are simultaneously optimised using Multi-objective Optimisation. This approach outperforms two meaningful baselines and minimises regret for unknown users.

2

Hill, Geoffrey. "Sensemaking in Big Data: Conceptual and Empirical Approaches to Actionable Knowledge Generation from Unstructured Text Streams." Kent State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=kent1433597354.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Pereira, José Casimiro. "Natural language generation in the context of multimodal interaction in Portuguese : Data-to-text based in automatic translation." Doctoral thesis, Universidade de Aveiro, 2017. http://hdl.handle.net/10773/21767.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Doutoramento em Informática
Resumo em português não disponivel
To enable the interaction by text and/or speech it is essential that we devise systems capable of translating internal data into sentences or texts that can be shown on screen or heard by users. In this context, it is essential that these natural language generation (NLG) systems provide sentences in the native languages of the users (in our case European Portuguese) and enable an easy development and integration process while providing an output that is perceived as natural. The creation of high quality NLG systems is not an easy task, even for a small domain. The main di culties arise from: classic approaches being very demanding in know-how and development time; a lack of variability in generated sentences of most generation methods; a di culty in easily accessing complete tools; shortage of resources, such as large corpora; and support being available in only a limited number of languages. The main goal of this work was to propose, develop and test a method to convert Data-to-Portuguese, which can be developed with the smallest amount possible of time and resources, but being capable of generating utterances with variability and quality. The thesis defended argues that this goal can be achieved adopting data-driven language generation { more precisely generation based in language translation { and following an Engineering Research Methodology. In this thesis, two Data2Text NLG systems are presented. They were designed to provide a way to quickly develop an NLG system which can generate sentences with good quality. The proposed systems use tools that are freely available and can be developed by people with low linguistic skills. One important characteristic is the use of statistical machine translation techniques and this approach requires only a small natural language corpora resulting in easier and cheaper development when compared to more common approaches. The main result of this thesis is the demonstration that, by following the proposed approach, it is possible to create systems capable of translating information/data into good quality sentences in Portuguese. This is done without major e ort regarding resources creation and with the common knowledge of an experienced application developer. The systems created, particularly the hybrid system, are capable of providing a good solution for problems in data to text conversion.

4

Shimorina, Anastasia. "Natural Language Generation : From Data Creation to Evaluation via Modelling." Electronic Thesis or Diss., Université de Lorraine, 2021. http://www.theses.fr/2021LORR0080.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

La génération en langue naturelle (natural language generation, NLG) est le processus qui consiste à générer du texte dans une langue naturelle à partir de données d’entrée. Ces entrées peuvent prendre la forme de textes, de documents, d’images, de tableaux, de graphes (réseaux de connaissances), de bases de données, d’actes de dialogue, ou d’autres représentations sémantiques. Les méthodes récentes en NLG, principalement basées sur des modèles neuronaux, ont apporté des améliorations significatives. Malgré ces récents progrès, de nombreux problèmes liés à la tâche de génération subsistent, tels que celui de la fidélité aux données d’entrée, du développement de modèles multilingues, ou de la génération à partir de peu d’exemples. Cette thèse explore trois aspects de la NLG : tout d’abord, la création de données d’apprentissage, puis le développement de modèles de génération, et enfin l’évaluation des méthodes proposées. Nous abordons la question du multilinguisme et proposons des stratégies de traduction semi-automatique de corpus destinés à l’entraînement de modèles de NLG. Nous montrons que les entités nommées constituent un obstacle majeur dans la réalisation de la tâche de traduction, ici considérée de l’anglais vers le russe. Nous décrivons ensuite deux méthodes de traitement des entités rares dans les données d’apprentissages des modèles de NLG : la copie et la délexicalisation. Nous démontrons que l’effet de ces deux mécanismes varie fortement selon la manière dont les données sont construites, et que les entités rares ont un impact important sur les performances des modèles. Concernant la génération multilingue, nous développons une approche modulaire de réalisation de surface superficielle (shallow surface realisation, SSR) pour plusieurs langues. Notre approche consiste à diviser la tâche de SSR en trois composantes : l’ordonnancement des mots, l’inflexion morphologique et la génération de contractions. Nous montrons, via la délexicalisation, que la composante d’ordonnancement s’appuie principalement sur les informations syntaxiques. En plus de nos contributions concernant la modélisation, nous proposons un cadre d’analyse des erreurs axé sur l’ordre des mots, pour la tâche de SSR. Ce cadre permet d’obtenir un aperçu linguistique des performances des modèles au niveau de la phrase et d’identifier les cas où un modèle échoue. Enfin, nous abordons le sujet de l’évaluation de manière plus générale et comparons différentes métriques automatiques et humaines ; nous soulignons la différence entre les méthodes d’évaluation au niveau de la phrase et les méthodes d’évaluations au niveau du corpus
Natural language generation is a process of generating a natural language text from some input. This input can be texts, documents, images, tables, knowledge graphs, databases, dialogue acts, meaning representations, etc. Recent methods in natural language generation, mostly based on neural modelling, have yielded significant improvements in the field. Despite this recent success, numerous issues with generation prevail, such as faithfulness to the source, developing multilingual models, few-shot generation. This thesis explores several facets of natural language generation from creating training datasets and developing models to evaluating proposed methods and model outputs. In this thesis, we address the issue of multilinguality and propose possible strategies to semi-automatically translate corpora for data-to-text generation. We show that named entities constitute a major stumbling block in translation exemplified by the English-Russian translation pair. We proceed to handle rare entities in data-to-text modelling exploring two mechanisms: copying and delexicalisation. We demonstrate that rare entities strongly impact performance and that the impact of these two mechanisms greatly varies depending on how datasets are constructed. Getting back to multilinguality, we also develop a modular approach for shallow surface realisation in several languages. Our approach splits the surface realisation task into three submodules: word ordering, morphological inflection and contraction generation. We show, via delexicalisation, that the word ordering component mainly depends on syntactic information. Along with the modelling, we also propose a framework for error analysis, focused on word order, for the shallow surface realisation task. The framework enables to provide linguistic insights into model performance on the sentence level and identify patterns where models underperform. Finally, we also touch upon the subject of evaluation design while assessing automatic and human metrics, highlighting the difference between the sentence-level and system-level type of evaluation

5

Faille, Juliette. "Data-Based Natural Language Generation : Evaluation and Explainability." Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0305.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Les modèles de génération de langage naturel (NLG) ont récemment atteint de très hautes performances. Les textes qu'ils produisent sont généralement corrects sur le plan grammatical et syntaxique, ce qui les rend naturels. Bien que leur sens soit correct dans la grande majorité des cas, même les modèles de NLG les plus avancés produisent encore des textes avec des significations partiellement inexactes. Dans cette thèse, en nous concentrons sur le cas particulier des problèmes liés au contenu des textes générés, nous proposons d'évaluer et d'analyser les modèles utilisés dans les tâches de verbalisation de graphes RDF (Resource Description Framework) et de génération de questions conversationnelles. Tout d'abord, nous étudions la tâche de verbalisation des graphes RDF et en particulier les omissions et hallucinations d'entités RDF, c'est-à-dire lorsqu'un texte généré automatiquement ne mentionne pas toutes les entités du graphe RDF d'entrée ou mentionne d'autres entités que celles du graphe d'entrée. Nous évaluons 25 modèles de verbalisation de graphes RDF sur les données WebNLG. Nous développons une méthode pour détecter automatiquement les omissions et les hallucinations d'entités RDF dans les sorties de ces modèles. Nous proposons une métrique basée sur le nombre d'omissions ou d'hallucinations pour quantifier l'adéquation sémantique des modèles NLG avec l'entrée. Nous constatons que cette métrique est corrélée avec ce que les annotateurs humains considèrent comme sémantiquement correct et nous montrons que même les modèles les plus globalement performants sont sujets à des omissions et à des hallucinations. Suite à cette observation sur la tendance des modèles de verbalisation RDF à générer des textes avec des problèmes liés au contenu, nous proposons d'analyser l'encodeur de deux de ces modèles, BART et T5. Nous utilisons une méthode d'explicabilité par sondage et introduisons deux sondes de classification, l'une paramétrique et l'autre non paramétrique, afin de détecter les omissions et les déformations des entités RDF dans les plongements lexicaux des modèles encodeur-décodeur. Nous constatons que ces classifieurs sont capables de détecter ces erreurs dans les encodages, ce qui suggère que l'encodeur des modèles est responsable d'une certaine perte d'informations sur les entités omises et déformées. Enfin, nous proposons un modèle de génération de questions conversationnelles basé sur T5 qui, en plus de générer une question basée sur un graphe RDF d'entrée et un contexte conversationnel, génère à la fois une question et le triplet RDF correspondant. Ce modèle nous permet d'introduire une procédure d'évaluation fine évaluant automatiquement la cohérence avec le contexte de la conversation et l'adéquation sémantique avec le graphe RDF d'entrée. Nos contributions s'inscrivent dans les domaines de l'évaluation en NLG et de l'explicabilité. Nous empruntons des techniques et des méthodologies à ces deux domaines de recherche afin d'améliorer la fiabilité des modèles de génération de texte
Recent Natural Language Generation (NLG) models achieve very high average performance. Their output texts are generally grammatically and syntactically correct which makes them sound natural. Though the semantics of the texts are right in most cases, even the state-of-the-art NLG models still produce texts with partially incorrect meanings. In this thesis, we propose evaluating and analyzing content-related issues of models used in the NLG tasks of Resource Description Framework (RDF) graphs verbalization and conversational question generation. First, we focus on the task of RDF verbalization and the omissions and hallucinations of RDF entities, i.e. when an automatically generated text does not mention all the input RDF entities or mentions other entities than those in the input. We evaluate 25 RDF verbalization models on the WebNLG dataset. We develop a method to automatically detect omissions and hallucinations of RDF entities in the outputs of these models. We propose a metric based on omissions or hallucination counts to quantify the semantic adequacy of the NLG models. We find that this metric correlates well with what human annotators consider to be semantically correct and show that even state-of-the-art models are subject to omissions and hallucinations. Following this observation about the tendency of RDF verbalization models to generate texts with content-related issues, we propose to analyze the encoder of two such state-of-the-art models, BART and T5. We use the probing explainability method and introduce two probing classifiers (one parametric and one non-parametric) to detect omissions and distortions of RDF input entities in the embeddings of the encoder-decoder models. We find that such probing classifiers are able to detect these mistakes in the encodings, suggesting that the encoder of the models is responsible for some loss of information about omitted and distorted entities. Finally, we propose a T5-based conversational question generation model that in addition to generating a question based on an input RDF graph and a conversational context, generates both a question and its corresponding RDF triples. This setting allows us to introduce a fine-grained evaluation procedure automatically assessing coherence with the conversation context and the semantic adequacy with the input RDF. Our contributions belong to the fields of NLG evaluation and explainability and use techniques and methodologies from these two research fields in order to work towards providing more reliable NLG models

6

Vaudry, Pierre-Luc. "Narrative generation by associative network extraction from real-life temporal data." Thèse, 2016. http://hdl.handle.net/1866/18473.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Les données portant sur des événements abondent dans notre société technologique. Une façon intéressante de présenter des données temporelles réelles pour faciliter leur interprétation est un récit généré automatiquement. La compréhension de récits implique la construction d'un réseau causal par le lecteur. Les systèmes de data-to-text narratifs semblent reconnaître l'importance des relations causales. Cependant, celles-ci jouent un rôle secondaire dans leurs planificateurs de document et leur identification repose principalement sur des connaissances du domaine. Cette thèse propose un modèle d'interprétation assistée de données temporelles par génération de récits structurés à l'aide d'un mélange de règles d'association automatiquement extraites et définies manuellement. Les associations suggèrent des hypothèses au lecteur qui peut ainsi construire plus facilement une représentation causale des événements. Ce modèle devrait être applicable à toutes les données temporelles répétitives, comprenant de préférence des actions ou activités, telles que les données d'activités de la vie quotidienne. Les règles d'association séquentielles sont choisies en fonction des critères de confiance et de signification statistique tels que mesurés dans les données d'entraînement. Les règles d'association basées sur les connaissances du monde et du domaine exploitent la similitude d'un certain aspect d'une paire d'événements ou des patrons causaux difficiles à détecter statistiquement. Pour interpréter une période à résumer déterminée, les paires d'événements pour lesquels une règle d'association s'applique sont associées et certaines associations supplémentaires sont dérivées pour former un réseau associatif. L'étape la plus importante du pipeline de génération automatique de texte (GAT) est la planification du document, comprenant la sélection des événements et la structuration du document. Pour la sélection des événements, le modèle repose sur la confiance des associations séquentielles pour sélectionner les faits les plus inhabituels. L'hypothèse est qu'un événement qui est impliqué par un autre avec une probabilité relativement élevée peut être laissé implicite dans le texte. La structure du récit est appelée le fil associatif ramifié, car il permet au lecteur de suivre les associations du début à la fin du texte. Il prend la forme d'un arbre couvrant sur le sous-réseau associatif précédemment sélectionné. Les associations qu'il contient sont sélectionnées en fonction de préférences de type d'association et de la distance temporelle relative. Le fil associatif ramifié est ensuite segmenté en paragraphes, phrases et syntagmes et les associations sont converties en relations rhétoriques. L'étape de microplanification définit des patrons lexico-syntaxiques décrivant chaque type d'événement. Lorsque deux descriptions d'événement doivent être assemblées dans la même phrase, un marqueur discursif exprimant la relation rhétorique spécifiée est employé. Un événement principal et un événement principal précédent sont déterminés pour chaque phrase. Lorsque le parent de l'événement principal dans le fil associatif n'est pas l'événement principal précédent, un anaphorique est ajouté au marqueur discursif frontal de la phrase. La réalisation de surface peut être effectuée en anglais ou en français grâce à des spécifications lexico-syntaxiques bilingues et à la bibliothèque Java SimpleNLG-EnFr. Les résultats d'une évaluation de la qualité textuelle montrent que les textes sont compréhensibles et les choix lexicaux adéquats.
Data about events abounds in our technological society. An attractive way of presenting real-life temporal data to facilitate its interpretation is an automatically generated narrative. Narrative comprehension involves the construction of a causal network by the reader. Narrative data-to-text systems seem to acknowledge causal relations as important. However, they play a secondary role in their document planners and their identification relies mostly on domain knowledge. This thesis proposes an assisted temporal data interpretation model by narrative generation in which narratives are structured with the help of a mix of automatically mined and manually defined association rules. The associations suggest causal hypotheses to the reader who can thus construct more easily a causal representation of the events. This model should be applicable to any repetitive temporal data, preferably including actions or activities, such as Activity of Daily Living (ADL) data. Sequential association rules are selected based on the criteria of confidence and statistical significance as measured in training data. World and domain knowledge association rules are based on the similarity of some aspect of a pair of events or on causal patterns difficult to detect statistically. To interpret a specific period to summarize, pairs of events for which an association rule applies are associated. Some extra associations are then derived. Together the events and associations form an associative network. The most important step of the Natural Language Generation (NLG) pipeline is document planning, comprising event selection and document structuring. For event selection, the model relies on the confidence of sequential associations to select the most unusual facts. The assumption is that an event that is implied by another one with a relatively high probability may be left implicit in the text. The structure of the narrative is called the connecting associative thread because it allows the reader to follow associations from the beginning to the end of the text. It takes the form of a spanning tree over the previously selected associative sub-network. The associations it contains are selected based on association type preferences and relative temporal distance. The connecting associative thread is then segmented into paragraphs, sentences, and phrases and the associations are translated to rhetorical relations. The microplanning step defines lexico-syntactic templates describing each event type. When two event descriptions need to be assembled in the same sentence, a discourse marker expressing the specified rhetorical relation is employed. A main event and a preceding main event are determined for each sentence. When the associative thread parent of the main event is not the preceding main event, an anaphor is added to the sentence front discourse marker. Surface realization can be performed in English or French thanks to bilingual lexico-syntactic specifications and the SimpleNLG-EnFr Java library. The results of a textual quality evaluation show that the texts are understandable and the lexical choices adequate.

Книги з теми "Data-to-text generation":

1

McKeown, Kathleen R. Text generation: Using discourse strategies and focus constraints to generate natural language text. Cambridge [Cambridgeshire]: Cambridge University Press, 1985.

Знайти повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Bizyuk, Aleksandr. Fundamentals of abnormal psychology. ru: INFRA-M Academic Publishing LLC., 2020. http://dx.doi.org/10.12737/974663.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The textbook is a Supplement to the course of lectures given at the faculties of psychology, where one of the sections of this discipline is clinical psychology. This publication has been updated to reflect the 11th International classification of diseases, changes in which also affect the classification aspects of mental disorders. In order to implement the principle of consistency in mastering knowledge of pathopsychology, the material is given in the context of General and clinical psychology, which facilitates the holistic assimilation of the specifics of this science and understanding its place among other related Sciences. In accordance with the requirements of didactics, the structuring of the material is based on the principle "from simple to complex"; at the end of each paragraph, test questions are offered, finding answers to which in the text of the book forms the core knowledge of the reader. The Chapter devoted to disorders of specific mental functions, in addition to General theoretical data, provides brief descriptions of psychodiagnostic techniques designed to assess the qualitative and quantitative parameters of recorded changes. When writing the book, we used a rich domestic and foreign material published in numerous sources. Meets the requirements of Federal state educational standards of higher education of the latest generation. It is intended for students of psychological, pedagogical and medical universities, primarily clinical psychologists, as well as for a wide range of specialists working in the information field of problems of the ratio of normal and altered psyche.

3

McKeown, Kathleen R. Text Generation: Using Discourse Strategies and Focus Constraints to Generate Natural Language Text (Studies in Natural Language Processing). Cambridge University Press, 1992.

Знайти повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Henderson, Peter A. Southwood's Ecological Methods. 5th ed. Oxford University Press, 2021. http://dx.doi.org/10.1093/oso/9780198862277.001.0001.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Ecological Methods, by the late T. R. E. Southwood and revised over the years by P. A. Henderson, has developed into a classic reference work for the field biologist. It provides a handbook of ecological methods and analytical techniques pertinent to the study of animals, with an emphasis on non-microscopic animals in both terrestrial and aquatic environments. It remains unique in the breadth of the methods presented and in the depth of the literature cited, stretching right back to the earliest days of ecological research. The universal availability of R as an open-source package has radically changed the way ecologists analyze their data. In response, Southwood’s classic text has been thoroughly revised to be more relevant and useful to a new generation of ecologists, making the vast resource of R packages more readily available to the wider ecological community. By focusing on the use of R for data analysis, supported by worked examples, the book is now more accessible than previous editions to students requiring support and ideas for their projects.

5

Ondercin, Heather L. The Evolution of Women’s (and Men’s) Partisan Attachments. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780190265144.003.0003.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

This chapter examines how women’s and men’s attachments with the two major political parties in the United States have evolved since the passage of the Nineteenth Amendment. The chapter contends that over time gender has become increasingly important in influencing both men’s and women’s partisan attachments. Along with identifying the similarities and differences between men and women in partisan attachments, this chapter examines the unity and disunity of women’s partisan attachments, drawing on historical analyses to understand men’s and women’s partisanship attachments immediately after the passage of the Nineteenth Amendment when systematic quantitative data are unavailable. The text then explores the partisan attachments of men and women between 1950 and 2012 using an extensive collection of Gallup surveys from this time period. Differences based on generation, education, race, and region are also examined.

6

Mackenzie, Simon. Transnational Criminology. Policy Press, 2020. http://dx.doi.org/10.1332/policypress/9781529203783.001.0001.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Trafficking is a form of transnational crime that involves the illicit movement of goods and people around the world. Such global criminal markets take a variety of forms, and this book reviews six of them: trafficking in drugs, humans, wildlife, diamonds, arms, and antiquities. While there is a healthy literature on many of these types of trafficking, there is relatively little written that systematically compares and contrasts them. In doing that, this book allows us to lift the viewpoint above the details of each individual type of trafficking, to think theoretically about what they have in common. The book therefore serves two purposes. First, it is a primer and review of the main points of what we currently know about how each trafficking market works: who the traffickers are, what routines and structures are involved, what harm is caused, and the main types of regulation and control that attempt to constrain trafficking. Second, the text sets out a social theory of transnational markets, constituted and illustrated throughout by the empirical data reviewed. That theory ties the criminal practices of traffickers into the wider social promotion of a business-like mindset. This allows individuals and groups to compartmentalise the emotional and moral implications of illegal entrepreneurial profit generation, so that harmful action is seen as ‘just business’. As such, trafficking is rationalised by participants as comparable to the perceived amoral economic calculations of conventional business.

7

Brantingham, Patricia L., Paul J. Brantingham, Justin Song, and Valerie Spicer. Advances in Visualization for Theory Testing in Environmental Criminology. Edited by Gerben J. N. Bruinsma and Shane D. Johnson. Oxford University Press, 2018. http://dx.doi.org/10.1093/oxfordhb/9780190279707.013.37.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

This chapter discusses advances in visualization for environmental criminology. The environment within which people move has many dimensions that influence or constrain decisions and actions by individuals and by groups. This complexity creates a challenge for theoreticians and researchers in presenting their research results in a way that conveys the dynamic spatiotemporal aspects of crime and actions by offenders in a clearly understandable way. There is an increasing need in environmental criminology to use scientific visualization to convey research results. A visual image can describe underlying patterns in a way that is intuitively more understandable than text and numeric tables. The advent of modern information systems generating large and deep data sets (Big Data) provides researchers unparalleled possibilities for asking and answering questions about crime and the environment. This will require new techniques and methods for presenting findings and visualization will be key.

8

Lovasi, Gina S., Ana V. Diez Roux, and Jennifer Kolker, eds. Urban Public Health. Oxford University Press, 2020. http://dx.doi.org/10.1093/oso/9780190885304.001.0001.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

This book will orient public health scholars and practitioners, as well as professionals from related fields such as the social sciences and design professions, to the tools and skills needed for effective urban health research, including foundational concepts, data sources, strategies for generating evidence, and engagement and dissemination strategies to inform action for urban health. The book brings together what the researchers are learning through ongoing research experience and their efforts to inform action. Chapters also feature brief contributions from other urban health experts and practitioners. The book highlights throughout the public health importance of urban environments and the critical need for diverse interdisciplinary teams and intersectoral collaboration to develop and evaluate approaches to improve health in urban settings. Urban health professionals are often charged with working in ways that take a systems perspective and challenge conventional silos, while also engaging in more traditional public health actions and research strategies. The text is infused with themes emphasizing the importance of place for health, the potential to link evidence with action, and the critical need to attend to health inequities within urban environments. By providing a primer on the range of activities and capacities useful to urban health researchers, the book supports reader in their own professional development and team building by covering a range of relevant skills and voices. The primary audience includes trainees at the undergraduate, graduate, and postdoctoral levels who are interested in creating actionable evidence and in taking evidence-informed action to improve health within urban settings.

9

Ufimtseva, Nataliya V., Iosif A. Sternin, and Elena Yu Myagkova. Russian psycholinguistics: results and prospects (1966–2021): a research monograph. Institute of Linguistics, Russian Academy of Sciences, 2021. http://dx.doi.org/10.30982/978-5-6045633-7-3.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The monograph reflects the problems of Russian psycholinguistics from the moment of its inception in Russia to the present day and presents its main directions that are currently developing. In addition, theoretical developments and practical results obtained in the framework of different directions and research centers are described in a concise form. The task of the book is to reflect, as far as it is possible in one edition, firstly, the history of the formation of Russian psycholinguistics; secondly, its methodology and developed methods; thirdly, the results obtained in different research centers and directions in different regions of Russia; fourthly, to outline the main directions of the further development of Russian psycholinguistics. There is no doubt that in the theoretical, methodological and applied aspects, the main problems and the results of their development by Russian psycholinguistics have no analogues in world linguistics and psycholinguistics, or are represented by completely original concepts and methods. We have tried to show this uniqueness of the problematics and the methodological equipment of Russian psycholinguistics in this book. The main role in the formation of Russian psycholinguistics was played by the Moscow psycholinguistic school of A.A. Leontyev. It still defines the main directions of Russian psycholinguistics. Russian psycholinguistics (the theory of speech activity - TSA) is based on the achievements of Russian psychology: a cultural-historical approach to the analysis of mental phenomena L.S. Vygotsky and the system-activity approach of A.N. Leontyev. Moscow is the most "psycholinguistic region" of Russia - INL RAS, Moscow State University, Moscow State Linguistic University, RUDN, Moscow State Pedagogical University, Moscow State Pedagogical University, Sechenov University, Moscow State University and other Moscow universities. Saint Petersburg psycholinguists have significant achievements, especially in the study of neurolinguistic problems, ontolinguistics. The most important feature of Russian psycholinguistics is the widespread development of psycholinguistics in the regions, the emergence of recognized psycholinguistic research centers - St. Petersburg, Tver, Saratov, Perm, Ufa, Omsk, Novosibirsk, Voronezh, Yekaterinburg, Kursk, Chelyabinsk; psycholinguistics is represented in Cherepovets, Ivanovo, Volgograd, Vyatka, Kaluga, Krasnoyarsk, Irkutsk, Vladivostok, Abakan, Maikop, Barnaul, Ulan-Ude, Yakutsk, Syktyvkar, Armavir and other cities; in Belarus - Minsk, in Ukraine - Lvov, Chernivtsi, Kharkov, in the DPR - Donetsk, in Kazakhstan - Alma-Ata, Chimkent. Our researchers work in Bulgaria, Hungary, Vietnam, China, France, Switzerland. There are Russian psycholinguists in Canada, USA, Israel, Austria and a number of other countries. All scientists from these regions and countries have contributed to the development of Russian psycholinguistics, to the development of psycholinguistic theory and methods of psycholinguistic research. Their participation has not been forgotten. We tried to present the main Russian psycholinguists in the Appendix - in the sections "Scientometrics", "Monographs and Manuals" and "Dissertations", even if there is no information about them in the Electronic Library and RSCI. The principles of including scientists in the scientometric list are presented in the Appendix. Our analysis of the content of the resulting monograph on psycholinguistic research in Russia allows us to draw preliminary conclusions about some of the distinctive features of Russian psycholinguistics: 1. cultural-historical approach to the analysis of mental phenomena of L.S.Vygotsky and the system-activity approach of A.N. Leontiev as methodological basis of Russian psycholinguistics; 2. theoretical nature of psycholinguistic research as a characteristic feature of Russian psycholinguistics. Our psycholinguistics has always built a general theory of the generation and perception of speech, mental vocabulary, linked specific research with the problems of ontogenesis, the relationship between language and thinking; 3. psycholinguistic studies of speech communication as an important subject of psycholinguistics; 4. attention to the psycholinguistic analysis of the text and the development of methods for such analysis; 5. active research into the ontogenesis of linguistic ability; 6. investigation of linguistic consciousness as one of the important subjects of psycholinguistics; 7. understanding the need to create associative dictionaries of different types as the most important practical task of psycholinguistics; 8. widespread use of psycholinguistic methods for applied purposes, active development of applied psycholinguistics. The review of the main directions of development of Russian psycholinguistics, carried out in this monograph, clearly shows that the direction associated with the study of linguistic consciousness is currently being most intensively developed in modern Russian psycholinguistics. As the practice of many years of psycholinguistic research in our country shows, the subject of study of psycholinguists is precisely linguistic consciousness - this is a part of human consciousness that is responsible for generating, understanding speech and keeping language in consciousness. Associative experiments are the core of most psycholinguistic techniques and are important both theoretically and practically. The following main areas of practical application of the results of associative experiments can be outlined. 1. Education. Associative experiments are the basis for constructing Mind Maps, one of the most promising tools for systematizing knowledge, assessing the quality, volume and nature of declarative knowledge (and using special techniques and skills). Methods based on smart maps are already widely used in teaching foreign languages, fast and deep immersion in various subject areas. 2. Information search, search optimization. The results of associative experiments can significantly improve the quality of information retrieval, its efficiency, as well as adaptability for a specific person (social group). When promoting sites (promoting them in search results), an associative experiment allows you to increase and improve the quality of the audience reached. 3. Translation studies, translation automation. An associative experiment can significantly improve the quality of translation, take into account intercultural and other social characteristics of native speakers. 4. Computational linguistics and automatic word processing. The results of associative experiments make it possible to reveal the features of a person's linguistic consciousness and contribute to the development of automatic text processing systems in a wide range of applications of natural language interfaces of computer programs and robotic solutions. 5. Advertising. The use of data on associations for specific words, slogans and texts allows you to predict and improve advertising texts. 6. Social relationships. The analysis of texts using the data of associative experiments makes it possible to assess the tonality of messages (negative / positive moods, aggression and other characteristics) based on user comments on the Internet and social networks, in the press in various projections (by individuals, events, organizations, etc.) from various social angles, to diagnose the formation of extremist ideas. 7. Content control and protection of personal data. Associative experiments improve the quality of content detection and filtering by identifying associative fields in areas subject to age restrictions, personal information, tobacco and alcohol advertising, incitement to ethnic hatred, etc. 8. Gender and individual differences. The data of associative experiments can be used to compare the reactions (and, in general, other features of thinking) between men and women, different social and age groups, representatives of different regions. The directions for the further development of Russian psycholinguistics from the standpoint of the current state of psycholinguistic science in the country are seen by us, first of all:  in the development of research in various areas of linguistic consciousness, which will contribute to the development of an important concept of speech as a verbal model of non-linguistic consciousness, in which knowledge revealed by social practice and assigned by each member of society during its inculturation is consolidated for society and on its behalf;  in the expansion of the problematics, which is formed under the influence of the growing intercultural communication in the world community, which inevitably involves the speech behavior of natural and artificial bilinguals in the new object area of psycholinguistics;  in using the capabilities of national linguistic corpora in the interests of researchers studying the functioning of non-linguistic and linguistic consciousness in speech processes;  in expanding research on the semantic perception of multimodal texts, the scope of which has greatly expanded in connection with the spread of the Internet as a means of communication in the life of modern society;  in the inclusion of the problems of professional communication and professional activity in the object area of psycholinguistics in connection with the introduction of information technologies into public practice, entailing the emergence of new professions and new features of the professional ethos;  in the further development of the theory of the mental lexicon (identifying the role of different types of knowledge in its formation and functioning, the role of the word as a unit of the mental lexicon in the formation of the image of the world, as well as the role of the natural / internal metalanguage and its specificity in speech activity);  in the broad development of associative lexicography, which will meet the most diverse needs of society and cognitive sciences. The development of associative lexicography may lead to the emergence of such disciplines as associative typology, associative variantology, associative axiology;  in expanding the spheres of applied use of psycholinguistics in social sciences, sociology, semasiology, lexicography, in the study of the brain, linguodidactics, medicine, etc. This book is a kind of summarizing result of the development of Russian psycholinguistics today. Each section provides a bibliography of studies on the relevant issue. The Appendix contains the scientometrics of leading Russian psycholinguists, basic monographs, psycholinguistic textbooks and dissertations defended in psycholinguistics. The content of the publications presented here is convincing evidence of the relevance of psycholinguistic topics and the effectiveness of the development of psycholinguistic problems in Russia.

Частини книг з теми "Data-to-text generation":

1

Gardent, Claire. "Syntax and Data-to-Text Generation." In Statistical Language and Speech Processing, 3–20. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-11397-5_1.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Upadhyay, Ashish, Stewart Massie, Ritwik Kumar Singh, Garima Gupta, and Muneendra Ojha. "A Case-Based Approach to Data-to-Text Generation." In Case-Based Reasoning Research and Development, 232–47. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-86957-1_16.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Rebuffel, Clément, Laure Soulier, Geoffrey Scoutheeten, and Patrick Gallinari. "A Hierarchical Model for Data-to-Text Generation." In Lecture Notes in Computer Science, 65–80. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-45439-5_5.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Wang, Mengda, Jianjun Cao, Xu Yu, and Zibo Nie. "A Data-to-Text Generation Model with Deduplicated Content Planning." In Big Data, 92–103. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-8331-3_6.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Mota, Abelardo Vieira, Ticiana Linhares Coelho da Silva, and José Antônio Fernandes De Macêdo. "Template-Based Multi-solution Approach for Data-to-Text Generation." In Advances in Databases and Information Systems, 157–70. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-54832-2_13.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Pandey, Abhishek Kumar, and Sanjiban Sekhar Roy. "Attention Based Bidirectional LSTM Model for Data-to-text Generation." In Advances in Computational Intelligence and Its Applications, 228–35. London: CRC Press, 2024. http://dx.doi.org/10.1201/9781003488682-29.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Upadhyay, Ashish, and Stewart Massie. "CBR Assisted Context-Aware Surface Realisation for Data-to-Text Generation." In Case-Based Reasoning Research and Development, 34–49. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-40177-0_3.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Belz, Anja, and Eric Kow. "Assessing the Trade-Off between System Building Cost and Output Quality in Data-to-Text Generation." In Empirical Methods in Natural Language Generation, 180–200. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-15573-4_10.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Roberti, Marco, Giovanni Bonetta, Rossella Cancelliere, and Patrick Gallinari. "Copy Mechanism and Tailored Training for Character-Based Data-to-Text Generation." In Machine Learning and Knowledge Discovery in Databases, 648–64. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-46147-8_39.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Upadhyay, Ashish, and Stewart Massie. "A Case-Based Approach for Content Planning in Data-to-Text Generation." In Case-Based Reasoning Research and Development, 380–94. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-14923-8_25.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Тези доповідей конференцій з теми "Data-to-text generation":

1

Kale, Mihir, and Abhinav Rastogi. "Text-to-Text Pre-Training for Data-to-Text Tasks." In Proceedings of the 13th International Conference on Natural Language Generation. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.inlg-1.14.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Kasner, Zdeněk, and Ondřej Dušek. "Data-to-Text Generation with Iterative Text Editing." In Proceedings of the 13th International Conference on Natural Language Generation. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.inlg-1.9.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Liu, Mengzhu, Zhaonan Mu, Jieping Sun, and Cheng Wang. "Data-to-text Generation with Pointer-Generator Networks." In 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA). IEEE, 2020. http://dx.doi.org/10.1109/aeeca49918.2020.9213600.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Perez-Beltrachini, Laura, and Claire Gardent. "Analysing Data-To-Text Generation Benchmarks." In Proceedings of the 10th International Conference on Natural Language Generation. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017. http://dx.doi.org/10.18653/v1/w17-3537.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Puduppully, Ratish, Li Dong, and Mirella Lapata. "Data-to-text Generation with Entity Modeling." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-1195.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Lin, Shuai, Wentao Wang, Zichao Yang, Xiaodan Liang, Frank F. Xu, Eric Xing, and Zhiting Hu. "Data-to-Text Generation with Style Imitation." In Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.findings-emnlp.144.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Xu, Xinnuo, Ivan Titov, and Mirella Lapata. "Compositional Generalization for Data-to-Text Generation." In Findings of the Association for Computational Linguistics: EMNLP 2023. Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.findings-emnlp.623.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Chang, Ernie, Xiaoyu Shen, Dawei Zhu, Vera Demberg, and Hui Su. "Neural Data-to-Text Generation with LM-based Text Augmentation." In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.eacl-main.64.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Burgdorf, Andreas, Micaela Barkmann, André Pomp, and Tobias Meisen. "Domain-independent Data-to-Text Generation for Open Data." In 11th International Conference on Data Science, Technology and Applications. SCITEPRESS - Science and Technology Publications, 2022. http://dx.doi.org/10.5220/0011272900003269.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

10

GONG, Li, Josep Crego, and Jean Senellart. "Enhanced Transformer Model for Data-to-Text Generation." In Proceedings of the 3rd Workshop on Neural Generation and Translation. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/d19-5615.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Звіти організацій з теми "Data-to-text generation":

1

Ma, Yue, and Felix Distel. Learning Formal Definitions for Snomed CT from Text. Technische Universität Dresden, 2013. http://dx.doi.org/10.25368/2022.193.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Snomed CT is a widely used medical ontology which is formally expressed in a fragment of the Description Logic EL++. The underlying logics allow for expressive querying, yet make it costly to maintain and extend the ontology. Existing approaches for ontology generation mostly focus on learning superclass or subclass relations and therefore fail to be used to generate Snomed CT definitions. In this paper, we present an approach for the extraction of Snomed CT definitions from natural language texts, based on the distance relation extraction approach. By benefiting from a relatively large amount of textual data for the medical domain and the rich content of Snomed CT, such an approach comes with the benefit that no manually labelled corpus is required. We also show that the type information for Snomed CT concept is an important feature to be examined for such a system. We test and evaluate the approach using two types of texts. Experimental results show that the proposed approach is promising to assist Snomed CT development.

2

Foundation models such as ChatGPT through the prism of the UNESCO Recommendation on the Ethics of Artificial Intelligence. UNESCO, 2023. http://dx.doi.org/10.54678/bgiv6160.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The release into the public domain and massive growth in the user base of artificial intelligence (AI) foundation models for text, images, and audio is fuelling debate about the risks they pose to work, education, scientific research, and democracy, as well as their potential negative impacts on cultural diversity and cross-cultural interactions, among other areas. Foundation models are AI systems that are characterized by the use of very large machine learning models trained on massive unlabelled data sets using considerable compute resources. Examples include large language models (LLMs) such as the GPT series and Bard, and image generator tools such as DALL·E 2 and Stable Diffusion. This discussion paper focuses on a widely used foundation model, ChatGPT, as a case study, but many of the points below are applicable to other LLMs and foundation models more broadly. UNESCO Catno: 0000385629