Log in

Relevant bibliographies by topics / HTML documents / Dissertations / Theses

To see the other types of publications on this topic, follow the link: HTML documents.

Dissertations / Theses on the topic 'HTML documents'

Author: Grafiati

Published: 4 June 2021

Last updated: 5 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'HTML documents.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Xie, Wei University of Ballarat. "Classification of HTML Documents." University of Ballarat, 2006. http://archimedes.ballarat.edu.au:8080/vital/access/HandleResolver/1959.17/12774.

Full text

Abstract:

Text Classification is the task of mapping a document into one or more classes based on the presence or absence of words (or features) in the document. It is intensively being studied and different classification techniques and algorithms have been developed. This thesis focuses on classification of online documents that has become more critical with the development of World Wide Web. The WWW vastly increases the availability of on-line documents in digital format and has highlighted the need to classify them. From this background, we have noted the emergence of “automatic Web Classification”. These mainly concentrate on classifying HTML-like documents into classes or categories by not only using the methods that are inherited from the traditional Text Classification process, but also utilizing the extra information provided only by Web pages. Our work is based on the fact that, Web documents, contain not only ordinary features (words) but also extra information, such as meta-data and hyperlinks that can be used to advantage the classification process. The aim of this research is to study various ways of using the extra information, in particularly, hyperlink information provided by HTML-documents (Web pages). The merit of the approach, developed in this thesis, is its simplicity, compared with existing approaches. We present different approaches of using hyperlink information to improve the effectiveness of web classification. Unlike other work in this area, we will only use the mappings between linked documents and their own class or classes. In this case, we only need to add a few features called linked-class features into the datasets, and then apply classifiers on them for classification. In the numerical experiments we adopted two wellknown Text Classification algorithms, Support Vector Machines and BoosTexter. The results obtained show that classification accuracy can be improved by using mixtures of ordinary and linked-class features. Moreover, out-links usually work better than in-links in classification. We also analyse and discuss the reasons behind this improvement.
Master of Computing

APA, Harvard, Vancouver, ISO, and other styles

2

Xie, Wei. "Classification of HTML Documents." University of Ballarat, 2006. http://archimedes.ballarat.edu.au:8080/vital/access/HandleResolver/1959.17/15628.

Full text

Abstract:

Text Classification is the task of mapping a document into one or more classes based on the presence or absence of words (or features) in the document. It is intensively being studied and different classification techniques and algorithms have been developed. This thesis focuses on classification of online documents that has become more critical with the development of World Wide Web. The WWW vastly increases the availability of on-line documents in digital format and has highlighted the need to classify them. From this background, we have noted the emergence of “automatic Web Classification”. These mainly concentrate on classifying HTML-like documents into classes or categories by not only using the methods that are inherited from the traditional Text Classification process, but also utilizing the extra information provided only by Web pages. Our work is based on the fact that, Web documents, contain not only ordinary features (words) but also extra information, such as meta-data and hyperlinks that can be used to advantage the classification process. The aim of this research is to study various ways of using the extra information, in particularly, hyperlink information provided by HTML-documents (Web pages). The merit of the approach, developed in this thesis, is its simplicity, compared with existing approaches. We present different approaches of using hyperlink information to improve the effectiveness of web classification. Unlike other work in this area, we will only use the mappings between linked documents and their own class or classes. In this case, we only need to add a few features called linked-class features into the datasets, and then apply classifiers on them for classification. In the numerical experiments we adopted two wellknown Text Classification algorithms, Support Vector Machines and BoosTexter. The results obtained show that classification accuracy can be improved by using mixtures of ordinary and linked-class features. Moreover, out-links usually work better than in-links in classification. We also analyse and discuss the reasons behind this improvement.
Master of Computing

APA, Harvard, Vancouver, ISO, and other styles

3

Levering, Ryan Reed. "Multi-stage modeling of HTML documents." Diss., Online access via UMI:, 2004.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

4

Stachowiak, Maciej 1976. "Automated extraction of structured data from HTML documents." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9896.

Full text

Abstract:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.
Includes bibliographical references (leaf 45).
by Maciej Stachowiak.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

5

Nálevka, Petr. "Compound XML documents." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-1746.

Full text

Abstract:

Tato práce se zabývá různými charakteristikami komponovaných dokumentů a ukazuje potencionální výhody využití takových dokumentů v prostředí dnešního Webu. Hlavní pozornost je soustředěna na problémy spojené s validací komponovaných dokumentů. Práce zkoumá různé přístupy k řešení těchto problémů. Validační metoda NVDL (Namespace-based Validation Dispatching Language) je popsána detailně. Tato práce popisuje hlavní principy NVDL, zkoumá výhody a nevýhody oproti jiným přístupům a představuje JNVDL. JNVDL je kompletní implementace specifikace NVDL, která byla napsána v jazyce Java jako součást této práce. Popsány jsou nejen technické prvky implementace, ale JNVDL je představeno i z uživatelské perspektivy. Pro ověření využitelnosti bylo JNVDL integrováno do existujícího projektu pro validaci webových dokumentů s názvem Relaxed, aby jednoduše zpřístupnilo validaci komponovaných dokumentů autorům webového obsahu.

APA, Harvard, Vancouver, ISO, and other styles

6

Temelkuran, Baris 1980. "Hap-Shu : a language for locating information in HTML documents." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/87882.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Meziane, Souad. "Analyse et conversion de documents : du pixel au langage HTML." Lyon, INSA, 1998. http://www.theses.fr/1998ISAL0128.

Full text

Abstract:

Mon travail de thèse s'inscrit dans la thématique de recherche " Analyse des documents " du laboratoire Reconnaissance de Forme et Vision. Pour réaliser un système capable d'analyser des documents et d'en restituer la structure, les méthodologies s'appuient sur plusieurs approches et particulièrement sur l'approche syntaxique et structurelle de la Reconnaissance de Formes. Le but recherché dans ce travail est d'arriver à convertir des documents papier vers des documents électroniques tels que les documents HTML car ce sont les documents les plus utilisés sur l'Internet. Le domaine d'application d'un tel système peut être général, cependant, nous nous concentrons en premier sur un type particulier de documents à typographie riche : les sommaires. Dans ce contexte, nous avons mis en œuvre un système s'appuyant d'une part sur les structures physique et logique du document et d'autre part sur l'inférence de Grammaire à Deux Niveaux. Elle est composée de deux grammaires : une métagrammaire et une hypergarmmaire. Dans notre système, le rôle de la métagrammaire est de décrire les structures physique et logique du document. L'hypergrammaire décrit les traitements à effectuer pour convertir le document en html. L'analyse d'un sommaire s'effectue en deux étapes. Lors de la première étape, le système construit une base d'apprentissage en utilisant l'inférence grammaticale. Cette base contient plusieurs modèles de sommaires à identifier. Un document inconnu, soumis au système est identifié par appariement avec les modèles de la base, en utilisant toutes les informations issues de l'étage d'analyse. La mise en page du document dans le format HTML est basée sur l'analyse grammaticale de l'hypergrammaire. Cette dernière est obtenue par traduction des étiquettes logiques et des paramètres typographiques en commandes HTML. Le résultats de l'analyse de l'hypergrammaire produit le document HTML équivalent au document étudié. Il est visualisé par un logiciel de navigation
This work is part of the thematic "Document Analysis" in the Laboratory Reconnaissance de Forme et Vision(RFV). To achieve an analysis system ables to, interpret documents and to restore its structure, the Methodologies we have chosen lean on several approaches and particularly on the syntactic and structural approach of the Pattern Recognition. The aim in this work is to convert some paper documents into HTML documents because these documents are more used on the Internet. The application domain of such systems could be general; however, we concentrate us on a particular type of documents with a rich typography: the summaries. In this context, we have realized a system that exploits on one hand the information about content of the document such as its physical and logical structures, and on the other hand on two level grammars. It is composed with two grammars: a meta-grammar and a hyper-grammar. In our system, the role of the meta-grammar is to describe the physical and logical structures of the document. The hyper-grammar is constituted with a set of calculus rules and describes the treatments to do in order to convert the document in HTML. The summary analysis is done in two steps: analysis and identification of the document, and then translation into HTML. During of the first step, the system constructs a learning base by using the grammatical inference. This base contains several patterns of synopses to identify. An unknown document, submitted to the system is identified by matching with the patterns of the base by using all the attributes obtained in the analysis step. The layout of HTML document construction is based on the grammatical analysis of the hyper-grammar. The last is obtained by translation of the logical labels and some typographic parameters into HTML commands. The result of the grammatical analysis of the hyper-grammar produces the structured HTML document corresponding to the studied document. This last will be visualized by software of navigation

APA, Harvard, Vancouver, ISO, and other styles

8

Mohammadzadeh, Hadi. "Improving Retrieval Accuracy in Main Content Extraction from HTML Web Documents." Doctoral thesis, Universitätsbibliothek Leipzig, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-130500.

Full text

Abstract:

The rapid growth of text based information on the World Wide Web and various applications making use of this data motivates the need for efficient and effective methods to identify and separate the “main content” from the additional content items, such as navigation menus, advertisements, design elements or legal disclaimers. Firstly, in this thesis, we study, develop, and evaluate R2L, DANA, DANAg, and AdDANAg, a family of novel algorithms for extracting the main content of web documents. The main concept behind R2L, which also provided the initial idea and motivation for the other three algorithms, is to use well particularities of Right-to-Left languages for obtaining the main content of web pages. As the English character set and the Right-to-Left character set are encoded in different intervals of the Unicode character set, we can efficiently distinguish the Right-to-Left characters from the English ones in an HTML file. This enables the R2L approach to recognize areas of the HTML file with a high density of Right-to-Left characters and a low density of characters from the English character set. Having recognized these areas, R2L can successfully separate only the Right-to-Left characters. The first extension of the R2L, DANA, improves effectiveness of the baseline algorithm by employing an HTML parser in a post processing phase of R2L for extracting the main content from areas with a high density of Right-to-Left characters. DANAg is the second extension of the R2L and generalizes the idea of R2L to render it language independent. AdDANAg, the third extension of R2L, integrates a new preprocessing step to normalize the hyperlink tags. The presented approaches are analyzed under the aspects of efficiency and effectiveness. We compare them to several established main content extraction algorithms and show that we extend the state-of-the-art in terms of both, efficiency and effectiveness. Secondly, automatically extracting the headline of web articles has many applications. We develop and evaluate a content-based and language-independent approach, TitleFinder, for unsupervised extraction of the headline of web articles. The proposed method achieves high performance in terms of effectiveness and efficiency and outperforms approaches operating on structural and visual features
Das rasante Wachstum von textbasierten Informationen im World Wide Web und die Vielfalt der Anwendungen, die diese Daten nutzen, macht es notwendig, effiziente und effektive Methoden zu entwickeln, die den Hauptinhalt identifizieren und von den zusätzlichen Inhaltsobjekten wie z.B. Navigations-Menüs, Anzeigen, Design-Elementen oder Haftungsausschlüssen trennen. Zunächst untersuchen, entwickeln und evaluieren wir in dieser Arbeit R2L, DANA, DANAg und AdDANAg, eine Familie von neuartigen Algorithmen zum Extrahieren des Inhalts von Web-Dokumenten. Das grundlegende Konzept hinter R2L, das auch zur Entwicklung der drei weiteren Algorithmen führte, nutzt die Besonderheiten der Rechts-nach-links-Sprachen aus, um den Hauptinhalt von Webseiten zu extrahieren. Da der lateinische Zeichensatz und die Rechts-nach-links-Zeichensätze durch verschiedene Abschnitte des Unicode-Zeichensatzes kodiert werden, lassen sich die Rechts-nach-links-Zeichen leicht von den lateinischen Zeichen in einer HTML-Datei unterscheiden. Das erlaubt dem R2L-Ansatz, Bereiche mit einer hohen Dichte von Rechts-nach-links-Zeichen und wenigen lateinischen Zeichen aus einer HTML-Datei zu erkennen. Aus diesen Bereichen kann dann R2L die Rechts-nach-links-Zeichen extrahieren. Die erste Erweiterung, DANA, verbessert die Wirksamkeit des Baseline-Algorithmus durch die Verwendung eines HTML-Parsers in der Nachbearbeitungsphase des R2L-Algorithmus, um den Inhalt aus Bereichen mit einer hohen Dichte von Rechts-nach-links-Zeichen zu extrahieren. DANAg erweitert den Ansatz des R2L-Algorithmus, so dass eine Sprachunabhängigkeit erreicht wird. Die dritte Erweiterung, AdDANAg, integriert eine neue Vorverarbeitungsschritte, um u.a. die Weblinks zu normalisieren. Die vorgestellten Ansätze werden in Bezug auf Effizienz und Effektivität analysiert. Im Vergleich mit mehreren etablierten Hauptinhalt-Extraktions-Algorithmen zeigen wir, dass sie in diesen Punkten überlegen sind. Darüber hinaus findet die Extraktion der Überschriften aus Web-Artikeln vielfältige Anwendungen. Hierzu entwickeln wir mit TitleFinder einen sich nur auf den Textinhalt beziehenden und sprachabhängigen Ansatz. Das vorgestellte Verfahren ist in Bezug auf Effektivität und Effizienz besser als bekannte Ansätze, die auf strukturellen und visuellen Eigenschaften der HTML-Datei beruhen

APA, Harvard, Vancouver, ISO, and other styles

9

Yerra, Rajiv. "Detecting Similar HTML Documents Using A Sentence-Based Copy Detection Approach." Diss., CLICK HERE for online access, 2005. http://contentdm.lib.byu.edu/ETD/image/etd977.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Singer, Ron. "Comparing machine learning and hand-crafted approaches for information extraction from HTML documents." Thesis, McGill University, 2003. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=79127.

Full text

Abstract:

The problem of automatically extracting information from web pages is becoming very important, due to the explosion of information available on the World Wide Web. In this thesis, we explore and compare hand-crafted information extraction tools with tools constructed using machine learning algorithms. The task we consider is the extraction of organization names and contact information, such as addresses and phone numbers, from web pages. Given the huge number of company web pages on the Internet, automating this task is of great practical interest. The system we developed consists of two components. The first component achieves the labeling or tagging of named entities (such as company names, addresses and phone numbers) in HTML documents. We compare the performance of hand-coded regular expressions and decision trees for this task. Using decision trees allows us to generate tagging rules that are significantly more accurate. The second component is used to establish relationships between named entities (i.e. company names, phone numbers and addresses), for the purpose of structuring the data into a useful record (i.e. a contact, or an organization). For this task we experimented with two approaches. The first approach uses an aggregator that implements human-generated heuristics to relate the tags and create the records sought. The second approach is based on Hidden Markov Models (HMM). As far as we know, no one has used HMM before to establish relationships between more than two tagged entities. Our empirical results suggest that HMMs compare favorable with the hand-crafted aggregator in terms of performance and ease of development.

APA, Harvard, Vancouver, ISO, and other styles

11

JABOUR, IAM VITA. "THE IMPACT OF STRUCTURAL ATTRIBUTES TO IDENTIFY TABLES AND LISTS IN HTML DOCUMENTS." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2010. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=17247@1.

Full text

Abstract:

COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
A segmentação de documentos HTML tem sido essencial para as tarefas de extração de informações, como mostram vários estudos na área. Nesta dissertação investigamos a relação entre o documento HTML e sua representação visual, mostrando como esta ligação ajuda na abordagem estrutural para a identificação de segmentos. Também investigamos como utilizar algoritmos de distância de edição em árvores para encontrar padrões na árvore DOM, tornando possível resolver duas tarefas de identificação de segmentos. A primeira tarefa é a identificação de tabelas genuínas, aonde foi obtido 90,40% de F1 utilizando o corpus fornecido por (Wang e Hu, 2002). Mostramos através de um estudo experimental que este resultado é competitivo com os melhores resultados da área. A segunda tarefa que consideramos é a identificação de listas de produtos em sites de comércio eletrônico, nessa obtivemos 94,95% de F1 utilizando um corpus com 1114 documentos HTML, criado a partir de 8 sites. Concluímos que os algoritmos de similaridade estrutural ajudam na resolução de ambas às tarefas e acreditamos que possam ajudar na identificação de outros tipos de segmentos.
The segmentation of HTML documents has been essential to information extraction tasks, as showed by several works in this area. This paper studies the link between an HTML document and its visual representation to show how it helps segments identification using a structural approach. For this, we investigate how tree edit distance algorithms can find structural similarities in a DOM tree, using two tasks to execute our experiments. The first one is the identification of genuine tables where we obtained a 90.40% F1 score using the corpus provided by (Wang e Hu, 2002). We show through an experimental study that this result is competitive with the best results in the area. The second task studied is the identification of product listings in e-commerce sites. Here we get a 94.95% F1 score using a corpus with 1114 HTML documents from 8 distinct sites. We conclude that algorithms to calculate trees similarity provide competitive results for both tasks, making them also good candidates to identify other types of segments.

APA, Harvard, Vancouver, ISO, and other styles

12

Mysore, Gopinath Abhijith Athreya. "Automatic Detection of Section Title and Prose Text in HTML Documents Using Unsupervised and Supervised Learning." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1535371714338677.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Al, Assimi Abdel-Basset. "Gestion de l'évolution non centralisée de documents parallèles multilingues." Université Joseph Fourier (Grenoble), 2000. http://www.theses.fr/2000GRE10127.

Full text

Abstract:

Notre travail se situe dans les domaines de l'informatique multilingue, des systèmes d'information, et des outils pour non-spécialistes. Notre étude de départ sur le multilinguisme en informatique a permis de montrer la complexité des problèmes résultant des différences entre les différentes situations culturelles et linguistiques, et nous a conduit à dégager un problème générique assez difficile et encore très peu abordé. Il s'agit d'analyser et de résoudre au moins en partie les nombreux problèmes posés par la gestion de l'évolution non centralisée de Documents Parallèles Multilingues (DPM). Les difficultés principales sont les suivantes : 1) L'hétérogénéité des logiciels et des plates-formes utilisés pour manipuler les différentes versions d'un même document dans plusieurs langues, 2) Le manque de bonnes solutions pour décrire et maintenir la correspondance existant entre les parties en parallèle dans n versions dans m langues (n>=m), et 3) La nécessité de traiter à la fois les aspects linguistiques (les correspondances fines), et les aspects de gestion (les informations d'organisation), concernant l'avancement du travail de traduction et de mise à jour des versions. Pour cela, nous avons défini des structures de données représentant l'alignement dans les DPM, et nous avons développé sous 4D la maquette OGA, mettant en pratique les solutions proposées pour gérer les correspondances linguistiques et les informations d'organisation. Ce prototypage a montré qu'il est possible de construire un outil simple de gestion des DPM, facile à manier et utile. Enfin, notre méthodologie générale a montré comment un tel outil pourrait être utilisé avec profit pour des travaux concrêts, et appliqué dans des contextes réels de systèmes d'information. Nous avons ainsi résisté à la tentation de créer un système complexe à gestion centralisée, qui serait finalement inutilisable dans les nombreuses situations où les différentes versions d'un DPM évoluent de façon hétérogène.

APA, Harvard, Vancouver, ISO, and other styles

14

Bukovčák, Jakub. "Extrakce informací z webových stránek." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403184.

Full text

Abstract:

This master thesis is focused on current technologies that are used for downloading web pages and extraction of structured information from them. The paper describes available tools to make this process possible and easier. Another part of this document provides the overview of technologies that can be used for creating web pages. Also, there is an information about development of information systems with web user interface based on Java Enterprise Edition (Java EE) platform. The main part of this master thesis describes design and implementation of application used to specify and manage extraction tasks. The last part of this project describes application testing on real web pages and evaluation of achieved results.

APA, Harvard, Vancouver, ISO, and other styles

15

Mohammadzadeh, Hadi [Verfasser], Gerhard [Akademischer Betreuer] Heyer, Gerhard [Gutachter] Heyer, and Jinan [Gutachter] Fiaidhi. "Improving Retrieval Accuracy in Main Content Extraction from HTML Web Documents / Hadi Mohammadzadeh ; Gutachter: Gerhard Heyer, Jinan Fiaidhi ; Betreuer: Gerhard Heyer." Leipzig : Universitätsbibliothek Leipzig, 2013. http://d-nb.info/1237818303/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Cheriat, Ahmed. "Une Méthode de correction de la structure de documents XML dans le cadre d'une validation incrémentale." Tours, 2006. http://www.theses.fr/2006TOUR4022.

Full text

Abstract:

XML s'est imposé comme format d'échange de données. Dans ce cadre, les données sont structurées selon un schéma et un validateur permet de vérifier qu'un document XML respecte le schéma qui lui est associé, c'est à dire que sa structure suit les règles du schéma (la structure d'un document XML est un arbre). Si ce document est mis à jour, une validation incrémentale doit être réalisée, qui consiste à vérifier si la structure du nouveau document est toujours conforme aux règles du schéma, en considérant uniquement les parties qui sont concernées par les mises à jour (ceci pour réduire le coût, par rapport à une validation du document entier). Cette thèse présente une méthode générale de validation incrémentale, qui de plus, au lieu de refuser les mises à jour qui rendent un document invalide, propose des corrections pour celui-ci. La correction en l'occurence consiste à transformer le résultat invalide des mises à jour en un document valide. Dans un premier temps, nous nous intéressons à un problème simplifié, qui consiste à corriger un mot par rapport à un autre mot (le mot initial valide) et à un langage. En effet, la correction d'un document XML dont la structure est simplement une racine et ses fils correspond à la correction d'un mot (composé des fils de la racine). Par rapport à une grammaire (la contrainte du schéma associé à l'étiquette de la racine). Dans un second temps, nous avons étendue cette idée à la correction d'un arbre XML (la structure d'un document) par rapport à un langage d'arbres (le schéma associé du document). Cette correction est réalisée en appliquant le minimum de modifications possibles (insérer, supprimer ou renommer des éléments) pour obtenir un arbre valide à partir d'un arbre invalide. Les algorithmes présentés dans cette thèse ont été implantés (en Java) et des résultats expérimentaux sont rapportés
XML becomes the main tool used to exchange data on the web. In this context, XML document should respect schema constraints that describe the structural form of XML documents. The validation of an XML document wrt a schema constraints consists in testing whether the document verifies the set of structural specifications described by this schema. Supposing that updates are applied to the document, an incremental validator is the one that verifies whether the updated document complies with the schema, by validating only the parts of the document involved in the updates (to reduce the cost of a validation from scratch of the whole XML document). In this thesis we associate the validation process with correctoin proposals. During the execution of our validation method, if a constraint violation is found, a correction routine is called in order to propose local solutions capable of allowing the validation process to continue. Firstly, we are interested in a special case of this problem, which consists in correcting a word wrt another word (the initial valid word) and wrt a regular language. Indeed, the correction of an XML document having only a root and its sons corresponds to the correction of a word (composed by the children of the root) wrt a regular language (the constraint associated tothe root). In a second time, we extended this idea to the correction of XML tree (the structure of documents) wrt tree languages (the schema associated to an XML document). This correction is done by applying the minimum of modifications (by insertion, deletion or by replacement of some elements) on an invalid XML document in order to obtain a valid XML document. The algorithms presented in this thesis were implemented (in Java) and the experimental result are shown

APA, Harvard, Vancouver, ISO, and other styles

17

Grabs, Torsten. "Storage and retrieval of XML documents with a cluster of database systems /." Berlin : Aka, 2003. http://www.loc.gov/catdir/toc/fy0713/2007435297.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Eeg-Tverbakk, Camilla. "Theatre-ting : toward a materialist practice of staging documents." Thesis, University of Roehampton, 2016. https://pure.roehampton.ac.uk/portal/en/studentthesis/theatre-–-ting(dd5f299e-6fdc-4c69-bed6-7ae690de6a8d).html.

Full text

Abstract:

This Practice-as-Research project investigates documentary performance from the perspective of the dramaturg. Through analysing two specific practical approaches to working with documentary material; one with non trained performers related to methods of socially engaged and participatory art practices, and the other with professionally trained performers, I argue for moving away from the perceived dichotomy between the discourse of reality and fiction in documentary work entirely. Introducing object-oriented philosophy and new materialism as an ethical framework, I propose a third way of framing work that use testimonial, tribunal or other materials derived from contemporary lives, arguing that a document is neither real nor fictional – it is a thing. I have explored practical ways for performers and dramaturgs to work with text-things, and a conceptual framework for the theatrical event called theatre-ting. The etymological root of the word ‘ting’ (thing) connects to practices of assemblage and gathering, still found in the Nordic languages. The theatre-ting brings the factual into the spaces of the fictional, which destabilizes both demonstrating how they are equally theatrical, truthful and mystical. It is a space where common questions and issues can be staged and discussed. It is an arena for testing, rehearsing, and practicing ethics. The materialist practice of staging documents questions notions of authorship, subjectivity, relation, and control in performative practices. In dialogue with object-oriented philosophy I have developed a conceptual framework to work from, challenging anthropocentrism and pointing to the ways things (including human bodies) are co-dependent and form each other, and where neither have the power of definition over the other. This demands ways of dealing with listening, time, relation, chance, and uncertainty. It is a matter of moving the attention to a materialist rather than individualist view.

APA, Harvard, Vancouver, ISO, and other styles

19

Kocman, Radim. "Podpora dynamického DOM v zobrazovacím stroji HTML." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2014. http://www.nusl.cz/ntk/nusl-236139.

Full text

Abstract:

The aim of this work is to create an extension for rendering engine CSSBox. This extension will implement Document Object Model interface for the Client-Side JavaScript. The first part of this thesis describes the CSSBox project, the current state of JavaScript engines, Document Object Model used in web browsers and the final design based on Adapter and Abstract Factory patterns. The rest of the text describes implementation issues with the W3C DOM specification and compares the speed of this extension with web browsers.

APA, Harvard, Vancouver, ISO, and other styles

20

Milosevic, Nikola. "A multi-layered approach to information extraction from tables in biomedical documents." Thesis, University of Manchester, 2018. https://www.research.manchester.ac.uk/portal/en/theses/a-multilayered-approach-to-information-extraction-from-tables-in-biomedical-documents(c2edce9c-ae7f-48fa-81c2-14d4bb87423e).html.

Full text

Abstract:

The quantity of literature in the biomedical domain is growing exponentially. It is becoming impossible for researchers to cope with this ever-increasing amount of information. Text mining provides methods that can improve access to information of interest through information retrieval, information extraction and question answering. However, most of these systems focus on information presented in main body of text while ignoring other parts of the document such as tables and figures. Tables present a potentially important component of research presentation, as authors often include more detailed information in tables than in textual sections of a document. Tables allow presentation of large amounts of information in relatively limited space, due to their structural flexibility and ability to present multi-dimensional information. Table processing encapsulates specific challenges that table mining systems need to take into account. Challenges include a variety of visual and semantic structures in tables, variety of information presentation formats, and dense content in table cells. The work presented in this thesis examines a multi-layered approach to information extraction from tables in biomedical documents. In this thesis we propose a representation model of tables and a method for table structure disentangling and information extraction. The model describes table structures and how they are read. We propose a method for information extraction that consists of: (1) table detection, (2) functional analysis, (3) structural analysis, (4) semantic tagging, (5) pragmatic analysis, (6) cell selection and (7) syntactic processing and extraction. In order to validate our approach, show its potential and identify remaining challenges, we applied our methodology to two case studies. The aim of the first case study was to extract baseline characteristics of clinical trials (number of patients, age, gender distribution, etc.) from tables. The second case study explored how the methodology can be applied to relationship extraction, examining extraction of drug-drug interactions. Our method performed functional analysis with a precision score of 0.9425, recall score of 0.9428 and F1-score of 0.9426. Relationships between cells were recognized with a precision of 0.9238, recall of 0.9744 and F1-score of 0.9484. The information extraction methodology performance is the state-of-the-art in table information extraction recording an F1-score range of 0.82-0.93 for demographic data, adverse event and drug-drug interaction extraction, depending on the complexity of the task and available semantic resources. Presented methodology demonstrated that information can be efficiently extracted from tables in biomedical literature. Information extraction from tables can be important for enhancing data curation, information retrieval, question answering and decision support systems with additional information from tables that cannot be found in the other parts of the document.

APA, Harvard, Vancouver, ISO, and other styles

21

Merkl-Davies, Doris. "The obfuscation hypothesis re-examined : analyzing impression management in corporate narrative report documents." Thesis, Bangor University, 2007. https://research.bangor.ac.uk/portal/en/theses/the-obfuscation-hypothesis-reexamined--analyzing-impression-management-in-corporate-narrative-report-documents(3fd58e2c-790a-44b7-80c8-2c4b41ef72c3).html.

Full text

Abstract:

This thesis empirically investigates the use of impression management in the narrative sections of the annual reports of UK listed companies. Impression management is examined by testing the obfuscation hypothesis which claims that firms with poor performance have a tendency to obfuscate negative organisational outcomes. For this purpose, the thesis provides an assessment of the extent to which reading difficulty and self-presentational dissimulation are associated with the disclosure of favourable or unfavourable results ('good/bad news') in annual financial statements, conditional on a firm's size and sector of operations. Impression management has previously been studied in the context of agency theory explanations of managerial and investor behaviour. This study contributes to the understanding of impression management in a corporate reporting context by first reviewing relevant theoretical work in behavioural finance, social psychology, and linguistics. Social psychology provides additional insights into the managerial motivation to engage in impression management, the circumstances fostering managerial impression management, and preferred managerial strategies. Behavioural finance offers insights into the effectiveness of impression management. Research in linguistics and social psychology provides the basis for developing new methodologies for measuring impression management in corporate narrative documents which overcome the validity problems inherent in conventional measures. Three new methodologies are introduced. The first develops cohesion-based measures of reading difficulty that focus on grammatical devices within and between sentences, including the number and density of cohesive ties and the proportion of new and given information (MMAX2). The second methodology provides multiple cohesionbased measures of readability, as applied in web-based readability scoring (CohMetrix). The third methodology measures impression management in the form of selfpresentational dissimulation (i.e. portraying a public image of firm performance and prospects inconsistent with a managerial view of firm performance and prospects), using linguistic markers which include word count, self-reference, reference to others, the use of emotion words, and cognitive complexity. The empirical analysis that is reported in this thesis is based on a sample that is balanced across industrial sectors and representative of the size distribution of firms. Results show firm size and not 'good/bad news' to be the determining factor in reading difficulty. Although the main effects model shows 'bad news' to be directly related to reading difficulty, this association is no longer significant when 'good/bad news' is interacted with firm size. Results suggest that large firms are more likely to produce corporate narrative documents which are less cohesive (and thus more difficult to read) than small firms. This is not interpreted as impression management, but as an indication that firms might tailor their corporate narrative documents to the reading strategies of their target readership groups. Thus, large firms seem to cater to the needs of high-knowledge readers (professional investors or readers largely familiar with the infom1ation content of the chairman's report), and small firms to the needs of low-knowledge readers (individual investors or readers largely unfamiliar with the information content of the chairman's report). Results regarding impression management in the form of self-presentational dissimulation suggest that the linguistic markers are not indicative of impression management in the form of selfpresentational dissimulation, but of other psychological issues.

APA, Harvard, Vancouver, ISO, and other styles

22

Tao, Cui. "Schema Matching and Data Extraction over HTML Tables." Diss., CLICK HERE for online access, 2003. http://contentdm.lib.byu.edu/ETD/image/etd279.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Haber, Renato Ribeiro. "Uma Ferramenta de Importação de Documentos HTML para um Ambiente de Ensino." Universidade de São Paulo, 1999. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-09032018-141601/.

Full text

Abstract:

Este trabalho apresenta um protótipo de ferramenta, a Html2Hip, que proporciona um ambiente de importação e adaptação de documentos descritos segundo o padrão HTML (HyperTexi Markup Language) para a representação interna do SASHE (Sistema de Autoria e Suporte Hipermiclia para Ensino), que se baseia na estruturação de objetos multimidia, segundo a hierarquia de classes proposta pelo MCA (Modelo de Contextos Aninhados). Além disso, este trabalho estendeu a capacidade do editor de nós de informação do tipo texto do protótipo anterior no que concerne ao processamento de arquivos-texto descritos pelo padrão RTF (Rich Text Formai). Dessa forma, o SASHE tornou-se capaz de processar e organizar materiais instrucionais preparados em seu próprio ambiente, no ambiente NX/NTAT/ (World-Wide Web), bem como em processadores de texto comuns.
This work presents a tool prototype, the Html2flip, that provides an importation and adaptation environment of documents described in HTML (HyperText Markup Language) standard for the internai representation of the SASHE (Hypermeclia System for Authorship and Supporting Educational Applications), that is based on the structural organization of multimeclia objects, proposed by MCA (Nested Contexts Model). Moreover, this work extended the capacity of the information text node editor of the previous prototype conceming the processing of text-files described in RTF (Rich Text Formar) standard. This way, the SASHE became capable to process and to organize instructional materiais prepared in its proper environment, in the WWW (World- Wide Web) environment, as well as in common word processors.

APA, Harvard, Vancouver, ISO, and other styles

24

Silva, Patrick Pedreira. "ExtraWeb: um sumarizador de documentos Web baseado em etiquetas HTML e ontologia." Universidade Federal de São Carlos, 2006. https://repositorio.ufscar.br/handle/ufscar/322.

Full text

Abstract:

Made available in DSpace on 2016-06-02T19:05:19Z (GMT). No. of bitstreams: 1 DissPPS.pdf: 2486545 bytes, checksum: 45bf3bd34f1453685126954dc3708459 (MD5) Previous issue date: 2006-07-10
Financiadora de Estudos e Projetos
This dissertation presents an automatic summarizer of Web documents based on both HTML tags and ontological knowledge. It has been derived from two independent approaches: one that focuses solely upon HTML tags, and another that focuses only on ontological knowledge. The three approaches were implemented and assessed, indicating that associating both knowledge types have a promising descriptive power for Web documents. The resulting prototype has been named ExtraWeb. The ExtraWeb system explores the HTML structure of Web documents in Portuguese and semantic information using the Yahoo ontology in Portuguese. This has been enriched with additional terms extracted from both a thesaurus, Diadorim and the Wikipedia. In a simulated Web search, ExtraWeb achieved a similar utility degree to Google one, showing its potential to signal through extracts the relevance of the retrieved documents. This has been an important issue recently. Extracts may be particularly useful as surrogates of the current descriptions provided by the existing search engines. They may even substitute the corresponding source documents. In the former case, those descriptions do not necessarily convey relevant content of the documents; in the latter, reading full documents demands a substantial overhead of Web users. In both cases, extracts may improve the search task, provided that they actually signal relevant content. So, ExtraWeb is a potential plug-in of search engines, to improve their descriptions. However, its scability and insertion in a real setting have not yet been explored.
Esta dissertação propõe um sumarizador de documentos Web baseado em etiquetas HTML e conhecimento ontológico, derivado de outras duas abordagens independentes: uma que contempla somente etiquetas HTML e outra, somente conhecimento ontológico. As três abordagens foram implementadas e avaliadas, indicando que a composição desses dois tipos de conhecimento tem um bom potencial descritivo de documentos Web. O protótipo resultante é denominado ExtraWeb. O ExtraWeb explora a estrutura de marcação de documentos em português e informações de nível semântico usando a ontologia do Yahoo em português, enriquecida com vocabulário extraído de um thesaurus, Diadorim, e da Wikipédia. Em uma tarefa simulada por internautas, de busca de documentos, o ExtraWeb obteve um grau de utilidade próximo ao do Google, evidenciando seu potencial para indicar, por meio de extratos, a relevância de documentos recuperados na Web. Esse foco é de grande interesse atualmente, pois os extratos podem ser particularmente úteis como substitutos das descrições atuais das ferramentas de busca ou, mesmo, como substitutos dos documentos correspondentes completos. No primeiro caso, as descrições nem sempre contemplam as informações mais relevantes dos documentos; no segundo, sua leitura implica um esforço considerável por parte do internauta. Em ambos os casos, extratos podem otimizar essa tarefa, se comprovada sua utilidade para a indicação da relevância dos documentos. Assim, o ExtraWeb tem potencial para ser um acessório das ferramentas de busca, para melhorar a forma como os resultados são apresentados, muito embora sua escalabilidade e implantação em um ambiente real ainda não tenham sido exploradas.

APA, Harvard, Vancouver, ISO, and other styles

25

Oliver, Robert W. "The vocation of the laity to evangelization an ecclesiological inquiry into the Synod on the laity (1987), Christifideles laici (1989), and documents of the NCCB (1987-1996) /." Roma : Editrice Pontificia Università Gregoriana, 1997. http://catalog.hathitrust.org/api/volumes/oclc/37849170.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Chen, Xueqi. "Query Rewriting for Extracting Data behind HTML Forms." Diss., CLICK HERE for online access, 2004. http://contentdm.lib.byu.edu/ETD/image/etd406.Chen.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Stewart, Jeffrey D. "An XML-based knowledge management system of port information for U.S. Coast Guard Cutters." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2003. http://library.nps.navy.mil/uhtbin/hyperion-image/03Mar%5FStewart.pdf.

Full text

Abstract:

Thesis (M.S. in Information Systems Technology)--Naval Postgraduate School, March 2003.
Thesis advisor(s): Magdi N. Kamel, Gordon H. Bradley. Includes bibliographical references (p. 101-103). Also available online.

APA, Harvard, Vancouver, ISO, and other styles

28

Chen, Benfeng. "Transforming Web pages to become standard-compliant through reverse engineering /." View abstract or full-text, 2006. http://library.ust.hk/cgi/db/thesis.pl?COMP%202006%20CHEN.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Mull, Randall Franklin. "Teaching web design at the higher education level." Morgantown, W. Va. : [West Virginia University Libraries], 2001. http://etd.wvu.edu/templates/showETD.cfm?recnum=1954.

Full text

Abstract:

Thesis (M.S.)--West Virginia University, 2001.
Title from document title page. Document formatted into pages; contains iii, 47 p. Vita. Includes abstract. Includes bibliographical references (p. 36-37).

APA, Harvard, Vancouver, ISO, and other styles

30

Nicoletti, Alberto. "Conversione di documenti DOCX in formato RASH." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/12298/.

Full text

Abstract:

In questo lavoro di tesi viene presentato DOCX2RASH, uno strumento che permette di convertire articoli scientifici scritti con Microsoft Word in formato DOCX nel formato RASH, un sottoinsieme di HTML. Questo software permette agli autori di articoli scientifici di usare Microsoft Word per la scrittura dei propri lavori, ottenendo allo stesso tempo i benefici dati dal formato HTML, tra cui interattività, usabilità e facilità di processamento da parte delle macchine.

APA, Harvard, Vancouver, ISO, and other styles

31

Majdzadeh, Khandani Kourosh. "Rights and liabilities of the consignees/endorsees : a comparative study of the Rotterdam Rules and English Law." Thesis, University of Manchester, 2018. https://www.research.manchester.ac.uk/portal/en/theses/rights-and-liabilities-of-the-consigneesendorsees-a-comparative-study-of-the-rotterdam-rules-and-english-law(aa10e154-facf-4573-a10f-30786c51e4c0).html.

Full text

Abstract:

In the context of an international carriage of goods by sea contract, the consignees and endorsees are the two important categories of the parties whom their rights and liabilities have not been legislated for in any international carriage of goods by sea convention until the adoption of the Rotterdam Rules. The truth is that, in contrast to the rights and the correlative liabilities and obligations of the shippers and carriers, the rights and liabilities of the consignees and endorsees have always been dealt with by the domestic and national laws. However, the Rotterdam Rules, with the goals of promoting legal certainty, improving the efficiency of international carriage of goods and harmonization and modernization of the carriage rules, for the first time at an international level, have attempted to regulate the provisions governing the rights and liabilities of the latter parties. Thus, the application of the Rotterdam Rules, in case they gain the force of law, will be broader than any other international maritime convention. Therefore, this has compelled the necessity of carrying out a profound and detailed critical analysis of the new, and somewhat innovative, regulations, since the impact of the application of the Convention on the existing carriage of goods by sea rules, both nationally and internationally will be crucially significant. The UK as one of the major actors of the maritime industry has a long-established set of rules particularly in the field of rights and liabilities of the parties, both in the common law and statutory senses, governing the carriage of goods by sea affairs for centuries. This thesis aims to evaluate the relevant provisions of the Rotterdam Rules by way of comparison with their corresponding rules of the English law in order to find out whether these new sets of regulations can establish a reliable source of reference for the consignees and endorsees who wish to ascertain their rights and become aware of their obligations and liabilities. In other words, the main objective of this study is to examine whether the Rotterdam Rules clearly define and specify the rights and liabilities of the consignees and endorsees to a contract of carriage of goods by sea. Further, it is going to investigate whether the Convention succeed in achieving its goals with respect to the rights and liabilities of these parties. Also, ratification of the Rotterdam Rules is believed to have a significant impact on the English maritime law and therefore, the question whether it is reasonable for the UK to ratify the Convention will be answered in this research. It is suggested that the findings of this thesis in addition to the solutions proposed to solve the difficulties, ambiguity and complexity of the existing rules, will be of assist to the UK authorities as well as the legislative bodies in other jurisdictions in order to obtain a more effective decision on the adoption of the Rotterdam Rules. This study ends with illustrating an alarming vision of the future of maritime law which will be largely affected by the evolution of smart technologies in the shipping industry.

APA, Harvard, Vancouver, ISO, and other styles

32

Parker, Rembert N. "An introduction to computer programming for complete beginners using HTML, JavaScript, and C#." CardinalScholar 1.0, 2008. http://liblink.bsu.edu/uhtbin/catkey/1465970.

Full text

Abstract:

Low student success rates in introductory computer programming classes result in low student retention rates in computer science programs. For some sections of the course a traditional approach began using C# in the .Net development environment immediately. An experimental course redesign for one section was prepared that began with a study of HTML and JavaScript and focused on having students build web pages for several weeks; after that the experimental course used C# and the .Net development environment, covering all the material that was covered in the traditional sections. Students were more successful in the experimental section, with a higher percentage of the students passing the course and a higher percentage of the students continuing on to take at least one additional computer science course.
Department of Computer Science

APA, Harvard, Vancouver, ISO, and other styles

33

Cohen, Eric Joseph. "An investigation into World Wide Web publishing with the Hypertext Markup Language /." Online version of thesis, 1995. http://hdl.handle.net/1850/12229.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Han, Wei. "Wrapper application generation for semantic web." Diss., Georgia Institute of Technology, 2003. http://hdl.handle.net/1853/5407.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Paolucci, Francesco. "A Fitting Algorithm: applicazione automatica di vincoli tipografici per la stampa di documenti testuali su browser." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20534/.

Full text

Abstract:

L'avvento dei moduli CSS specifici per la creazione di documenti nel formato di stampa, i paged media, ha riscosso grande attenzione per la possibilità di estendere le funzionalità dei linguaggi cardini del web anche alla carta stampata. Lo stato delle specifiche è tutt’ora in working draft, molte delle funzionalità non sono state implementate su tutti i browser e i moduli sono applicati solo quando viene aperta la finestra di dialogo “stampa”. La libreria Paged.js permette di rappresentare direttamente sulla finestra di browser il flusso di testo descritto in HTML con un page layout, facendo polyfill dei moduli di stampa CSS, forzando le specifiche. In questo progetto di tesi il framework viene ampliato con una tecnica di fitting per adattare il testo alla pagina tenendo conto di vincoli tipografici d’armonia e qualità dell’impaginato. L’algoritmo, che chiameremo Fitting Algorithm, si basa sulla scelta di una sequenza di flusso ottimale basata su un punteggio stabilito dalle posizioni di page break. Nell’elaborato verranno inizialmente descritti i moduli di CSS per il page layout: Paged Media, Fragmentation, Generated Content, Page floats. In seguito, si andrà a esplorare il funzionamento di Paged.js, troveremo i limiti che presenta la libreria su cui si porranno le basi per l’algoritmo di fitting. Analizzeremo le caratteristiche principali che governano l’algoritmo come: i vincoli tipografici da rispettare, le soluzioni applicabili e il punteggio, cuore dell’algoritmo. Nel capitolo successivo, scenderemo più nel dettaglio andando ad analizzare la struttura a blocchi dell’algoritmo e alcune funzioni interessanti. Valuteremo anche la sua efficacia ed efficienza con una serie di comparazioni tra l’utilizzo di Paged.js senza e con l’implementazione dell’algoritmo di fitting. Dimostrando come i limiti evidenziati siano stati effettivamente superati ed elencando nuovi obiettivi per il futuro e possibili estensioni.

APA, Harvard, Vancouver, ISO, and other styles

36

Tandon, Seema Amit. "Web Texturizer: Exploring intra web document dependencies." CSUSB ScholarWorks, 2004. https://scholarworks.lib.csusb.edu/etd-project/2539.

Full text

Abstract:

The goal of this project is to create a customized web browser to facilitate the skimming of documents by offsetting the document with relevant information. This project added techniques of learning information retrieval to automate the web browsing experience to the web texturizer. The script runs on the web texturizer website; and it allows users to quickly navigate through the web page.

APA, Harvard, Vancouver, ISO, and other styles

37

Moura, Antonio Gilberto de. "Proposta de um sistema para geração de Applets Java para animação de paginas HTML destinadas a educação a distancia." [s.n.], 2002. http://repositorio.unicamp.br/jspui/handle/REPOSIP/260239.

Full text

Abstract:

Orientador : Yuzo Iano
Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação
Made available in DSpace on 2018-08-03T15:06:30Z (GMT). No. of bitstreams: 1 Moura_AntonioGilbertode_M.pdf: 9110464 bytes, checksum: 7065ce0ccb8423e69937937ed11d6b03 (MD5) Previous issue date: 2002
Mestrado

APA, Harvard, Vancouver, ISO, and other styles

38

Al-Dallal, Ammar Sami. "Enhancing recall and precision of web search using genetic algorithm." Thesis, Brunel University, 2012. http://bura.brunel.ac.uk/handle/2438/7379.

Full text

Abstract:

Due to rapid growth of the number of Web pages, web users encounter two main problems, namely: many of the retrieved documents are not related to the user query which is called low precision, and many of relevant documents have not been retrieved yet which is called low recall. Information Retrieval (IR) is an essential and useful technique for Web search; thus, different approaches and techniques are developed. Because of its parallel mechanism with high-dimensional space, Genetic Algorithm (GA) has been adopted to solve many of optimization problems where IR is one of them. This thesis proposes searching model which is based on GA to retrieve HTML documents. This model is called IR Using GA or IRUGA. It is composed of two main units. The first unit is the document indexing unit to index the HTML documents. The second unit is the GA mechanism which applies selection, crossover, and mutation operators to produce the final result, while specially designed fitness function is applied to evaluate the documents. The performance of IRUGA is investigated using the speed of convergence of the retrieval process, precision at rank N, recall at rank N, and precision at recall N. In addition, the proposed fitness function is compared experimentally with Okapi-BM25 function and Bayesian inference network model function. Moreover, IRUGA is compared with traditional IR using the same fitness function to examine the performance in terms of time required by each technique to retrieve the documents. The new techniques developed for document representation, the GA operators and the fitness function managed to achieves an improvement over 90% for the recall and precision measures. And the relevance of the retrieved document is much higher than that retrieved by the other models. Moreover, a massive comparison of techniques applied to GA operators is performed by highlighting the strengths and weaknesses of each existing technique of GA operators. Overall, IRUGA is a promising technique in Web search domain that provides a high quality search results in terms of recall and precision.

APA, Harvard, Vancouver, ISO, and other styles

39

Rubano, Vincenzo. "L'(in)accessibilità degli articoli scientifici sul Web e l'uso di RASH e EPUB." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2016. http://amslaurea.unibo.it/12281/.

Full text

Abstract:

Ad oggi lo standard “de facto” per la pubblicazione di articoli scientifici in formato elettronico è costituito dal formato PDF il quale crea molteplici problemi a tutti coloro che ne debbano usufruire attraverso l’uso di tecnologie assistive. Dopo aver fornito una breve panoramica dei motivi che hanno portato alla situazione attuale e delle ragioni per cui sarebbe auspicabile l’impiego di un’alternativa differente per la pubblicazione di articoli scientifici sul Web, ci si soffermerà sulla discussione dell’implementazione e dei vantaggi di una possibile soluzione al problema basata su RASH (Research Articles in Simplified HTML) un linguaggio che consente la scrittura di articoli scientifici utilizzando un insieme ristretto di funzionalità presenti nei linguaggi di markup per la creazione di pagine Web. Più nel dettaglio, l’alternativa proposta consiste nell’implementare uno strumento da integrare nel framework RASH che consentano di esportare un documento scritto con l’omonimo linguaggio in formato EPUB, uno standard aperto impiegato ormai da molti anni per la pubblicazione di libri in formato elettronico; verranno illustrati i motivi che hanno portato alla scelta della combinazione di RASH e EPUB, nonché i vantaggi che essa comporta. Saranno inoltre discusse le difficoltà tecniche che l'implementazione di rash2epub, dovute alla varietà di dispositivi con cui è possibile leggere gli ebook in formato EPUB e allo scarso supporto che essi offrono per il rendering delle formule matematiche e le strategie utilizzate per superarle.

APA, Harvard, Vancouver, ISO, and other styles

40

West, Philip. "A framework for responsive content adaptation in electronic display networks." Thesis, Rhodes University, 2006. http://hdl.handle.net/10962/d1004824.

Full text

Abstract:

Recent trends show an increase in the availability and functionality of handheld devices, wireless network technology, and electronic display networks. We propose the novel integration of these technologies to provide wireless access to content delivered to large-screen display systems. Content adaptation is used as a method of reformatting web pages to display more appropriately on handheld devices, and to remove unwanted content. A framework is presented that facilitates content adaptation, implemented as an adaptation layer, which is extended to provide personalization of adaptation settings and response to network conditions. The framework is implemented as a proxy server for a wireless network, and handles HTML and XML documents. Once a document has been requested by a user, the HTML/XML is retrieved and parsed, creating a Document Object Model tree representation. It is then altered according to the user’s personal settings or predefined settings, based on current network usage and the network resources available. Three adaptation techniques were implemented; spatial representation, which generates an image map of the document, text summarization, which creates a tree view representation of a document, and tag extraction, which replaces specific tags with links. Three proof-of-concept systems were developed in order to test the robustness of the framework. A system for use with digital slide shows, a digital signage system, and a generalized system for use with the internet were implemented. Testing was performed by accessing sample web pages through the content adaptation proxy server. Tag extraction works correctly for all HTML and XML document structures, whereas spatial representation and text summarization are limited to a controlled subset. Results indicate that the adaptive system has the ability to reduce average bandwidth usage, by decreasing the amount of data on the network, thereby allowing a greater number of users access to content. This suggests that responsive content adaptation has a positive influence on network performance metrics.

APA, Harvard, Vancouver, ISO, and other styles

41

Ticona, Quispe Miguel. "La conservacion preventiva y curativa de los documentos publicos oficiales en la Biblioteca Central de la Universidad Mayor de San Andres." Universidad Mayor de San Andrés. Programa Cybertesis BOLIVIA, 2003. http://www.cybertesis.umsa.bo:8080/umsa/2007/ticona_qm/html/index-frames.html.

Full text

Abstract:

El presente trabajo de investigación titulado “CONSERVACIÓN PREVENTIVA Y CURATIVA DE LOS DOCUMENTOS PÚBLICOS OFICIALES EN LA BIBLIOTECA CENTRAL DE LA UNIVERSIDAD MAYOR DE SAN ANDRES”, es el resultado del interés de conocer las estrategias de conservación que se aplican en la Biblioteca el cual incluye el ordenamiento y el estado de las colecciones, el control del medio ambiente, la higiene y el manejo y uso de los documentos por los usuarios universitarios, teniendo en cuenta su conservación, su facilidad de acceso, los sistemas de seguridad, el edificio, el trabajo de reparación y restauración empírica que se realiza; todo ello, permite la obtención de nuevos conocimientos sistematizados derivados de la metodología y las técnicas propuestas en el presente estudio y determinar el grado de eficiencia en el servicio de Circulación y Préstamo de libros y documentos en la Biblioteca Central. La conservación que debe recibir este tipo de colección y las demás colecciones del fondo bibliográfico general, no solamente es tarea de los bibliotecarios que trabajan en esta unidad de información, sino también de las autoridades superiores de la UMSA y del Estado Boliviano por que constituyen patrimonio bibliográfico y documental de la nación; por lo tanto, recursos informativos de gran valor potencial para la investigación científica, técnica y cultural. La existencia de leyes y normas jurídicas como medidas de seguridad, preservación y el buen orden de las documentaciones públicas oficiales a lo largo de la historia de la República, tanto activas con pasivas, han proporcionado resultados limitados; originando la destrucción paulatina de acervos documentales.

APA, Harvard, Vancouver, ISO, and other styles

42

Gonzalez-Ayala, Sofia Natalia. "Black, Afro-Colombian, Raizal and Palenquero communities at the National Museum of Colombia : a reflexive ethnography of (in)visibility, documentation and participatory collaboration." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/black-afrocolombian-raizal-and-palenquero-communities-at-the-national-museum-of-colombia-a-reflexive-ethnography-of-invisibility-documentation-and-participatory-collaboration(e40c8594-35c7-49b9-af1c-ccca82cb335f).html.

Full text

Abstract:

The subject of this thesis is the temporary and travelling exhibition Velorios y santos vivos: comunidades negras, afrocolombianas, raizales y palenqueras [Wakes and living saints: Black, Afro-Colombian, Raizal and Palenquero communities]. ‘Velorios,’ as many people involved in the project referred to it, portrayed Afro-Colombian funerals and devotions to Catholic saints, and was on display in the temporary exhibitions hall in the National Museum of Colombia, in Bogotá, from 21 August to 3 November 2008. Before it closed, a travelling version was designed that began to go around the country in 2009. When I wrote this thesis, ‘the Itinerante,’ as the travelling version was referred to at the Museum, was still available as one of the displays that its Travelling Exhibitions Programme (TEP) offered to the public. I use Velorios and the Itinerante as the main ‘characters’ in an ethnography of the National Museum of Colombia, where I explore the different instances in which this major exhibition produced visibilities and invisibilities regarding the place of Afro-Colombian people in the nation. As a museum, this institution is responsible for managing, researching and displaying its four collections (of art, history, ethnography and archaeology) but also, as one of the Ministry of Culture’s ‘special administrative units,’ it is in charge of designing and implementing policies that regulate all the other museums in Colombia. This is in keeping with national and international official legislation regarding cultural heritage, like the National Culture Plan and UNESCO’s resolutions, and in support of the development and strengthening of museums, museology and museum design in the whole country. Here I show what these responsibilities and duties translate into on the ground. The themes that the thesis explores are i) (in)visibility, ii) participatory collaboration and, also as the means to approach these themes, iii) documents and documentation. They are all components of the kind of curatorship that this museum exhibition conveyed.

APA, Harvard, Vancouver, ISO, and other styles

43

Nagrath, Vineet. "Software architectures for cloud robotics : the 5 view Hyperactive Transaction Meta-Model (HTM5)." Thesis, Dijon, 2015. http://www.theses.fr/2015DIJOS005/document.

Full text

Abstract:

Le développement de logiciels pour les robots connectés est une difficulté majeure dans le domaine du génie logiciel. Les systèmes proposés sont souvent issus de la fusion de une ou plusieurs plates-formes provenant des robots, des ordinateurs autonomes, des appareils mobiles, des machines virtuelles, des caméras et des réseaux. Nous proposons ici une approche orientée agent permettant de représenter les robots et tous les systèmes auxiliaires comme des agents d’un système. Ce concept de l’agence préserve l’autonomie sur chacun des agents, ce qui est essentiel dans la mise en oeuvre logique d’un nuage d’éléments connectés. Afin de procurer une flexibilité de mise en oeuvre des échanges entre les différentes entités, nous avons mis en place un mécanisme d’hyperactivité ce qui permet de libérer sélectivement une certaine autonomie d’un agent par rapport à ces associés.Actuellement, il n’existe pas de solution orientée méta-modèle pour décrire les ensembles de robots interconnectés. Dans cette thèse, nous présentons un méta-modèle appelé HTM5 pour spécifier a structure, les relations, les échanges, le comportement du système et l’hyperactivité dans un système de nuages de robots. La thèse décrit l’anatomie du méta-modèle (HTM5) en spécifiant les différentes couches indépendantes et en intégrant une plate-forme indépendante de toute plateforme spécifique. Par ailleurs, la thèse décrit également un langage de domaine spécifique pour la modélisation indépendante dans HTM5. Des études de cas concernant la conception et la mise en oeuvre d’un système multi-robots basés sur le modèle développé sont également présentés dans la thèse. Ces études présentent des applications où les décisions commerciales dynamiques sont modélisées à l’aide du modèle HTM5 confirmant ainsi la faisabilité du méta-modèle proposé
Software development for cloud connected robotic systems is a complex software engineeringendeavour. These systems are often an amalgamation of one or more robotic platforms, standalonecomputers, mobile devices, server banks, virtual machines, cameras, network elements and ambientintelligence. An agent oriented approach represents robots and other auxiliary systems as agents inthe system.Software development for distributed and diverse systems like cloud robotic systems require specialsoftware modelling processes and tools. Model driven software development for such complexsystems will increase flexibility, reusability, cost effectiveness and overall quality of the end product.The proposed 5-view meta-model has separate meta-models for specifying structure, relationships,trade, system behaviour and hyperactivity in a cloud robotic system. The thesis describes theanatomy of the 5-view Hyperactive Transaction Meta-Model (HTM5) in computation independent,platform independent and platform specific layers. The thesis also describes a domain specificlanguage for computation independent modelling in HTM5.The thesis has presented a complete meta-model for agent oriented cloud robotic systems and hasseveral simulated and real experiment-projects justifying HTM5 as a feasible meta-model

APA, Harvard, Vancouver, ISO, and other styles

44

Baroni, Andrea. "Pattern design per documenti strutturati: il problema della conversione da e per formati documentali tradizionali." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14422/.

Full text

Abstract:

In questa dissertazione presento DOCX2ADF e ADF2DOCX, due software per la conversione di documenti. Il primo si occupa di trasformare documenti redatti attraverso Microsoft Word dal formato Office Open XML al formato ADF, mentre il secondo converte documenti dal formato ADF al formato Office Open XML. Il formato ADF è un sottoinsieme di HTML5 che impiega solamente 25 elementi.

APA, Harvard, Vancouver, ISO, and other styles

45

Sebastião, Cláudio Barradas. "Proposta de um modelo conceitual de ferramenta para monitoramento de documento na web." Florianópolis, SC, 2003. http://repositorio.ufsc.br/xmlui/handle/123456789/84744.

Full text

Abstract:

Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Ciência da Computação.
Made available in DSpace on 2012-10-20T12:07:30Z (GMT). No. of bitstreams: 1 198175.pdf: 1739707 bytes, checksum: 290e2c7ab594e697dc4fabda07e89adb (MD5)
A Web pode ser vista de duas formas: serviços e conteúdo. Conteúdo é o conjunto das informações eletrônicas que podem ser publicadas através do meio Web e por serviços designamos o conjunto de funcionalidades que possibilitam a extração, integração, publicação e visualização do conteúdo. Com esta visão, este estudo comtempla uma grande estruturação de como desenvolver páginas Web e gerencia-las de uma forma prática, segura e responsável, utilizando-se de todas as opções que as inúmeras ferramentas de desenvolvimento Web nos proporcionam.

APA, Harvard, Vancouver, ISO, and other styles

46

Sire, Guillaume. "La production journalistique et Google : chercher à ce que l’information soit trouvée." Thesis, Paris 2, 2013. http://www.theses.fr/2013PA020040/document.

Full text

Abstract:

Nous cherchons dans ce travail à détricoter la relation à la fois compétitive et coopérative, indifféremment technique, économique, juridique, sociale, politique et résolument communicationnelle de Google et des éditeurs de presse. Pour cela, après avoir historicisé la rencontre de deux univers singuliers, nous décrivons ce que les éditeurs peuvent faire pour franchir le prisme du moteur de recherche et y optimiser la visibilité de leur production. Nous tâchons ensuite de décrypter ce que la firme Google peut faire faire aux éditeurs en analysant leurs relations de pouvoir, leurs incitations, leurs projets et leurs environnements informationnels respectifs. Enfin, nous rendons compte de ce que les éditeurs français issus de la presse imprimée font effectivement : ce qu’ils communiquent à Google, par quels moyens et à quel prix, pour quels résultats espérés, à l’issue de quelles concessions, quels détours, quelles contestations. Nous expliquons comment les conditions et les modalités de captation du trafic sont susceptibles d’influencer la valorisation du contenu, l’organisation e sa production, la structure du site, les pratiques journalistiques et les lignes éditoriales. Nous montrons qu’un aller-retour performatif se crée entre énoncés et conditions d’énonciation, agissant par et sur les textes, les architextes et les hypertextes. En somme, c’est à la compréhension de ce que deviennent l’actualité et eux qui la mettent en récit, dès lors qu’ils cherchent à ce que l’information qu’ils produisent soit trouvée par les utilisateurs de Google, que notre thèse est consacrée
In this thesis, we aim to disentangle the cooperative but also competitive relationship between Google and news publishers, which is at the same time technical, economic, legal, social, political and certainly communicational. In order to do so, we trace the historical development of two singular universes, describing what publishers can do to overcome the search engine and optimize their ranking. We then analyse how Google can influence publishers’ conduct, by studying power relations, respective incentives, aims, and informational and socio-economic backgrounds. Finally, we report on actual practices of French traditional news publishers: what they communicate to Google, by which means and at what price, for which expected results, after which concessions, detours and controversies. Thus, we explain how search engine optimization is likely to affect the way content is valued, its production organisation, the website’s structure, journalists’ prac tice an editorial policy. We show a back and forth movement between performative utterances and performed circumstances, having an effect on and by texts, architexts and hypertexts. To sum up, this thesis is dedicated to understanding what happens to news and publishers once they strive for their information to be found by Google's users

APA, Harvard, Vancouver, ISO, and other styles

47

Björk, Linus. "Avancerad webbteknologi i mobila webbläsare." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-65921.

Full text

Abstract:

Utvecklingen på webben går snabbt framåt och webbapplikationerna blir bara mer avancerade. Samtidigt så har de mobila webbläsarna utvecklats i en snabb takt. Dock skiljer det fortfarande mycket mellan en mobil webbläsare och en vanlig webbläsare, samt att man integrerar med en mobiltelefon på ett annat sätt än vad man gör med en dator. Detta examensarbete undersöker om det är möjligt att skapa avancerade webbapplikationer som, genom att utnyttja de senaste webbteknologierna, kan ersätta vanliga mobilapplikationer. Undersökningen genomförs genom att skapa en lättviktsvariant av en telefonapplikation, Mobile Documents till Symbian S60, som är en applikation som hanterar dokument, mejl och bilagor. Utvecklingen sker till största del i Google Web Toolkit och tekniker så som AJAX och Comet används. Eftersom antalet olika sorters telefoner med tryckskärm är väldigt stort så kommer undersökningen att rikta sig mot ett fåtal telefoner som kör webbläsarna Mobile Safari, microB och Android Browser. Slutsatserna av rapporten är att JavaScript-stödet hos dagens webbläsare är stort nog till att köra avancerade webbapplikationer. Dock skiljer det mycket webbläsarna emellan och det största problemet är att skapa sig ett välfungerande användargränssnitt som fungerar lika bra på alla telefoner och med alla de olika interaktionsmöjligheter som finns i en mobiltelefon.
The web develops fast and web applications are getting more advanced. At the same time the mobile browsers develop at a rapid pace. However, it still differs a lot between a mobile browser and a standard web browser. You also interact with a mobile phone in a different way than what you do with a computer. This thesis examines whether it is possible to create advanced web applications that by utilizing the latest web technologies can replace ordinary mobile applications. The investigation is done by creating a lightweight version of a phone application, Mobile Documents on Symbian S60, which is an application that manages documents, emails and attachments. The development is done in Google Web Toolkit and technologies such as AJAX and Comet are both used. As the number of different types of phones with touch screens is very large the investigation only will target a small number of phones running web browsers as Mobile Safari, microB and Android Browser. The conclusions of this report is that the JavaScript support of today's browsers is enough to run advanced web applications. However, it differs a lot between browsers and the main problem is to create a functional user interface that works equally well on all phones and with all the different interaction possibilities that a mobile phone gives.

APA, Harvard, Vancouver, ISO, and other styles

48

Hendges, Graciela Rabuske. "Tackling genre classification." Florianópolis, SC, 2007. http://repositorio.ufsc.br/xmlui/handle/123456789/90448.

Full text

Abstract:

Tese (doutorado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão. Programa de Pós-Graduação em Letras/Inglês e Literatura Correspondente
Made available in DSpace on 2012-10-23T10:39:26Z (GMT). No. of bitstreams: 1 249271.pdf: 3171345 bytes, checksum: 00f207cece278de30d1f5b7fd246c496 (MD5)
Pesquisas recentes sobre comunicação científica têm revelado que desde o final dos anos de 1990 o uso de periódicos acadêmicos passou da mídia impressa para o mídia eletrônica (Tenopir, 2002, 2003; Tenopir & King, 2001, 2002) e, conseqüentemente, há previsões de que por volta de 2010 cerca de 80% dos periódicos terão apenas versões online (Harnad, 1998). Todavia, essas pesquisas mostram também que nem todas as disciplinas estão migrando para a Internet com a mesma velocidade. Enquanto que áreas como as Ciências da Informação, Arquivologia, Web design e Medicina têm mostrado interesse e preocupação em entnder e explicar esse fenômeno, em Lingüística Aplicada, particularmente em Análise de Gênero, os estudos ainda são escassos. Neste trabalho, portanto, procuro investigar em que medida o meio eletrônico (Internet) afeta o gênero artigo acadêmico no seu processo de mudança da mídia impressa para a mídia eletrônica. Mais especificamente, examino artigos acadêmicos em HTML nas áreas de Lingüística e Medicina com vistas a verificar se esse hypertexto é um gênero novo ou não. A abordagem metodológica adotada nesta pesquisa deriva da proposta de Askehave e Swales (2001) e de Swales (2004), na qual o critéro predominante para a classificação de um gênero é o propósito comunicativo, o qual só pode ser definido com base em uma análise textual tanto quanto em uma análise contextual. Dessa forma, neste estudo foram coletados e analisados dados textuais e contextuais e os resultados de ambas análises revelam que o artigo acadêmico em HTML é um gênero novo, cujo propósito comunicativo é realizado por hiperlinks e portanto, esse gênero é profundamente dependente da mídia eletrônica.

APA, Harvard, Vancouver, ISO, and other styles

49

Costa, José Henrique Calenzo. "Filtered-page ranking." reponame:Repositório Institucional da UFSC, 2016. https://repositorio.ufsc.br/xmlui/handle/123456789/167840.

Full text

Abstract:

Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2016.
Made available in DSpace on 2016-09-20T04:25:42Z (GMT). No. of bitstreams: 1 341906.pdf: 4935734 bytes, checksum: 5630ca8c10871314b7f54120d18ae335 (MD5) Previous issue date: 2016
Algoritmos de ranking de páginas Web podem ser criados usando técnicas baseadas em elementos estruturais da página Web, em segmentação da página ou na busca personalizada. Esta pesquisa aborda um método de ranking de documentos previamente filtrados, que segmenta a página Web em blocos de três categorias para delas eliminar conteúdo irrelevante. O método de ranking proposto, chamado Filtered-Page Ranking (FPR), consta de duas etapas principais: (i) segmentação da página web e eliminação de conteúdo irrelevante e (ii) ranking de páginas Web. O foco da extração de conteúdo irrelevante é eliminar conteúdos não relacionados à consulta do usuário, através do algoritmo proposto Query-Based Blocks Mining (QBM), para que o ranking considere somente conteúdo relevante. O foco da etapa de ranking é calcular quão relevante cada página Web é para determinada consulta, usando critérios considerados em estudos de recuperação da informação. Com a presente pesquisa pretende-se demonstrar que o QBM extrai eficientemente o conteúdo irrelevante e que os critérios utilizados para calcular quão próximo uma página Web é da consulta são relevantes, produzindo uma média de resultados de ranking de páginas Web de qualidade melhor que a do clássico modelo vetorial.

Abstract : Web page ranking algorithms can be created using content-based, structure-based or user search-based techniques. This research addresses an user search-based approach applied over previously filtered documents ranking, which relies in a segmentation process to extract irrelevante content from documents before ranking. The process splits the document into three categories of blocks in order to fragment the document and eliminate irrelevante content. The ranking method, called Page Filtered Ranking, has two main steps: (i) irrelevante content extraction; and (ii) document ranking. The focus of the extraction step is to eliminate irrelevante content from the document, by means of the Query-Based Blocks Mining algorithm, creating a tree that is evaluated in the ranking process. During the ranking step, the focus is to calculate the relevance of each document for a given query, using criteria that give importance to specific parts of the document and to the highlighted features of some HTML elements. Our proposal is compared to two baselines: the classic vectorial model, and the CETR noise removal algorithm, and the results demonstrate that our irrelevante content removal algorithm improves the results and our relevance criteria are relevant to the process.

APA, Harvard, Vancouver, ISO, and other styles

50

Maddipudi, Koushik. "Efficient Architectures for Retrieving Mixed Data with Rest Architecture Style and HTML5 Support." TopSCHOLAR®, 2013. http://digitalcommons.wku.edu/theses/1251.

Full text

Abstract:

Software as a service is an emerging but important aspect of the web. WebServices play a vital role in providing it. Web Services are commonly provided in one of two architectural styles: a "REpresentational State Transfer" (REST), or using the "Simple Object Access Protocol" (SOAP.) Originally most web content was text and small images. But more recent services involve complex data structures including text, images, audio, and video. The task of optimizing data to provide delivery of these structures is a complex one, involving both theoretical and practical aspects. In this thesis work, I have considered two architectures developed in the REST architectural style and tested them on mixes of data types (plain text, image, audio) being retrieved from a file system or database. The payload which carries the actual content of a data transmission process can either be in Extensible Markup Language (XML) or JavaScript Object Notation (JSON). Both of these language notations are widely used. The two architectures used in this thesis work are titled as Scenario 1 and Scenario 2. Scenario 1 proposes two different cases for storing, retrieving and presenting the data via a REST web service. We investigate the question of what is the best way to provide different data types (image, audio) via REST Web Service. Payload size for JSON and XML are compared. Scenario 2 proposes an enhanced and optimized architecture which is derived from the pros of the first two cases in Scenario 1. The proposed architecture is best suited for retrieving and serving non-homogeneous data as a service in a homogenous environment. This thesis is composed of theoretical and practical parts. The theory part contains the design and principles of REST architecture. The practical part has a Web Service provider and consumer model developed in Java. The practical part is developed using the Spring MVC framework and Apache CXF, which provides an implementation using JAX-RS, the Java API for RESTful services. A glossary of acronyms used in this thesis appears in the appendix on page 101.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!