Academic literature on the topic 'HTML documents'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'HTML documents.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "HTML documents"

1

Bonhomme, Stéphane, and Cécile Roisin. "Interactively restructuring HTML documents." Computer Networks and ISDN Systems 28, no. 7-11 (May 1996): 1075–84. http://dx.doi.org/10.1016/0169-7552(96)00042-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sato, S. y. "Dynamic rewriting of HTML documents." Computer Networks and ISDN Systems 27, no. 2 (November 1994): 307–8. http://dx.doi.org/10.1016/s0169-7552(94)90147-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

von Tetzchner, J. Stephenson. "Converting formatted documents to HTML." Computer Networks and ISDN Systems 27, no. 2 (November 1994): 309–10. http://dx.doi.org/10.1016/s0169-7552(94)90154-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

O, Geum-Yong, and In-Jun Hwang. "Automatically Converting HTML Documents with Similar Pattern into XML Documents." KIPS Transactions:PartD 9D, no. 3 (June 1, 2002): 355–64. http://dx.doi.org/10.3745/kipstd.2002.9d.3.355.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

KAJI, NOBUHIRO, and MASARU KITSUREGAWA. "Acquiring Polar Sentences from HTML Documents." Journal of Natural Language Processing 15, no. 3 (2008): 77–90. http://dx.doi.org/10.5715/jnlp.15.3_77.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Gupta, Suhit, Gail E. Kaiser, Peter Grimm, Michael F. Chiang, and Justin Starren. "Automating Content Extraction of HTML Documents." World Wide Web 8, no. 2 (June 2005): 179–224. http://dx.doi.org/10.1007/s11280-004-4873-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Vállez, Mari, Rafael Pedraza-Jiménez, Lluís Codina, Saúl Blanco, and Cristòfol Rovira. "A semi-automatic indexing system based on embedded information in HTML documents." Library Hi Tech 33, no. 2 (June 15, 2015): 195–210. http://dx.doi.org/10.1108/lht-12-2014-0114.

Full text
Abstract:
Purpose – The purpose of this paper is to describe and evaluate the tool DigiDoc MetaEdit which allows the semi-automatic indexing of HTML documents. The tool works by identifying and suggesting keywords from a thesaurus according to the embedded information in HTML documents. This enables the parameterization of keyword assignment based on how frequently the terms appear in the document, the relevance of their position, and the combination of both. Design/methodology/approach – In order to evaluate the efficiency of the indexing tool, the descriptors/keywords suggested by the indexing tool are compared to the keywords which have been indexed manually by human experts. To make this comparison a corpus of HTML documents are randomly selected from a journal devoted to Library and Information Science. Findings – The results of the evaluation show that there: first, is close to a 50 per cent match or overlap between the two indexing systems, however, if you take into consideration the related terms and the narrow terms the matches can reach 73 per cent; and second, the first terms identified by the tool are the most relevant. Originality/value – The tool presented identifies the most important keywords in an HTML document based on the embedded information in HTML documents. Nowadays, representing the contents of documents with keywords is an essential practice in areas such as information retrieval and e-commerce.
APA, Harvard, Vancouver, ISO, and other styles
8

THIEMANN, PETER. "A typed representation for HTML and XML documents in Haskell." Journal of Functional Programming 12, no. 4-5 (July 2002): 435–68. http://dx.doi.org/10.1017/s0956796802004392.

Full text
Abstract:
We define a family of embedded domain specific languages for generating HTML and XML documents. Each language is implemented as a combinator library in Haskell. The generated HTML/XML documents are guaranteed to be well-formed. In addition, each library can guarantee that the generated documents are valid XML documents to a certain extent (for HTML only a weaker guarantee is possible). On top of the libraries, Haskell serves as a meta language to define parameterized documents, to map structured documents to HTML/XML, to define conditional content, or to define entire web sites. The combinator libraries support element-transforming style, a programming style that allows programs to have a visual appearance similar to HTML/XML documents, without modifying the syntax of Haskell.
APA, Harvard, Vancouver, ISO, and other styles
9

Gupta, Shivangi, and Mukesh Rawat. "Keyword based Automatic Summarization of HTML Documents." International Journal of Computer Applications 127, no. 8 (October 15, 2015): 24–29. http://dx.doi.org/10.5120/ijca2015906421.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Wu, Qi, Xing-shu Chen, Kai Zhu, and Chun-hui Wang. "Relevance-based content extraction of HTML documents." Journal of Central South University 19, no. 7 (July 2012): 1921–26. http://dx.doi.org/10.1007/s11771-012-1226-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "HTML documents"

1

Xie, Wei University of Ballarat. "Classification of HTML Documents." University of Ballarat, 2006. http://archimedes.ballarat.edu.au:8080/vital/access/HandleResolver/1959.17/12774.

Full text
Abstract:
Text Classification is the task of mapping a document into one or more classes based on the presence or absence of words (or features) in the document. It is intensively being studied and different classification techniques and algorithms have been developed. This thesis focuses on classification of online documents that has become more critical with the development of World Wide Web. The WWW vastly increases the availability of on-line documents in digital format and has highlighted the need to classify them. From this background, we have noted the emergence of “automatic Web Classification”. These mainly concentrate on classifying HTML-like documents into classes or categories by not only using the methods that are inherited from the traditional Text Classification process, but also utilizing the extra information provided only by Web pages. Our work is based on the fact that, Web documents, contain not only ordinary features (words) but also extra information, such as meta-data and hyperlinks that can be used to advantage the classification process. The aim of this research is to study various ways of using the extra information, in particularly, hyperlink information provided by HTML-documents (Web pages). The merit of the approach, developed in this thesis, is its simplicity, compared with existing approaches. We present different approaches of using hyperlink information to improve the effectiveness of web classification. Unlike other work in this area, we will only use the mappings between linked documents and their own class or classes. In this case, we only need to add a few features called linked-class features into the datasets, and then apply classifiers on them for classification. In the numerical experiments we adopted two wellknown Text Classification algorithms, Support Vector Machines and BoosTexter. The results obtained show that classification accuracy can be improved by using mixtures of ordinary and linked-class features. Moreover, out-links usually work better than in-links in classification. We also analyse and discuss the reasons behind this improvement.
Master of Computing
APA, Harvard, Vancouver, ISO, and other styles
2

Xie, Wei. "Classification of HTML Documents." University of Ballarat, 2006. http://archimedes.ballarat.edu.au:8080/vital/access/HandleResolver/1959.17/15628.

Full text
Abstract:
Text Classification is the task of mapping a document into one or more classes based on the presence or absence of words (or features) in the document. It is intensively being studied and different classification techniques and algorithms have been developed. This thesis focuses on classification of online documents that has become more critical with the development of World Wide Web. The WWW vastly increases the availability of on-line documents in digital format and has highlighted the need to classify them. From this background, we have noted the emergence of “automatic Web Classification”. These mainly concentrate on classifying HTML-like documents into classes or categories by not only using the methods that are inherited from the traditional Text Classification process, but also utilizing the extra information provided only by Web pages. Our work is based on the fact that, Web documents, contain not only ordinary features (words) but also extra information, such as meta-data and hyperlinks that can be used to advantage the classification process. The aim of this research is to study various ways of using the extra information, in particularly, hyperlink information provided by HTML-documents (Web pages). The merit of the approach, developed in this thesis, is its simplicity, compared with existing approaches. We present different approaches of using hyperlink information to improve the effectiveness of web classification. Unlike other work in this area, we will only use the mappings between linked documents and their own class or classes. In this case, we only need to add a few features called linked-class features into the datasets, and then apply classifiers on them for classification. In the numerical experiments we adopted two wellknown Text Classification algorithms, Support Vector Machines and BoosTexter. The results obtained show that classification accuracy can be improved by using mixtures of ordinary and linked-class features. Moreover, out-links usually work better than in-links in classification. We also analyse and discuss the reasons behind this improvement.
Master of Computing
APA, Harvard, Vancouver, ISO, and other styles
3

Levering, Ryan Reed. "Multi-stage modeling of HTML documents." Diss., Online access via UMI:, 2004.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Stachowiak, Maciej 1976. "Automated extraction of structured data from HTML documents." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9896.

Full text
Abstract:
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.
Includes bibliographical references (leaf 45).
by Maciej Stachowiak.
M.Eng.
APA, Harvard, Vancouver, ISO, and other styles
5

Nálevka, Petr. "Compound XML documents." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-1746.

Full text
Abstract:
Tato práce se zabývá různými charakteristikami komponovaných dokumentů a ukazuje potencionální výhody využití takových dokumentů v prostředí dnešního Webu. Hlavní pozornost je soustředěna na problémy spojené s validací komponovaných dokumentů. Práce zkoumá různé přístupy k řešení těchto problémů. Validační metoda NVDL (Namespace-based Validation Dispatching Language) je popsána detailně. Tato práce popisuje hlavní principy NVDL, zkoumá výhody a nevýhody oproti jiným přístupům a představuje JNVDL. JNVDL je kompletní implementace specifikace NVDL, která byla napsána v jazyce Java jako součást této práce. Popsány jsou nejen technické prvky implementace, ale JNVDL je představeno i z uživatelské perspektivy. Pro ověření využitelnosti bylo JNVDL integrováno do existujícího projektu pro validaci webových dokumentů s názvem Relaxed, aby jednoduše zpřístupnilo validaci komponovaných dokumentů autorům webového obsahu.
APA, Harvard, Vancouver, ISO, and other styles
6

Temelkuran, Baris 1980. "Hap-Shu : a language for locating information in HTML documents." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/87882.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Meziane, Souad. "Analyse et conversion de documents : du pixel au langage HTML." Lyon, INSA, 1998. http://www.theses.fr/1998ISAL0128.

Full text
Abstract:
Mon travail de thèse s'inscrit dans la thématique de recherche " Analyse des documents " du laboratoire Reconnaissance de Forme et Vision. Pour réaliser un système capable d'analyser des documents et d'en restituer la structure, les méthodologies s'appuient sur plusieurs approches et particulièrement sur l'approche syntaxique et structurelle de la Reconnaissance de Formes. Le but recherché dans ce travail est d'arriver à convertir des documents papier vers des documents électroniques tels que les documents HTML car ce sont les documents les plus utilisés sur l'Internet. Le domaine d'application d'un tel système peut être général, cependant, nous nous concentrons en premier sur un type particulier de documents à typographie riche : les sommaires. Dans ce contexte, nous avons mis en œuvre un système s'appuyant d'une part sur les structures physique et logique du document et d'autre part sur l'inférence de Grammaire à Deux Niveaux. Elle est composée de deux grammaires : une métagrammaire et une hypergarmmaire. Dans notre système, le rôle de la métagrammaire est de décrire les structures physique et logique du document. L'hypergrammaire décrit les traitements à effectuer pour convertir le document en html. L'analyse d'un sommaire s'effectue en deux étapes. Lors de la première étape, le système construit une base d'apprentissage en utilisant l'inférence grammaticale. Cette base contient plusieurs modèles de sommaires à identifier. Un document inconnu, soumis au système est identifié par appariement avec les modèles de la base, en utilisant toutes les informations issues de l'étage d'analyse. La mise en page du document dans le format HTML est basée sur l'analyse grammaticale de l'hypergrammaire. Cette dernière est obtenue par traduction des étiquettes logiques et des paramètres typographiques en commandes HTML. Le résultats de l'analyse de l'hypergrammaire produit le document HTML équivalent au document étudié. Il est visualisé par un logiciel de navigation
This work is part of the thematic "Document Analysis" in the Laboratory Reconnaissance de Forme et Vision(RFV). To achieve an analysis system ables to, interpret documents and to restore its structure, the Methodologies we have chosen lean on several approaches and particularly on the syntactic and structural approach of the Pattern Recognition. The aim in this work is to convert some paper documents into HTML documents because these documents are more used on the Internet. The application domain of such systems could be general; however, we concentrate us on a particular type of documents with a rich typography: the summaries. In this context, we have realized a system that exploits on one hand the information about content of the document such as its physical and logical structures, and on the other hand on two level grammars. It is composed with two grammars: a meta-grammar and a hyper-grammar. In our system, the role of the meta-grammar is to describe the physical and logical structures of the document. The hyper-grammar is constituted with a set of calculus rules and describes the treatments to do in order to convert the document in HTML. The summary analysis is done in two steps: analysis and identification of the document, and then translation into HTML. During of the first step, the system constructs a learning base by using the grammatical inference. This base contains several patterns of synopses to identify. An unknown document, submitted to the system is identified by matching with the patterns of the base by using all the attributes obtained in the analysis step. The layout of HTML document construction is based on the grammatical analysis of the hyper-grammar. The last is obtained by translation of the logical labels and some typographic parameters into HTML commands. The result of the grammatical analysis of the hyper-grammar produces the structured HTML document corresponding to the studied document. This last will be visualized by software of navigation
APA, Harvard, Vancouver, ISO, and other styles
8

Mohammadzadeh, Hadi. "Improving Retrieval Accuracy in Main Content Extraction from HTML Web Documents." Doctoral thesis, Universitätsbibliothek Leipzig, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-130500.

Full text
Abstract:
The rapid growth of text based information on the World Wide Web and various applications making use of this data motivates the need for efficient and effective methods to identify and separate the “main content” from the additional content items, such as navigation menus, advertisements, design elements or legal disclaimers. Firstly, in this thesis, we study, develop, and evaluate R2L, DANA, DANAg, and AdDANAg, a family of novel algorithms for extracting the main content of web documents. The main concept behind R2L, which also provided the initial idea and motivation for the other three algorithms, is to use well particularities of Right-to-Left languages for obtaining the main content of web pages. As the English character set and the Right-to-Left character set are encoded in different intervals of the Unicode character set, we can efficiently distinguish the Right-to-Left characters from the English ones in an HTML file. This enables the R2L approach to recognize areas of the HTML file with a high density of Right-to-Left characters and a low density of characters from the English character set. Having recognized these areas, R2L can successfully separate only the Right-to-Left characters. The first extension of the R2L, DANA, improves effectiveness of the baseline algorithm by employing an HTML parser in a post processing phase of R2L for extracting the main content from areas with a high density of Right-to-Left characters. DANAg is the second extension of the R2L and generalizes the idea of R2L to render it language independent. AdDANAg, the third extension of R2L, integrates a new preprocessing step to normalize the hyperlink tags. The presented approaches are analyzed under the aspects of efficiency and effectiveness. We compare them to several established main content extraction algorithms and show that we extend the state-of-the-art in terms of both, efficiency and effectiveness. Secondly, automatically extracting the headline of web articles has many applications. We develop and evaluate a content-based and language-independent approach, TitleFinder, for unsupervised extraction of the headline of web articles. The proposed method achieves high performance in terms of effectiveness and efficiency and outperforms approaches operating on structural and visual features
Das rasante Wachstum von textbasierten Informationen im World Wide Web und die Vielfalt der Anwendungen, die diese Daten nutzen, macht es notwendig, effiziente und effektive Methoden zu entwickeln, die den Hauptinhalt identifizieren und von den zusätzlichen Inhaltsobjekten wie z.B. Navigations-Menüs, Anzeigen, Design-Elementen oder Haftungsausschlüssen trennen. Zunächst untersuchen, entwickeln und evaluieren wir in dieser Arbeit R2L, DANA, DANAg und AdDANAg, eine Familie von neuartigen Algorithmen zum Extrahieren des Inhalts von Web-Dokumenten. Das grundlegende Konzept hinter R2L, das auch zur Entwicklung der drei weiteren Algorithmen führte, nutzt die Besonderheiten der Rechts-nach-links-Sprachen aus, um den Hauptinhalt von Webseiten zu extrahieren. Da der lateinische Zeichensatz und die Rechts-nach-links-Zeichensätze durch verschiedene Abschnitte des Unicode-Zeichensatzes kodiert werden, lassen sich die Rechts-nach-links-Zeichen leicht von den lateinischen Zeichen in einer HTML-Datei unterscheiden. Das erlaubt dem R2L-Ansatz, Bereiche mit einer hohen Dichte von Rechts-nach-links-Zeichen und wenigen lateinischen Zeichen aus einer HTML-Datei zu erkennen. Aus diesen Bereichen kann dann R2L die Rechts-nach-links-Zeichen extrahieren. Die erste Erweiterung, DANA, verbessert die Wirksamkeit des Baseline-Algorithmus durch die Verwendung eines HTML-Parsers in der Nachbearbeitungsphase des R2L-Algorithmus, um den Inhalt aus Bereichen mit einer hohen Dichte von Rechts-nach-links-Zeichen zu extrahieren. DANAg erweitert den Ansatz des R2L-Algorithmus, so dass eine Sprachunabhängigkeit erreicht wird. Die dritte Erweiterung, AdDANAg, integriert eine neue Vorverarbeitungsschritte, um u.a. die Weblinks zu normalisieren. Die vorgestellten Ansätze werden in Bezug auf Effizienz und Effektivität analysiert. Im Vergleich mit mehreren etablierten Hauptinhalt-Extraktions-Algorithmen zeigen wir, dass sie in diesen Punkten überlegen sind. Darüber hinaus findet die Extraktion der Überschriften aus Web-Artikeln vielfältige Anwendungen. Hierzu entwickeln wir mit TitleFinder einen sich nur auf den Textinhalt beziehenden und sprachabhängigen Ansatz. Das vorgestellte Verfahren ist in Bezug auf Effektivität und Effizienz besser als bekannte Ansätze, die auf strukturellen und visuellen Eigenschaften der HTML-Datei beruhen
APA, Harvard, Vancouver, ISO, and other styles
9

Yerra, Rajiv. "Detecting Similar HTML Documents Using A Sentence-Based Copy Detection Approach." Diss., CLICK HERE for online access, 2005. http://contentdm.lib.byu.edu/ETD/image/etd977.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Singer, Ron. "Comparing machine learning and hand-crafted approaches for information extraction from HTML documents." Thesis, McGill University, 2003. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=79127.

Full text
Abstract:
The problem of automatically extracting information from web pages is becoming very important, due to the explosion of information available on the World Wide Web. In this thesis, we explore and compare hand-crafted information extraction tools with tools constructed using machine learning algorithms. The task we consider is the extraction of organization names and contact information, such as addresses and phone numbers, from web pages. Given the huge number of company web pages on the Internet, automating this task is of great practical interest. The system we developed consists of two components. The first component achieves the labeling or tagging of named entities (such as company names, addresses and phone numbers) in HTML documents. We compare the performance of hand-coded regular expressions and decision trees for this task. Using decision trees allows us to generate tagging rules that are significantly more accurate. The second component is used to establish relationships between named entities (i.e. company names, phone numbers and addresses), for the purpose of structuring the data into a useful record (i.e. a contact, or an organization). For this task we experimented with two approaches. The first approach uses an aggregator that implements human-generated heuristics to relate the tags and create the records sought. The second approach is based on Hidden Markov Models (HMM). As far as we know, no one has used HMM before to establish relationships between more than two tagged entities. Our empirical results suggest that HMMs compare favorable with the hand-crafted aggregator in terms of performance and ease of development.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "HTML documents"

1

DeRose, Steven J. The SGML FAQ book: Understanding the foundation of HTML and XML. Boston: Kluwer Academic Publishers, 1997.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

The Document object model: Processing structured documents. New York: McGraw-Hill/Osborne, 2002.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Heslop, Brent. HTML Publishing on the Internet: Create great-looking documents online: home pages, newsletters, catalogs, ads and forms. Research Triangle Park, NC: Ventana Communications, 1996.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Guthrie, Malcolm. Forms: Interactivity for the World Wide Web : creating HTML and PDF form documents. San Jose, Calif: Adobe Press, 1998.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Heslop, Brent. HTML publishing on the Internet for Windows: Create great-looking documents online:home pages, newsletters, catalogs, ads & forums. Chapel Hill, NC: Ventana Press, 1995.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Heslop, Brent D. HTML publishing on the Internet for Macintosh: Create great-looking documents online : home pages, newsletters, catalogs, ads & forms. Research Triangle Park, NC: Ventana, 1995.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Heslop, Brent. HTML publishing on the Internet for Macintosh: Create great-looking documents online : home pages, newsletters, catalogs, ads & forms. Research Triangle Park, NC: Ventana, 1995.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Tobias, Hauser, ed. HTML. Boston, MA: Prentice Hall, 2002.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Heslop, Brent. HTML publishing on the internet: For windows : create great-looking documents online: home pages, newsletters, catalogs, ads & forums. Chapel Hill, N.C: Ventana Press, Inc., 1995.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Heslop, Brent D. HTML publishing on the Internet for Windows: Create great-looking documents online : home pages, newsletters, catalogs, ads & forums. Chapel Hill, NC: Ventana Press, 1995.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "HTML documents"

1

Freeman, Adam. "Creating HTML Documents." In The Definitive Guide to HTML5, 117–50. Berkeley, CA: Apress, 2011. http://dx.doi.org/10.1007/978-1-4302-3961-1_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

White, Bebo. "Authoring HTML Documents." In HTML and the Art of Authoring for the World Wide Web, 151–84. Boston, MA: Springer US, 1996. http://dx.doi.org/10.1007/978-1-4613-1351-9_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

White, Bebo. "Dynamically Created HTML Documents." In HTML and the Art of Authoring for the World Wide Web, 223–30. Boston, MA: Springer US, 1996. http://dx.doi.org/10.1007/978-1-4613-1351-9_12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

White, Bebo. "Converting Formatted Documents to HTML." In HTML and the Art of Authoring for the World Wide Web, 213–14. Boston, MA: Springer US, 1996. http://dx.doi.org/10.1007/978-1-4613-1351-9_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Wang, Yalin, and Jianying Hu. "Detecting Tables in HTML Documents." In Lecture Notes in Computer Science, 249–60. Berlin, Heidelberg: Springer Berlin Heidelberg, 2002. http://dx.doi.org/10.1007/3-540-45869-7_29.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Liu, Mengchi. "Capturing Semantics in HTML Documents." In Lecture Notes in Computer Science, 103–12. Berlin, Heidelberg: Springer Berlin Heidelberg, 2002. http://dx.doi.org/10.1007/3-540-46146-9_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ciancarini, Paolo, Cecilia Mascolo, and Fabio Vitali. "Visualizing Z Notation in HTML Documents." In ZUM ’98: The Z Formal Specification Notation, 81–95. Berlin, Heidelberg: Springer Berlin Heidelberg, 1998. http://dx.doi.org/10.1007/978-3-540-49676-2_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Faghani, Shabanali, Ali Hadian, and Behrouz Minaei-Bidgoli. "Charset Encoding Detection of HTML Documents." In Information Retrieval Technology, 215–26. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-28940-3_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Lim, Seung-Jin, and Yiu-Kai Ng. "A Heuristic Approach for Converting HTML Documents to XML Documents." In Computational Logic — CL 2000, 1182–96. Berlin, Heidelberg: Springer Berlin Heidelberg, 2000. http://dx.doi.org/10.1007/3-540-44957-4_79.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Schultz, David, and Craig Cook. "Adding Style to Your Documents: CSS." In Beginning HTML with CSS and XHTML, 227–50. Berkeley, CA: Apress, 2007. http://dx.doi.org/10.1007/978-1-4302-0350-6_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "HTML documents"

1

Kim, Yeon-seok, and Kyong-ho Lee. "Generating Structured Documents from HTML Tables." In 2006 International Conference on Hybrid Information Technology. IEEE, 2006. http://dx.doi.org/10.1109/ichit.2006.253669.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Molinari, Andrea, Gabriella Pasi, and R. A. Marques Pereira. "An indexing model of HTML documents." In the 2003 ACM symposium. New York, New York, USA: ACM Press, 2003. http://dx.doi.org/10.1145/952532.952697.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Rapela, Joaquin. "Automatically combining ranking heuristics for HTML documents." In Proceeding of the third international workshop. New York, New York, USA: ACM Press, 2001. http://dx.doi.org/10.1145/502932.502945.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Gupta, Suhit, Gail Kaiser, David Neistadt, and Peter Grimm. "DOM-based content extraction of HTML documents." In the twelfth international conference. New York, New York, USA: ACM Press, 2003. http://dx.doi.org/10.1145/775152.775182.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kirschning, Ingrid, and Joaquín O. Rueda. "Animated agents and TTS for HTML documents." In the 2005 Latin American conference. New York, New York, USA: ACM Press, 2005. http://dx.doi.org/10.1145/1111360.1111375.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Burget, R. "Layout Based Information Extraction from HTML Documents." In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2. IEEE, 2007. http://dx.doi.org/10.1109/icdar.2007.4376990.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

"CONCEPTS EXTRACTION BASED ON HTML DOCUMENTS STRUCTURE." In International Conference on Agents and Artificial Intelligence. SciTePress - Science and and Technology Publications, 2012. http://dx.doi.org/10.5220/0003748305030506.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Hoon Hwangbo and Hongchul Lee. "Reusing of information constructed in HTML documents: A conversion of HTML into OWL." In 2008 International Conference on Control, Automation and Systems (ICCAS). IEEE, 2008. http://dx.doi.org/10.1109/iccas.2008.4694654.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Canan Pembe, F., and Tunga Gungor. "Heading-based sectional hierarchy identification for HTML documents." In 2007 22nd international symposium on computer and information sciences. IEEE, 2007. http://dx.doi.org/10.1109/iscis.2007.4456839.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Jern, Mikael, Jakob Rogstadius, Tobias Åström, and Anders Ynnerman. "Visual Analytics Presentation Tools Applied in HTML Documents." In 2008 12th International Conference Information Visualisation (IV). IEEE, 2008. http://dx.doi.org/10.1109/iv.2008.22.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "HTML documents"

1

Gupta, Suhit, Gail Kaiser, David Neistadt, and Peter Grimm. DOM-based Content Extraction of HTML Documents. Fort Belvoir, VA: Defense Technical Information Center, January 2005. http://dx.doi.org/10.21236/ada437440.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Palme, J., A. Hopmann, and N. Shelness. MIME Encapsulation of Aggregate Documents, such as HTML (MHTML). RFC Editor, March 1999. http://dx.doi.org/10.17487/rfc2557.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Palme, J., and A. Hopmann. MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML). RFC Editor, March 1997. http://dx.doi.org/10.17487/rfc2110.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography