Academic literature on the topic 'Web page data extraction'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Web page data extraction.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Web page data extraction"
Ahmad Sabri, Ily Amalina, and Mustafa Man. "Improving Performance of DOM in Semi-structured Data Extraction using WEIDJ Model." Indonesian Journal of Electrical Engineering and Computer Science 9, no. 3 (March 1, 2018): 752. http://dx.doi.org/10.11591/ijeecs.v9.i3.pp752-763.
Full textAhamed, B. Bazeer, D. Yuvaraj, S. Shitharth, Olfat M. Mizra, Aisha Alsobhi, and Ayman Yafoz. "An Efficient Mechanism for Deep Web Data Extraction Based on Tree-Structured Web Pattern Matching." Wireless Communications and Mobile Computing 2022 (May 27, 2022): 1–10. http://dx.doi.org/10.1155/2022/6335201.
Full textAhmad Sabri, Ily Amalina, and Mustafa Man. "A deep web data extraction model for web mining: a review." Indonesian Journal of Electrical Engineering and Computer Science 23, no. 1 (July 1, 2021): 519. http://dx.doi.org/10.11591/ijeecs.v23.i1.pp519-528.
Full textLiu, Hong, and Yin Xiao Ma. "Web Data Extraction Research Based on Wrapper and XPath Technology." Advanced Materials Research 271-273 (July 2011): 706–12. http://dx.doi.org/10.4028/www.scientific.net/amr.271-273.706.
Full textIbrahim, Nadia, Alaa Hassan, and Marwah Nihad. "Big Data Analysis of Web Data Extraction." International Journal of Engineering & Technology 7, no. 4.37 (December 13, 2018): 168. http://dx.doi.org/10.14419/ijet.v7i4.37.24095.
Full textKayed, Mohammed, and Chia-Hui Chang. "FiVaTech: Page-Level Web Data Extraction from Template Pages." IEEE Transactions on Knowledge and Data Engineering 22, no. 2 (February 2010): 249–63. http://dx.doi.org/10.1109/tkde.2009.82.
Full textEt. al., Shilpa Deshmukh,. "Efficient Methodology for Deep Web Data Extraction." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 1S (April 11, 2021): 286–93. http://dx.doi.org/10.17762/turcomat.v12i1s.1769.
Full textGAO, XIAOYING, MENGJIE ZHANG, and PETER ANDREAE. "AUTOMATIC PATTERN CONSTRUCTION FOR WEB INFORMATION EXTRACTION." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 12, no. 04 (August 2004): 447–70. http://dx.doi.org/10.1142/s0218488504002928.
Full textPatnaik, Sudhir Kumar, and C. Narendra Babu. "Trends in web data extraction using machine learning." Web Intelligence 19, no. 3 (December 16, 2021): 169–90. http://dx.doi.org/10.3233/web-210465.
Full textKumaresan, Umamageswari, and Kalpana Ramanujam. "A Framework for Automated Scraping of Structured Data Records From the Deep Web Using Semantic Labeling." International Journal of Information Retrieval Research 12, no. 1 (January 2022): 1–18. http://dx.doi.org/10.4018/ijirr.290830.
Full textDissertations / Theses on the topic "Web page data extraction"
Alves, Ricardo João de Freitas. "Declarative approach to data extraction of web pages." Master's thesis, Faculdade de Ciências e Tecnologia, 2009. http://hdl.handle.net/10362/5822.
Full textIn the last few years, we have been witnessing a noticeable WEB evolution with the introduction of significant improvements at technological level, such as the emergence of XHTML, CSS,Javascript, and Web2.0, just to name ones. This, combined with other factors such as physical expansion of the Web, as well as its low cost, have been the great motivator for the organizations and the general public to join, with a consequent growth in the number of users and thus influencing the volume of the largest global data repository. In consequence, there was an increasing need for regular data acquisition from the WEB, and because of its frequency, length or complexity, it would only be viable to obtain through automatic extractors. However, two main difficulties are inherent to automatic extractors. First, much of the Web's information is presented in visual formats mainly directed for human reading. Secondly, the introduction of dynamic webpages, which are brought together in local memory from different sources, causing some pages not to have a source file. Therefore, this thesis proposes a new and more modern extractor, capable of supporting the Web evolution, as well as being generic, so as to be able to be used in any situation, and capable of being extended and easily adaptable to a more particular use. This project is an extension of an earlier one which had the capability of extractions on semi-structured text files. However it evolved to a modular extraction system capable of extracting data from webpages, semi-structured text files and be expanded to support other data source types. It also contains a more complete and generic validation system and a new data delivery system capable of performing the earlier deliveries as well as new generic ones. A graphical editor was also developed to support the extraction system features and to allow a domain expert without computer knowledge to create extractions with only a few simple and intuitive interactions on the rendered webpage.
Cheng, Wang. "AMBER : a domain-aware template based system for data extraction." Thesis, University of Oxford, 2015. http://ora.ox.ac.uk/objects/uuid:ff49d786-bfd8-4cd4-a69c-19e81cb95920.
Full textAnderson, Neil David Alan. "Data extraction & semantic annotation from web query result pages." Thesis, Queen's University Belfast, 2016. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.705642.
Full textWu, Yongliang. "Aggregating product reviews for the Chinese market." Thesis, KTH, Kommunikationssystem, CoS, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-91484.
Full textI december 2007 uppgick antalet internetanvändare i Kina har ökat till 210 miljoner människor. Den årliga tillväxttakten nådde 53,3 procent 2008, med den genomsnittliga Antalet Internet-användare ökar för varje dag av 200.000 människor. Närvarande Kinas Internet befolkningen är något lägre än de 215 miljoner Internetanvändare i USA Staterna.[1] Trots den snabba tillväxten i den kinesiska ekonomin i den globala Internetmarknaden, Kinas e-handel inte följer det traditionella mönstret av handel, men i stället har utvecklats baserat på användarnas efterfrågan. Denna tillväxt har utvidgas till alla områden I Internet. I väst har expert recensioner visat sig vara en viktig del I användarens köpbeslut. Ju högre kvalitet på produkten recensioner som kunderna mottagna fler produkter de köper från on-line butiker. Eftersom antalet produkter och alternativen ökar, kinesiska kunderna behöver opersonlig, opartisk och detaljerade produkter recensioner. Denna avhandling fokuserar på on-line recensioner och hur de påverkar Kinesiska kundens köpbeslut.</p> E-handel är ett komplext system. Som en typisk modell för e-handel, vi undersöka ett Business to Consumer (B2C) on-line-försäljning plats och överväga ett antal faktorer; inklusive några till synes subtitle faktorer som kan påverka kundens småningom Beslutet att handla på webbplatsen. Uttryckligen detta examensarbete kommer att undersöka aggregerade recensioner från olika online-källor genom att analysera vissa befintliga västra företag. Efter den här avhandlingen visar hur samlade produkt recensioner för en e-affärer webbplats. Under detta examensarbete fann vi att befintliga data mining tekniker gjort det rakt fram för att samla recensioner. Dessa översyner har lagrats i en databas och webb program kan söka denna databas för att ge en användare med en rad relevanta product recensioner. En av de viktiga frågorna, precis som med sökmotorer är att tillhandahålla relevanta produkt recensioner och bestämma vilken ordning de ska presenteras i. vårt arbete har vi valt recensioner baserat på matchning produkten (men i vissa fall det finns oklarheter i fråga om två produkter verkligen identiska eller inte) och beställa matchande recensioner efter datum - med den senaste recensioner närvarande första. Några av de öppna frågorna som kvarstår för framtiden är: (1) förbättra matchning - För att undvika oklarheter rörande om Gästrecensionerna om samma produkt eller inte och (2) avgöra om det finns recensioner faktiskt påverka en kinesisk användarens val att köpa en produkt.
Malchik, Alexander 1975. "An aggregator tool for extraction and collection of data from web pages." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86522.
Full textIncludes bibliographical references (p. 54-56).
by Alexander Malchik.
M.Eng.
Kolečkář, David. "Systém pro integraci webových datových zdrojů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2020. http://www.nusl.cz/ntk/nusl-417239.
Full textMazal, Zdeněk. "Extrakce textových dat z internetových stránek." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-219347.
Full textWeng, Daiyue. "Extracting structured data from Web query result pages." Thesis, Queen's University Belfast, 2016. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.709858.
Full textСмілянець, Федір Андрійович. "Екстракція структурованої інформації з множини веб-сторінок." Master's thesis, КПІ ім. Ігоря Сікорського, 2020. https://ela.kpi.ua/handle/123456789/39926.
Full textRelevance of the research topic. Modern wide internet is a considerable source of data to be used in scientific and business applications. An ability to extract up to date data is frequently crutial for reaching necessary goals, though, modern quality solutions to this problem, which are using computer vision and other technologies, may be finantially demanding to acquire or develop, thus simple and cheap to develop, maintain and use solutions are necessary. The purpose of the study is to create a software instrument aimed at extraction of structured data from news websites for usage in news trustworthiness classification. Following tasks were outlined and implemented to achieve the aforementioned goal: - Outline existing approaches and analogues in areas of data extraction and news classification; - Design and develop extraction, preparation and classification algorhitms; - Compare the results achieved with developed extraction algorhitm and with existing software solution, including comparing machine learning accuracies on both of the extractors. The object of the study is the process of text data extraction with subsequent machine learning analysis. The subjects of the study are methods and tools of extraction and analysis of text data. Scientific novelty of the obtained results. A simple greedy algorithm was created, combining the process of link discovery and data extraction. Expediency of usage of simple web data extraction algorithms for composing machine learning datasets was proven. It was also proven that classical machine learning algorithms can achieve results similar to neural networks such as LSTM. Capabilities of machine learning systems to function efficiently in a bilingual context were also shown. Publications. Materials, related to this study, were published in the All-Ukrainian Scientific and Practical Conference of Young Scientists and Students “Information Systems and Management Technologies” (ISTU-2019) “News trustworthiness classification with machine learning”.
Hou, Jingyu. "Discovering web page communities for web-based data management." University of Southern Queensland, Faculty of Sciences, 2002. http://eprints.usq.edu.au/archive/00001447/.
Full textBooks on the topic "Web page data extraction"
1964-, Palade Vasile, ed. Adaptive web sites: A knowledge extraction from web data approach. Amsterdam: IOS Press, 2008.
Find full textDevelopments in data extraction, management, and analysis. Hershey, PA: Information Science Reference, 2012.
Find full textPaul, McFedries, ed. The complete idiot's guide to creating a Web page. 4th ed. Indianapolis, Ind: Que, 1999.
Find full textThe complete idiot's guide to creating a Web page. 5th ed. Indianapolis, IN: Alpha, 2002.
Find full textMcFedries, Paul. The complete idiot's guide to creating a Web page. 4th ed. Indianapolis, Ind: Que, 2000.
Find full textExplorer's guide to the Semantic Web. Greenwich: Manning, 2004.
Find full textShroff, Gautam. The Intelligent Web. Oxford University Press, 2013. http://dx.doi.org/10.1093/oso/9780199646715.001.0001.
Full textJamaludin, Zulikha, and Wan Hussain Wan Ishak. Do it Yourself: Bina Laman Sesawang Statik & Dinamik. UUM Press, 2010. http://dx.doi.org/10.32890/9789675311314.
Full textBook chapters on the topic "Web page data extraction"
Kravchenko, Andrey, Ruslan R. Fayzrakhmanov, and Emanuel Sallinger. "Web Page Representations and Data Extraction with BERyL." In Current Trends in Web Engineering, 22–30. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-03056-8_3.
Full textGrigalis, Tomas, Lukas Radvilavičius, Antanas Čenys, and Juozas Gordevičius. "Clustering Visually Similar Web Page Elements for Structured Web Data Extraction." In Lecture Notes in Computer Science, 435–38. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-31753-8_38.
Full textChang, Chia-Hui, Yen-Ling Lin, Kuan-Chen Lin, and Mohammed Kayed. "Page-Level Wrapper Verification for Unsupervised Web Data Extraction." In Lecture Notes in Computer Science, 454–67. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-41230-1_38.
Full textHu, Dongdong, and Xiaofeng Meng. "Automatic Data Extraction from Data-Rich Web Pages." In Database Systems for Advanced Applications, 828–39. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11408079_75.
Full textPalekar, Vikas R. "A Visual Based Page Segmentation for Deep Web Data Extraction." In Advances in Intelligent and Soft Computing, 791–804. New Delhi: Springer India, 2012. http://dx.doi.org/10.1007/978-81-322-0491-6_72.
Full textCarchiolo, Vincenza, Alessandro Longheu, and Michele Malgeri. "Extraction of Hidden Semantics from Web Pages." In Intelligent Data Engineering and Automated Learning — IDEAL 2002, 117–22. Berlin, Heidelberg: Springer Berlin Heidelberg, 2002. http://dx.doi.org/10.1007/3-540-45675-9_20.
Full textChang, Chia-Hui, Shih-Chien Kuo, Kuo-Yu Hwang, Tsung-Hsin Ho, and Chih-Lung Lin. "Automatic Information Extraction for Multiple Singular Web Pages." In Advances in Knowledge Discovery and Data Mining, 297–303. Berlin, Heidelberg: Springer Berlin Heidelberg, 2002. http://dx.doi.org/10.1007/3-540-47887-6_29.
Full textLassri, Safae, El Habib Benlahmar, and Abderrahim Tragha. "Web Page Classification Based on an Accurate Technique for Key Data Extraction." In Advanced Intelligent Systems for Sustainable Development (AI2SD’2020), 1124–31. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-90639-9_91.
Full textKolla, Bhanu Prakash, and Arun Raja Raman. "Data Engineered Content Extraction Studies for Indian Web Pages." In Advances in Intelligent Systems and Computing, 505–12. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-8055-5_45.
Full textLi, Long, Dandan Song, and Lejian Liao. "Vertical Classification of Web Pages for Structured Data Extraction." In Information Retrieval Technology, 486–95. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-35341-3_44.
Full textConference papers on the topic "Web page data extraction"
Kayed, Mohammed, Chia-Hui Chang, Khaled Shaalan, and Moheb Ramzy Girgis. "FiVaTech: Page-Level Web Data Extraction from Template Pages." In 2007 Seventh IEEE International Conference on Data Mining - Workshops (ICDM Workshops). IEEE, 2007. http://dx.doi.org/10.1109/icdmw.2007.95.
Full textGyőrödi, Cornelia, Robert Győrödi, Mihai Cornea, and George Pecherle. "Automated internal web page clustering for improved data extraction." In the 2nd International Conference. New York, New York, USA: ACM Press, 2012. http://dx.doi.org/10.1145/2254129.2254209.
Full textXingyi Li, Yanyan Kong, and Huaji Shi. "Web page repetitive structure and URL feature based Deep Web data extraction." In 2010 Second International Conference on Communication Systems, Networks and Applications (ICCSNA). IEEE, 2010. http://dx.doi.org/10.1109/iccsna.2010.5588744.
Full textYang, Jufeng, Guangshun Shi, Yan Zheng, and Qingren Wang. "Data Extraction from Deep Web Pages." In 2007 International Conference on Computational Intelligence and Security (CIS 2007). IEEE, 2007. http://dx.doi.org/10.1109/cis.2007.39.
Full textHong-ping, Chen, Fang Wei, Yang Zhou, Zhuo Lin, and Cui Zhi-Ming. "Automatic Data Records Extraction from List Page in Deep Web Sources." In 2009 Asia-Pacific Conference on Information Processing, APCIP. IEEE, 2009. http://dx.doi.org/10.1109/apcip.2009.100.
Full textWang, Yun, Bicheng Li, and Chen Lin. "Data extraction from Web forums based on similarity of page layout." In 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE). IEEE, 2009. http://dx.doi.org/10.1109/nlpke.2009.5313736.
Full textGong, Jibing, Xiaomeng Kou, Hanyun Zhang, Jiquan Peng, Shishan Gong, and Shuli Wang. "Automatic web page data extraction through MD5 trigeminal tree and improved BIRCH." In International Conference on Electronic Information Engineering, Big Data, and Computer Technology (EIBDCT 2022), edited by Xuexia Ye and Guoqiang Zhong. SPIE, 2022. http://dx.doi.org/10.1117/12.2635678.
Full textGuo, Jinsong, Valter Crescenzi, Tim Furche, Giovanni Grasso, and Georg Gottlob. "RED: Redundancy-Driven Data Extraction from Result Pages?" In The World Wide Web Conference. New York, New York, USA: ACM Press, 2019. http://dx.doi.org/10.1145/3308558.3313529.
Full textZhang, Mingzhu, Zhongguo Yang, Sikandar Ali, and Weilong Ding. "Web Page Information Extraction Service Based on Graph Convolutional Neural Network and Multimodal Data Fusion." In 2021 IEEE International Conference on Web Services (ICWS). IEEE, 2021. http://dx.doi.org/10.1109/icws53863.2021.00094.
Full textHui Song, Suraj Giri, and Fanyuan Ma. "Data extraction and annotation for dynamic Web pages." In IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004. IEEE, 2004. http://dx.doi.org/10.1109/eee.2004.1287353.
Full textReports on the topic "Web page data extraction"
Chang, Kevin C., Truman Shuck, and Govind Kabra. Web-Scale Search-Based Data Extraction and Integration. Fort Belvoir, VA: Defense Technical Information Center, October 2011. http://dx.doi.org/10.21236/ada554205.
Full text