Application of Integrated Interface Schema (LIS) Over Multiple WDBS To Enhance Data Unit Annotation

M Vamsikrishna, G G.Parameswarakumar

Abstract


An annotation wrapper for the search site is automatically build and can be used to interpret new result pages from the same web database.  A growing number of databases have become web accessible through HTML form based search interfaces. The data units revisit from the underlying database are regularly encoded into the result pages dynamically for human browsing. In this paper we present an automatic annotation approach that first line up the data units on a result page into different groups such that the data in the same group have the same semantic. Then for each group we annotate it from dissimilar aspects and cumulative the different annotations to expect a final annotation label for it. Our experiments specify that the proposed approach is superior and effectual.


Keywords


Data alignment, data annotation, data unit, search result record, search pattern, semantic, text node, wrapper generation.

References


A. Arasu and H. Garcia-Molina, “Extracting Structured Data from Web Pages,” Proc. SIGMOD Int’l Conf. Management of Data, 2003.

L. Arlotta, V. Crescenzi, G. Mecca, and P. Merialdo, “Automatic Annotation of Data Extracted from Large Web Sites,” Proc. Sixth Int’l Workshop the Web and Databases (WebDB), 2003.

P. Chan and S. Stolfo, “Experiments on Multistrategy Learning by Meta-Learning,” Proc. Second Int’l Conf. Information and Knowledge Management (CIKM), 1993.

W. Bruce Croft, “Combining Approaches for Information Retrieval,” Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, Kluwer Academic, 2000.

V. Crescenzi, G. Mecca, and P. Merialdo, “RoadRUNNER: Towards Automatic Data Extraction from Large Web Sites,” Proc. Very Large Data Bases (VLDB) Conf., 2001.

S. Dill et al., “SemTag and Seeker: Bootstrapping the Semantic Web via Automated Semantic Annotation,” Proc. 12th Int’l Conf. World Wide Web (WWW) Conf., 2003.

H. Elmeleegy, J. Madhavan, and A. Halevy, “Harvesting Relational Tables from Lists on the Web,” Proc. Very Large Databases (VLDB) Conf., 2009.

D. Embley, D. Campbell, Y. Jiang, S. Liddle, D. Lonsdale, Y. Ng, and R. Smith, “Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages,” Data and Knowledge Eng., vol. 31,

no. 3, pp. 227-251, 1999.

D. Freitag, “Multistrategy Learning for Information Extraction,” Proc. 15th Int’l Conf. Machine Learning (ICML), 1998.

D. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, 1989.

S. Handschuh, S. Staab, and R. Volz, “On Deep Annotation,” Proc. 12th Int’l Conf. World Wide Web (WWW), 2003.

S. Handschuh and S. Staab, “Authoring and Annotation of Web Pages in CREAM,” Proc. 11th Int’l Conf. World Wide Web (WWW), 2003.

B. He and K. Chang, “Statistical Schema Matching Across Web Query Interfaces,” Proc. SIGMOD Int’l Conf. Management of Data, 2003.

H. He, W. Meng, C. Yu, and Z. Wu, “Automatic Integration of Web Search Interfaces with WISE-Integrator,” VLDB J., vol. 13, no. 3, pp. 256-273, Sept. 2004.

H. He, W. Meng, C. Yu, and Z. Wu, “Constructing Interface Schemas for Search Interfaces of Web Databases,” Proc. Web Information Systems Eng. (WISE) Conf., 2005.


Full Text: PDF[FULL TEXT]

Refbacks

  • There are currently no refbacks.


Copyright © 2013, All rights reserved.| ijseat.com

Creative Commons License
International Journal of Science Engineering and Advance Technology is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJSEat , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.