A Study To Support First-N Queries And Incremental Updates To Answer Multi Keyword Queries

Ala Rajitha; Fasi Ahmed Parvez

A Study To Support First-N Queries And Incremental Updates To Answer Multi Keyword Queries

Ala Rajitha, Fasi Ahmed Parvez

Abstract

Most search engines and online search forms maintain auto completion which demonstrates suggested queries or even answers on the fly as a user types in a keyword query character by character. As many search systems accumulate their information in a backend relational DBMS. Some databases such as Oracle and SQL server already support prefix search and we could use this feature to do search-as-you-type. Still not all databases provide this feature. For this reason we study new methods that can be used in all databases. One approach is to expand a separate application layer on the database to construct indexes and execute algorithms for answering queries. While this approach has the benefit of achieving a high performance its main drawback is duplicating data and indexes resulting in additional hardware costs. Another approach is to use database extenders such as DB2 Extenders, Informix Data Blades, Microsoft SQL Server Common Language Runtime (CLR) integration and Oracle Cartridges which permit developers to implement new functionalities to a DBMS. This approach is not possible for databases that do not provide such an extender interface such as MySQL. Because it needs to utilize proprietary interfaces provided by database vendors a solution for one database may not be portable to others. In addition an extender-based solution particularly those implemented in C/C++ could cause severe dependability and security problems to database engines.

Keywords

Search-as-you-type, databases, SQL, fuzzy search.

References

S. Agrawal, K. Chakrabarti, S. Chaudhuri, and V. Ganti, â€œScalable Ad-Hoc Entity Extraction from Text Collections,â€ Proc. VLDB Endowment, vol. 1, no. 1, pp. 945-957, 2008.

S. Agrawal, S. Chaudhuri, and G. Das, â€œDBXplorer: A System for Keyword-Based Search over Relational Data Bases,â€ Proc. 18th Intâ€™l Conf. Data Eng. (ICDE â€™02), pp. 5-16, 2002.

A. Arasu, V. Ganti, and R. Kaushik, â€œEfficient Exact Set-Similarity Joins,â€ Proc. 32nd Intâ€™l Conf. Very Large Data Bases (VLDB â€™06), pp. 918-929, 2006.

H. Bast, A. Chitea, F.M. Suchanek, and I. Weber, â€œESTER: Efficient Search on Text, Entities, and Relations,â€ Proc. 30th Ann. Intâ€™l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR â€™07), pp. 671-678, 2007.

H. Bast and I. Weber, â€œType Less, Find More: Fast Autocompletion Search with a Succinct Index,â€ Proc. 29th Ann. Intâ€™l ACM SIGIR Conf. Research and Development in Information Retrieval

(SIGIR â€™06), pp. 364-371, 2006.

H. Bast and I. Weber, â€œThe Complete Search Engine: Interactive, Efficient, and Towards IR & DB Integration,â€ Proc. Conf. Innovative Data Systems Research (CIDR), pp. 88-95, 2007.

R.J. Bayardo, Y. Ma, and R. Srikant, â€œScaling up all Pairs Similarity Search,â€ Proc. 16th Intâ€™l Conf. World Wide Web (WWW â€™07), pp. 131- 140, 2007.

G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan, â€œKeyword Searching and Browsing in Data Bases Using Banks,â€ Proc. 18th Intâ€™l Conf. Data Eng. (ICDE â€™02), pp. 431- 440, 2002.

K. Chakrabarti, S. Chaudhuri, V. Ganti, and D. Xin, â€œAn Efficient Filter for Approximate Membership Checking,â€ Proc. ACM SIGMOD Intâ€™l Conf. Management of Data (SIGMOD â€™08), pp. 805- 818, 2008.

S. Chaudhuri, K. Ganjam, V. Ganti, R. Kapoor, V. Narasayya, and T. Vassilakis, â€œData Cleaning in Microsoft SQL Server 2005,â€ Proc. ACM SIGMOD Intâ€™l Conf. Management of Data (SIGMOD â€™05), pp. 918-920, 2005.

S. Chaudhuri, K. Ganjam, V. Ganti, and R. Motwani, â€œRobust and Efficient Fuzzy Match for Online Data Cleaning,â€ Proc. ACM SIGMOD Intâ€™l Conf. Management of Data (SIGMOD â€™03), pp. 313- 324, 2003.

S. Chaudhuri, V. Ganti, and R. Kaushik, â€œA Primitive Operator for Similarity Joins in Data Cleaning,â€ Proc. 22nd Intâ€™l Conf. Data Eng. (ICDE â€™06), pp. 5-16, 2006.

S. Chaudhuri, V. Ganti, and R. Motwani, â€œRobust Identification of Fuzzy Duplicates,â€ Proc. 21st Intâ€™l Conf. Data Eng. (ICDE), pp. 865- 876, 2005.

S. Chaudhuri and R. Kaushik, â€œExtending Autocompletion to Tolerate Errors,â€ Proc. 35th ACM SIGMOD Intâ€™l Conf. Management of Data (SIGMOD â€™09), pp. 433-439, 2009.

B.B. Dalvi, M. Kshirsagar, and S. Sudarshan, â€œKeyword Search on External Memory Data Graphs,â€ Proc. VLDB Endowment, vol. 1, no. 1, pp. 1189-1204, 2008.

B. Ding, J.X. Yu, S. Wang, L. Qin, X. Zhang, and X. Lin, â€œFinding Top-K Min-Cost Connected Trees in Data Bases,â€ Proc. IEEE 23rd Intâ€™l Conf. Data Eng. (ICDE â€™07), pp. 836-845, 2007.

L. Gravano, P.G. Ipeirotis, H.V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava, â€œApproximate String Joins in a Data Base (Almost) for Free,â€ Proc. 27th Intâ€™l Conf. Very Large Data Bases (VLDB â€™01), pp. 491-500, 2001.

M. Hadjieleftheriou, A. Chandel, N. Koudas, and D. Srivastava, â€œFast Indexes and Algorithms for Set Similarity Selection Queries,â€ Proc. IEEE 24th Intâ€™l Conf. Data Eng. (ICDE â€™08), pp. 267-276, 2008.

Full Text: PDF[FULL TEXT]

Refbacks

There are currently no refbacks.

International Journal of Science Engineering and Advance Technology is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJSEat , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.

Username
Password
Remember me

A Study To Support First-N Queries And Incremental Updates To Answer Multi Keyword Queries

Abstract

Keywords

References

Refbacks

Copyright Â© 2013, All rights reserved.| ijseat.com