Efficient clustering and document retrival by query keywords

Pendyala Santhosh Krishna, Nadella Sunil


User penchants are shown by a set of keywords. A central server monitors the document stream and continuously reports to each user the top-k documents that are most relevant to her keywords. Our unprejudiced is to backing large numbers of users and high stream rates, while energizing the top-k results almost instantly. Our clarification walks out on the customary frequency-ordered indexing approach. As an alternative, it trails an identifier-ordering paradigm that ensembles better the nature of the problem. When supplemented with a new, locally adaptive method, our method offers confirmed optimality the number of well-thought-out queries per stream event, and direction of extent shorter retort time than the contemporary state-of-the-art.


P. Haghani, S. Michel, and K. Aberer, “The gist of everything new: personalized top-k processing over web 2.0 streams.” in CIKM,2010, pp. 489–498.

K. Mouratidis and H. Pang, “Efficient evaluation of continuous text search queries,” IEEE Trans. Knowl. Data Eng., vol. 23, no. 10, pp. 1469–1482, 2011.

N. Vouzoukidou, B. Amann, and V. Christophides, “Processing continuous text queries featuring non-homogeneous scoring functions.” in CIKM, 2012, pp. 1065–1074.

A. Hoppe, “Automatic ontology-based user profile learning from heterogeneous web resources in a big data context.” PVLDB, pp. 1428–1433, 2013.

A. Lacerda and N. Ziviani, “Building user profiles to improve user experience in recommender systems,” in WSDM, 2013, pp. 759–764.

M. Busch, K. Gade, B. Larson, P. Lok, S. Luckenbill, and J. J. Lin, “Earlybird: Real-time search at twitter,” in ICDE, 2012, pp. 1360– 1369.

L. Wu, W. Lin, X. Xiao, and Y. Xu, “LSII: an indexing structure for exact real-time search on microblogs,” in ICDE, 2013, pp. 482–493.

J. Zobel and A. Moffat, “Inverted files for text search engines,” ACM Comput. Surv., vol. 38, no. 2, 2006.

R. Fagin, A. Lotem, and M. Naor, “Optimal aggregation algorithms for middleware,” J. Comput. Syst. Sci., vol. 66, no. 4, pp. 614–656, 2003.

A. Z. Broder, D. Carmel, M. Herscovici, A. Soffer, and J. Y. Zien, “Efficient query evaluation using a two-level retrieval process.” in CIKM, 2003, pp. 426–434. IEEE Transactions on Knowledge and Data Engineering,Volume:29,Issue:5,Issue Date:May.1.2017 14

S. Prabhakar, Y. Xia, D. V. Kalashnikov, W. G. Aref, and S. E. Hambrusch, “Query indexing and velocity constrained indexing: Scalable techniques for continuous queries on moving objects,” IEEE Trans. Computers, vol. 51, no. 10, pp. 1124–1140, 2002.

S. E. Robertson and D. A. Hull, “The TREC-9 Filtering Track Final Report,” in Text REtrieval Conference, 2000, pp. 25–40.

Y. Zhang and J. Callan, “Maximum Likelihood Estimation for Filtering Thresholds,” in SIGIR, 2001, pp. 294–302.

F. Fabret, H. Jacobsen, F. Llirbat, J. L. M. Pereira, K. A. Ross, and D. Shasha, “Filtering algorithms and implementation for very fast publish/subscribe,” in SIGMOD Conference, 2001, pp. 115–126.

W. Rao, L. Chen, A. W.-C. Fu, H. Chen, and F. Zou, “On efficient content matching in distributed pub/sub systems.” in INFOCOM, 2009, pp. 756–764.

Leong Hou U1, Junjie Zhang2,Kyriakos Mouratidis3,Ye Li4..(2017)”Continuous Top-k Monitoring on Document Streams”

Full Text: PDF [Full Text]


  • There are currently no refbacks.

Copyright © 2013, All rights reserved.| ijseat.com

Creative Commons License
International Journal of Science Engineering and Advance Technology is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJSEat , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.