Personaliseren van zoekopdrachten door de analyse van surfgedrag en contextinformatie

Jeroen
Liebaert
  • Eline
    De Meyer

Abstract – We designed a program to get better recommendations then those given by most search engines. To do this each user has his own profile in which his search history is stored. By combining this profile and ontology based reasoning, recommendations are generated. Afterwards the recommendations are enriched with context information. The program was tested on a set of data obtained through a questionnaire.

Introduction

The well known problem with search engines is that they always display a lot of unwanted recommendations. You have to scroll all the way down before you can find what you are searching for. That is because the search engine does not keep track of your search behavior or current context.

Imagine someone who enjoys cooking but does not have a clue what he can make for dinner. He starts up his computer, opens a web browser and searches for “Shrimps” hoping to get some pretty recipes. Instead of these, recommendations about the way the shrimp is built are given. Luckily another application also analyzed his search query and as a result, all the recipes he hoped for are presented. On top of that, a list of restaurants in the neighbourhood is shown just in case he prefers going out for diner instead of making it himself. That application that was running in the background is just what this abstract is all about.

We start off with proposing a search engine built on the knowledge supplied by different ontologies. It makes use of context information to personalize the outcome. We discuss the correctness of the given recommendations in the next section and finally the concluding remarks and future work are presented.

A personalized search engine

Analyze surf behavior

Personalized search systems focus on active context-aware retrieval. This is a technique to infer information based on the accumulated knowledge of the user. In this specific case, that will be an overview of his search history.

Therefore the knowledge domains are represented by ontologies and each user has a profile in which for all items of the knowledge domains a score is stored. That score represents the interests of the user. When a user surfs the Internet, a web crawler will analyze all terms and phrases on the visited web page. Afterwards a new score in accordance with the number of occurrences will be given to each item that also is present in the domains of interests. The new score will be propagated through the ontology network to all subclasses. The score is also subject to score degradation so changing trends in the surf behavior can be picked up more easily.

Generate recommendations

All collected information will be used to generate personalized recommendations. Each time a search query is launched, the query will be analyzed to extract all the keywords wherefore recommendations can be given.

Afterwards, two list of terms are generated. At first item-based filtering will be used to make a list of terms related to the keywords. Secondly, based on the stored scores, the items the user is most interested in will be selected. By merging this list in a smart way, the most appropriate recommendations are obtained.

Enrich the recommendations

Analyzing the context is a perfect way to personalize an application because context tells a lot about the user himself. Context is a broad concept but in general it covers all information that can be used to characterize a situation, person, object or place. It is important to make a selection of the context that will be useful. In this application context information about the current location of the user is used to enrich the user profile. This makes it possible to adapt the recommendations in a way that only those events in the neighbourhood of the user are shown.

An extension of context analysis is the use of a social network. This network makes it possible to stay in touch with friends and share interesting data. The FOAF-project makes use of XML to deploy an instant network. One file is enough to share information about a person. The file can be extended with specific data such as the search history. Because of that, it will be used in the application to enrich the current profile with recommendations of friends by making use of user-based filtering.

Combining ontologies

The program can in the first place be used to obtain recommendations about all kinds of events. Besides, it can be extended to other domains of knowledge. Each of these domains needs to be defined in a corresponding ontology and should be linked with the first ontology. To limit the scope, the application makes only use of one supplementary ontology describing dishes and their ingredients.

By combining the different ontologies, the recommendations given can become more precise. When a link exists, the reasoning will be extended to the other ontology as well. This will make it possible to plan a whole night out in just one click.

Correctness of the recommendations

Personalized search engines will only be used when the output is useful or in other words when the user is interested in the recommendations. To test the program we needed real data and held a questionnaire. The respondents indicated how fond they are of certain products, dishes and events.

Only the scores concerning products were injected into the application. Afterwards the output of different search queries was compared with the scores the respondent had given to the recommended dishes and events.

In most cases, we noted that recommended items were also rated high by the respondent. This means that the generated recommendations are well-directed and useful.

Conclusion

Most search engines never give exactly the recommendations you are looking for. You do not want to scroll all the way down before you find what you need. Moreover, the information you get is mostly limited to one domain so the information you receive is very limited.

By giving each user his own profile and store his search behavior this is not longer a problem. It is even possible to enrich this profile by using context information. We have shown that the recommendations given are correct and useful even though we only use two ontologies and a very limited amount of context information.

Personalized applications will become more important due to the use of mobile devices and their extra opportunities. To improve the recommendation engine, it can be extended by using more ontologies and other context information. More efficient score calculations and other types of item-based filtering will amend the reasoning so even more precise recommendations can be given.

Bibliografie

  • [1] Abdi, H. (2003). Neural networks. In Lewis-Beck, M., Bryman, A., & Futing, T. (Red.) (2003).Encyclopedia of Social Sciences Research Methods. Plaats van uitgave: Thousand Oaks, USA.
  • [2] Abowd, G. D., Atkeson, C. G., Hong, J., Long, S., Kooper, R., & Pinkerton, M. (1997). Cyberguide: a mobile context-aware tour guide. Wireless Networks, Volume 3 (Issue 5). 421 - 433. doi:10.1023/A:1019194325861
  • [3] Adjusted Cosine Similarity (2001, 19 februari). Opgevraagd op 20 mei 2011, van http://www10.org/cdrom/papers/519/node14.html
  • [4] Ajax (programming) (2011, 11 mei). Opgevraagd op 14 mei 2011, van http://en.wikipedia.org/wiki/Ajax (programming)
  • [5] Alag, S. (2009). Collective intelligence in action. Plaats van uitgave: Manning Publications Co.
  • [6] Android.com (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://www.android.com/
  • [7] Authentication and Authorization for Google APIs (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://code.google.com/intl/nl/apis/accounts/
  • [8] Breslin, J.G., Grzonkowski, S., Gzella, A., Kruk, S. R., & Woroniecki, T. (2009, mei) Sharing information across community portals with FOAFRealm. International Journal of Web Based Communities, Volume 5 (Issue 3). 351-370. doi:10.1504/IJWBC.2009.025212
  • [9] Brown, P. J., & Jones, G. J. F. (2001, december). Context-aware retrieval: exploring a new environment for information retrieval and information ltering. Personal Obiquitous Computing, Volume 5 (Issue 4). 253 - 263. doi:10.1007/s007790170004
  • [10] Check and Visualize your RDF documents (2007, 15 februari). Opgevraagd op 14 mei 2011, van http://www.w3.org/RDF/Validator/
  • [11] Chen, G., & Kotz, D. (2000, november). A survey of context-aware mobile computing research. Dartmouth Computer Science Technical Report TR2000-381.
  • [12] Collaborative ltering (2011, 29 april). Opgevraagd op 14 mei 2011, van http://en.wikipedia.org/wiki/Collaborative ltering
  • [13] Coppola, P., Della Mea, V., Di Gaspero, L., Mischis, O., Mizzaro, S., Nazzi, E.,... Vassena, L. (2010, januari). Context-aware browser. IEEE Intelligent Systems, Volume 25 (Issue 2). 38 - 47. doi:10.1109/MIS.2010.26
  • [14] Cornelis, C. (2005). Trends in soft computing [cursustekst]. Gent: Universiteit Gent.
  • [15] Cosine similarity (2011, 30 januari). Opgevraagd op 20 mei 2011, van http://en.wikipedia.org/wiki/Cosine similarity
  • [16] Coutand, O. (2005, 13 November). A framework for contextual personalised applications. Plaats van uitgave: Kassel University Press, Germany.
  • [17] DDE voor Java (2007). Opgevraagd op 14 mei 2011, van http://www.javaparts.com/
  • [18] Dey, A. K., & Abowd, G. D. (1999, juni). Towards a better understanding of context and context-awareness. Proceedings of the 1st international symposium on Handheld and Ubiquitous Computing. 304 - 307.
  • [19] Dietterich, T. G. (2003, 26 mei). Learning and reasoning. Plaats van uitgave: Department of Computer Science, Oregan State University.
  • [20] Ding, L., Kolari, P., Ding, Z., & Avancha, S. (2007). Using ontologies in the semantic web: a survey. Ontologies: integrated series in information systems, Volume 14 (Issue 1). 79 - 113. doi:10.1007/978-0-387-37022-4 4
  • [21] Drogehorn, O., Wust, B., & David, K. (2005, 4-8 april). Personalised applications and services for a mobile user. Autonomous Decentralized Systems. 473 - 479. doi:10.1109/ISADS.2005.1452113
  • [22] Drools documentation (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://downloads.jboss.com/drools/docs/4.0.4.17825.GA/html single/index.html
  • [23] Dynamic Data Exchange (2010, 7 september). Opgevraagd op 14 mei 2011, van http://msdn.microsoft.com/en-us/library/ms648711.aspx/
  • [24] Eisenstein, J., Vanderdonckt, J., & Puerta , A. (2000). Adapting to Mobile Contexts with User-Interface Modeling. Proc. of 3rd IEEE workshop on mobile computing systems and applications 7 - 8
  • [25] Fielding, R. T. (2000). Architectural Styles and the Design of Network-based Software Architectures (dissertation) University of California, Irvine.
  • [26] Firefox-add-ons (2011, 14 mei). Opgevraagd op 14 mei 2011, van https://addons.mozilla.org/nl/ refox/
  • [27] FOAF-a-Matic (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://www.ldodds.com/foaf/foaf-a-matic
  • [28] FOAF Vocabulary Speci cation 0.98 (2010, 9 augustus). Opgevraagd op 14 mei 2011,van http://xmlns.com/foaf/spec/
  • [29] Forgy, C. L. (1982, september). Rete: A fast algorithm for the many pattern/many object pattern match problem. Arti cial Intelligence, Volume 19 (Issue 1). 17 - 37. doi:10.1016/0004-3702(82)90020-0
  • [30] Friend of a friend (FOAF) search engine (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://www.foaf-search.net/
  • [31] Gamperl, J., & Nefzger, W. (2001). JavaScript grand cru: het boek voor de computerliefhebber. Plaats van uitgave: Easy Computing (Mensys B.V.).
  • [32] Ghita, S., Henze, N., & Nejdl, W. (2005). Task speci c semantic views: extracting and integrating contextual metadata from the web. In submitted for publication, L3S Technical report.
  • [33] Golbreich, C., & Imai, A. (2005, juli). Combining SWRL rules and OWL ontologies with Protégé OWL Plugin, Jess, and Racer. 8th International Protege Conference, Protege with Rules Workshop.
  • [34] Google Latitude API (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://code.google.com/intl/nl/apis/latitude/
  • [35] Gospodnetic, O., & Hatcher, E. (2004). Lucene in action. Plaats van uitgave:Manning Publications Co. Greenwich, CT, USA.
  • [36] Hendler, J. (2008, februari). Web 3.0: chicken farms on the semantic web. Computer,Volume 41 (Issue 1). 106 - 108. doi:10.1109/MC.2008.34
  • [37] Hendler, J. (2009, januari). Web 3.0 emerging. Computer, Volume 42 (Issue 1). 111- 113. doi:10.1109/MC.2009.30
  • [38] Hofer, T., Schwinger, W., Pichler, M., Leonhartsberger, G., & Altmann, J. (2002). Context-awareness on mobile devices - the Hydrogen approach. Proceedings of the 36th Hawaii International Conference on System Sciences, Volume 9. 292.1.
  • [39] Horridge, M. (2009, 13 maart). A practical guide to building OWL ontologies using Protégé 4 and CO-ODE tools (edition 1.2). Technical report, The University Of Manchester.
  • [40] Huang, L., Chen, E. Y., Barth, A., Rescorla, E., & Jackson, C. (2010). Transparent proxies: threat or menace?. In submission.
  • [41] Huynh, D., Mazzocchi, S., & Karger, D. (2005, oktober). Piggy Bank: experience the semantic web inside your web browser. Lecture Notes in Computer Science, Volume 3729. 413 - 430. doi:10.1007/11574620
  • [42] Interface OntModel (2009, 6 oktober). Opgevraagd op 14 mei 2011, van http://jena.sourceforge.net/javadoc/com/hp/hpl/jena/ontology/OntModel.h…
  • [43] Introduction to XUL (2005, 31 januari). Opgevraagd op 14 mei 2011, van https://developer.mozilla.org/en/introduction to xul/
  • [44] Jena A Semantic Web Framework for Java (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://jena.sourceforge.net/
  • [45] JSpider (2003). Opgevraagd op 14 mei 2011, van http://j-spider.sourceforge.net/
  • [46] Kruk, S. R., Gzella, A., & Grzonkowski, S. (2006, 11-14 juni). D-FOAF: distributed identity management based on social networks. 3rd European Semantic Web Conference, Budva, Montenegro.
  • [47] Lissala, O., & Hendler, J. (2007, mei). EmbracingWeb 3.0. IEEE Internet Computing, Volume 11 (Issue 3). 90 - 93. doi:10.1109/MIC.2007.52
  • [48] List of social networking websites (2011, 12 mei). Opgevraagd op 14 mei 2011, van http://en.wikipedia.org/wiki/List of social networking websites
  • [49] Lugano, G. (2005, 15 december). Semantic web technologies and the FOAF project [cursustekst]. Jyvaskyla, Finland: University of Jyvaskyla.
  • [50] Maemo (2010). Opgevraagd op 14 mei 2011, van http://maemo.nokia.com/
  • [51] Metadata Extraction Tool (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://www.natlib.govt.nz/services/get-advice/digital-libraries/metadat…
  • [52] Mitchell-Wong, J., Kowalczyk, R., Roshelova, A., Joy, B., & Tsai, H. (2007, 21-23 februari). OpenSocial: from social networks to social ecosystem. Digital EcoSystems and Technologies Conference. 361 - 366. doi:10.1109/DEST.2007.371999
  • [53] Mohr, T., Stack, M., Ranitovic, I., Avery, D., & Kimpton, M. (2004). An introduction to Heritrix, an open source archival quality web crawler. 4th International Web Archiving Workshop.
  • [54] O'Connor, M., Knublaunch, H., Tu, S., & Musen, M. (2005). Writing rules for the semantic web using SWRL and Jess. 8th International Protege Conference, Protege with Rules Workshop.
  • [55] OpenRDF.org ...home of Sesame (2011, 31 maart). Opgevraagd op 14 mei 2011, van http://www.openrdf.org/
  • [56] OpenSocial: The web is better when it's social (2011, 14 mei). Opgevraagd op 14mei 2011, van http://code.google.com/intl/nl/apis/opensocial/
  • [57] OWL-S: Semantic Markup for Web Services (2004, 22 november). Opgevraagd op 14 mei 2011, van http://www.w3.org/Submission/OWL-S/
  • [58] OWL Web Ontology Language Overview (2004, 10 februari). Opgevraagd op 14 mei 2011, van http://www.w3.org/TR/owl-features/
  • [59] Quatuo (2010). Opgevraagd op 14 mei 2011, van http://www.quatuo.com/
  • [60] Representational State Transfer (2011, 6 mei). Opgevraagd op 14 mei 2011, van http://en.wikipedia.org/wiki/Representational State Transfer
  • [61] Roman, D., Keller, U., Lausen, H., de Bruijn, J., Lara, R., Stollberg, M.,... Fensel, D. (2005, januari). Web service modeling ontology. Applied Ontology, Volume 1 (Issue 1). 77 - 106.
  • [62] Sapkopa, B., Ludwig, L., Zhou, X., & Breslin, J. G. (2005, 22-24 september). SiFo Peers: a social FOAF based peer-to-peer network. Proceedings of the 16th Annual International Information Management Association Conference (IIMA 2005), Volume 5 (Issue 4). 81 - 90.
  • [63] Schilit, B., Adams, N., & Want, R. (1994, december). Context-aware computing applications. Proceedings of IEEE Workshop on Mobile Computing Systems and Applications. 85 - 90.
  • [64] Schmidt, A., Aidoo, K. A., Takaluoma, A., Tuomela, U., Van Laerhoven, K., & Van de Velde, W. (1999). Advanced interaction in context. Proceedings of the 1st international symposium on Handheld and Ubiquitous Computing. 89 - 101.
  • [65] Segaran, T. (2007). Programming collective intelligence. Plaats van uitgave: O'Reilly.
  • [66] Shanmuhan, M. T. (2003). Semanta: an ontology driven semantic link analysis framework (master's thesis) The University of Georgia, USA.
  • [67] Sirin, E., Parsia, B., Grau, B. C., Kalyanpur, A. & Katz, Y. (2007, juni). Pellet: A practical OWL-DL reasoner. Software Engineering and the Semantic Web, Volume 5 (Issue 2). 51 - 53. doi:10.1016/j.websem.2007.03.004
  • [68] Small, J., Smailagic, A., & Siewiorek, D. P. (2000). Determining user location for context aware computing through the use of a wireless LAN infrastructure. Submitted to ACM Mobile Networks and Applications.
  • [69] SOAP (2011, 13 mei). Opgevraagd op 14 mei 2011, van http://en.wikipedia.org/wiki/SOAP
  • [70] Soap working version 1.2 (second edition) (2007, 27 april). Opgevraagd op 14 mei 2011, van http://www.w3.org/TR/soap12-part1/
  • [71] Sugiyama, K., Hatano, K., & Yoshikawa, M. (2004, 17-22 mei). Adaptive web search based on user pro le constructed without any e ort from users. Proceedings of the 13th international conference on World Wide Web. 675 - 684. doi:10.1145/988672.988764
  • [72] SW Excellence Inc. (2008). Opgevraagd op 14 mei 2011, van http://www.swexcellence.com/
  • [73] SweetRules (2005, 27 oktober). Opgevraagd op 14 mei 2011, van http://sweetrules.projects.semwebcentral.org/
  • [74] Taxicab geometry (2011, 18 mei). Opgevraagd op 20 mei 2011, van http://en.wikipedia.org/wiki/Manhattan distance
  • [75] The BAT ultrasonic location system (2005, 1 juli). Opgevraagd op 14 mei 2011, van http://www.cl.cam.ac.uk/research/dtg/attarchive/bat/
  • [76] The Friend of a Friend (FOAF) project (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://www.foaf-project.org/
  • [77] Tsarkov, D., & Horrocks, I. (2006). Fact++ description logic reasoner: system description. Proceedings of the International Joint Conference on Automated Reasoning. 292 - 297.
  • [78] Want, R., Hopper, A., Falcao, V. & Gibbons, J. (1992, januari). The active badge location system. ACM Transactions on Information Systems (TOIS), Volume 10 (Issue 1). 91 - 102. doi:10.1145/128756.128759
  • [79] Web Ontology Language (OWL) Candidate Recommendations (2003, 19 augustus). Opgevraagd op 14 mei 2011, van http://xml.coverpages.org/OWL-CR200308.html.
  • [80] Web Services Description Language (WSDL) 1.1 (2001, 15 maart). Opgevraagd op 14 mei 2011, van http://www.w3.org/TR/wsdl
  • [81] WebSPHINX: A Personal, Customizable Web Crawler (2011, 14 mei). Opgevraagd op 14 mei 2011, van http://www.cs.cmu.edu/ rcm/websphinx/
  • [82] Welcome to Nutch! (2010, 27 september). Opgevraagd op 14 mei 2011, van http://nutch.apache.org/
  • [83] Windows API (2009, 19 mei). Opgevraagd op 14 mei 2011, van http://msdn.microsoft.com/en-us/library/cc433218.aspx/
  • [84] MatPLC: user manual (2006, 26 december). Opgeroepen op 15 mei 2011, van http://mat.sourceforge.net/manual/
  • [85] XUL (2011, 5 mei). Opgevraagd op 14 mei 2011, van http://en.wikipedia.org/wiki/XUL
  • [86] Yan, H. & Selker, T. (2000). Context-aware office assistant. Proceedings of the 5th international conference on Intelligent user interfaces. 276 - 279. doi:10.1145/325737.325872
  • [87] Zamir, O., & Etzioni, O. (1999). Grouper: a dynamic clustering interface to web search results. Computer Networks, Volume 31 (Issue 11-13). 1375 - 1389. doi:10.1016/S1389-1286(99)00054-7
Universiteit of Hogeschool
Universiteit Gent
Thesis jaar
2011