  |
The Structure of Broad Topics on the Web - http://www2002.org/CDROM/refereed/338/
By Soumen Chakrabarti, Mukul M. Joshi, Kunal Punera and David M. Pennock, IIT Bombay and NEC Research Institute. In: Proceedings of the 11th international conference on World Wide Web, 2002. Many studies on the Web graph concentrate on the graph structure, and do not consider textual properties of the nodes. The authors propose that a topic taxonomy such as Yahoo or ODP provides a useful framework for understanding the structure of content-based clusters and communities, and they present measurements that may prove valuable in the design of community-specific crawlers and link-based ranking systems. The experiments are based on ODP data. |
  |
Accelerated Focused Crawling through Online Relevance Feedback - http://www.cse.iitb.ac.in/soumen/doc/www2002m/p336-chakrabarti.pdf
By Soumen Chakrabarti, Kunal Punera, IIT Bombay, India, and Mallela Subramanyam, University of Texas, Austin, USA. In: Proceedings of the 11th international conference on World Wide Web, 2002. The ODP taxonomy is used for the experiments. |
  |
The Influence of Caption Features on Clickthrough Patterns in Web Search - http://research.microsoft.com/~ryenw/papers/ClarkeSIGIR2007.pdf
By Charles L. A. Clarke, University of Waterloo, Eugene Agichtein, Emory University, and Susan Dumais and Ryen W. White, Microsoft. In: Proceedings of the 30th Annual International ACM SIGIR Conference, July 2007. The results of the study suggest that relatively simple caption features such as the presence of all terms query terms, the readability of the snippet, and the length of the URL shown in the caption, can significantly influence users´ Web search behavior. The experiments are based on the Windows Live search engine, which may use ODP titles and descriptions when generating captions. |
  |
Topic Sensitive PageRank - http://www2002.org/presentations/haveliwala-rp17.pdf
T. Haveliwala proposes bringing topical information into PageRank calculation, using pages listed in the ODP. In: Proceedings of the Eleventh International World Wide Web Conference, May 2002. |
  |
A General Evaluation Framework for Topical Crawlers - http://dollar.biz.uiowa.edu/~pant/Papers/crawl_framework.pdf
By P. Srinivasan, F. Menczer and G. Pant. In: Information Retrieval, 2005. The ODP hierarchy is used as source for topics. |
  |
Mapping Ontologies into Cyc - http://www.cyc.com/doc/white_papers/mapping-ontologies-into-cyc_v31.pdf
By Stephen L. Reed and Douglas B. Lenat, Cycorp Inc., Austin, USA, 2002. The authors present the process by which several ontologies have been mapped or integrated with Cyc, a large commonsense knowledge base, over 15 years. ODP was among the chosen ontologies but was removed because the constant enhancements in the directory created a high maintenance burden. |
  |
The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases - http://139.91.183.30:9090/RDF/publications/semweb2001.html
By Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis and Karsten Tolle. 2001. The ODP RDF dump was used as a testbed for a suite of tools for RDF validation, storage and querying. |
  |
Lycos Retriever: An Information Fusion Engine - http://vistology.com/papers/retriever-final.pdf
By Brian Ulicny. In: Proceedings of the Human Language Technology Conference of the NAACL. June 2006. The Lycos Retriever automatically generates coherent topical summaries of popular web query topics. The ODP hierarchy is used for topic categorization and disambiguation. |
  |
OCELOT: A System for Summarizing Web Pages - http://portal.acm.org/citation.cfm?id=345565&dl=ACM&coll=portal
By Adam L. Berger, Carnegie Mellon University, and Vibhu O. Mittal, Just Research, Pittsburgh, USA. In: Proceedings of the 23rd Annual International ACM SIGIR Conference, 2000. Probabilistic models are used to select and order words into a gist. The paper describes a technique for learning these models automatically from a collection of human-summarized web pages, the authors used ODP data for this purpose. |
  |
Enhanced Word Clustering for Hierarchical Text Classification - http://www.cs.utexas.edu/users/inderjit/public_papers/hierdist.pdf
By Inderjit S. Dhillon, Subramanyam Mallela and Rahul Kumar, University of Texas, Austin, USA. Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 2002. The authors propose a new information-theoretic divisive algorithm for word clustering applied to text classification. Experimental results are based on a 20 Newsgroups data set and a 3-level hierarchy of HTML documents collected from ODP´s Science toplevel. |
  |
Using ODP Metadata to Personalize Search - http://www.l3s.de/~chirita/publications/chirita05using.pdf
By Paul Alexandru Chirita, Wolfgang Nejdl, Raluca Paiu and Christian Kohlschütter, L3S and University of Hannover, Germany. In: Proceedings of the 28th Annual International ACM SIGIR Conference, August 2005. The paper discusses how ODP metadata can be exploited to achieve high quality personalized web search. |
  |
Algorithmic Computation and Approximation of Semantic Similarity - http://informatics.indiana.edu/fil/Papers/semsim_extended.pdf
By A. Maguitman, F. Menczer, F. Erdinc, H. Roinestad and A. Vespignani, Indiana University. In: World Wide Web, Volume 9, Issue 4, 2006. An information-theoretic measure of semantic similarity between pages exploiting both hierarchical and non-hierarchical ODP structure improves on taxonomy-based approaches. |
  |
Summarizing Web Sites Automatically - http://flame.cs.dal.ca/~yongzhen/publication/paper/ai03.pdf
By Y. Zhang, N. Zincir-Heywood and E. Milios, Dalhousie University, Canada. In: Proceedings of the Sixteenth Conference of the Canadian Society for Computational Studies of Intelligence, 2003. Machine learning and natural language processing techniques are employed to automatically summarize web pages. The summaries are compared with ODP descriptions and with the results of browsing experiments. |
  |
Index Construction for Linear Categorisation - http://www.seg.rmit.edu.au/research/download.php?manuscript=142
By Vaughan R. Shanks and Hugh E. Williams, RMIT University, Melbourne, Australia. Proceedings of the twelfth international conference on Information and knowledge management, 2003. A problem with iterative training techniques for automatic text categorisation such as Support Vector Machines (SVM) is that during the learning phase, they require the entire training collection to be held in main-memory, which is infeasible for large training collections such as DMOZ or large news wire feeds. The authors present techniques which permit automatic categorisation using very large training collections, vocabularies, and numbers of categories. ODP is mentioned as a possible set of training data. |
  |
THESUS: Organizing Web Document Collections Based on Link Semantics - http://www.db-net.aueb.gr/index.php/corporate/content/download/159/416/file/HNVV03_VLDBJ.pdf
By Maria Halkidi, Benjamin Nguyen, Iraklis Varlamis and Michalis Vazirgiannis. In: The VLDB Journal - The International Journal on Very Large Data Bases, 2003. Currently Web documents are classified based on their content not taking into account the fact that they are connected to each other by links. The authors claim that a page’s classification is enriched by the detection of its incoming links’ semantics. An ontology based on ODP´s Arts/Music branch is used for experimenting. |
 |
OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization - http://research.microsoft.com/research/pubs/view.aspx?pubid=1460
By Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, Qiansheng Cheng, Weiguo Fan and Wei-Ying Ma. In: Proceedings of the 28th Annual International ACM SIGIR Conference, August 2005. Experiments based on 20 Newsgroups (20NG), Reuters Corpus Volume 1 (RCV1) and ODP data show that OCFS is a consistently better feature selection method than Information Gain (IG) and c 2-test (CHI). |
 |
Web Content Categorization Using Link Information - http://dbpubs.stanford.edu:8090/pub/showDoc.Fulltext?lang=en&doc=2006-17&format=pdf&compression=&name=2006-17.pdf
By Zoltan Gyongyi and Hector Garcia-Molina, Stanford University, and Jan Pedersen, Yahoo. Technical Report, June 2006. Introduces a link-based approach to classification, which can be used in isolation or in conjunction with text-based classification. The Yahoo web index and ODP are used for the experiments. |
 |
Improving Web Search Results Using Affinity Graph - http://research.microsoft.com/apps/pubs/default.aspx?id=67818
By Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen and Wei-Ying Ma. In: Proceedings of the 28th Annual International ACM SIGIR Conference, August 2005. The authors propose a ranking scheme named Affinity Ranking (AR). Yahoo, ODP and newsgroup data are used for the experiments. |
 |
Web-Page Summarization Using Clickthrough Data - http://research.microsoft.com/apps/pubs/default.aspx?id=69202
By Jian-Tao Sun, Dou Shen, HuaJun Zeng, Qiang Yang, Yuchang Lu and Zheng Chen. In: Proceedings of the 28th Annual International ACM SIGIR Conference, August 2005. The authors propose two adapted summarization methods that take advantage of the relationships discovered from clickthrough data. For those pages not covered by clickthrough data, they put forward a thematic lexicon approach to generate implicit knowledge. The methods are evaluated on a relatively small dataset consisting of manually annotated pages as well as a large dataset crawled from ODP. |