Categorical Searching and Learning on Information Networks

July 10, 2018

Information search is an essential behavior in our everyday lives. With the advent of the World Wide Web, the amount and variety of accessible information burst and brought us the prominent challenge of information overload. Crucially, efficiency in information search depends on the way knowledge is structured and on the strategy used to find the target information. In order to improve performance through the design of a better information space for humans to search and learn, it is important to understand what the influencing factors are in our searching behavior. In her talk at DNDS, first year PhD student Manran Zhu gave a presentation on this topic as it is being developed for her thesis proposal.

Manran’s research aims at investigating the dynamics of navigation processes on knowledge networks by employing a data-driven approach. By analyzing massive data of users’ search trails, Manran’s auspicated goal is to reveal how the knowledge structure is reflected in the mind of searchers and how search strategies emerge and change during the navigation process. In particular, Manran’s research aims at explaining the relationship between the two main phases of search, namely exploration and exploitation, and the structured knowledge acquired during the process. Navigation strategy and the whole learning behavior of information seekers are known to show large individual variations. Another goal is therefore to uncover the main characteristic categories and, where possible, to relate them to cognitive patterns.

Manran began the presentation by giving an introduction to the idea of navigability as it is used to represent search processes on networks. In few words, navigability is defined as the ability to find the shortest path between a source node and a target node using local information. Manran talked first about navigation on geographical networks and showed that here topology matters. Subsequently, Manran examined navigation on social networks where the main message was that hierarchy is the key when it comes to searching them. By relying on this line of research, Manran moved to illustrate how navigation is characterized when it is applied to information networks (e.g. Google, Wikipedia). As an example, Manran considered Wikispeedia. Wikispeedia is human-computation game where player try to navigate from a given source to a given target article on Wikipedia by only clicking Wikipedia links. The question therefore is: is Wikipedia a navigable network? If yes, is it due to the hierarchical structure of the Wikipedia network as in the case of social networks?

As an answer to the first question, Manran relied on existing literature to sustain that the in the Wikispeedia game, the efficient paths found by humans are typically not much longer than the optimal paths, therefore confirming the navigability assumption. The answer to the second question is not that clear. One reason for the difficulty is that the background knowledge structure that a player uses in the game is highly personal and not trivially equal to the category structure of Wikipedia. A navigation task from the page ”Quantum Mechanics” to ”Cat”, for example, would be easy for a physicist while not that obvious for others without the knowledge of Schrodinger’s cat. In order to understand the role of background knowledge, Manran sustained that we first need to know what it looks like. Several perspectives were presented including the use of Markov chain models to model players’ navigation from one page to another.

Based on the above discussion, Manran will be developing her thesis during the next year to answer the following questions:

1. How is the individuals’ knowledge structure distorted from the real one? Taking the example of Wikipedia: What are the similarities and differences between the network structures of the knowledge represented in Wikipedia and of that in the individuals’ minds?

2. How does the reflected knowledge structure influence the navigation activity and other possible activities on the network? Clearly, besides individual strategy this reflected knowledge network has the largest impact on the search process.

3. How does the individuals’ knowledge structure evolve during the navigation process? The reflected information structure is based at the beginning on knowledge acquired prior to the search, however, it is continuously changing as a consequence of the process, influencing the search itself.

Blog post by Tamer Khraisha