social knowledge dynamics: a case study on modeling wikipedia presenter: benyun shi the 10th...

24
Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervis or: Prof. Jiming Liu Department of Computer Science Hong Kong Baptist University September, 2009

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

Social Knowledge Dynamics:

A Case Study on Modeling Wikipedia

Presenter: Benyun Shi

The 10th HKBU-CSD Postgraduate Research Symposium

Supervisor:

Prof. Jiming Liu

Department of Computer ScienceHong Kong Baptist University

September, 2009

Page 2: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

2

Outline

Wikipedia and Social Knowledge Dynamics Previous Work on Wikipedia

– Degree distribution– Reciprocity and feedback loops– Motifs

Modeling Wikipedia’s Growth– A model about reference– A model about degree distribution

AOC-based Models Conclusion

Page 3: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

3

Wikipedia

• Anyone can create, edit, as well as delete;• Some properties:

Each article can be treated as a collective “knowledge” of a group of users; Users can exchange “knowledge” through “talk” page; Users with similar “knowledge” may form communities; The underlying structure of some article may inversely influence users “knowledge”;

Page 4: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

4

Social Knowledge Dynamics“Knowledge is embodied in people gathered in communities and networks. The road to knowledge is via people, conversations, connections and relationships. Knowledge surfaces through dialog, all knowledge is socially mediated and access to knowledge is by connecting to people that know or know who to contact.”

-- Denham Grey

Social dynamics: A society of individuals to react to inner and/or outer changes; Global patterns can emerge from even simple individuals;

phase transitions, catastrophe, etc.

Social Dynamics

Social knowledge dynamicsCulture dynamics

Language dynamics

Crowed behaviors

… …

Page 5: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

5

Difficulties and Motivations Two levels of difficulty to discover global

emergence by local dynamic models: The definition of sensible and realistic microscopic

models; (intact data is needed) The usual problem of inferring the macroscopic

phenomena out of the microscopic dynamic models;

Motivations of studying Wikipedia– The formation of Wikipedia is a kind of social

knowledge dynamics; (if treat articles as knowledge)– Intact data for download;

• Articles, categories, images and multimedia, talk pages, redirect and broken links, and so on.

Page 6: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

6

Related Analysis on Wikipedia Treat Wikipedia as complex

networks, where the articles represent the nodes, and hyperlinks represent links.

Degree distribution

Reciprocity and feedback loops

Motifs

Page 7: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

7

Degree distribution

• Degree: measure the number of articles that link into or out of

• Meanings of degree:– Two articles sharing a link reflect some

kind of relations in term of their contents;

– Articles with high degree are more likely to be common knowledge;

Page 8: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

8

Observations: Scale-free

The in-degree distribution of Japan Wikipedia. (adopted from Fig. 3 in ref[1].)

Reference[1] V. Zlatic, M. Bozicevic, H. Stefancic, and M. Domazet, “Wikipedias: Collaborative Web-based Encyclopedias as Complex Networks”, Physical Review E 74, 016615, 2006.

The out-degree distribution of Japan Wikipedia. (adopted from Fig. 3 in ref[1].)

Page 9: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

9

Scale-free and Phase Transition

“The theory of phase transitions told us loud and clear that the road from disorder to order is maintained by the powerful forces of self-organization and is paved by power laws. It told us that power laws are the patent signatures of self-organization in complex systems….”

--Barabasi AL. 2002. Linked: The new science of networks. Cambridge: Perseus Publishing.

Similar results can be observed from Wikipedia with other languages.What are the fundamental principle behind the similar type of growth? – Preferential Attachment?

Page 10: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

10

Reciprocity and Feedback Loops• Reciprocal links are just the links pointing from

the node i to the node j for which exists a link pointing from node j to the node i.

Reciprocity qualifies mutual “exchange” between two articles.

• Feedback loops: A loop with directed links that start from and end with the same node.

/ ( 1)a L N N The density of the links

Page 11: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

11

Feedback Loops in Ecological System

The ecological study observed that the number of feedback loops in the species network is correlated with system lifetime.

Reference:[2] R. Mehrotra, V. Soni, and S. Jain. Diversity sustains an evolving network. Journal of the Royal Society Interface, 6(38):793–799, 2009.

Normal State State before crash

Page 12: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

12

Motifs

• Motifs [3] are small subgraphs of networks, which are used to systematically study similarity in the local structure of networks.

Reference:[3] R. Milo, S. Itzkovitz, N. Kashtan, R. Levitt, S. Shen-Orr, I. Ayzenshtat, M. Sheffer, and U. Alon. Superfamilies of evolved and designed networks. Science, 303(5663):1538–1542, 2004.

Triadic subgraphs

Feedback loops

Questions:

Do Wikipedia with different languages share same functions?

Is the formation of social knowledge driven by the same fundamental function?

Page 13: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

13

Modeling Reference Growth

Frequency distribution of the expected and actual number of references added each month to each article (adopted from Fig. 3b in [4]).

At each time step t,

A number of entries and rt references are added;

The references are distributed among all entries following a probability

,,

,,

( ) i ti t

j tj t

kp k

k

The expected number of references added to entry i at time t is

, ,( ) ( )i t t i tE k r p k

Reference:[4] D. Spinellis and P. Louridas. The collaborative organization of knowledge. Communications of the ACM, 51(8):68–73,2008.

Page 14: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

14

Modeling about Degree Distribution• The model consists of two steps:

– A new node t attaches to a network with m outgoing links. The probability that the given link will attach itself to some node s is proportional to the in-degree ki(s) of the node s.

– Every new link with the probability r, a new reciprocal link is formed between node s and t.Comparison of in-degree distribution.

Chosen parameters are t = 94094, m = 16.75, r=0.18. (adopted from [5])

Reference:[5] Vinko et al. Model of wikipedia growth based on information exchange via reciprocal arcs. Physics and Society, 2009.

Page 15: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

15

Insufficiency (1)

• The above two models seems to reflect the preferential attachment as a principle behind scale-free phenomena– However, other researchers also show that

selective removal [6] can also formed the scale-free distribution.

The models for scale-free can be divided into two groups:

i) Scale-free as the result of an optimization or phase transition process

ii) Scale-free as the results of a growth model, such as preferential attachment.

Reference:[6] M. Salathé, Robert M May, and S. Bonhoeffer, “The Evolution of Network Topology by Selective Removal”, Journal of Royal Society, Interface, 2(5): 533–536, 2005.

Page 16: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

16

Insufficiency (2)

• The above two models are based on simple stochastic processes– we should realize that the real

Wikipedia is driven by the social dynamics, including user-user interactions, use-group interactions, and group-group interactions, rather than the simple stochastic processes.

Page 17: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

17

AOC-based Models

• Components of Autonomy-Oriented Computing– Entities;– Interactions;– Behavioral rules;– Self-organizations

• Collective regulations;• Aggregations;

Relationships

Behaviors

Wikipedia

Users;Interact for a page;Behaviors;Self-organized groups;Feedbacks;

Used to solve large-scale dynamically-evolving, and/or highly distributed computational problems.

Reference:[7] M. Salathé, Robert M May, and S. Bonhoeffer, “The Evolution of Network Topology by Selective Removal”, Journal of Royal Society, Interface, 2(5): 533–536, 2005.

Page 18: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

18

Questions

• What are the fundamental behavioral rules (e.g., explicit/implicit optimization objectives) of entities to form global patterns of Wikipedia?

• How do entities self-organize themselves during the evolution of Wikipedia?

• Do these rules and self-organization reflect the formation rule of social knowledge and social organization?

Page 19: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

19

Three Possible Directions-1• Wikipedia as a system

– As a collaborative system based solely on users’ spontaneous actions, what’s the driven of its birth, boom, and death?

• Existing results on ecosystems:– Large randomly assembled ecosystems tend to

be less stable as they increase in complexity,• the complexity is measured by the connectance and

the average interaction strength between species.

– The typical lifetime of the system increase with the diversity of its components.

Page 20: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

20

Three Possible Directions-2• Topic evolution on Wikipedia

– We can treat the topic evolution on Wikipedia as a results of user-to-user interactions, or even the interaction among groups of users. (Like cultural dynamics)

• Existing work:– Static data mining; (Time windows for

dynamic data mining)– Semantic/content analysis; (What is the

driven force?)

Page 21: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

21

Three Possible Directions-3• User community dynamics on Wikipedia

– Each user may associate with multiple articles;

– For each article, there will be multiple users acting on it;

– Communities may emerge from entities local interactions, which may change over time;

• Existing work– Modularity – The linkage-based measurement cannot reflect

multiple relationships

2( )ii ii

Q e a

Page 22: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

22

Three Levels of Consideration• Describing the structure

– Such as food webs in ecosystems, neural networks in organisms, etc.

• How the structure influence what happens in the system– Such as the food-web structure affects the

dynamics of population of species;

• How the structure change over time– Species going extinct will influence the food-

web structure

Page 23: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

23

Conclusion

• The relation of Wikipedia and social knowledge; (Motivations)

• The current studies on Wikipedia and their insufficiency;

• The possibility of adopting AOC-based modeling;

• Three research directions;

Page 24: Social Knowledge Dynamics: A Case Study on Modeling Wikipedia Presenter: Benyun Shi The 10th HKBU-CSD Postgraduate Research Symposium Supervisor:Prof

24

Thanks!