Visualising the locality of participation and voice on Wikipedia

On Sunday, I use an extended stopover in San Francisco airport to pop into the Wikimedia Headquarters and chat about uneven geographies of voice and representation on the platform.  In looking through some of the previous work that Ralph Straumann, Bernie Hogan, Ahmed Medhat, and I did on the topic, I noticed a few results that we haven’t yet had a chance to blog.

The graph above shows us something very interesting about the locality of participation and voice on Wikipedia.

It investigates the topic by looking at the proportion of within-region-edits to Wikipedia articles.

Each region of the world is assigned a colour (North America – brown; Oceania – dark blue; Europe – light blue; Asia – grey; Latin America – yellow; Middle East – green, and Sub-Saharan Africa – red). The vertical axis focuses on the proportion of edits to articles in the region that come from that region [“received edits”] and the horizontal axis focuses on the proportion of a region’s committed edits that stay within that region [“committed edits”]. (note that The data shows anonymous edits only (+), registered edits only (×) and both edit types combined (●))

On the vertical axis of the figure we can see a clear division between regions that are largely able to define themselves and regions that are largely defined by others. The world regions separate into two distinct groups of three (with Asia in the middle): Sub-Saharan Africa, Middle East & North Africa, Latin America & Caribbean receive comparatively few edits from within their territories (around 25 percent). Europe, Oceania and North America on the other hand receive primarily edits from within (around 75 percent). Asia is edited from within and from outside to almost equal degrees. In other words, there are significant parts of the world in which a majority of content is not locally generated.

Asia, Europe and North America all have 73–80 percent of their committed edits staying within their own region. Interestingly, Sub-Saharan Africa commits just slightly less within-region edits at 67 percent. Oceania, Latin America & Caribbean and especially Middle East & North Africa fall behind with 36–60 percent. In other words, not only does the Middle East & North Africa have a lot of non-locally generated content written about it, many of the edits coming from the region are used to write about other parts of the world.

What does this mean?

• Even when editors from Sub-Saharan Africa spend most of their edits within region, their small numbers mean that most content still comes from elsewhere.

• The global cores of North America and Europe self-represent very effectively by focusing on their own regions.

• Content appears to be very sensitive to feedback loops. A lot of content on an area in one language leads to more content in other languages as translations rather than similar local content.

The global cores focus their editing primarily within their own territories. Large amounts of geospatial content show no sign of deterring people from further contributions and editing: as more content exists, so too do more articles to amend, augment, update and build upon. It is possible that a stock of good content may be an attractive “editing ground” for Wikipedians, whereas a scarcity of content, beneath a certain unknown threshold, may – somewhat paradoxically – demotivate people to fill in the blanks. A relative lack of content may further reinforce perceptions amongst editors that little content equates to a small audience that is not worth writing for.

We expand on all of the ideas in a recent report that we published:

Graham, M. and B. Hogan. 2014. Uneven Openness: Barriers to MENA Representation on Wikipedia. Oxford Internet Institute Report, Oxford, UK.

See also:

Graham, M., Hogan, B., Straumann, R. K., and Medhat, A. 2014. Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers (forthcoming).

And they will also be forthcoming in a new paper (that we’re currently working on) that focuses specifically on these issues of self-representation.