Full disclosure: Diary of an internet geography project #1

OII research fellow, Mark Graham and DPhil student, Heather Ford (both part of the CII group) are working with a group of computer scientists including Brent Hecht, Dave Musicant and Shilad Sen to understand how far Wikipedia has come to representing ‘the sum of all human knowledge’. As part of the project, they will be making explicit the methods that they use to analyse millions of data records from Wikipedia articles about places in many languages. The hope is that by experimenting with a reflexive method of doing multidisciplinary ‘big data’ project, others might be able to use this as a model for pursuing their own analyses in the future. This is the first post in a series in which Heather outlines the team’s plans and processes.  

Clockwise from top left: Dave Musicant, Brent Hecht and Shilad Sen (Minnesota) and Mark Graham and Heather Ford (Oxford) in their inaugural Skype meeting

It was a beautiful day in Oxford and we wanted to show our Minnesotan friends some Harry Pottery architecture, so Mark and I sat on a bench in the Balliol gardens while we called Brent, Dave and Shilad who are based in Minnesota for our inaugural Skype meeting. I have worked with Dave and Shilad on a paper about Wikipedia sources in the past, and Mark and Brent know each other because they both have produced great work on Wikipedia geography, but we’ve never all worked together as a team. A recent grant from Oxford University’s John Fell Fund provided impetus for the five of us to get together and pool efforts in a short, multidisciplinary project that will hopefully catalyse further collaborative work in the future.

In last week’s meeting, we talked about our goals and timing and how we wanted to work as a team. Since we’re a multidisciplinary group who really value both quantitative and qualitative approaches, we thought that it might make sense to present our goals as consisting of two main strands: 1) to investigate the origins of knowledge about places on Wikipedia in many languages, and 2) to do this in a way that is both transparent and reflexive.

In her eight ‘big tent’ criteria for excellent qualitative research, Sarah Tracy (2010, PDF) includes self-reflexivity and transparency in her conception of researcher ‘sincerity’. Tracy believes that sincerity is a valuable quality that relates to researchers being earnest and vulnerable in their work and ‘considering not only their own needs but also those of their participants, readers, coauthors and potential audiences’. Despite the focus on qualitative research in Tracy’s influential paper, we think that practicing transparency and reflexivity can have enormous benefits for quantitative research as well but one of the challenges is finding ways to pursue transparency and reflexivity as a team rather than as individual researchers.

Transparency

Tracy writes that transparency is about researchers being honest about the research process.

‘Transparent research is marked by disclosure of the study’s challenges and unexpected twists and turns and revelation of the ways research foci transformed over time.’

She writes that, in practice, transparency requires a formal audit trail of all research decisions and activities. For this project, we’ve set up a series of Google docs folders for our meeting agendas, minutes, Skype calls, screenshots of our video call as well as any related spreadsheets and analyses produced during the week. After each session, I clean up the meeting minutes that we’ve co-produced on the Google doc while we’re talking, and write a more narrative account about what we did and what we learned beneath that.

Although we’re co-editing these documents as a team, it’s important to note that, as the documenter of the process, it’s my perspective that is foregrounded and I have to be really mindful of this as reflect what happened. Our team meetings are occasions for discussion of the week’s activities, challenges and revelations which I try to document as accurately as possible, but I will probably also need to conduct interviews with individual members of the team further along in the process in order to capture individual responses to the project and the process that aren’t necessarily accommodated in the weekly meetings.

Reflexivity

According to Tracy, self-reflexivity involves ‘honesty and authenticity with one’s self, one’s research and one’s audience’. Apart from the focus on interrogating our own biases as researchers, reflexivity is about being frank about our strengths and weaknesses, and, importantly, about examining our impact on the scene and asking for feedback from participants.

Soliciting feedback from participants is something quite rare in quantitative research but we believe that gaining input from Wikipedians and other stakeholders can be extremely valuable for improving the rigor of our results and for providing insight into the humans behind the data.

As an example, a few years ago when I was at a Wikimedia Kenya meetup, I asked what editors thought about Mark Graham’s Swahili Wikipedia maps. One respondent was immediately able to explain the concentration of geolocated articles from Turkey because he knew the editor who was known as a specialist of Turkey geography stubs. Suddenly the map took on a more human form — a reflection of the relationships between real people trying to represent their world. More recently, a Swahili Wikipedians contacted Mark about the same maps and engaged him in a conversation about how they could be made better. Inspired by these engagements, we want to really encourage those conversations and invite people to comment on our process as it evolves. To do this, we’ll be blogging about the progress of the project and inviting particular groups of stakeholders to provide comments and questions. We’ll then discuss those comments and questions in our weekly meetings and try to respond to as many of them as possible in thinking about how we move the analysis forward.

In conclusion, transparency and reflexivity are two really important aspects of researcher sincerity. The challenge with this project is trying to put this into practice in a quantitative rather than qualitative project, a project driven by a team rather than an individual researcher. Potential risks are that I inaccurately report on what we’re doing, or expose something about our process that is considered inappropriate. What I’m hoping is that we can mark these entries clearly as my initial, necessarily incomplete reflections on our process and that this can feed into the team’s reflections going forward. Knowing the researchers in the team and having worked with all of them in the past, my goal is to reflect the ways in which they bring what Tracy values in ‘sincere’ researchers: the empathy, kindness, self-awareness and self deprecation that I know all of these team members display in their daily work.

Heather Ford

I am a University Academic Fellow at the University of Leeds in the School of Media and Communication where I study and teach about power, representation, governance and politics online.