Despite being founded on ideals of freedom and openness, censorship on the internet is rampant, with more than 60 countries engaging in some form of state-sponsored censorship. A research project at the University of Cambridge is aiming to uncover the scale of this censorship, and to understand how it affects users and publishers of information

Censorship over the internet can potentially achieve unprecedented scale

Sheharbano Khattak

For all the controversy it caused, Fitna is not a great film. The 17-minute short, by the Dutch far-right politician Geert Wilders, was a way for him to express his opinion that Islam is an inherently violent religion. Understandably, the rest of the world did not see things the same way. In advance of its release in 2008, the film received widespread condemnation, especially within the Muslim community.

When a trailer for Fitna was released on YouTube, authorities in Pakistan demanded that it be removed from the site. YouTube offered to block the video in Pakistan, but would not agree to remove it entirely. When YouTube relayed this decision back to the Pakistan Telecommunications Authority (PTA), the decision was made to block YouTube.

Although Pakistan has been intermittently blocking content since 2006, a more persistent blocking policy was implemented in 2011, when porn content was censored in response to a media report that highlighted Pakistan as the top country in terms of searches for porn. Then, in 2012, YouTube was blocked for three years when a video, deemed blasphemous, appeared on the website. Only in January this year was the ban lifted, when Google, which owns YouTube, launched a Pakistan-specific version, and introduced a process by which governments can request the blocking of access to offending material.

All of this raises the thorny issue of censorship. Those censoring might raise objections to material on the basis of offensiveness or incitement to violence (more than a dozen people died in Pakistan following widespread protests over the video uploaded to YouTube in 2012). But when users aren’t able to access a particular site, they often don’t know whether it’s because the site is down, or if some force is preventing them from accessing it. How can users know what is being censored and why?

“The goal of a censor is to disrupt the flow of information,” says Sheharbano Khattak, a PhD student in Cambridge’s Computer Laboratory, who studies internet censorship and its effects. “internet censorship threatens free and open access to information. There’s no code of conduct when it comes to censorship: those doing the censoring – usually governments – aren’t in the habit of revealing what they’re blocking access to.” The goal of her research is to make the hidden visible.

She explains that we haven’t got a clear understanding of the consequences of censorship: how it affects different stakeholders, the steps those stakeholders take in response to censorship, how effective an act of censorship is, and what kind of collateral damage it causes.

Because censorship operates in an inherently adversarial environment, gathering relevant datasets is difficult. Much of the key information, such as what was censored and how, is missing. In her research, Khattak has developed methodologies that enable her to monitor censorship by characterising what normal data looks like and flagging anomalies within the data that are indicative of censorship.

She designs experiments to measure various aspects of censorship, to detect censorship in actively and passively collected data, and to measure how censorship affects various players.

The primary reasons for government-mandated censorship are political, religious or cultural. A censor might take a range of steps to stop the publication of information, to prevent access to that information by disrupting the link between the user and the publisher, or to directly prevent users from accessing that information. But the key point is to stop that information from being disseminated.

Internet censorship takes two main forms: user-side and publisher-side. In user-side censorship, the censor disrupts the link between the user and the publisher. The interruption can be made at various points in the process between a user typing an address into their browser and being served a site on their screen. Users may see a variety of different error messages, depending on what the censor wants them to know. 

“The thing is, even in countries like Saudi Arabia, where the government tells people that certain content is censored, how can we be sure of everything they’re stopping their citizens from being able to access?” asks Khattak. “When a government has the power to block access to large parts of the internet, how can we be sure that they’re not blocking more than they’re letting on?”

What Khattak does is characterise the demand for blocked content and try to work out where it goes. In the case of the blocking of YouTube in 2012 in Pakistan, a lot of the demand went to rival video sites like Daily Motion. But in the case of pornographic material, which is also heavily censored in Pakistan, the government censors didn’t have a comprehensive list of sites that were blacklisted, so plenty of pornographic content slipped through the censors’ nets. 

Despite any government’s best efforts, there will always be individuals and publishers who can get around censors, and access or publish blocked content through the use of censorship resistance systems. A desirable property, of any censorship resistance system is to ensure that users are not traceable, but usually users have to combine them with anonymity services such as Tor.

“It’s like an arms race, because the technology which is used to retrieve and disseminate information is constantly evolving,” says Khattak. “We now have social media sites which have loads of user-generated content, so it’s very difficult for a censor to retain control of this information because there’s so much of it. And because this content is hosted by sites like Google or Twitter that integrate a plethora of services, wholesale blocking of these websites is not an option most censors might be willing to consider.”

In addition to traditional censorship, Khattak also highlights a new kind of censorship – publisher-side censorship – where websites refuse to offer services to a certain class of users. Specifically, she looks at the differential treatments of Tor users by some parts of the web. The issue with services like Tor is that visitors to a website are anonymised, so the owner of the website doesn’t know where their visitors are coming from. There is increasing use of publisher-side censorship from site owners who want to block users of Tor or other anonymising systems.

“Censorship is not a new thing,” says Khattak. “Those in power have used censorship to suppress speech or writings deemed objectionable for as long as human discourse has existed. However, censorship over the internet can potentially achieve unprecedented scale, while possibly remaining discrete so that users are not even aware that they are being subjected to censored information.”

Professor Jon Crowcroft, who Khattak works with, agrees: “It’s often said that, online, we live in an echo chamber, where we hear only things we agree with. This is a side of the filter bubble that has its flaws, but is our own choosing. The darker side is when someone else gets to determine what we see, despite our interests. This is why internet censorship is so concerning.”

“While the cat and mouse game between the censors and their opponents will probably always exist,” says Khattak. “I hope that studies such as mine will illuminate and bring more transparency to this opaque and complex subject, and inform policy around the legality and ethics of such practices.”


Creative Commons License
The text in this work is licensed under a Creative Commons Attribution 4.0 International License. For image use please see separate credits above.