Monday, December 21, 2009

The ACMA blacklist, and can it be distributed securely?

One sore point is that the ACMA Blacklist for online restricted content is "closed" - that is, we the public have no way at the present time of viewing what is it on it. The pro-filtering advocates quite validly state that opening up the ACMA Blacklist would basically be publishing URLs for naughty people to view - an "illegal content directory", if you will.

So what? If they want to find it, they'll find it - whether it is public or not.

The current downside though is that the blacklist can't be easily used by third-party filter software producers without what I understand to be an elaborate and expensive process.

So not only is it currently impossible for the public to vet the list to make sure only illegal content makes it on there, but it also means it can't be widely used everywhere unless you're a company with a lot of money to burn.

It seems like a bit of a silly situation to be in, doesn't it?

So, is it feasible to distribute the list in some encrypted way? How easy would it be to find what is on the list itself? This is a good question. The true answer is "no, it isn't feasible." Like everything technological, the true question is how much effort you're willing to go to in order to hide said list.

The ACMA blacklist is integrated into a few products which are currently available. The problem is hiding the URLs from the user. Software hackers are a clever bunch. If your computer runs the software then it is very possible to determine how to decrypt the URL list and use it. So simply publishing the cleartext ACMA blacklist - encrypted or not - is just never going to be secure. I believe this is how the ACMA blacklist was leaked to Wikileaks earlier in 2009.

There already exists a perfectly good way to distribute this sort of URL blacklist - eg Google SafeSearch. The ACMA could take the list of URLs, convert them to sets of MD5 strings to match against, and distribute that. They could distribute this openly - so everyone and anyone who wished to filter content based on this list could do so without having to pay the ACMA some stupid amount of money. Finally, it means that web site owners could compare their own URLs against the content of the blacklist to see if any of their pages are on it. It may not be that feasible for very large, dynamic URL sites - but it certainly is more feasible than what can be done today.

If the ACMA did this then I'd even write up a Squid plugin to filter against said ACMA blacklist. Small companies and schools can then use it for free. That would get the ACMA blacklist more exposure - which benefits the ACMA as much as it benefits anti-censor advocates. More use will translate to a larger cross-section of visited web sites - so people will be more likely to discover if something which shouldn't be blocked suddenly appears on the blacklist.

But is it truely secure? There's currently no way to take an MD5 string and turn it back into a URL. You could theoretically generate a set of URLs which would hash to that MD5 string but it'd take a damned long time. So, for all practical reasons, it can't be reverse engineered.

But what can be done is to log the URLs which match the filter and slowly build up a list of sites that way. Naughty people could then publish the set of URLs which match the blacklist rules. There's no technological method of avoiding that. If people discover a URL has been filtered, they may just share the link online.

The only real way the government has to counter sharing the cleartext URLs from the blacklist would be to make it illegal and enforce that law very strictly. This means enforcing it when naughty stuff is shared - but it also means that anyone who publishes URLs for content which should not be on the list may also get punished. That is a whole other debate.

So in summary - yes, the ACMA could publish the blacklist in a way that is more secure than they currently are. They could publish it - like Google does - to the public, so it can be integrated into arbitrary pieces of software. This may help it be more widely adopted and tested. But they will never be able to publish the list in a way that makes it impossible to identify and publish cleartext URLs.

Let me be clear here - there is no technological method for restricting what information people can share between each other, and this includes URLs identified to be on the ACMA blacklist.

1 comment: