Each day, a team of analysts faces a seemingly endless mountain of horrors. The team of 21, who work at the Internet Watch Foundation’s office in Cambridgeshire, spend hours trawling through images and videos containing child sexual abuse. And, each time they find a photo or piece of footage it needs to be assessed and labelled. Last year alone the team identified 153,383 webpages with links to child sexual abuse imagery. This creates a vast database of abuse which can then be shared internationally in an attempt to stem the flow of abuse. The problem? Different countries have different ways of categorising images and videos.
Until now, analysts at the UK-based child protection charity have checked to see whether the material they find falls into three categories: either A, B, or C. These groupings are based on the UK’s laws and sentencing guidelines for child sexual abuse and broadly set out types of abuse. Images in category A, for example the most severe classification, include the worst crimes and against children. These classifications are then used to work out how long someone convicted of a crime should be sentenced for. But other countries use different classifications.
Now the IWF believes a data breakthrough could remove some of these differences. The group has rebuilt its hashing software, dubbed Intelligrade, to automatically match up images and videos to the rules and laws of Australia, Canada, New Zealand, the US and the UK, also known as the Five Eyes countries. The change should mean less duplication of analytical work and make it easier for tech companies to prioritise the most serious images and videos of abuse first.
“We believe that we are better able to share data so that it can be used in meaningful ways by more people, rather than all of us just working in our own little silos,” says Chris Hughes, the director of the IWF’s reporting hotline. “When we share data [currently] it is very difficult to get any meaningful comparisons against the data because they just simply don’t mesh correctly.”
Countries place different weightings on images based on what happens in them and the age of the children involved. Some countries classify images based on whether children are prepubescent or pubescent as well as the crime that is taking place. The UK’s most serious category, A, includes penetrative sexual activity, beastiality and sadism. It doesn’t necessarily include acts of masturbation, Hughes says. Whereas in the US this falls in a higher category. “At the moment, the US requesting IWF category A images would be missing out on that level of content,” Hughes says.
All the photos and videos the IWF looks at are given a hash, essentially a code, that’s shared with tech companies and law enforcement agencies around the world. These hashes are used to detect and block the known abuse content being uploaded to the web again. The hashing system has had a substantial impact on the spread of child sexual abuse material online, but the IWF’s latest tool adds significantly new information to each hash.
The IWF’s secret weapon is metadata. This is data that’s about data – it can be the what, who, how and when of what is contained in the images. Metadata is a powerful tool for investigators as it allows them to spot patterns in people’s actions and analyse them for trends. Among the biggest proponents of metadata are spies, who say it can be more revealing than the content of people’s messages.
The IWF has ramped up the amount of metadata it creates for each image and video it adds to its hash list, Hughes says. Each new image or video it looks at is being assessed in more detail than ever before. As well as working out if sexual abuse content falls under the UK’s three groups, its analysts are now adding up to 20 different pieces of information to their reports. These fields match what is needed to determine the classifications of an image in the other Five Eyes countries – the charity’s policy staff compared each of the laws and worked out what metadata is needed. “We decided to provide a high level of granularity about describing the age, a high level of granularity in terms of depicting what’s taking place in the image and also confirming gender,” Hughes says.