Hash/Hash-database – Glossary of Platform Law and Policy Terms

Cite this article as:

Courtney Radsch (17/12/2021). Hash/Hash-database. In Belli, L.; Zingales, N. & Curzi, Y. (Eds.), Glossary of Platform Law and Policy Terms (online). FGV Direito Rio. https://platformglossary.info/hash-hash-database/.

Author: Courtney Radsch

A hash is a function that can be used to generate a unique identifier or value that can then be converted to another value and decoded via a hash table and is used for several purposes. With respect to content moderation and platform governance, a hash is akin to a digital fingerprint that is added to multimedia (photos, video, etc.) which provides a unique identifier and enables that content to be identified across the internet and for the search for, and removal of, the content associated with the hash to be automated.

Hash databases enable the sharing of these unique identifiers, or hashes, across platforms without having to share the content itself. Hashing enables coordinated action, such as content takedown, and allows companies to share information about content deemed unacceptable for a given platform across different services. Hashing technology such as PhotoDNA (Microsoft, n.d)¹ has been used to combat the spread of child pornography, terrorist content, and other unwanted or illegal content, such as extremist content.

In 2009, Microsoft and Dartmouth University launched (Gregoire, 2015)² PhotoDNA to help combat the trafficking and sexual exploitation of children, and in 2018 (Langston, 2018)³ expanded its use for video. The hash database is provided for free to law enforcement and civil society partners and overseen (Microsoft, n.d)⁴ by the National Center for Missing & Exploited Children (NCMEC) in the United States.

In 2016, Facebook, together with Google and Microsoft, created a hash database of ISIS videos to coordinate the removal of terrorist content. This collaboration formed the basis for the creation of the Global Internet Forum for Terrorist Content, which grew up around the hash database to include dozens of companies that coordinate around content removal and spun off into a stand-along organization in mid-2020. Critics have raised concerns about the opaque nature of this collaboration and the failure of the companies involved to maintain a database or other form of access to affected content that researchers and independent auditors could review and study. Although founding companies said the hash database would only include Al Qaeda and ISIS-related propaganda, in the wake of the 2019 Christchurch massacre of Muslims in New Zealand there was pressure to expand the remit of the database to include other forms of extremism. As of 2018, there were more than 200,000 pieces of content in the database, according to the GICT transparency report (GIFCT, 2020)⁵.

Critics of the GIFCT and the approach to coordinated content takedown via hash databases express concern about the potential for the technology and approach to be co-opted to eradicate other types of content, such as hate speech or misinformation. It is also not entirely clear under data protection law how content associated with such hash databases ought to be saved, categorized, and made available for independent, third-party oversight and research.

References

Microsoft. (2020). PhotoDNA. https://www.microsoft.com/en-us/photodna.
Gregoire, C. (2016). First Microsoft PhotoDNA update adds Linux and OS X support, detections up to 20 times faster. Microsoft on the Issues.
Langston, J. (2018). How PhotoDNA for Video is being used to fight online child exploitation. On the Issues. https://news.microsoft.com/on-the-issues/2018/09/12/how-photodna-for-video-is-being-used-to-fight-online-child-exploitation.
Microsoft. (2020). PhotoDNA. https://www.microsoft.com/en-us/photodna.
GIFCT. (2020a). GIFCT Transparency Report. Available at: https://gifct.org/wp-content/uploads/2020/10/GIFCT-Transparency-Report-July-2020-Final.pdf.

By Courtney Radsch

Courtney Radsch is an American Journalist. She holds a Ph.D. in international relations and is author of Cyberactivism and Citizen Journalism in Egypt: Digital Dissidence and Political Change. She has also worked as the advocacy director for the Committee to Protect Journalists until 2021.

View all of Courtney Radsch's posts.

Courtney Radsch (17/12/2021). Hash/Hash-database. In Belli, L.; Zingales, N. & Curzi, Y. (Eds.), Glossary of Platform Law and Policy Terms (online). FGV Direito Rio. https://platformglossary.info/hash-hash-database/.

References

Related

By Courtney Radsch

Leave a comment Cancel reply