Thanks to Innovate UK Grant, Hanzo Is Training AI to Recognise Inappropriate Behaviour on Slack

by Aidan Randle-Conde | Hanzo

Hanzo

Hanzo recently announced that we’ve been awarded a grant from Innovate UK’s Sustainable Innovation Fund to expand the reach of Hanzo Hold, our purpose-built solution for preserving and collecting Slack content. Innovate UK is providing over £130 million in funding across more than 1,100 projects to help businesses across the UK rebuild in the wake of the COVID-19 pandemic.

For our part, we’re looking to develop new capabilities in Hanzo Hold to proactively identify new workplace risks that have been exacerbated by the upswing in remote work. We’re extending Hanzo Hold’s functionality from preserving information for litigation or investigations to actually helping organisations spot the need for internal investigation and corrective action in the first instance, increasing its value for our clients.

As the lead data scientist here at Hanzo, I’m excited to give you an overview of what we’re planning to do with these grant funds.

HOW COLLABORATION PLATFORMS HAVE SOLVED SOME COVID-19 PROBLEMS … BUT CREATED OTHERS

Since the emergence of COVID-19 and working-from-home, organizations have dramatically expanded their use of collaboration platforms like Slack to keep their employees connected and productive despite their newly introduced physical distance. While those platforms are tremendously effective at their intended aim—streamlining communication—they also introduce novel risks, primarily in two categories:

information security risks that occur when confidential personal information (such as email addresses or identification numbers) or business knowledge (such as patents, contracts, or applications for funding) are intentionally or inadvertently shared in a public forum or to an audience that should not have access to it; and
human resources risks of managing discrimination, harassment, bullying, or other policy violations in new communication mediums. Such behaviors can lead to a creating a hostile work environment, whether that disparate treatment is based on race, gender, or other characteristics.

Organizations have developed mechanisms to recognise and interrupt these risks in a traditional office work environment. Where employees can be easily observed or overheard, they’re less likely to engage in these inappropriate behaviours and more likely to be detected so that corrective training or discipline can occur. But at the moment, organisations don’t have any easy way to translate those safeguards of physical proximity into the digital work environment of collaboration platforms.

To make matters worse, working remotely during a global pandemic has already given rise to considerable stress and isolation, leading to mental health challenges and increased vulnerabilities. When you’re not physically sharing space, it’s quite a bit harder to spot changes in employees’ responses or working relationships and to determine the causes of those changes. At the same time, much of the world is navigating a fraught political environment and an increased demand for diversity, equality, and inclusion, sparking resistance and pushback from some employees.

What organisations need is a way to identify the novel risks posed by collaboration platforms without relying on sharing a physical office space.

THE UK INNOVATE GRANT

We decided to seek funding for a project that would help us accelerate the use of artificial intelligence and advanced language analytics to aid better identification of issues and reduce these information security and HR risks. This research is critical due to the challenges posed by collaboration platforms like Slack. Every day, users create huge volumes of dynamic, complex content, sprinkled with shorthand, abbreviations, emojis, GIFs, reactions, and more. As conversations play out over time, it can be enormously difficult to glean the context of a communication from any individual message standing alone. In the document-centric world of ediscovery, collaboration platforms create uncertainties, including what exactly is a “document” or a “conversation thread”?

Additionally, analysing collaboration content over time could help organisations identify atypical or troubling patterns of behaviour, which they could then evaluate to determine whether intervention would be warranted to protect individual employees or the organisation as a whole. Fundamental to this analysis is not only developing a means to measure, but also identify trends over time.

PLANS FOR THE FUTURE

The existing analysis solutions are limited to basic keyword matching or message level rules which can miss the broader patterns of problematic behaviours that we’re interested in. We’re looking for a way to analyse the language in collaboration platforms like Slack and Microsoft Teams to identify personal or confidential information that’s related to data loss and leakage and to detect entities, sentiment, toxicity, emotion, and so on. Such analysis will enable recognition of patterns of abnormal behaviour, including discrimination and harassment. We’ll dive into these two distinct areas in upcoming blog posts.

It is important to point out that the software we’re developing isn’t intended to reach its own conclusions or automatically trigger any action. We’re designing a tool that will bring situations to light for a human to evaluate. We don’t intend for anyone to rely on these capabilities to abdicate their own responsibility to make decisions.

At the moment, we’re using public data sets—including the entirety of the English-language Wikipedia encyclopaedia—to train our models to recognise different types of language and to create “heat map” visualisations that will be immediately useful to our human reviewers. We’re particularly sensitive to the need to design solutions that will be easily extended to other themes or contexts; perhaps an organisation needs to identify references to on-the-job injuries or employee break times. I’m confident that we can meet these challenges, and I’m excited to see what other applications our customers will come up with. Watch this space for periodic follow-up posts reporting on our results and progress.

[View source.]