Words don’t come easy: identifying perinatal self-harm in healthcare records

Dr Karyn Ayre is NIHR (National Institute of Health Research) Doctoral Research Fellow in the Section of Women's Mental Health. In this blog she discusses the value of a novel way to research self-harm in an extremely vulnerable group – pregnant women and new mothers with severe mental illness. This new approach extracts data from the text in healthcare records with the aim of understanding self-harm better and potentially informing the support for this group.   

Self-harm is a distressing and complex topic, and there are many different ways to define it. It can mean different things to diferent people, which can make it difficult to identify when someone needs support. This is a particularly serious issue for women during what’s known as ‘the perinatal period’ – a window of time encompassing pregnancy and the first year of their baby’s birth.

During this perinatal period, it’s estimated that between 5-14% of women experience thoughts of self-harm and evidence suggests perinatal self-harm is more common in women with serious mental illness. As such it’s crucial to understand how common it is in this group and possible risk factors that could help inform the support they need. 

Research Challenges

Currently, there still exists an evidence gap around acts of perinatal self-harm. While there has been plenty of research into rates of self-harm, especially in young people, comparatively little has so far focussed on those women who are pregnant, or those who have had a baby in the last year.

Some of this is related to the more general challenges of accuracy around the measurement of rates of self-harm. Surveys and interview studies are costly and time-consuming, while looking at A&E attendances often provides an under-estimate as most people who self-harm don’t attend hospital immediately.

Natural Language Processing as a solution

Our project explored a third option to measure rates: looking for mentions of self-harm in people’s healthcare records, using a technique known as Natural Language Processing (NLP).

While the number and length of these records makes it far too labour-intensive for any person to study them one by one, NLP uses computer coding to allow a researcher to identify mentions of a specific concept of interest (in this case, self-harm) within large volumes of electronic text.

Our project focused on developing an NLP tool for the purpose of detecting acts of perinatal self-harm in the healthcare records of women with serious mental illness. These records were de-identified, meaning all information that could potentially allow an individual to be recognised had been removed.  The tool set out to recognise mentions of acts of self-harm, as well as to classify whether the self-harm had actually occurred and whether it was current or in the past.

We created an NLP app capable of accurately identifying mentions of self-harm recorded in healthcare records of people accessing mental healthcare in South London during their perinatal period. In South London and Maudsley NHS Foundation Trust (SLaM), all people who access mental healthcare have electronic healthcare records and the NIHR Maudsley BRC Clinical Record Interactive Search system (or ‘CRIS’, for short) is an ethically-approved database of these de-identified records that can be used for research purposes.

The purpose of developing the NLP app was to enable an effective search for mentions of self-harm in the records created during the time someone was pregnant or within the year they delivered, without revealing their identity. Similar work has been done in the US, but our methodology is different and, to our knowledge, we are the first to do so using UK data.

Important learnings

Our research, published in PLOSONE, describes our thought process throughout the whole study. While the app performs well in terms of identifying mentions of self-harm in electronic healthcare records, it is far from perfect, as self-harm is an incredibly complex topic and there are lots of different ways to express it in language.

One key insight from the project was the high variability in the words and formulations used by clinicians in their note taking. Freedom to use different terms provides a richer description of the person’s experience but also makes this type of research trickier. Time is also a complicating issue, as we had to decipher whether the patient in question was referring to a recent act of self-harm or if it was a historic case. Trying to identify self-harm occurring within a relatively narrow timeframe (i.e. nine months of pregnancy and a year after birth) therefore makes this even harder.

With further development we should be able to more accurately tell the difference between historic and recent admissions of self-harm but, as it stands, the system is likely over-estimating the prevalence to a .

The ultimate goal of this technology is to help us better identify and understand the needs of mothers at high risk of self-harming during the perinatal period. The tool could potentially also be adapted to ascertain prevalence of self-harm in other groups, such as adolescent populations and among women with eating disorders.

If you’ve been affected by any of the topics covered in this article and would like more information on the symptoms and effects of self-harm, please visit the NHS website


Tags: research stories - CRIS -

By NIHR Maudsley BRC at 5 Aug 2021, 20:27 PM


Back to Blog List