eRisk 2021:

Early risk prediction on the Internet


CLEF 2021 Workshop

Bucharest, 21-24 September 2021

Find Out More

CLEF eRisk 2021:

Early risk prediction on the Internet


eRisk explores the evaluation methodology, effectiveness metrics and practical applications (particularly those related to health and safety) of early risk detection on the Internet. Early detection technologies can be employed in different areas, particularly those related to health and safety. For instance, early alerts could be sent when a predator starts interacting with a child for sexual purposes, or when a potential offender starts publishing antisocial threats on a blog, forum or social network. Our main goal is to pioneer a new interdisciplinary research area that would be potentially applicable to a wide variety of situations and to many different personal profiles. Examples include potential paedophiles, stalkers, individuals that could fall into the hands of criminal organisations, people with suicidal inclinations, or people susceptible to depression.

Participate


This is the fifth year of eRisk and the lab plans to organize three tasks:

Task 1: Early Detection of Signs of Pathological Gambling

This is a new task. The challenge consists in performing a task on early risk detection of pathological gambling. The challenge consists of sequentially processing pieces of evidence and detect early traces of pathological gambling, also known as compulsive gambling or disordered gambling, as soon as possible. The task is mainly concerned about evaluating Text Mining solutions and, thus, it concentrates on texts written in Social Media. Texts should be processed in the order they were created. In this way, systems that effectively perform this task could be applied to sequentially monitor user interactions in blogs, social networks, or other types of online media.

The test collection for this task has the same format as the collection described in [Losada & Crestani 2016]. The source of data is also the same used for previous eRisks. It is a collection of writings (posts or comments) from a set of Social Media users. There are two categories of users, pathological gamblers and non-pathological gamblers, and, for each user, the collection contains a sequence of writings (in chronological order).

In 2019, we moved from a chunk-based release of data (used in 2017 and 2018) to a item-by-item release of data. We set up a server that iteratively gives user writings to the participating teams. More information about the server is given here. In 2021, the server will be used to provide the users' writings during the test stage.

This will be an "only test" task, no training data will be provided.

  • Test stage. The test stage will consist of a period of time where the participants have to connect to our server and iteratively get user writings and send responses. More information on the eRisk server that will be used at test time is available here.

Evaluation: The evaluation will take into account not only the correctness of the system's output (i.e. whether or not the user is a pathological gambler) but also the delay taken to emit its decision. To meet this aim, we will consider the ERDE metric proposed in [Losada & Crestani 2016] and other alternative evaluation measures. A full description of the evaluation metrics can be found at 2020's erisk overview.

The proceedings of the lab will be published in the online CEUR-WS Proceedings and on the conference website.

To have access to the collection all participants have to fill, sign and send a user agreement form (follow the instructions provided here). Once you have submitted the signed copyright form, you can proceed to register for the lab at CLEF 2021 Labs Registration site

Important Dates

Task 2: Early Detection of Signs of Self-Harm

This is a continuation of eRisk 2019's T2 and 2020's T1 task.

The challenge consists in performing a task on early risk detection of self-harm. The challenge consists of sequentially processing pieces of evidence and detect early traces of self-harm as soon as possible. The task is mainly concerned about evaluating Text Mining solutions and, thus, it concentrates on texts written in Social Media. Texts should be processed in the order they were created. In this way, systems that effectively perform this task could be applied to sequentially monitor user interactions in blogs, social networks, or other types of online media.

The test collection for this task has the same format as the collection described in [Losada & Crestani 2016]. The source of data is also the same used for previous eRisks. It is a collection of writings (posts or comments) from a set of Social Media users. There are two categories of users, self-harm and non-self-harm, and, for each user, the collection contains a sequence of writings (in chronological order).

In 2019, we moved from a chunk-based release of data (used in 2017 and 2018) to a item-by-item release of data. We set up a server that iteratively gives user writings to the participating teams. More information about the server is given here. In 2021, the server will be used to provide the users' writings during the test stage.

The task is organized into two different stages:

  • Training stage. Initially, the teams that participate in this task will have access to a training stage where we will release the whole history of writings for a set of training users (we will provide all writings of all training users), and we will indicate what users have explicitly mentioned that they have done self-harm. The participants can therefore tune their systems with the training data. In 2021, the training data for Task 1 is composed of all 2019's T2 users (T2 2019 training users + T2 2019 test users) plus the Test user of 2020's T1 task.
  • Test stage. The test stage will consist of a period of time where the participants have to connect to our server and iteratively get user writings and send responses. More information on the eRisk server that will be used at test time is available here.

Evaluation: The evaluation will take into account not only the correctness of the system's output (i.e. whether or not the user has committed self-harm) but also the delay taken to emit its decision. To meet this aim, we will consider the ERDE metric proposed in [Losada & Crestani 2016] and other alternative evaluation measures. A full description of the evaluation metrics can be found at 2020's erisk overview.

The proceedings of the lab will be published in the online CEUR-WS Proceedings and on the conference website.

To have access to the collection all participants have to fill, sign and send a user agreement form (follow the instructions provided here). Once you have submitted the signed copyright form, you can proceed to register for the lab at CLEF 2021 Labs Registration site

Important Dates

Task 3: Measuring the severity of the signs of depression

This is a continuation of eRisk 2019's T3 and 2020's T2 task. The task consists of estimating the level of depression from a thread of user submissions. For each user, the participants will be given a history of postings and the participants will have to fill a standard depression questionnaire (based on the evidence found in the history of postings).

The questionnaires are defined from Beck's Depression Inventory (BDI), which assesses the presence of feelings like sadness, pessimism, loss of energy, etc. The questionnaire has the following 21 questions:

	Instructions:

This questionnaire consists of 21 groups of statements. Please read each group of statements
carefully, and then pick out the one statement in each group that best describes the way you feel.
If several statements in the group seem to apply equally well, choose the highest
number for that group.

1. Sadness
0. I do not feel sad.
1. I feel sad much of the time.
2. I am sad all the time.
3. I am so sad or unhappy that I can't stand it.

2. Pessimism
0. I am not discouraged about my future.
1. I feel more discouraged about my future than I used to be.
2. I do not expect things to work out for me.
3. I feel my future is hopeless and will only get worse.

3. Past Failure
0. I do not feel like a failure.
1. I have failed more than I should have.
2. As I look back, I see a lot of failures.
3. I feel I am a total failure as a person.

4. Loss of Pleasure
0. I get as much pleasure as I ever did from the things I enjoy.
1. I don't enjoy things as much as I used to.
2. I get very little pleasure from the things I used to enjoy.
3. I can't get any pleasure from the things I used to enjoy.

5. Guilty Feelings
0. I don't feel particularly guilty.
1. I feel guilty over many things I have done or should have done.
2. I feel quite guilty most of the time.
3. I feel guilty all of the time.

6. Punishment Feelings
0. I don't feel I am being punished.
1. I feel I may be punished.
2. I expect to be punished.
3. I feel I am being punished.

7. Self-Dislike
0. I feel the same about myself as ever.
1. I have lost confidence in myself.
2. I am disappointed in myself.
3. I dislike myself.

8. Self-Criticalness
0. I don't criticize or blame myself more than usual.
1. I am more critical of myself than I used to be.
2. I criticize myself for all of my faults.
3. I blame myself for everything bad that happens.

9. Suicidal Thoughts or Wishes
0. I don't have any thoughts of killing myself.
1. I have thoughts of killing myself, but I would not carry them out.
2. I would like to kill myself.
3. I would kill myself if I had the chance.

10. Crying
0. I don't cry anymore than I used to.
1. I cry more than I used to.
2. I cry over every little thing.
3. I feel like crying, but I can't.

11. Agitation
0. I am no more restless or wound up than usual.
1. I feel more restless or wound up than usual.
2. I am so restless or agitated that it's hard to stay still.
3. I am so restless or agitated that I have to keep moving or doing something.

12. Loss of Interest
0. I have not lost interest in other people or activities.
1. I am less interested in other people or things than before.
2. I have lost most of my interest in other people or things.
3. It's hard to get interested in anything.

13. Indecisiveness
0. I make decisions about as well as ever.
1. I find it more difficult to make decisions than usual.
2. I have much greater difficulty in making decisions than I used to.
3. I have trouble making any decisions.

14. Worthlessness
0. I do not feel I am worthless.
1. I don't consider myself as worthwhile and useful as I used to.
2. I feel more worthless as compared to other people.
3. I feel utterly worthless.

15. Loss of Energy
0. I have as much energy as ever.
1. I have less energy than I used to have.
2. I don't have enough energy to do very much.
3. I don't have enough energy to do anything.

16. Changes in Sleeping Pattern
0. I have not experienced any change in my sleeping pattern.
la. I sleep somewhat more than usual.
lb. I sleep somewhat less than usual.
2a. I sleep a lot more than usual.
2b. I sleep a Iot less than usual.
3a. I sleep most of the day.
3b. I wake up 1-2 hours early and can't get back to sleep.

17. Irritability
0. I am no more irritable than usual.
1. I am more irritable than usual.
2. I am much more irritable than usual.
3. I am irritable all the time.

18. Changes in Appetite
0. I have not experienced any change in my appetite.
la. My appetite is somewhat less than usual.
lb. My appetite is somewhat greater than usual.
2a. My appetite is much less than before.
2b. My appetite is much greater than usual.
3a. I have no appetite 
3b. I crave food all the time.

19. Concentration Difficulty
0. I can concentrate as well as ever.
1. I can't concentrate as well as usual.
2. It's hard to keep my mind on anything for very long.
3. I find I can't concentrate on anything.

20. Tiredness or Fatigue
0. I am no more tired or fatigued than usual.
1. I get more tired or fatigued more easily than usual.
2. I am too tired or fatigued to do a lot of the things I used to do.
3. I am too tired or fatigued to do most of the things I used to do.

21. Loss of Interest in Sex
0. I have not noticed any recent change in my interest in sex.
1. I am less interested in sex than I used to be.
2. I am much less interested in sex now.
3. I have lost interest in sex completely
at all.

								

This task aims therefore at exploring the viability of automatically estimating the severity of multiple symptoms associated with depression. Given the user's history of writings, the algorithms have to estimate the user's response to each individual question. We collected questionnaires filled by Social Media users together with their history of writings (we extracted each history of writings right after the user provided us with the filled questionnaire). The questionnaires filled by the users (ground truth) will be used to assess the quality of the responses provided by the participating systems.

The participants will be given a dataset with multiple users (for each user, his history of writings is provided) and they will be asked to produce a file with the following structure:

username1 answer1 answer2 .... answer21
username2 ....
....
								

Each line has the username and 21 values. These values correspond with the responses to the questions above (the possible values are 0, 1a, 1b, 2a, 2b, 3a, 3b -for questions 16 and 18- and 0, 1, 2, 3 -for the rest of the questions-).

The 2021 participants will be given 2020's and 2019's questionnaires and the golden truth responses and, thus, those users can be used for training purposes.

Evaluation will be based on:

  • the overlapping between the questionnaire filled by the real user and the questionnaire filled by the system (number of correct responses).

  • the absolute difference between the levels of depression obtained from both questionnaires (level of depression obtained from the real questionnaire vs level of depression obtained from the estimated questionnaire). The level of depression is simply obtained by summing the numeric values of the responses to the individual questions. This gives an integer value in the range 0-63.

  • the depression level obtained from this questionnaire is regularly used to categorize users as: minimal depression (0-9), mild depression (10-18), moderate depression (19-29), and severe depression (30-63). A third method of evaluation will consist of assessing the systems in terms of how many users are correctly categorized (automatic questionnaire vs real questionnaire).

To have access to the collection all participants have to fill, sign and send a user agreement form (follow the instructions provided here). Once you have submitted the signed copyright form, you can proceed to register for the lab at CLEF 2021 Labs Registration site

16 NOV
  • Registration for lab opens
  • 16/11/2020

30 NOV
  • Release of the training data (T2,T3)
  • 30/11/2020

01 FEB
  • Release of the data (T3). T1, T2: Beginning of test stage (server opens).
  • 01/02/2021

16 APR
  • T1, T2: End of test stage (server closes). T3: deadline for submitting participants' results
  • 16/04/2021

07 MAY
  • Release of evaluation results to all participants
  • 07/05/2021

28 MAY
  • Task participant papers due
  • 28/05/2021

11  JUN
  • Notification of acceptance
  • 11/06/2021

02 JUL
  • Camera ready. Task participant papers
  • 02/07/2021

Programme


Thursday Sep 23th

eRisk Session 1 (15:30-17:00 GMT+3)

Chair: Patricia Martín-Rodilla

15:30-16:00

Overview of eRisk. Javier Parapar, Patricia Martín-Rodilla, David E. Losada, Fabio Crestani, 

Overview of eRisk at CLEF 2021: Early Risk Prediction on the Internet (Extended Overview)

16:00-16:20

Juan Martín Loyola, Sergio Burdisso, Horacio Thompson, Leticia Cagnina, Marcelo Errecalde

UNSL at eRisk 2021: A Comparison of Three Early Alert Policies for Early Risk Detection       

16:20-16:40

Diana Inkpen, Ruba Skaik, Prasadith Buddhitha, Dimo Angelov, Maxwell Thomas Fredenburgh. 

uOttawa at eRisk 2021: Automatic Filling of the Beck's Depression Inventory Questionnaire using Deep Learning

16:40-17:00

Diego Maupomé, Maxime D. Armstrong, Fanny Rancourt, Thomas Soulas, Marie-Jean Meurs

Early Detection of Signs of Pathological Gambling, Self-Harm and Depression through Topic Extraction and Neural Networks



eRisk Session 2 (17:30-19:00 GMT+3)

Chair: David E. Losada

17:30-17:50

Hassan Alhuzali, Tianlin Zhang, Sophia Ananiadou

Predicting Sign of Depression via Using Frozen Pre-trained Models and Random Forest Classifier

17:50-18:10

Tanmay Basu, Georgios V Gkoutos

Exploring the Performance of Baseline Text Mining Frameworks for Early Prediction of Self Harm Over Social Media

18:10-18:30

Raffaele Manna, Johanna Monti

UniOR NLP at eRisk 2021: Assessing the Severity of Depression with Part of Speech and Syntactic Features

18:30-18:50

Lucas Barros, Alina Trifan, José Luis Oliveira

VADER meets BERT: sentiment analysis for early detection of signs of self-harm through social mining



Friday Sep 24th

eRisk Session 3 (11:30-13:00 GMT+3)

Chair: Javier Parapar

11:30-11:50

Shih-Hung Wu, Zhao-Jun Qiu

A RoBERTa-based model on measuring the severity of the signs of depression   

11:50-12:10

Qamar Un Nisa, Rafi Muhammad

Towards transfer learning using BERT for early detection of self-harm of social media users       

12:10-12:30

Christoforos Spartalis, George Drosatos, Avi Arampatzis

Transfer Learning for Automated Responses to the BDI Questionnaire

12:30-12:50

Ana-Maria Bucur, Adrian Cosma, Liviu P. Dinu

Early Risk Detection of Pathological Gambling, Self-Harm and Depression Using BERT



eRisk Session 4 (14:00-15:30 GMT+3)

Chair: Fabio Crestani

14:00-14:20

Elena Campillo-Ageitos, Hermenegildo Fabregat, Lourdes Araujo, Juan Martinez-Romo

NLP-UNED at eRisk 2021: self-harm early risk detection with TF-IDF and linguistic features

14:20-14:40

Angelo Basile, Mara Chinea-Rios, Ana-Sabina Uban, Thomas Müller, Luise Rössler, Seren Yenikent, Berta Chulví, Paolo Rosso, Marc Franco-Salvador

UPV-Symanto at eRisk 2021: Mental Health Author Profiling for Early Risk Prediction on the Internet

14:40-15:00

Rui Pedro Lopes

CeDRI at eRisk 2021: A Naive Approach to Early Detection of Psychological Disorders in Social Media

15:00-15:30

eRisk Wrap-up Session


Organizers


More information


+34 881 016 027

CLEF 2021 Conference & CLEF initiative:

CLEF 2020
CLEF