eRisk 2024:

Early risk prediction on the Internet


CLEF 2024 Workshop

Grenoble, 9-12 September 2024

Find Out More

CLEF eRisk 2024:

Early risk prediction on the Internet


eRisk explores the evaluation methodology, effectiveness metrics and practical applications (particularly those related to health and safety) of early risk detection on the Internet. Early detection technologies can be employed in different areas, particularly those related to health and safety. For instance, early alerts could be sent when a predator starts interacting with a child for sexual purposes, or when a potential offender starts publishing antisocial threats on a blog, forum or social network. Our main goal is to pioneer a new interdisciplinary research area that would be potentially applicable to a wide variety of situations and to many different personal profiles. Examples include potential paedophiles, stalkers, individuals that could fall into the hands of criminal organisations, people with suicidal inclinations, or people susceptible to depression.

Participate


This is the fifth year of eRisk and the lab plans to organize three tasks:

Task 1: Search for symptoms of depression

This is a continuation of eRisk 2023' Task 1.

The task consists of ranking sentences from a collection of user writings according to their relevance to a depression symptom. The participants will have to provide rankings for the 21 symptoms of depression from the BDI Questionnaire. A sentence will be deemed relevant to a BDI symptom when it conveys information about the user's state concerning the symptom. That is, it may be relevant even when it indicates that the user is ok with the symptom.

We would release a TREC formatted sentence-tagged dataset (based on eRisk past data) together with the BDI questionnaire. Participants would be free to decide on the best strategy to derive queries from describing the BDI symptoms in the questionnaire.

After receiving the runs from the participating teams, we would create the relevance judgements with the help of human assessors using pooling. We will use the resulting qrels to evaluate the systems with classical ranking metrics (e.g. MAP, nDCG, etc.). This new corpus with annotated sentences would be a valuable resource with multiple applications beyond eRisk.

The task is organized into two different stages:

  • Submission stage. After the release of the datasets, the participants will have time to produce and upload to the FTP their TREC formatted runs. Each participant may upload up to 5 files corresponding with 5 systems to the FTP. The format of the run should be as follows:
    1	Q0	sentence-id-121	0001	10	myGroupNameMyMethodName
    1	Q0	sentence-id-234	0002	9.5	myGroupNameMyMethodName
    1	Q0	sentence-id-345	0003	9	myGroupNameMyMethodName
    ...
    21	Q0	sentence-id-456	0998	1.25	myGroupNameMyMethodName
    21	Q0	sentence-id-242	0999	1	myGroupNameMyMethodName
    21	Q0	sentence-id-347	1000	0.9	myGroupNameMyMethodName	
    
    That is, the participants should submit up to 1000 results sorted by estimated relevance for each of the 21 symptoms of the BDI-II questionarie. Each line contains: symptom_number, Q0, sentence-id, position_in_ranking, score, system_name.
  • Evaluation stage. Once that the submission stage is closed, the submitted runs will be used for obtaining the relevance judgements by classical pooling strategies using humman assessors. With those judgements systems will be evaluated.

The proceedings of the lab will be published in the online CEUR-WS Proceedings and on the conference website.

To have access to the collection all participants have to fill, sign and send a user agreement form (follow the instructions provided here). Once you have submitted the signed copyright form, you can proceed to register for the lab at CLEF 2024 Labs Registration site

Important Dates

Task 2: Early Detection of Signs of Anorexia

This is a continuation of eRisk 2018's T2 and 2019's T1 tasks.

The challenge consists in performing a task on early risk detection of anorexia. The challenge consists of sequentially processing pieces of evidence and detect early traces of anorexia as soon as possible. The task is mainly concerned about evaluating Text Mining solutions and, thus, it concentrates on texts written in Social Media. Texts should be processed in the order they were created. In this way, systems that effectively perform this task could be applied to sequentially monitor user interactions in blogs, social networks, or other types of online media.

The test collection for this task has the same format as the collection described in [Losada & Crestani 2016]. The source of data is also the same used for previous eRisks. It is a collection of writings (posts or comments) from a set of Social Media users. There are two categories of users, individuals suffering anorexia and control usuers, and, for each user, the collection contains a sequence of writings (in chronological order).

In 2019, we moved from a chunk-based release of data (used in 2017 and 2018) to a item-by-item release of data. We set up a server that iteratively gives user writings to the participating teams. More information about the server is given here. In 2024, the server will be used to provide the users' writings during the test stage.

The task is organized into two different stages:

  • Training stage. Initially, the teams that participate in this task will have access to a training stage where we will release the whole history of writings for a set of training users (we will provide all writings of all training users), and we will indicate what users have explicitly mentioned that they have been diagnosed with anorexia. The participants can therefore tune their systems with the training data. In 2024, the training data for Task 2 is composed of both 2018 and 2019 users.
  • Test stage. The test stage will consist of a period of time where the participants have to connect to our server and iteratively get user writings and send responses. More information on the eRisk server that will be used at test time is available here.

Evaluation: The evaluation will take into account not only the correctness of the system's output (i.e. whether or not the user is a pathological gambler) but also the delay taken to emit its decision. To meet this aim, we will consider the ERDE metric proposed in [Losada & Crestani 2016] and other alternative evaluation measures. A full description of the evaluation metrics can be found at 2021's erisk overview.

The proceedings of the lab will be published in the online CEUR-WS Proceedings and on the conference website.

To have access to the collection all participants have to fill, sign and send a user agreement form (follow the instructions provided here). Once you have submitted the signed copyright form, you can proceed to register for the lab at CLEF 2024 Labs Registration site

Important Dates

Task 3: Measuring the severity of the signs of Eating Disorders

This is a continuation of 2022 and 2023's Task 3. The task consists of estimating the level of features associated with a diagnosis of eating disorders from a thread of user submissions. For each user, the participants will be given a history of postings and the participants will have to fill a standard eating disorder questionnaire (based on the evidence found in the history of postings).

The questionnaires are defined from Eating Disorder Examination Questionnaire (EDE-Q) is a 28-item self-reported questionnaire adapted from the semi-structured interview Eating Disorder Examination (EDE). We will only use questions 1-12 and 19-28. It is designed to assess the range and severity of features associated with a diagnosis of eating disorder using 4 subscales (Restraint, Eating Concern, Shape Concern and Weight Concern) and a global score:

Instructions:

The following questions are concerned with the past four weeks (28 days) only. Please read each question carefully. Please answer all the questions. Thank you.

1. Have you been deliberately trying to limit the amount of food you eat to influence your shape or weight (whether or not you have succeeded)
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY					

2. Have you gone for long periods of time (8 waking hours or more) without eating anything at all in order to influence your shape or weight?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY

3. Have you tried to exclude from your diet any foods that you like in order to influence your shape or weight (whether or not you have succeeded)?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY

4. Have you tried to follow definite rules regarding your eating (for example, a calorie limit) in order to influence your shape or weight (whether or not you have succeeded)?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY
					  
5. Have you had a definite desire to have an empty stomach with the aim of influencing your shape or weight?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY
					  
6. Have you had a definite desire to have a totally flat stomach?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY

7. Has thinking about food, eating or calories made it very difficult to concentrate on things you are interested in (for example, working, following a conversation, or reading)?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY
					  
8. Has thinking about shape or weight made it very difficult to concentrate on things you are interested in (for example, working, following a conversation, or reading)?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY				

9. Have you had a definite fear of losing control over eating?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY	
					  					  
10. Have you had a definite fear that you might gain weight?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY	
					  
11. Have you felt fat?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY	
					  

12. Have you had a strong desire to lose weight?
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY	
					  
		
/* Questions 13 to 18 from EDQ v6.0 will not be used */ 		  					  					  
					  
19. Over the past 28 days, on how many days have you eaten in secret (ie, furtively)? ... Do not count episodes of binge eating.
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY	
					  
					  
20. On what proportion of the times that you have eaten have you felt guilty (felt that you’ve done wrong) because of its effect on your shape or weight? ... Do not count episodes of binge eating.
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY	
					  
					  
21. Over the past 28 days, how concerned have you been about other people seeing you eat? ... Do not count episodes of binge eating 
0. NO DAYS
1. 1-5 DAYS
2. 6-12 DAYS
3. 13-15 DAYS
4. 16-22 DAYS
5. 23-27 DAYS
6. EVERY DAY						  
					  		
					  				
22. Has your weight influenced how you think about (judge) yourself as a person? 
0. NOT AT ALL (0)
1. SLIGHTY (1)
2. SLIGHTY (2)
3. MODERATELY (3)
4. MODERATELY (4)
5. MARKEDLY (5)
6. MARKEDLY (6)
					  
					  					  
23. Has your shape influenced how you think about (judge) yourself as a person?  
0. NOT AT ALL (0)
1. SLIGHTY (1)
2. SLIGHTY (2)
3. MODERATELY (3)
4. MODERATELY (4)
5. MARKEDLY (5)
6. MARKEDLY (6)
					  		
					  				  									  
24. How much would it have upset you if you had been asked to weigh yourself once a week (no more, or less, often) for the next four weeks? 
0. NOT AT ALL (0)
1. SLIGHTY (1)
2. SLIGHTY (2)
3. MODERATELY (3)
4. MODERATELY (4)
5. MARKEDLY (5)
6. MARKEDLY (6)		
					  		
					  		
25. How dissatisfied have you been with your weight? 
0. NOT AT ALL (0)
1. SLIGHTY (1)
2. SLIGHTY (2)
3. MODERATELY (3)
4. MODERATELY (4)
5. MARKEDLY (5)
6. MARKEDLY (6)	
					  
					  
26. How dissatisfied have you been with your shape?
0. NOT AT ALL (0)
1. SLIGHTY (1)
2. SLIGHTY (2)
3. MODERATELY (3)
4. MODERATELY (4)
5. MARKEDLY (5)
6. MARKEDLY (6)
					  
					  
27. How uncomfortable have you felt seeing your body (for example, seeing your shape in the mirror, in a shop window reflection, while undressing or taking a bath or shower)? 
0. NOT AT ALL (0)
1. SLIGHTY (1)
2. SLIGHTY (2)
3. MODERATELY (3)
4. MODERATELY (4)
5. MARKEDLY (5)
6. MARKEDLY (6)
					  	
					  				  						  
28. How uncomfortable have you felt about others seeing your shape or figure (for example, in communal changing rooms, when swimming, or wearing tight clothes)? 
0. NOT AT ALL (0)
1. SLIGHTY (1)
2. SLIGHTY (2)
3. MODERATELY (3)
4. MODERATELY (4)
5. MARKEDLY (5)
6. MARKEDLY (6)
					  					  	

This task aims therefore at exploring the viability of automatically estimating the severity of multiple symptoms associated with eating disorders. Given the user's history of writings, the algorithms have to estimate the user's response to each individual question. We collected questionnaires filled by Social Media users together with their history of writings (we extracted each history of writings right after the user provided us with the filled questionnaire). The questionnaires filled by the users (ground truth) will be used to assess the quality of the responses provided by the participating systems.

The participants will be given a dataset with multiple users (for each user, its history of writings is provided) and they will be asked to produce a file with the following structure:

username1 answer1 answer2 .... answer28
username2 ....
....
								

Each line has the username and 22 values. These values correspond with the responses to the questions above (the possible values are 0,1,2,3,4,5,6).

The task is organized into two different stages:

  • Training stage. Initially, the teams that participate in this task will have access to a training stage where we will release the whole history of writings for a set of training users (we will provide all writings of all training users), together with their answers to the EDE-Q questionnaire. The participants can therefore tune their systems with the training data. In 2024, the training data for Task 3 is composed of the 2022 and 2023 users.
  • Test stage. A new set of users writing history will be provided to the participants. They have to use the trained models for producing the predictions to the EDE-Q questionnaire and upload the results in the aferomentioned format to the FTP.

Evaluation: The evaluation will take into the answers to the questionnaire from the systems and the actual response from the users. A full description of the evaluation metrics can be found at 2022's erisk overview.

To have access to the collection all participants have to fill, sign and send a user agreement form (follow the instructions provided here). Once you have submitted the signed copyright form, you can proceed to register for the lab at CLEF 2024 Labs Registration site

Ongoing schedule



20 DEC
  • Registration for lab opens
  • 20/12/2023

22 DEC
  • Release of the training data (T1,T2,T3) and test dataset for T1
  • 22/12/2023

05 FEB
  • Release of the test data (T3). T2: Beginning of test stage (server opens)
  • 05/02/2023

01 APR
  • T1 deadline for submitting participants' results to FTP
  • 01/04/2024

12 APR
  • T2 End of test stage (server closes). T3: deadline for submitting participants' results to FTP
  • 12/04/2024

01 MAY
  • Release of evaluation results to all participants
  • 1/05/2024

31 MAY
  • Task participant papers due
  • 31/05/2024

24 JUN
  • Notification of acceptance
  • 24/06/2024

08 JUL
  • Camera ready. Task participant papers
  • 08/07/2024

Programme


Monday Sep 9th

eRisk Session Task 1 (14:00-15:30 CET)

Chair:

14:00-14:10

Javier Parapar, Patricia Martín-Rodilla, David E. Losada, Fabio Crestani

Overview of eRisk at CLEF 2024: Early Risk Prediction on the Internet (Task 1)

14:10-14:20

Alba María Mármol-Romero, Adrián Moreno-Muñoz, Pablo Álvarez-Ojeda, Karla María Valencia-Segura, Eugenio Martínez-Cámara, Manuel García-Vega, Arturo Montejo-Ráez

SINAI at eRisk@ CLEF 2024: Approaching the Search for Symptoms of Depression and Early Detection of Anorexia Signs using Natural Language Processing.

14:20-14:30

Diego Maupomé, Yves Ferstler, Sébastien Mosser, Marie-Jean Meurs

Automatically Finding Evidence and Predicting Answers in Mental Health Self-Report Questionnaires

14:30-14:40

Beng Heng Ang, Sujatha Das Gollapalli, See-Kiong Ng

NUS-IDS@eRisk2024: Ranking Sentences for Depression Symptoms using Early Maladaptive Schemas and Ensembles

14:40-14:50

Raluca-Maria Hanciu

MindwaveML at eRisk 2024: Identifying Depression Symptoms in Reddit Users

14:50-15:00

Anna Barachanou, Filareti Tsalakanidou, Symeon Papadopoulos

REBECCA at eRisk 2024: Search for Symptoms of Depression Using Sentence Embeddings and Prompt-Based Filtering

15:00-15:10

David Guecha, Aaryan Potdar, Anthony Miyaguchi

DS@GT eRisk 2024: Sentence Transformers for Social Media Risk Assessment

15:10-15:20

Alejandro Pardo Bascuñana, Isabel Segura Bedmar

APB-UC3M at eRisk 2024: Natural Language Processing and Deep Learning for the Early Detection of Mental Disorders



Monday Sep 9th

Poster Session (15:30-16:30 CET)

All participants



Monday Sep 9th

eRisk Session Task 2 (16:30-18:00 CET)

Chair:

16:30-16:40

Javier Parapar, Patricia Martín-Rodilla, David E. Losada, Fabio Crestani

Overview of eRisk at CLEF 2024: Early Risk Prediction on the Internet (Task 2)

16:40-16:50

Prateek Sarangi, Sumit Kumar, Shraddha Agarwal, Tanmay Basu

A Natural Language Processing Based Framework for Early Detection of Anorexia via Sequential Text Processing

16:50-17:00

Oskar Riewe-Perła, Agata Filipowska

Combining Recommender Systems and Language Models in Early Detection of Signs of Anorexia

17:00-17:10

Horacio Thompson, Marcelo Errecalde

A Time-Aware Approach to Early Detection of Anorexia: UNSL at eRisk 2024

17:10-17:20

Ronghao Pan, José Antonio García-Díaz, Tomás Bernal-Beltrán, Rafael Valencia-Garcia

UMUTeam at eRisk@CLEF 2024: Fine-Tuning Transformer Models with Sentiment Features for Early Detection and Severity Measurement of Eating Disorders

17:20-17:30

Andreu Casamayor, Vicent Ahuir, Antonio Molina, Lluís-Felip Hurtado

ELiRF-VRAIN at eRisk 2024: Using LongFormers for Early Detection of Signs of Anorexia

17:30-17:40

Hermenegildo Fabregat, Daniel Deniz, Andres Duque, Lourdes Araujo, Juan Martinez-Romo

NLP-UNED at eRisk 2024: Approximate Nearest Neighbors with Encoding Refinement for Early Detecting Signs of Anorexia

17:40-17:50

Alba María Mármol-Romero, Adrián Moreno-Muñoz, Pablo Álvarez-Ojeda, Karla María Valencia-Segura, Eugenio Martínez-Cámara, Manuel García-Vega, Arturo Montejo-Ráez

SINAI at eRisk@ CLEF 2024: Approaching the Search for Symptoms of Depression and Early Detection of Anorexia Signs using Natural Language Processing.

17:50-18:00

Alejandro Pardo Bascuñana, Isabel Segura Bedmar

APB-UC3M at eRisk 2024: Natural Language Processing and Deep Learning for the Early Detection of Mental Disorders



Tuesday Sep 10th

eRisk Session Task 3 and Wrap-Up (11:10-12:40 CET)

Chair:

11:10-11:20

Javier Parapar, Patricia Martín-Rodilla, David E. Losada, Fabio Crestani

Overview of eRisk at CLEF 2024: Early Risk Prediction on the Internet (Task 3)

11:20-11:30

Alejandro Pardo Bascuñana, Isabel Segura Bedmar

APB-UC3M at eRisk 2024: Natural Language Processing and Deep Learning for the Early Detection of Mental Disorders

11:30-11:40

David Guecha, Aaryan Potdar, Anthony Miyaguchi

DS@GT eRisk 2024: Sentence Transformers for Social Media Risk Assessment

11:40-11:50

Diego Maupomé, Yves Ferstler, Sébastien Mosser, Marie-Jean Meurs

Automatically Finding Evidence and Predicting Answers in Mental Health Self-Report Questionnaires

11:50-12:00

Sachin Prasanna, Abhayjit Singh Gulati, Subhojit Karmakar, M Yoga Hiranmayi, Anand Kumar Madasamy

Measuring the severity of the signs of Eating Disorders using Machine Learning Techniques

12:00-12:10

Ronghao Pan, José Antonio García-Díaz, Tomás Bernal-Beltrán, Rafael Valencia-Garcia

UMUTeam at eRisk@CLEF 2024: Fine-Tuning Transformer Models with Sentiment Features for Early Detection and Severity Measurement of Eating Disorders

12:10-12:40

Javier Parapar, Patricia Martín-Rodilla, David E. Losada, Fabio Crestani

eRisk wrap-up session: feedback, closing and future.



Organizers


More information


+34 881 016 027

CLEF 2024 Conference & CLEF initiative:

CLEF 2024
CLEF

Funded by Big-eRisk: Predicción temprana de riesgos personales en conjuntos de datos masivos. Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación, Plan de Recuperación, Transformación y Resiliencia, Unión Europea-Next Generation EU PLEC2021-007662

Ministerio de Ciencia e Innovación, Agencia Estatal de Investigación, Plan de Recuperación, Transformación y Resiliencia, Unión Europea-Next Generation EU