Download as iCal file

Creating screening questionnaires from archival social media data

By Ortal Ashkenazi
Location Bloomfield 527
Advisor(s): Dr. Elad Yom-Tov and Dr. Ofra Amir
Academic Program: IE
Sunday 03 May 2020, 14:30 - 15:30

Screening questionnaires are used in medicine as a diagnostic aid. Creating them

is a long and expensive process, which might be automated through analysis of

social media posts related to symptoms and behaviors prior to diagnosis.  Here

we propose a method for generating a screening questionnaire for a given med-

ical condition from social media postings.  The method first identifies a cohort

of users through their posts in dedicated patient groups and a control group of

users who reported similar symptoms but did not report being diagnosed with

the condition of interest.  Posts made prior to diagnosis are used to generate

decision rules to differentiate between the different groups by clustering symp-

toms mentioned therein and training a decision tree.  We validate the generated

rules by correlating the rules with the scores given by medical doctors to match-

ing cases.  Questionnaires for three conditions (endometriosis, lupus, and gout),

were produced using the data of several hundreds of users from Reddit and rated

by doctors.  The average Pearson’s correlation between medical doctor’s scores

and the decision rules were 0.58 (endometriosis), 0.40 (lupus) and 0.27 (gout).

Our results suggest that the process of questionnaire generation can be, at least

partly, automated.  The generated questionnaires are advantageous in that they

are based on real-person experience, but are currently lacking in their ability to

capture the context, duration, and timing of symptoms.