3. Evaluation

Version 11.1 by William OGrady on 2024/03/25 14:15

In Evaluation, we describe a study to evaluate the effectiveness of the system's information retrieval capabilities.

Contents:

a. Prototype

b. Pilot Study

c. Test

Setup

Concrete Assignment: The experiment will consist of groups of 3-4 people that will be given the opportunity to interact with NAO in a time-slot of 10 mins. In this time, the participants will be tasked to retrieve certain information. Due to time and language constraints of the NAO, we are unable to test Dutch PwDs. To simulate the effect of early-stage dementia symptoms, the NAO consists of data with a pre-defined persona. The participants will be tasked to retrieve key elements of the persona by interacting with the NAO. Seeing as this is not their own history, it will be as if they have 'forgotten' this information. After the 10 minutes of interaction with NAO, the test group will be given two evaluation surveys. The first evaluation survey is to see how much of the information was retained and discovered of the persona through the use of the NAO. The second evaluation survey is to allow the participant to indicate the emotional experience with the NAO.

NOTES;

We are testing out use-case 2.

Setup:

We get random test persons. The person get the role of someone with early stage dementia who has forgotten relationship (we might have to specifically focus on memory). 

the person's goal is to find out as much information as possible from the memory bank of a person don't know.

we mainly test if the robot and the implemtation is effective in memory retrieval. 

We ONLY test with students, not PwD.

PILOT STUDY!!!

Task list:

0. What do we want to evaluate?

  a. work out what use case we want to evaluate and what assumptions/claim we want to test.

1. Write down test proposal & content

  a. introduction information

  b. how it gets processed

  c. informed consent and they can leave any time etc

  d. questions for evaluation of system

  e. data analysis we are going to do

2. Ethical approval from that one board at the TU Delft

3. implement test for robot

4. evaluate with students