Changes for page 4. Evaluation Methods

Last modified by Manali Shah on 2023/04/10 12:28

From version 3.1
edited by Manali Shah
on 2023/03/22 23:08
Change comment: There is no comment for this version
To version 6.1
edited by Manali Shah
on 2023/03/30 18:43
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -1,15 +1,12 @@
1 -The following steps will be used to design and evaluate the prototype proposed against the corresponding control condition:
1 +The following steps were used to design and evaluate the prototype proposed against the corresponding control condition:
2 2  
3 -~1. Confirm the prototype: The prototype for the scenario to be tested, and the control situation will first be setup, and preliminary testing will be done by the team members. This includes the robots with and without interactive storytelling which should be confirmed and working.
3 +~1. Confirm the prototype: For the pilot study, the scenario to be tested, and the control situation were setup at the Insyght Lab at TU Delft, and preliminary testing was done by the team members. This includes the robots with and without interactive storytelling which were confirmed and working. The voice input and touch input to the robot were verified.
4 4  
5 -2. Develop Questions:
5 +2. Develop Questions: We now develop the metrics on which the robot must be evaluated. We decided to use a modified version of the Godspeed questionnaire, which each participant was made to fill after interacting with the robot. This questionnaire has been elaborated below.
6 6  
7 -3. Design Methods
7 +3. Invite participants: Due to limited time and resources, patients with dementia (the actual users) could not be used for the study. We instead use TU Delft students to test the prototype.
8 8  
9 -4. Implement and adapt:
10 10  
11 -5. Make decisions:
12 -
13 13  **Research Question**
14 14  
15 15  "Is interactive storytelling more engaging and beneficial than storytelling in the third person for persons suffering from dementia?"
... ... @@ -23,9 +23,19 @@
23 23  
24 24  **Summative Evaluation**
25 25  
26 -We will evaluate the prototype's effectiveness at the end of the experiment, i.e whether interactive storytelling was beneficial as compared to non interactive storytelling. Thus, we follow summative evaluation. Using a questionnaire, we will try to assess the usefulness and effectiveness of the robot. Due to limited time of the course, this will be the last evaluation. However, in the absence of time constraints, we would need to do a formative evaluation to get feedback for the next versions of the robot.
23 +We will evaluate the prototype's effectiveness at the end of the experiment, i.e whether interactive storytelling was beneficial as compared to non interactive storytelling. Since we are comparing two robots, we follow summative evaluation. Using a questionnaire, we will try to assess the usefulness and effectiveness of the robot. Due to limited time of the course, this will be the last evaluation. However, in the absence of time constraints, we would need to do a formative evaluation to get feedback for the next versions of the robot.
27 27  
28 28  
26 +**Questionnaire**
27 +
28 +We used a modified version of the Godspeed questionnaire for our evaluation [1]. It measures the anthropomorphism, animacy, likeability, intelligence, and safety of the robot. This uses a Likert scale where the user must rate questions as a number between 1 and 5; both numbers being at opposite poles. To measure whether patients with dementia completed the activity they were meant to do, and to evaluate whether storytelling made a difference to their meal, we added the following questions:
29 +
30 +
31 +-modified godspeed questionnaire for robot
32 +
33 +-statistical test (p value) for evaluation
34 +
35 +
29 29  **Prototype**
30 30  
31 31  We present a low fidelity prototype of the robot, which means a simple demonstration of the initial stages of the robot, meant for formative feedback. We wizard-of-oz the approach, and for now just present one story (in interactive and non interactive modes) for purposes of the experiment. The final robot is expected to have various templates of stories.
... ... @@ -37,3 +37,7 @@
37 37  **Since we don't have many participants, should we skip the statistical test? Can we just report average values of responses for both scenarios?**
38 38  
39 39  **Questionnaire should be a formal one, or should we ask 4-5 questions through Pepper? Or both?**
47 +
48 +
49 +
50 +[1]C. Bartneck, D. Kuli´c, E. Croft, and S. Zoghbi, “Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots,” International Journal of Social Robotics, vol. 1, no. 1, p. 71–81, 2008.