Wiki source code of b. Test

Last modified by Manali Shah on 2023/04/11 18:38

Hide last authors
Ruud de Jong 1.1 1 = 1. Introduction =
2
Manali Shah 6.2 3 We aim to measure the effectiveness of a robot with interactive storytelling, which provides personalization and opportunities for interaction and activities with the family members of the patients. The control situation was a storytelling robot which narrates the story without any interaction or personalization. We aim to measure the claims made earlier using a modified Godspeed questionnaire. The questions added were:
Ruud de Jong 1.1 4
Manali Shah 4.2 5 ~1. The mood of the patient after the meal with storytelling.
Ruud de Jong 1.1 6
Manali Shah 4.2 7 2. The feedback of the patient for the story.
Manali Shah 4.1 8
Manali Shah 4.2 9 3. The enjoyment level of the patient (given by caregiver)
Manali Shah 4.1 10
Manali Shah 4.2 11 4. Was the meal completed? (given by caregiver)
12
13 5. Time taken to complete the meal: Too much time could mean the patient did not enjoy the meal, or that they were too engaged and hence it took longer. A critical analysis is needed to evaluate this measure.
14
Manali Shah 6.2 15 The negative effects of Pepper were also measured in the questionnaire (covered in Godspeed)
Manali Shah 4.2 16
Manali Shah 6.1 17 ~1. Pepper was annoying.
18
19 2. Pepper was not human.
20
21 3. Pepper was disturbing.
22
23
Ruud de Jong 1.1 24 = 2. Method =
25
26
27 == 2.1 Participants ==
28
Manali Shah 7.1 29 A total of 14 TU Delft students participated in the study. There were 9 male and 5 female participants. Due to limited time, targeted users (patients with Dementia) could not be included in the study, and all participants were students.
Ruud de Jong 1.1 30
31 == 2.2 Experimental design ==
32
Manali Shah 7.1 33 The within-subject experimental design was chosen, due to the limited number of participants. So each participant was made to talk to the robot twice. The group was divided into half: the first half interacted with the robot with non-interactive storytelling first (control scenario), and then spoke to the robot with interactive storytelling. For the second half of the group, this order was reversed. This was done to balance any carryover bias of talking to one robot before the other.
Ruud de Jong 1.1 34
35 == 2.3 Tasks ==
36
Manali Shah 4.2 37 The Pepper robot was powered on and connected to the laptop before the participants came in. The participants had to sign the consent form, talk to the robots (twice each) and fill in the questionnaire (twice each).
Ruud de Jong 1.1 38
39 == 2.4 Measures ==
40
Manali Shah 7.2 41 The experiment measured the differences between the non interactive storytelling robot (control situation) versus the interactive storytelling robot (experimental situation). After each interaction, the participant filled a questionnaire about how their experience with questions which could be answered on a scale of 1 to 5. The link to the questionnaire is [[here>>https://tudelft.eu.qualtrics.com/jfe/preview/previewId/e0402219-f101-4c82-b651-98bbf7cdcc58/SV_8Gtn6s6loYOYXl4?Q_CHL=preview&Q_SurveyVersionID=current]].
Ruud de Jong 1.1 42
Manali Shah 8.1 43 The answers to both questionnaires were recorded, and the p-value was calculated to find the significance of the differences (if any).
Manali Shah 6.1 44
Ruud de Jong 1.1 45 == 2.5 Procedure ==
46
Manali Shah 3.1 47 For the experiment, the following steps were performed for each participant:
Ruud de Jong 1.1 48
Manali Shah 3.1 49 ~1. Welcome the participant, and explain their tasks.
50
51 2. Make them sign the consent form, making them aware of the data being collected.
52
53 3. Power the robot version on depending on which group of participants they are in.
54
55 4. Make them interact with the first version of the robot.
56
57 5. Once completed, give them a questionnaire to document and fill their experience.
58
59 6. Power the second version of the robot they have to interact with.
60
61 7. Once completed, give them the same questionnaire to fill in their experience.
62
63 8. The experiment is now completed and they can leave.
64
Manali Shah 7.1 65 The non interactive stories and interactive stories were hard-coded for now. They were personalized based on a made-up scenario. This must be changed according to the patient's experiences. The same story was used for both situations, only difference being: the non interactive robot narrated the story without any scope for interaction between family members, whereas the interactive robot used the same story to promote engagement with the family.
Manali Shah 3.1 66
Ruud de Jong 1.1 67 == 2.6 Material ==
68
Manali Shah 4.1 69 The following materials were used during the experiments:
Ruud de Jong 1.1 70
Manali Shah 4.1 71 ~1. Laptops for consent forms, questionnaires and the code.
72
73 2. The robot pepper.
74
Ruud de Jong 1.1 75 = 3. Results =
76
Manali Shah 8.1 77 With the answers from both questionnaires (i.e the ones in the control and experimental situations), the following means were observed for the questions with ratings on a 1 to 5 scale:
Ruud de Jong 1.1 78
Manali Shah 9.1 79 [[image:1681231071484-260.png||height="326" width="444"]]
Manali Shah 8.1 80
81 The x axis denotes the question number from the [[questionnaire>>https://tudelft.eu.qualtrics.com/jfe/preview/previewId/e0402219-f101-4c82-b651-98bbf7cdcc58/SV_8Gtn6s6loYOYXl4?Q_CHL=preview&Q_SurveyVersionID=current]] (these questions can also be seen in the table below) and the y axis denotes the mean rating of that question calculated from all participants. We can see that interactive storytelling has a higher mean in most questions than non interactive storytelling, which implies a more positive response from the participants towards the robot with interactive storytelling. This shows that the interactive storytelling robot might elicit a more positive response from patients with dementia.
82
83 To further find if our results were significant, we performed a one tailed paired sample t test, and the table below summarizes the p values obtained for all the questions (robot characteristics).
84
85 (% style="width:509px" summary="Table showing obtained p-values for all questions" %)
86 |=(% style="width: 47px;" %)Qs Num|=(% style="width: 279px;" %)Robot Characteristics (Questions from Questionnaire)|=(% style="width: 179px;" %)p-value
87 |(% style="width:47px" %)1|(% style="width:279px" %)//**1- Fake, 5- Natural**//|(% style="width:179px" %)//**0.04113**//
88 |(% style="width:47px" %)2|(% style="width:279px" %)1 - Machine Like, 5 - Human like|(% style="width:179px" %)0.20205
89 |(% style="width:47px" %)3|(% style="width:279px" %)1 - Unconscious, 5 - conscious|(% style="width:179px" %)0.14581
90 |(% style="width:47px" %)4|(% style="width:279px" %)1 - Artificial, 5 - Lifelike|(% style="width:179px" %)0.09371
91 |(% style="width:47px" %)5|(% style="width:279px" %)1 - Moving Rigidly, 5 - Moving elegantly|(% style="width:179px" %)0.15096
92 |(% style="width:47px" %)6|(% style="width:279px" %)1- Dead, 5- Alive|(% style="width:179px" %)0.5
93 |(% style="width:47px" %)7|(% style="width:279px" %)1 - Stagnant, 5 - Lively|(% style="width:179px" %)0.30576
94 |(% style="width:47px" %)8|(% style="width:279px" %)1 - Mechanical, 5 - Organic|(% style="width:179px" %)0.40923
95 |(% style="width:47px" %)9|(% style="width:279px" %)1 - Artificial, 5 - Lifelike|(% style="width:179px" %)0.26641
96 |(% style="width:47px" %)10|(% style="width:279px" %)//**1 - Inert, 5 - Interactive**//|(% style="width:179px" %)//**0.00495**//
97 |(% style="width:47px" %)11|(% style="width:279px" %)//**1- Apathetic, 5 - Responsive**//|(% style="width:179px" %)//**0.00759**//
98 |(% style="width:47px" %)12|(% style="width:279px" %)//**1- Dislike, 5- Like**//|(% style="width:179px" %)//**0.00143**//
99 |(% style="width:47px" %)13|(% style="width:279px" %)1 - Unfriendly, 5 - Friendly|(% style="width:179px" %)0.10392
100 |(% style="width:47px" %)14|(% style="width:279px" %)//**1 - Unkind, 5 - Kind**//|(% style="width:179px" %)//**0.02226**//
101 |(% style="width:47px" %)15|(% style="width:279px" %)//**1 - Unpleasant, 5 - Pleasant**//|(% style="width:179px" %)//**0.00845**//
102 |(% style="width:47px" %)16|(% style="width:279px" %)//**1 - Awful, 5 - Nice**//|(% style="width:179px" %)//**0.00329**//
103 |(% style="width:47px" %)17|(% style="width:279px" %)1- Ignorant, 5- Knowledgable|(% style="width:179px" %)0.13154
104 |(% style="width:47px" %)18|(% style="width:279px" %)//**1 - Irresponsible, 5 - Responsible**//|(% style="width:179px" %)//**0.01425**//
105 |(% style="width:47px" %)19|(% style="width:279px" %)1 - Unintelligent, 5 - Intelligent|(% style="width:179px" %)0.37610
106 |(% style="width:47px" %)20|(% style="width:279px" %)1 - Foolish, 5 - Sensible|(% style="width:179px" %)0.06823
107 |(% style="width:47px" %)21|(% style="width:279px" %)1- Anxious, 5- Relaxed|(% style="width:179px" %)0.16778
108 |(% style="width:47px" %)22|(% style="width:279px" %)1 - Agitated, 5 - Calm|(% style="width:179px" %)0.05548
109 |(% style="width:47px" %)23|(% style="width:279px" %)//**1 - Quiescent, 5 - Surprised**//|(% style="width:179px" %)//**0.04806**//
110 |(% style="width:47px" %)24|(% style="width:279px" %)1 - Demotivated, 5 - Motivated|(% style="width:179px" %)0.10392
111 |(% style="width:47px" %)25|(% style="width:279px" %)//**Mood of the patient after the activity.**//|(% style="width:179px" %)//**0.00055**//
112 |(% style="width:47px" %)26|(% style="width:279px" %)//**Patient's feedback about the story experience**//|(% style="width:179px" %)//**0.00037**//
113 |(% style="width:47px" %)27|(% style="width:279px" %)//**Patient's enjoyment**//|(% style="width:179px" %)//**0.00016**//
114 |(% style="width:47px" %)28|(% style="width:279px" %)Did the patient complete the activity?|(% style="width:179px" %)NA
115 |(% style="width:47px" %)29|(% style="width:279px" %)How many minutes did the patient take to complete the activity?|(% style="width:179px" %)NA
116
117 With a p-value chosen to be 0.05, the questions which received a p value less than this, have been highlighted in the table. We observe that 12/27 questions come out to be statistically significant, which include all 3 custom questions (Qs 25,26,27). We could not determine the results of the last two questions (i.e, whether the patient completed the activity or not, and the time taken to complete the activity) because of the limited scope of the experiment. The participants were neither dementia patients, nor were they actually eating: so it seemed irrelevant to base any results on these questions.
118
119 The following graphs show the difference in means for the statistically significant questions:
120
121
122 [[image:https://lh6.googleusercontent.com/nTIJLrcxdbvmHRNU7AqEsBcRRH_29l73c4o1dbeVDAwvC9sKoupoUEHvoCqf3_LIAD2Uk51kRURT6jj5sMxTDnexvBHbWbyy9e-wSLmIyvowqEcNmeacWsCdbSXjGg0vVwtcPRg8hF_cSrtetKObNmITTg=s2048||height="361" width="509"]]
123
124 Difference in means for Questions in Godspeed questionnaire
125
126
127 [[image:https://lh4.googleusercontent.com/I3voGI5LPNJH_jm41KqgOFHY1PixCJRBbZJosr0KlcuLrTlA7dC8dod_rU0Oex5kkjmADyc9e4rcNkhXsBV7IMPe-vVXj5gq2i7KOclqr_q5UdVijJe4f1dX9MWaejgSPdFkDlskEKrlG7djOYGMDFFUEg=s2048||height="312" width="518"]]
128
129 Difference in means for custom questions
130
Ruud de Jong 1.1 131 = 4. Discussion =
132
Manali Shah 8.1 133 From the above results, we do observe a difference between interactive and non interactive storytelling. Overall, the interactive storytelling robot was perceived to be more friendly and responsive, which could make a difference for patients with dementia. It was significantly more interactive than the non interactive storytelling robot, and promoted conversations between the patient and the family, which is beneficial for the patient. Though it was not significantly more intelligent than the robot in the control situation, the robot with interactive storytelling was perceived to be more amicable and human-like. As mentioned above, meal completion and time taken to complete the meal could not be judged due to limited scope of the experiment. Thus, these give answers to the research questions in an encouraging and hopeful direction.
134
Ruud de Jong 1.1 135
Manali Shah 8.1 136 However, in the statistical testing we do not account for the fact that there are multiple questions, and the p-value might need to be changed in order to factor this in. In other words, we are testing multiple features on the same data, which increases the probability of atleast some features being significant (even if they are not). Hence, ideally we must divide the threshold p-value by 27 (for 27 questions) for a more accurate result. Another limitation lies in the fact that we could not experiment with actual patients: our participants were students who took the place of these patients. This could have affected judgements of the final results. Finally, the interactive storytelling requires templates of stories from the patient's life for an biographical story experience: this could raise privacy concerns. This also needs investment of time and work from the side of the family, as the patient with dementia might not be able to recall the stories of their past.
137
138 All in all, factoring in the limitations, we do see promising results in this research towards the well-being of patients with dementia. More relevant experimentation would give more accurate results, and formative assessment at each stage could push the research in the direction which is most beneficial for the patients. In the future, more extensive research must be performed with patients, and using stories curated to their experiences. For most reliable results, the experiment should be organic, with the patient's family members involved, and in an environment where the robot will actually be deployed.
139
Ruud de Jong 1.1 140 = 5. Conclusions =
Manali Shah 8.1 141
142 In this report, we show performed research on how an intervention with robots can help patients with dementia. Stories about a patient's life and past can help promote memory retention and interaction about this with people around can help with mood management, and to connect with others. Using this, we built a robot for interactive biographical storytelling i.e, a robot which uses the patient's past experiences to spark conversations and interactions with family members. We conducted experiments using a non interactive storytelling robot as the control situation, and observe hopeful results towards the well-being of patients with dementia. These experiments have their limitations (as mentioned above) and can be improved in future work, but surely give an optimistic start to future research in the field.