b. Test

Last modified by Manali Shah on 2023/04/11 18:38

1. Introduction

We aim to measure the effectiveness of a robot with interactive storytelling, which provides personalization and opportunities for interaction and activities with the family members of the patients. The control situation was a storytelling robot which narrates the story without any interaction or personalization. We aim to measure the claims made earlier using a modified Godspeed questionnaire. The questions added were:

1. The mood of the patient after the meal with storytelling.

2. The feedback of the patient for the story.

3. The enjoyment level of the patient (given by caregiver)

4. Was the meal completed? (given by caregiver)

5. Time taken to complete the meal: Too much time could mean the patient did not enjoy the meal, or that they were too engaged and hence it took longer. A critical analysis is needed to evaluate this measure.

The negative effects of Pepper were also measured in the questionnaire (covered in Godspeed)

1. Pepper was annoying.

2. Pepper was not human.

3. Pepper was disturbing.

2. Method

2.1 Participants

A total of 14 TU Delft students participated in the study. There were 9 male and 5 female participants. Due to limited time, targeted users (patients with Dementia) could not be included in the study, and all participants were students.

2.2 Experimental design

The within-subject experimental design was chosen, due to the limited number of participants. So each participant was made to talk to the robot twice. The group was divided into half: the first half interacted with the robot with non-interactive storytelling first (control scenario), and then spoke to the robot with interactive storytelling. For the second half of the group, this order was reversed. This was done to balance any carryover bias of talking to one robot before the other.

2.3 Tasks

The Pepper robot was powered on and connected to the laptop before the participants came in. The participants had to sign the consent form, talk to the robots (twice each) and fill in the questionnaire (twice each).

2.4 Measures

The experiment measured the differences between the non interactive storytelling robot (control situation) versus the interactive storytelling robot (experimental situation). After each interaction, the participant filled a questionnaire about how their experience with questions which could be answered on a scale of 1 to 5. The link to the questionnaire is here.

The answers to both questionnaires were recorded, and the p-value was calculated to find the significance of the differences (if any).

2.5 Procedure

For the experiment, the following steps were performed for each participant:

1. Welcome the participant, and explain their tasks.

2. Make them sign the consent form, making them aware of the data being collected.

3. Power the robot version on depending on which group of participants they are in.

4. Make them interact with the first version of the robot.

5. Once completed, give them a questionnaire to document and fill their experience.

6. Power the second version of the robot they have to interact with. 

7. Once completed, give them the same questionnaire to fill in their experience.

8. The experiment is now completed and they can leave.

The non interactive stories and interactive stories were hard-coded for now. They were personalized based on a made-up scenario. This must be changed according to the patient's experiences. The same story was used for both situations, only difference being: the non interactive robot narrated the story without any scope for interaction between family members, whereas the interactive robot used the same story to promote engagement with the family.

2.6 Material

The following materials were used during the experiments:

1. Laptops for consent forms, questionnaires and the code.

2. The robot pepper.

3. Results

With the answers from both questionnaires (i.e the ones in the control and experimental situations), the following means were observed for the questions with ratings on a 1 to 5 scale: 

1681231071484-260.png

The x axis denotes the question number from the questionnaire (these questions can also be seen in the table below) and the y axis denotes the mean rating of that question calculated from all participants. We can see that interactive storytelling has a higher mean in most questions than non interactive storytelling, which implies a more positive response from the participants towards the robot with interactive storytelling. This shows that the interactive storytelling robot might elicit a more positive response from patients with dementia. 

To further find if our results were significant, we performed a one tailed paired sample t test, and the table below summarizes the p values obtained for all the questions (robot characteristics).

Qs NumRobot Characteristics (Questions from Questionnaire)p-value
11- Fake, 5- Natural0.04113
21 - Machine Like, 5 - Human like0.20205
31 - Unconscious, 5 - conscious0.14581
41 - Artificial, 5 - Lifelike0.09371
51 - Moving Rigidly, 5 - Moving elegantly0.15096
61- Dead, 5- Alive0.5
71 - Stagnant, 5 - Lively0.30576
81 - Mechanical, 5 - Organic0.40923
91 - Artificial, 5 - Lifelike0.26641
101 - Inert, 5 - Interactive0.00495
111- Apathetic, 5 - Responsive0.00759
121- Dislike, 5- Like0.00143
131 - Unfriendly, 5 - Friendly0.10392
141 - Unkind, 5 - Kind0.02226
151 - Unpleasant, 5 - Pleasant0.00845
161 - Awful, 5 - Nice0.00329
171- Ignorant, 5- Knowledgable0.13154
181 - Irresponsible, 5 - Responsible0.01425
191 - Unintelligent, 5 - Intelligent0.37610
201 - Foolish, 5 - Sensible0.06823
211- Anxious, 5- Relaxed0.16778
221 - Agitated, 5 - Calm0.05548
231 - Quiescent, 5 - Surprised0.04806
241 - Demotivated, 5 - Motivated0.10392
25Mood of the patient after the activity.0.00055
26Patient's feedback about the story experience0.00037
27Patient's enjoyment0.00016
28Did the patient complete the activity?NA
29How many minutes did the patient take to complete the activity?NA

With a p-value chosen to be 0.05, the questions which received a p value less than this, have been highlighted in the table. We observe that 12/27 questions come out to be statistically significant, which include all 3 custom questions (Qs 25,26,27). We could not determine the results of the last two questions (i.e, whether the patient completed the activity or not, and the time taken to complete the activity) because of the limited scope of the experiment. The participants were neither dementia patients, nor were they actually eating: so it seemed irrelevant to base any results on these questions. 

The following graphs show the difference in means for the statistically significant questions:

https://lh6.googleusercontent.com/nTIJLrcxdbvmHRNU7AqEsBcRRH_29l73c4o1dbeVDAwvC9sKoupoUEHvoCqf3_LIAD2Uk51kRURT6jj5sMxTDnexvBHbWbyy9e-wSLmIyvowqEcNmeacWsCdbSXjGg0vVwtcPRg8hF_cSrtetKObNmITTg=s2048

Difference in means for Questions in Godspeed questionnaire

https://lh4.googleusercontent.com/I3voGI5LPNJH_jm41KqgOFHY1PixCJRBbZJosr0KlcuLrTlA7dC8dod_rU0Oex5kkjmADyc9e4rcNkhXsBV7IMPe-vVXj5gq2i7KOclqr_q5UdVijJe4f1dX9MWaejgSPdFkDlskEKrlG7djOYGMDFFUEg=s2048

Difference in means for custom questions

4. Discussion

From the above results, we do observe a difference between interactive and non interactive storytelling. Overall, the interactive storytelling robot was perceived to be more friendly and responsive, which could make a difference for patients with dementia. It was significantly more interactive than the non interactive storytelling robot, and promoted conversations between the patient and the family, which is beneficial for the patient. Though it was not significantly more intelligent than the robot in the control situation, the robot with interactive storytelling was perceived to be more amicable and human-like. As mentioned above, meal completion and time taken to complete the meal could not be judged due to limited scope of the experiment. Thus, these give answers to the research questions in an encouraging and hopeful direction.
 

However, in the statistical testing we do not account for the fact that there are multiple questions, and the p-value might need to be changed in order to factor this in. In other words, we are testing multiple features on the same data, which increases the probability of atleast some features being significant (even if they are not). Hence, ideally we must divide the threshold p-value by 27 (for 27 questions) for a more accurate result. Another limitation lies in the fact that we could not experiment with actual patients: our participants were students who took the place of these patients. This could have affected judgements of the final results. Finally, the interactive storytelling requires templates of stories from the patient's life for an biographical story experience: this could raise privacy concerns. This also needs investment of time and work from the side of the family, as the patient with dementia might not be able to recall the stories of their past.

All in all, factoring in the limitations, we do see promising results in this research towards the well-being of patients with dementia. More relevant experimentation would give more accurate results, and formative assessment at each stage could push the research in the direction which is most beneficial for the patients. In the future, more extensive research must be performed with patients, and using stories curated to their experiences. For most reliable results, the experiment should be organic, with the patient's family members involved, and in an environment where the robot will actually be deployed. 

5. Conclusions

In this report, we show performed research on how an intervention with robots can help patients with dementia. Stories about a patient's life and past can help promote memory retention and interaction about this with people around can help with mood management, and to connect with others. Using this, we built a robot for interactive biographical storytelling i.e, a robot which uses the patient's past experiences to spark conversations and interactions with family members. We conducted experiments using a non interactive storytelling robot as the control situation, and observe hopeful results towards the well-being of patients with dementia. These experiments have their limitations (as mentioned above) and can be improved in future work, but surely give an optimistic start to future research in the field.