b. Test

Last modified by Clemente van der Aa on 2023/04/08 17:42

Introduction

The purpose of this study is to evaluate the effectiveness of a socially intelligent dog-robot, Dogg0, in providing companionship and reducing stress levels for people with dementia (PwD). The study aims to test the hypothesis that the interactions with the robot will improve the mood of the PwD and enhance their trust in the robot. To achieve this, we will measure trustworthiness, the effect on the mood of the PwD, and the functionalities of the robot. These aspects will be assessed using a questionnaire filled out by participants immediately after the experiment.

Method:

The prototype was evaluated through an in-person experiment involving multiple participants. Since we cannot conduct the experiment with real PwD, fellow students who are also taking the course and others were recruited as participants. All data collected will be anonymized to maintain confidentiality. 

Experimental Design:

We used a within-subject design in which all participants interacted with the robot. 

Tasks:

Participants were instructed to interact with Dogg0 without prior knowledge of all its functionalities. They were free to engage with the robot as they wished.

Measures:

A trust score, as described in Gutalli et al. (2019) (Design, development and evaluation of a human-computer trust scale), the effect on the mood of the participant was measured using a questionnaire. The questionnaire consisted of sub-questions related to these aspects and used a 1-5 Likert Scale to capture the level of agreement and feelings towards these aspects.

Trust Score:

Screenshot.png

Figure 1: Factors for Human Robot Trust

According to Gulati et al. (2019), the trust people have in robots consist of 4 different factors:

1) The Percieved Risk of the Robot: This indicates how cautious people feel they have to be around the robot, or how risky they feel it is to interact with the robot. This score inverted shows how much people trust a robot.

2) The Benevolence of the Robot: This score shows how much people think a robot will act in their best interests.

3) The Competence of the Robot: This shows how well people think the robot is fit for its job.

4) The Reciprocity of the Robot: The Reciprocity score indicates how much people feel a connection with the robot.

Mood Score:

Our Mood Score is derived from the Oxford Happiness Questionnaire (Hills et al. ,The Oxford Happiness Questionnaire: a compact scale for the measurement of psychological well-being, (2002)). The Oxford Happiness Questionnaire correlates with personality variables like satisfaction with life, self-esteem and happiness. This score can be used to measure the effect of the interaction with Dogg0 on people's happiness.

Procedure:

The procedure was conducted as follows:

  1. Participants were welcomed and informed about the purpose of the study.
  2. Participants signed a consent form to indicate their willingness to participate and allow researchers to analyze the data gathered from the experiment.
  3. Participants interacted with the robot.
  4. Participants completed a questionnaire that assesed their mood and their trust in Dogg0.

Materials:

Two main materials were used in this study. First, a consent form was used to ensure that participants were willing to participate, and their privacy was protected. Second participants were exposed to the Miro robot, that did a pre-programmed routine. The robot was programmed using MiroCloud and had the same behavior for every participant.

3. Results

The experiment was conducted on 10 participants. It yielded the following results:

Picture16.pngPicture16.pngPicture16.png Figure 2: Trust Assesment of Dogg0Picture15.pngPicture15.pngPicture15.png Figure 3: Average Trust in Dogg0

Figure 2 shows the trust participants had in Dogg0. The height of the bar denotes the mean Likert score for the experiment. The error bars show the standard deviation of the score. Participants view Dogg0 as more than average competent (3.4) and percieved Dogg0 as not risky (3.6). Reciprocity and Benevolence both scored about a 3 on the Likert scale, which means that people did neither agree nor disagree that it was very Benevolent or Reciprocal. The final results show that on average (Figure 3) participants did trust Dogg0 a little bit (3.6).

The figures below show the individual responses per trust factor. Note that Risk Perception is not yet inverted here, to reflect the questionnaire better. Interesting was that for the competency assesment people overall did not think that the robot had all the functions they expected from a social companion robot. This might have to do with the limitations of programming in MiroCode, but could also point to a more structural problem with our design. On the other hand, participants were overall very positive that the robot could keep them good company as questions like "I think that the robot is effective in keeping me company" and "I can always rely on Dogg0 for keeping me company" were rated very high. Participants did feel like they had to be slightly cautious around Dogg0 (Figure 4), which again might have to do with the limitations of MiroCode.

Picture14.pngPicture14.pngPicture14.png Figure 4: Risk Perception of Dogg0

Picture20.pngFigure 5: Benevolence assesment

Picture17.pngPicture17.pngPicture17.png Figure 6: Competency assesmentPicture12.pngPicture12.pngPicture12.png Figure 7: Reciprocity assesment

Picture19.pngFigure 8: Mood assesment of Dogg0

Above the results of the mood assesment after using Dogg0 are shown (Figure 8). On average participants report a slightly positive mood after their interaction with Dogg0. Some participants report a slightly negative mood after their interaction (2.4), and some participants really enjoyed their interaction (4.8). Below (Figure 9) the individual questions are shown with their respective scores. Noteworthy is that on average most participants didn't feel like they had accomplished something and that they weren't really stimulated to be active. On average participants did feel slightly less safe with Dogg0 (2.9), the reason for this might be the same as why people felt like they had to be more cautious when using Dogg0 in the trust assesment. Positive however is that participants found the interaction amusing, enjoyable and that it made them happy. On average they also felt like doing it again.

Picture22.png

Figure 9: Mood assesment of Dogg0 per question

Calculating the Pearson Product-Moment Correlation Coefficient for the mood scores and trust scores per participant gives us a correlation of 0.82, meaning both scores are strongly correlated. This indicates that if a participant enjoyed their interaction with Dogg0, they also trusted Dogg0 more or vice versa.

4. Discussion

Small Test Population

Dogg0's trust score tells us accurately how much people trusted Dogg0 and is based on literature. It gives us insight in the factors that reduced the trust in Dogg0 and does a nice job at that. Our test population of 10 however is way too little to accurately draw conclusions about our experiment. If we were to repeat our experiment in the future, we would like to have a bigger test population to have significant results.

Mood Score before/after

We used questions inspired by the Oxford Happiness Questionnaire to determine if interacting with Dogg0 made participants happy or not. While this did give us some useful insights in how participants valued the interaction with Dogg0, this did not actually tell us anything about their change in mood. It would have been better to asses participants' mood before the experiment and after the experiment, to see how Dogg0 changed their mood short term. Even better would be to asses their mood again after a longer period, to see if interacting with Dogg0 had longer term effects on mood.

Experiment instruction

We started giving only minimal instructions for participants interacting with Dogg0, as we expected Dogg0 to trigger some intuitive interaction. However as we noticed that participants seemed confused about what to do in the interaction, we started giving more detailed instructions. Suprisingly this did not seem to affect the average mood or trust, as can be seen in the figure below.

 

Picture132.pngFigure 10: Trust Score per participant

Picture1323.png

Figure 11: Mood Score per participant

MiroCode

MiroCode was used to program the robot, this however did not provide a great degree of control and the robot movements often looked finnicky and unreactive. This may very well have been the reason that participants felt like they had to be cautious around Dogg0. and that peopleĀ  did not think they could depend on it completely. To improve the reactiveness and functionality of Dogg0, ROS could be used instead of MiroCode.

Dementia patients

This experiment was done on fit college students, not on elderly dementia patients. In that sense it does give an indication of how enjoyable the interaction with Dogg0 is, but does not per se mean that elderly dementia patients will feel the same about it. A good evaluation of Dogg0 would evaluate it on the actual target audience.

Better Sensors and Actuators

Dogg0 had numerous sensing problems. Often it would consider its' own movement as a clap, touch sensors would activate when certain movements were made and black or grey floor was percieved as a cliff. These sensing problems resulted in us having to tune down the reactiveness of Dogg0 for the experiment and work around the sensor issues. Overall this may have made the interaction less enjoyable and natural.

Measuring Intuitiveness 

While conducting the experiment we switched from giving no instructions to giving more context and explanation about the functions of Dogg0. We saw a difference between the reactions of the participants to the Dogg0 of the two groups. This observation was merely anecdotal, and it would have been interesting to measure systematically whether giving former instructions or not would affect the mood of the participants. This could potentially give some meaningful results on how intuitive the Dogg0 is, which is one of the objectives.

5. Conclusions

The average trust in Dogg0 was 3.2, so slightly positive. Participants were neutral about the benevolence and reciprocity of Dogg0, but were positive its competence and did not percieve it as a risk.

Participants on average also slightly enjoyed their interaction with Dogg0 (3.3). A correlation was found between the trust score per participant and the mood score per participant, which might indicate that participants who trusted Dogg0 more, also enjoyed the interaction more, or that people who enjoyed the interaction more, also trusted Dogg0 more.

Overall this experiment was done with too little participants and was conducted on students. To draw any significant conclusions on the evaluation of Dogg0, it should be done with more participants of the correct target group.