Wiki source code of b. Test

Last modified by Lucia Serrano Ruber on 2025/04/23 15:41

Show last authors
1 = 1. Introduction =
2
3 One of the things that people with early stage dementia struggle with is keeping appointments, planning and/or organizing. This schedule reminder system for people with early-stage dementia (PwD) using the Pepper robot, is made to help with this and would in the long run alleviate stress on care-givers. The specific claims to be tested are:
4
5 (((
6 1. Promote independence by clarifying daily activities
7 1. Promote engagement by inviting and scheduling social activities for PwD
8 1. Have the system usability with Pepper be as good with a nurse.
9
10 Throughout the section we use "the system" as a generic way to refer to either the nurse or the Pepper robot as our independent variable.
11 )))
12
13 = 2. Method =
14
15
16 == 2.1 Participants ==
17
18 The evaluation had 4 participants who were all Engineering Master's students in their 20s. This is clearly not enough participants, and also not our target audience for the prototype and thus not suitable to make any strong claims about our prototype.
19
20 == 2.2 Experimental design ==
21
22 The experiment followed a within-participants design such that we could get the most out of our participants. This means that every participant performed the experiment twice. Once with the nurse and once with Pepper. This design could potentially lead to poor quality data, which was mitigated by alternating the order of the systems for each participant. This way there was an equal split between participants who first interact with Pepper before the nurse and vice versa.
23
24 == 2.3 Tasks ==
25
26 The action sequence of our experiment goes as follows. First the PwD is greeted by the system with the current date and time. Then an overview of their schedule for the day is offered to them. If there
27
28 Throughout the experiment the participant is asked to act as if they were Margaret (our PwD, see [[Personas>>doc:Main.sdf.Personas.WebHome]]). When the system is approached it informs the participant of today's date and time, and asks if they want to see today's schedule. When the participant agrees, the system reminds them of their schedule for that day. In the case of the nurse, they verbally informed the participant of the schedule. With Pepper the participants also get to see the schedule visually through the tablet, instead of only verbally.
29
30 If there is an overlap between free time in the schedule and previously scheduled group activities, the system will ask the participant if they would like to join this activity. The participant is free to accept or deny the suggestion. With the nurse this will be done verbally, whereas with Pepper they can use either verbal or tactile input (press a button on the tablet). If they accept, the system will add the group activity to their schedule. Otherwise the system will simply accept their decision and move on.
31
32 == 2.4 Measures ==
33
34 In [[2. Specification>>doc:2\. Specification.WebHome]] we stated that our robotic system would "promote independence by minimising confusion about daily schedule". To measure this, the Self-Assessment Manikin (SAM) was used. SAM has 3 scales which each measure a different feeling, namely valence, arousal, and dominance, ranging from 1 to 9 (Bynion).
35
36 Another goal of the system was to ensure that the usability of the Pepper system was not significantly worse than having a nurse do this task. To evaluate this, the System Usability Scale (SUS) was used. This is a set of 10 statements stating different areas of usability for a system, to which the user can report (dis)agreement on a scale from 1 to 5 (Brooke).
37
38 Lastly, it was also checked how much of the schedule a participant remembered using a simple point system. For this, the participants were asked to retell what they remembered of the schedule. From this, the researchers awarded points for every event that was correctly recalled. 1 point was given for the correct event, and 1 point was given for the correct time. For example, imagine the schedule includes a movie at 15:00. If user can remember there is a movie at some point, that counts as 1 point. If they remember there is some activity at 15:00, that is 1 point. If they remember there is a movie at 15:00, that is 2 points. The participants are not informed whether or not they are correct about the schedule. The researchers aim to be neutral when performing this check.
39
40 == 2.5 Procedure ==
41
42 Once the participants of the experiments have been acquired, they are first briefed by the researchers and given a consent form. The consent form states all the necessary information and instructions for the participant throughout the experiment, minimizing the interaction with the researchers in their researcher role.
43
44 The participant is then told the schedule by the researchers at least 1 hour before the experiment outlined in 2.2 will take place. In our case it was possible to do this briefing one day prior to the experiment. This step served as a way to simulate a PwD's "initial state" when interacting with the system, as they would realistically have some notion of the schedule when interacting with either system. The idea behind this was that it would allow for a fairer comparison between the two systems as you could then compare the difference in memory before and after using each system.
45
46 The next day, the actual experiment took place. Before interacting with either system, the participants were quickly reminded of their role and the flow of the experiment. Then, the participant was given a SAM to indicate how they were feeling at that moment in relation to the schedule they had been given prior to the experiment. Then the researchers checked the participant's memory of the schedule. This would serve as a baseline to compare the schedule recall before and after the interactions.
47
48 Following this initial state check, the participant proceeded to the interaction with the first system. For the specific procedure of the interaction see 2.3. After each interaction, the participant was asked to fill in the SAM & SUS forms. After which their memory of the schedule was once again checked. For the sake of time, the participants interacted with the second system immediately afterwards, repeating this same procedure that was just described. Ideally there would be more time in between interactions, but this was not possible in the given timeframe.
49
50 At the end of the experiment, the participant was able to ask any questions about the experiment and could also give any feedback to the researchers about factors that could have affected their behaviour in the experiment.
51
52 == 2.6 Material ==
53
54 For this experiment, the following materials were necessary:
55
56 * Printed consent forms (1 per participant)
57 * SAM form (3 per participant)
58 * SUS form (2 per participant)
59 * Writing utensil
60 * Laptop with the PowerPoint prototype described in [[3.a. Prototype>>doc:3\. Evaluation.a\. Prototype.WebHome]].
61
62 = 3. Results =
63
64
65 [[image:https://lh7-rt.googleusercontent.com/slidesz/AGV_vUewlqzJzS9rmll_BZ7_mIKv7RVOy7bipQ-rATa258TiL93_0HuR3T5awzdLK0IQUIyK2ftbIG82rKeRiCi3Mi9nfaNfymZpec_g0eZU0eJFtISDHnNnTw6M20f7xLz_18NL-L22WQ=s2048?key=Ih5jbDktAldGWqmL49OwQZQm||height="375px;" width="576px;"]]
66
67 To test whether our prototype promoted independence, we used the SAM. Here the most relevant dimension is the Dominance scale. A negative score indicates higher feeling of independence, whereas a positive scale indicates a feeling of dependency. The results show that the participants reported similar scores in the Valence and Arousal dimensions for both the nurse and Pepper systems. However a clear difference is seen in the Dominance dimension. Participants reported generally feeling more independent after interacting with Pepper than after an interaction with the nurse.
68
69
70 [[image:https://lh7-rt.googleusercontent.com/slidesz/AGV_vUekffrlH2UJjxQ3Lm7Nqi6awsf8Aun7wAsjmltm41l1GVtB9FwAPmmXGz6NwsX8v4UnIr339TWhjqQP9O2u83MNZyoerEOqaDHgYla7uZB6A4jHzd-V6yrSHp-yQNl_usxZfW14=s2048?key=Ih5jbDktAldGWqmL49OwQZQm||height="377px;" width="668px;"]]
71
72 The results for each memory check were averaged and normalized from 0 to 1. These results show that, on average, the schedule recall was better after interacting with Pepper than with interacting with the nurse. From the debriefing feedback received from the participants, it could be said that this is partially due to the fact that Pepper also shows the schedule on its tablet next to verbally saying the schedule.
73
74
75 [[image:https://lh7-rt.googleusercontent.com/slidesz/AGV_vUeZo-EUnwj_qElm1HFcJV8u9zCRsjF7Cnq4IlzyAYm6kUKtcUE5YnX6bdoIz-o2Fs0Qt8FQGHZzUb9r-X5S2n4z83IvfxGIq9G4I9jlwwuvs0hSjrw6-sL9xj_3D3DvEmV83rSn=s2048?key=Ih5jbDktAldGWqmL49OwQZQm||height="282px;" width="673px;"]]
76
77 From the SUS forms we find that in general, the nurse is an easier "system" to interact with, which is to be expected as most people are used to interacting with humans over robotic systems. However, participants reported a higher preference and feeling of confidence when using the Pepper robot over the nurse system.
78
79 = 4. Discussion =
80
81 Overall the results are satisfactory and seem to support our hypotheses.
82
83 The results of the SAM scale indicate that indeed participants felt more independent while using the Pepper robot, while not sacrificing feelings in the Valence or Arousal domains compared to interacting with a nurse. After using Pepper, participants scored better in the memory check. From the feedback received during the debriefing, some participants stated that the multimodal nature of Pepper and its tablet helped a lot with being able to remember the schedule afterwards. This makes sense because such audio-visual multimodal output can reduce cognitive load, helps convey complex information, and thereby improves retention of the communicated information, according to Soni and Oviatt.
84
85 Lastly, there was no significant decrease in usability when interacting with the nurse "system" vs the Pepper robot. Of course, the Pepper robot scored worse compared to the nurse in terms of "awkward use of system", "unnecessary complexity", and "ease of use" because most people (and at least all participants) have interacted with other humans before, making interacting with a nurse easy. On the other hand, not everyone has interacted with a robot before, so there is a steeper learning curve there. It is interesting to note, however, that participants do seem to prefer the Pepper robot over the nurse and feel more confident in their interactions with Pepper than with the nurse. This could indicate that the quality of the interaction is not significantly decreased with a Pepper robot over a nurse.
86
87 = 5. Conclusions =
88
89 To start off, it is necessary to point out that with such a small sample size that is also not the target audience of the prototype, it is not possible to draw any conclusions from this data. Instead, the results can be used to get a rough idea about the effects of our system, from which we could make a hypothesis to test in a future study.
90
91 If these results would persist in a study with significantly more (relevant) participants, the results would indicate that participants felt more confident, independent, and overall preferred interacting with the Pepper robot over the nurse. It would also show that not only do participants prefer it, but it is also clear that it is more effective in reminding participants of their schedule, as their recall of the schedule is more accurate with Pepper than with the nurse. Lastly, using the Pepper robot does not significantly decrease the quality of the interaction when compared to a nurse.
92
93 For a future study with more time and resources we would use a Pepper robot in our experiment instead of a PowerPoint, have a trained nurse instead of a researcher acting as a nurse, and we would include a more diverse group of participants.
94
95
96 **Literature**
97
98 Brooke, J.: SUS: A “Quick and Dirty” Usability Scale. In: Jordan, P.W., Thomas, B., Weerdmeester, B.A., McClelland (eds.) Usability Evaluation in Industry, pp. 189–194. Taylor & Francis, London (1996).
99
100 Bynion, T. M., & Feldner, M. T. (2017). Self-assessment manikin. //Encyclopedia of personality and individual differences//, //25//, 1-3.
101
102 Oviatt, S., Cohen, P.R. (2015). Aims and Advantages of Multimodal Interfaces. In: The Paradigm Shift to Multimodality in Contemporary Computer Interfaces. Synthesis Lectures on Human-Centered Informatics. Springer, Cham. [[https:~~/~~/doi.org/10.1007/978-3-031-02213-5_3>>https://doi.org/10.1007/978-3-031-02213-5_3]]
103
104 Soni, Aditya. “13 Advantages and Disadvantages of Visual Communication .” //Clearinfo//, 3 Jan. 2024, clearinfo.in/blog/advantages-and-disadvantages-of-visual-communication/.