b. Test - XWiki

=== 1. Introduction ===

This study aims to evaluate the effectiveness of the Nao robot in aiding memory retrieval through the use of a between-subjects design with a test group and a control group. We will simulate conditions of early-stage dementia in student participants to test whether interactions with Nao enhance the retrieval of memories fabricated created by us, mimicking lost memories typical of individuals with dementia.

4

5

Recall the image from the [[3. Evaluation Methods>>Main.b\. Human Factors.Measuring Instruments.WebHome]] as in the figure below. The first two concepts, that of Autonomy and Relatedness and the memory self-efficacy, will be tested by using validated surveys. The memory recall will be tested by a custom survey that was catered to be used with the robot in the domain of interest, that is of using the NAO as an encyclopedia for recalling familiar people.

6

7

[[image:Socio-Cognitive Engineering - Frame 1.jpg]]

=== 2. Method ===

[[Figure 2: procedure of the controlled experiment>>image:Procedure.png]]

13

14

==== 2.1 Participants ====

15

16

As actual PwDs are not available for this research, the participants will consist of several student volunteers. These will be divided into two groups: those using the Nao robot for memory retrieval (the test group) and those using text-based prompts (the control group).

17

18

19

==== 2.2 Materials ====

20

21

Materials for the test group include the NAO robot, an OpenAI chatbot running on a mobile phone, and a laptop to control the human-like movements of NAO, as described in the [[prototype>>doc:3\. Evaluation.a\. Prototype.WebHome]]. The control group requires written-out text prompts instead.

22

Both groups require a set of fabricated memories, memory quizzes, writing materials, and evaluation forms for feedback. ** **

23

24

25

==== 2.3 Measures ====

26

27

Memory retrieval success will be measured using a [[quiz>>doc:.Quiz.WebHome]] based on the fabricated memories. Participant feedback and observational notes will also be collected to assess the user experience and interaction effectiveness. We will evaluate the recall and memory efficacy of both groups and try to relate the given answers to values** **such as autonomy and relatedness.

28

29

30

==== 2.4 Procedure ====

31

32

Participants will receive an overview of their role as a PwD and undergo a session with either Nao or text prompts to retrieve memories. Following the session, they will complete the memory quiz. Feedback will be gathered post-interaction.

33

34

//**Phase 1: Preparation**//

35

36

1. **NAO Robot Setup**: Confirm that Nao is fully operational and our team is equipped to facilitate the interactions by doing a pre-test/trial run with our prototype.

37

1. **Recruitment and Consent**: Recruit a diverse group of student volunteers, ensuring informed consent for participation.

38

1. **Training**: Brief participants on their roles and the concept of simulated memory loss without revealing specific details of the fabricated memories.

39

40

//**Phase 2: Baseline Assessment**//

41

42

1. **Pre-test**: Optionally, assess the students’ ability to recall general information unrelated to the fabricated memories to establish a baseline.

43

1. **Documentation**: Record responses to analyze as a baseline for later comparison.

44

45

//**Phase 3: Memory Recall**//

46

47

1. **Introduction**: Introduce participants in the test group to the NAO robot, explaining its purpose, and provide the control group with their first text-based memory prompt.

48

1. **Simulation**: Conduct sessions where the test group interacts with Nao aiming to recall a set of fabricated memories.

49

1. **Analog control group simulation: **Conduct sessions with an analog encyclopedia with equally complex information on relatives.

50

1. **Memory Quiz**: Following each session, provide a written quiz to both groups based on the memories they were asked to retrieve.

51

1. **Documentation**: Collect the completed quizzes, which will serve as the primary documentation of memory retrieval.

52

53

//**Phase 4: Evaluation**//

54

55

1. **Scoring**: Assess the memory quizzes from both groups to evaluate the quantity and accuracy of retrieved memories.

56

1. **Analysis**: Compare the effectiveness of memory retrieval between the NAO-assisted group and the text-prompt group.

57

1. **Participant Feedback**: Gather feedback from all participants regarding their experience with the memory retrieval process through a questionnaire.

58

1. **Observer Notes**: Collect observations from staff regarding the interaction between participants and the Nao robot or text prompts.

59

60

//**Phase 5: Feedback and Improvement**//

61

62

1. **Participant Reflections**: Review participant feedback for insights into the memory retrieval process and suggestions for improvement.

63

1. **Staff and Observer Insights**: Analyze observations from staff and observers for additional perspectives on the effectiveness and emotional impact of the interactions.

64

1. **Improvement Strategies**: Based on the feedback and results, develop a plan to enhance the memory retrieval capabilities of the Nao robot and the overall test design.

=== 3. Results ===

Twenty University students participated in the controlled experiment (M= 45%, F=55%). After gaining consent through the consent form, all participants were briefed to act as if experiencing the interaction from the viewpoint of a PwD. From the sample, half were asked to interact with the NAO and half were only using a voice-based version of the encyclopedia prototype.

69

70

The main finding of the experiment is on Relatedness as can be seen in Table 1: the increase of affect in both Arousal and Valence dimensions during the experiment with the test group was higher than the experiment with the control group. The control group mean values in arousal and valence show no significant change at all.

71

72

73

[[Figure 3: Pie chart showing the gender distribution of our sample. ~| Figure 4: Boxplot chart showing the familiarity with robots and the attitude towards technology.>>image:gender & familiarity.png]]

Competence. No significant effect on using the NAO (p > 0.05)

78

79

Covariates. had no significant influence on recall score (p > 0.05)

80

81

82

For other analyses, most resulting p values had an alpha above 0.05 and thus were not significant enough.

83

84

[[Table 1: showing the Affect variables>>image:Table of means.png||height="214" width="737"]]

85

86

=== 4. Discussion ===

87

88

The NAO may positively affects participant’ s affective state. Yet the other results were not significant enough to make a valid decision about. There are a few possible reasons why that is that will be elaborated upon in this section.

89

90

91

**Sample Group and Size impact validity of the study**

92

93

The ecological validity of the study is impacted by the fact that there were no PwD in our sample. The scope of the experiment was limited to TU Delft University students. That means that future research may benefit from a closer approach to an experiment which is closer to the experience of PwD. Moreover, the controlled experiment was restricted to a cohort of 20 participants, underscoring the potential for enhancing result validity through the utilization of a larger sample size

94

95

96

**//Participants don't know about points system so they didn't answer with "getting the most points" in mind//**

97

98

//Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.//

99

100

101

//**Participants don't know which people or facts are important, so they can get stuck in spots that are unrewarded**//

102

103

//The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant. //

104

105

106

**//GPT Assistant can elaborate on any question, and therefore the user does not know what belongs to the database and don't know where to focus//**

107

108

//Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.//

=== 5. Conclusions ===

113

114

===== //**Ethical Considerations**// =====

115

116

We commit to high ethical standards, respecting the sensitive nature of simulating dementia conditions, and ensuring the well-being and dignity of all participants throughout the study. To ensure this we present a form containing the ethical considerations to each participant.

117

118

119

===== //**Final Remarks**// =====

120

121

With regards to our final insights, there were some areas of improvement. For example, relating to the validity of the testing procedure, we had a relatively small sample size and none of the participants had dementia. This means that the target group for this NAO system was not tested. It was also only a specific group of people, namely students with the age range of 20-25. There was also a level of ambiguity in the point system of the evaluation and with regards to the LLM model focus, we found deviations from the GPT regarding the main objective. However, although these were a few of the limitations throughout the process, we were still able to evaluate the NAO system and could recognise that the NAO may positively affects participant’ s affective state.

122

123

===== **//Potential Future Work//** =====

Wiki source code of b. Test