Changes for page b. Test

Last modified by Jean-Paul Smit on 2024/04/09 15:23

From 23.1 to 23.2 From 25.1 to 25.2

From version 23.2

edited by Jean-Paul Smit
on 2024/04/07 00:03

Change comment: There is no comment for this version

To version 25.1

edited by Jean-Paul Smit
on 2024/04/07 00:14

Change comment: There is no comment for this version

Raw
Rendered

Summary

Page properties (1 modified, 0 added, 0 removed)

Details

Page properties

Content

@@ -9,7 +9,7 @@
  === 2. Method ===
--[[Figure: procedure of the controlled experiment>>image:Procedure.png]]
++[[Figure 2: procedure of the controlled experiment>>image:Procedure.png]]
  ==== 2.1 Participants ====
@@ -18,13 +18,13 @@
  ==== 2.2 Materials ====
--Materials for the test group include the NAO robot, an OpenAI chatbot running on a mobile phone, and a laptop to control the human-like movements of NAO, as described in the [[prototype>>doc:3\. Evaluation.a\. Prototype.WebHome]]. The control group requires written out text prompts instead.
++Materials for the test group include the NAO robot, an OpenAI chatbot running on a mobile phone, and a laptop to control the human-like movements of NAO, as described in the [[prototype>>doc:3\. Evaluation.a\. Prototype.WebHome]]. The control group requires written-out text prompts instead.
  Both groups require a set of fabricated memories, memory quizzes, writing materials, and evaluation forms for feedback. ** **
  ==== 2.3 Measures ====
--Memory retrieval success will be measured using a [[quiz>>doc:.Quiz.WebHome]] based on the fabricated memories. Participant feedback and observational notes will also be collected to assess the user experience and interaction effectiveness. We will evaluate the recall and memory-efficacy of both groups and try to relate the given answers to values** **such as autonomy and relatedness.
++Memory retrieval success will be measured using a [[quiz>>doc:.Quiz.WebHome]] based on the fabricated memories. Participant feedback and observational notes will also be collected to assess the user experience and interaction effectiveness. We will evaluate the recall and memory efficacy of both groups and try to relate the given answers to values** **such as autonomy and relatedness.
  ==== 2.4 Procedure ====
@@ -65,35 +65,47 @@
  === 3. Results ===
--Twenty University students participated in the controlled experiment (M= 45%, F=55%). After gaining consent through the consent form, all participants were briefed to act as if experiencing the interaction from the viewpoint of a PwD. From the sample, half experienced
++Twenty University students participated in the controlled experiment (M= 45%, F=55%). After gaining consent through the consent form, all participants were briefed to act as if experiencing the interaction from the viewpoint of a PwD. From the sample, half were asked to interact with the NAO and half were only using a voice-based version of the encyclopedia prototype.
--[[Pie chart showing the gender distribution of our sample. / Boxplot chart showing the familiarity with robots and the attitude towards technology.>>image:gender & familiarity.png]]
++The main finding of the experiment is on Relatedness as can be seen in Table 1: the increase of affect in both Arousal and Valence dimensions during the experiment with the test group was higher than the experiment with the control group. The control group mean values in arousal and valence show no significant change at all.
--Most resulting p values had an alpha above 0.05 and thus were not significant enough.
++[[Figure 3: Pie chart showing the gender distribution of our sample. ~| Figure 4: Boxplot chart showing the familiarity with robots and the attitude towards technology.>>image:gender & familiarity.png]]
--[[Table showing the Affect variables>>image:Table of means.png||height="214" width="737"]]
++
++Competence. No significant effect on using the NAO (p > 0.05)
++
++Covariates. had no significant influence on recall score (p > 0.05)
++
++
++For other analyses, most resulting p values had an alpha above 0.05 and thus were not significant enough.
++
++[[Table 1: showing the Affect variables>>image:Table of means.png||height="214" width="737"]]
++
  === 4. Discussion ===
++The NAO may positively affects participant’ s affective state. Yet the other results were not significant enough to make a valid decision about. There are a few possible reasons why that is that will be elaborated upon in this section.
++
++
  **Sample Group and Size impact validity of the study**
--The ecological validity of the study is impacted by the fact that there were no PwD in our sample.The scope of the experiment was limited to TU Delft University students. That means that future research may benefit from a closer approach to an experiment which is closer to the experience of PwD. Moreover, the controlled experiment was restricted to a cohort of 20 participants, underscoring the potential for enhancing result validity through the utilization of a larger sample size
++The ecological validity of the study is impacted by the fact that there were no PwD in our sample. The scope of the experiment was limited to TU Delft University students. That means that future research may benefit from a closer approach to an experiment which is closer to the experience of PwD. Moreover, the controlled experiment was restricted to a cohort of 20 participants, underscoring the potential for enhancing result validity through the utilization of a larger sample size
  **//Participants don't know about points system so they didn't answer with "getting the most points" in mind//**
--//Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participant, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.//
++//Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.//
  //**Participants don't know which people or facts are important, so they can get stuck in spots that are unrewarded**//
--//The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate about certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hiearchy of importance to the participant. //
++//The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant. //
--**//GPT Assistant can elaborate on any question, and therefore the user does not know what belongs to the database, and don't know where to focus//**
++**//GPT Assistant can elaborate on any question, and therefore the user does not know what belongs to the database and don't know where to focus//**
--//Another limiation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.//
++//Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.//
@@ -102,5 +102,3 @@
  ===== //**Ethical Considerations**// =====
  We commit to high ethical standards, respecting the sensitive nature of simulating dementia conditions, and ensuring the well-being and dignity of all participants throughout the study. To ensure this we present a form containing the ethical considerations to each participant.
--
--

Changes for page b. Test

Summary

Details

Applications

Navigation

Need help?