Changes for page b. Test

Last modified by Jean-Paul Smit on 2024/04/09 15:23

From 23.2 to 23.1 From 29.1 to 28.1

From version 28.1

edited by Pravesha Ramsundersingh
on 2024/04/08 10:56

Change comment: There is no comment for this version

To version 23.2

edited by Jean-Paul Smit
on 2024/04/07 00:03

Change comment: There is no comment for this version

Raw
Rendered

Summary

Page properties (2 modified, 0 added, 0 removed)
Attachments (0 modified, 0 added, 1 removed)
- Test group data.jpg

Details

Page properties

Author

@@ -1,1 +1,1 @@
--xwiki:XWiki.PraveshaRamsundersingh
++xwiki:XWiki.jeanpaulsmit

Content

@@ -9,7 +9,7 @@
  === 2. Method ===
--[[Figure 2: procedure of the controlled experiment>>image:Procedure.png]]
++[[Figure: procedure of the controlled experiment>>image:Procedure.png]]
  ==== 2.1 Participants ====
@@ -18,13 +18,13 @@
  ==== 2.2 Materials ====
--Materials for the test group include the NAO robot, an OpenAI chatbot running on a mobile phone, and a laptop to control the human-like movements of NAO, as described in the [[prototype>>doc:3\. Evaluation.a\. Prototype.WebHome]]. The control group requires written-out text prompts instead.
++Materials for the test group include the NAO robot, an OpenAI chatbot running on a mobile phone, and a laptop to control the human-like movements of NAO, as described in the [[prototype>>doc:3\. Evaluation.a\. Prototype.WebHome]]. The control group requires written out text prompts instead.
  Both groups require a set of fabricated memories, memory quizzes, writing materials, and evaluation forms for feedback. ** **
  ==== 2.3 Measures ====
--Memory retrieval success will be measured using a [[quiz>>doc:.Quiz.WebHome]] based on the fabricated memories. Participant feedback and observational notes will also be collected to assess the user experience and interaction effectiveness. We will evaluate the recall and memory efficacy of both groups and try to relate the given answers to values** **such as autonomy and relatedness.
++Memory retrieval success will be measured using a [[quiz>>doc:.Quiz.WebHome]] based on the fabricated memories. Participant feedback and observational notes will also be collected to assess the user experience and interaction effectiveness. We will evaluate the recall and memory-efficacy of both groups and try to relate the given answers to values** **such as autonomy and relatedness.
  ==== 2.4 Procedure ====
@@ -65,62 +65,42 @@
  === 3. Results ===
--Twenty University students participated in the controlled experiment (M= 45%, F=55%). After gaining consent through the consent form, all participants were briefed to act as if experiencing the interaction from the viewpoint of a PwD. From the sample, half were asked to interact with the NAO and half were only using a voice-based version of the encyclopedia prototype.
++Twenty University students participated in the controlled experiment (M= 45%, F=55%). After gaining consent through the consent form, all participants were briefed to act as if experiencing the interaction from the viewpoint of a PwD. From the sample, half experienced
--The main finding of the experiment is on Relatedness as can be seen in Table 1: the increase of affect in both Arousal and Valence dimensions during the experiment with the test group was higher than the experiment with the control group. The control group mean values in arousal and valence show no significant change at all.
++[[Pie chart showing the gender distribution of our sample. / Boxplot chart showing the familiarity with robots and the attitude towards technology.>>image:gender & familiarity.png]]
--[[Figure 3: Pie chart showing the gender distribution of our sample. ~| Figure 4: Boxplot chart showing the familiarity with robots and the attitude towards technology.>>image:gender & familiarity.png]]
++Most resulting p values had an alpha above 0.05 and thus were not significant enough.
++[[Table showing the Affect variables>>image:Table of means.png||height="214" width="737"]]
--
--Competence. No significant effect on using the NAO (p > 0.05)
--
--Covariates. had no significant influence on recall score (p > 0.05)
--
--
--For other analyses, most resulting p values had an alpha above 0.05 and thus were not significant enough.
--
--[[Table 1: showing the Affect variables>>image:Table of means.png||height="214" width="737"]]
--
  === 4. Discussion ===
--The NAO may positively affects participant’ s affective state. Yet the other results were not significant enough to make a valid decision about. There are a few possible reasons why that is that will be elaborated upon in this section.
--
--
  **Sample Group and Size impact validity of the study**
--The ecological validity of the study is impacted by the fact that there were no PwD in our sample. The scope of the experiment was limited to TU Delft University students. That means that future research may benefit from a closer approach to an experiment which is closer to the experience of PwD. Moreover, the controlled experiment was restricted to a cohort of 20 participants, underscoring the potential for enhancing result validity through the utilization of a larger sample size
++The ecological validity of the study is impacted by the fact that there were no PwD in our sample.The scope of the experiment was limited to TU Delft University students. That means that future research may benefit from a closer approach to an experiment which is closer to the experience of PwD. Moreover, the controlled experiment was restricted to a cohort of 20 participants, underscoring the potential for enhancing result validity through the utilization of a larger sample size
  **//Participants don't know about points system so they didn't answer with "getting the most points" in mind//**
--//Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.//
++//Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participant, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.//
  //**Participants don't know which people or facts are important, so they can get stuck in spots that are unrewarded**//
--//The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant. //
++//The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate about certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hiearchy of importance to the participant. //
--**//GPT Assistant can elaborate on any question, and therefore the user does not know what belongs to the database and don't know where to focus//**
++**//GPT Assistant can elaborate on any question, and therefore the user does not know what belongs to the database, and don't know where to focus//**
--//Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.//
++//Another limiation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.//
--=====   =====
--(% class="wikigeneratedid" id="HEthicalConsiderations" %)
--//**Ethical Considerations**//
--
--We commit to high ethical standards, respecting the sensitive nature of simulating dementia conditions, and ensuring the well-being and dignity of all participants throughout the study. To ensure this we present a form containing the ethical considerations to each participant.
--
--
  === 5. Conclusions ===
--===== //**Final Remarks**// =====
++===== //**Ethical Considerations**// =====
--With regards to our final insights, there were some areas of improvement. For example, relating to the validity of the testing procedure, we had a relatively small sample size and none of the participants had dementia. This means that the target group for this NAO system was not tested. It was also only a specific group of people, namely students with the age range of 20-25. There was also a level of ambiguity in the point system of the evaluation and with regards to the LLM model focus, we found deviations from the GPT regarding the main objective. However, although these were a few of the limitations throughout the process, we were still able to evaluate the NAO system and could recognise that the NAO may positively affects participant’ s affective state.
++We commit to high ethical standards, respecting the sensitive nature of simulating dementia conditions, and ensuring the well-being and dignity of all participants throughout the study. To ensure this we present a form containing the ethical considerations to each participant.
--
--===== **//Potential Future Work//** =====
++

Test group data.jpg

Author

...	...	@@ -1,1 +1,0 @@
1		-xwiki:XWiki.jeanpaulsmit

Size

...	...	@@ -1,1 +1,0 @@
1		-157.5 KB

Content

Changes for page b. Test

Summary

Details

Applications

Navigation

Need help?