Changes for page b. Test

Last modified by Jean-Paul Smit on 2024/04/09 15:23

From 29.1 to 30.1 From 34.1 to 35.1

From version 30.1

edited by Pravesha Ramsundersingh
on 2024/04/08 11:02

Change comment: There is no comment for this version

To version 34.1

edited by Jean-Paul Smit
on 2024/04/09 00:32

Change comment: There is no comment for this version

Raw
Rendered

Summary

Page properties (2 modified, 0 added, 0 removed)

Details

Page properties

Author

@@ -1,1 +1,1 @@
--xwiki:XWiki.PraveshaRamsundersingh
++xwiki:XWiki.jeanpaulsmit

Content

@@ -95,17 +95,17 @@
  **//Participants don't know about points system so they didn't answer with "getting the most points" in mind//**
--//Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.//
++Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.
  //**Participants don't know which people or facts are important, so they can get stuck in spots that are unrewarded**//
--//The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant. //
++The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant.
  **//GPT Assistant can elaborate on any question, and therefore the user does not know what belongs to the database and don't know where to focus//**
--//Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.//
++Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.
  (% class="wikigeneratedid" id="HEthicalConsiderations" %)
@@ -116,11 +116,9 @@
  === 5. Conclusions ===
--===== //**Final Remarks**// =====
++With regards to our final insights, there were some areas of improvement. For example, relating to the validity of the testing procedure, we had a relatively small sample size and none of the participants had Dementia. This means that the target group for this NAO system was not tested. It was also only a specific group of people, namely students with the age range of 20-25. There was also a level of ambiguity in the point system of the evaluation and with regards to the LLM model focus, we found deviations from the GPT regarding the main objective. However, although these were a few of the limitations throughout the process, we were still able to evaluate the NAO system and could recognise that the NAO may positively affects participant’ s affective state.
--With regards to our final insights, there were some areas of improvement. For example, relating to the validity of the testing procedure, we had a relatively small sample size and none of the participants had dementia. This means that the target group for this NAO system was not tested. It was also only a specific group of people, namely students with the age range of 20-25. There was also a level of ambiguity in the point system of the evaluation and with regards to the LLM model focus, we found deviations from the GPT regarding the main objective. However, although these were a few of the limitations throughout the process, we were still able to evaluate the NAO system and could recognise that the NAO may positively affects participant’ s affective state.
++===== **//Avenues for Future Work//** =====
--===== **//Potential Future Work//** =====
--
  As mentioned, there were some limitations, which could be taken into consideration if wishing to continue work on the Personal Encyclopaedia. For future work, testing can be done on the correct target group, the GPT could be optimised to avoid deviations, and more focus could be placed on our last stage (Extension Phase) to see if additional functionalities are needed!

Changes for page b. Test

Summary

Details

Applications

Navigation

Need help?