Changes for page b. Test
Last modified by Jean-Paul Smit on 2024/04/09 15:23
From version 30.1
edited by Pravesha Ramsundersingh
on 2024/04/08 11:02
on 2024/04/08 11:02
Change comment:
There is no comment for this version
To version 34.1
edited by Jean-Paul Smit
on 2024/04/09 00:32
on 2024/04/09 00:32
Change comment:
There is no comment for this version
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Author
-
... ... @@ -1,1 +1,1 @@ 1 -xwiki:XWiki. PraveshaRamsundersingh1 +xwiki:XWiki.jeanpaulsmit - Content
-
... ... @@ -95,17 +95,17 @@ 95 95 96 96 **//Participants don't know about points system so they didn't answer with "getting the most points" in mind//** 97 97 98 - //Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.//98 +Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them. 99 99 100 100 101 101 //**Participants don't know which people or facts are important, so they can get stuck in spots that are unrewarded**// 102 102 103 - //The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant.//103 +The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant. 104 104 105 105 106 106 **//GPT Assistant can elaborate on any question, and therefore the user does not know what belongs to the database and don't know where to focus//** 107 107 108 - //Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.//108 +Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant. 109 109 110 110 111 111 (% class="wikigeneratedid" id="HEthicalConsiderations" %) ... ... @@ -116,11 +116,9 @@ 116 116 117 117 === 5. Conclusions === 118 118 119 - =====//**FinalRemarks**//=====119 +With regards to our final insights, there were some areas of improvement. For example, relating to the validity of the testing procedure, we had a relatively small sample size and none of the participants had Dementia. This means that the target group for this NAO system was not tested. It was also only a specific group of people, namely students with the age range of 20-25. There was also a level of ambiguity in the point system of the evaluation and with regards to the LLM model focus, we found deviations from the GPT regarding the main objective. However, although these were a few of the limitations throughout the process, we were still able to evaluate the NAO system and could recognise that the NAO may positively affects participant’ s affective state. 120 120 121 -With regards to our final insights, there were some areas of improvement. For example, relating to the validity of the testing procedure, we had a relatively small sample size and none of the participants had dementia. This means that the target group for this NAO system was not tested. It was also only a specific group of people, namely students with the age range of 20-25. There was also a level of ambiguity in the point system of the evaluation and with regards to the LLM model focus, we found deviations from the GPT regarding the main objective. However, although these were a few of the limitations throughout the process, we were still able to evaluate the NAO system and could recognise that the NAO may positively affects participant’ s affective state. 122 122 122 +===== **//Avenues for Future Work//** ===== 123 123 124 -===== **//Potential Future Work//** ===== 125 - 126 126 As mentioned, there were some limitations, which could be taken into consideration if wishing to continue work on the Personal Encyclopaedia. For future work, testing can be done on the correct target group, the GPT could be optimised to avoid deviations, and more focus could be placed on our last stage (Extension Phase) to see if additional functionalities are needed!