Changes for page b. Test
Last modified by Jean-Paul Smit on 2024/04/09 15:23
From version 35.1
edited by Jean-Paul Smit
on 2024/04/09 15:23
on 2024/04/09 15:23
Change comment:
There is no comment for this version
To version 28.1
edited by Pravesha Ramsundersingh
on 2024/04/08 10:56
on 2024/04/08 10:56
Change comment:
There is no comment for this version
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Author
-
... ... @@ -1,1 +1,1 @@ 1 -xwiki:XWiki. jeanpaulsmit1 +xwiki:XWiki.PraveshaRamsundersingh - Content
-
... ... @@ -74,8 +74,13 @@ 74 74 75 75 76 76 77 - The competence ofparticipants, measured with a paired t-test between pre-study and post-study measures found no significant effect on using the NAO (p > 0.05)over the set up with only the conversational agent. Covariates had no significant influence on recall score (p > 0.05). Both student t-tests on the relationship between familiarity and attitude towards robots and conversational agents gave no significant difference between the two group's dependent variables. For other analyses, most resulting p-values had an alpha above 0.05 and thus were not significant enough.77 +Competence. No significant effect on using the NAO (p > 0.05) 78 78 79 +Covariates. had no significant influence on recall score (p > 0.05) 80 + 81 + 82 +For other analyses, most resulting p values had an alpha above 0.05 and thus were not significant enough. 83 + 79 79 [[Table 1: showing the Affect variables>>image:Table of means.png||height="214" width="737"]] 80 80 81 81 === 4. Discussion === ... ... @@ -90,19 +90,21 @@ 90 90 91 91 **//Participants don't know about points system so they didn't answer with "getting the most points" in mind//** 92 92 93 -Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them. 98 +//Ambiguities in the evaluation briefing has led to several aspects in the results that might misrepresent the participants' gathered knowledge. Points were awarded to the participant for certain key descriptors, each family member's role, occupation, likes, dislikes and so on. Of course, this wasn't known to the participants, so they might have omitted descriptors they deemed less important or trivial and therefore scored worse, even though they'd heard and remembered them.// 94 94 95 95 96 96 //**Participants don't know which people or facts are important, so they can get stuck in spots that are unrewarded**// 97 97 98 -The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant. 103 +//The choice to create a sprawling, multi-faceted database also had the side-effect of participants finding out a lot of information that was not rewarded by the grading system in any way. For example, the user can ask the robot to elaborate on certain memories or character traits of family members. There are also people in the database that act as ancillary characters and to create a sense of realism to the database, but participants can likely get stuck on learning about them as there is no implied hierarchy of importance to the participant. // 99 99 100 100 101 101 **//GPT Assistant can elaborate on any question, and therefore the user does not know what belongs to the database and don't know where to focus//** 102 102 103 -Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant. 108 +//Another limitation of the evaluation is with the GPT Assistant's ability to consistently elaborate on any question posed by the participant.// 104 104 105 105 111 +===== ===== 112 + 106 106 (% class="wikigeneratedid" id="HEthicalConsiderations" %) 107 107 //**Ethical Considerations**// 108 108 ... ... @@ -111,9 +111,9 @@ 111 111 112 112 === 5. Conclusions === 113 113 114 - Withregards to our final insights, there were some areas of improvement.For example, relating to the validityof thetesting procedure, we had a relatively small sample size and none of the participantshad Dementia. This means that the target group for this NAO system was not tested. It was also only a specific group of people, namely students with the age range of 20-25. There was also a level of ambiguity in the point system of the evaluation and with regards to the LLM model focus, we found deviations from the GPT regarding the main objective. However, although these were a few of the limitations throughout the process, we were still able to evaluate the NAO system and could recognise that the NAO may positively affects participant’ s affective state.121 +===== //**Final Remarks**// ===== 115 115 123 +With regards to our final insights, there were some areas of improvement. For example, relating to the validity of the testing procedure, we had a relatively small sample size and none of the participants had dementia. This means that the target group for this NAO system was not tested. It was also only a specific group of people, namely students with the age range of 20-25. There was also a level of ambiguity in the point system of the evaluation and with regards to the LLM model focus, we found deviations from the GPT regarding the main objective. However, although these were a few of the limitations throughout the process, we were still able to evaluate the NAO system and could recognise that the NAO may positively affects participant’ s affective state. 116 116 117 -===== **//Avenues for Future Work//** ===== 118 118 119 - Asmentioned,there were some limitations, which could betakeninto consideration if wishing to continue work on the PersonalEncyclopaedia.For futurework,testing can be done on the correct target group, the GPT could be optimised to avoid deviations, and more focus could be placed on our last stage (Extension Phase) to see if additional functionalities are needed!126 +===== **//Potential Future Work//** =====