Changes for page b. Test
Last modified by Demi Breen on 2023/04/09 15:10
From version 39.1
edited by Maya Elasmar
on 2023/04/03 13:15
on 2023/04/03 13:15
Change comment:
There is no comment for this version
To version 31.1
edited by Hugo van Dijk
on 2023/03/30 22:09
on 2023/03/30 22:09
Change comment:
There is no comment for this version
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Author
-
... ... @@ -1,1 +1,1 @@ 1 -XWiki. MayaElasmar1 +XWiki.hjpvandijk - Content
-
... ... @@ -2,10 +2,6 @@ 2 2 3 3 For our research, we are looking into the effect of either using goal-based motivation or emotion-based motivation in promoting PwD for physical activity. Two systems will thus be designed; one motivating using emotion-based explanations and the other using goal-based motivation. The product will motivate the PwD to go for a walk in the park stimulating the amount of physical activity. It has been shown that physical activity, an increase in emotional stability and more goal-based activities can increase the mental and physical health of the PwD. Since 70% of the PwD have a lack of motivation, apathy and lack of interest in activities this project could have a great influence on the lives of these people. 4 4 5 -Thus our research question is: 6 -**What is the effect of goal-based and emotion-based explanations in prompting PwD for physical activity?** 7 - 8 - 9 9 The claims that need to be tested are thus: 10 10 11 11 - The effect of emotion-based motivation; The PwD can comprehend the emotion that is being conveyed and in that way is motivated to contribute to the activity of walking in the garden. ... ... @@ -60,7 +60,7 @@ 60 60 61 61 In an optimal scenario where we can test the robot on PwD. We would have measured the number of times a person went out. We would also have measured the effect of the goal and emotion-based motivation on the long-term over the people. Whether it will be less effective over time or not. We would also measure the emotional effect on the caregivers and the functional effect. By the functional effect, we mean whether they indeed have more time to do other tasks or not. It would also have been perfect if we could measure the effect of the walks on the PwD and their health. 62 62 63 -The questionnaire for the feedback is in the attachment (Questionnaire (2)). The questionnaire is based on a questionnaire in the paper " Measuring acceptance of an assistive social robot: a suggested toolkit " [5]. There are also 5 question at the end that we added ourselves, because we think it fits our experiment.59 +The questionnaire for the feedback is in the attachment (Questionnaire (2)). 64 64 65 65 The questionnaire measures the experiment of the interaction of the students with the robot. By that we mean it measures: 66 66 ... ... @@ -114,14 +114,10 @@ 114 114 115 115 = 3. Results = 116 116 117 - ===Noteworthy answers===113 +On average, participants only rejected the robot's persuasion attempts 0.5 times. The participants rated the robot a 2/5 in terms of being scary. They gave a 4/5 for it making life more interesting and it being good to make use of the robot. Questions related to the participant's enjoyment and fascination of the system and the robot were met with ratings between 3.8 and 4.1. The question "I think the staff would like me using the robot" was rated a 4/5 on average. 118 118 119 - On average, participantsonlyrejected the robot's persuasionattempts0.5times.Theparticipantsrated therobota 2/5 interms of being scary.Theygave a4/5forit makinglifemoreinterestinganditbeinggoodtomakeuse of therobot.Questionselatedtotheparticipant's enjoymentandfascinationwithystemandtheobot weremet withratingsbetween3.8 and 4.1. Thequestion"Ithinkthestaff would likeme using therobot" wasrateda4/5onaverage.Finally,toquestion of whethertheywouldnot havegoneforawalkiftheobotdidn'taskthemto,the average answerwas3.8/5.Alltheseanswershada standarddeviationof less than 1.115 +Firstly, the Jarque-Bera test [2] was used to check for normality. When the answers for a question weren't normally distributed, the Mann-Whitney U-Test [3] was used. For normally distributed answers, the T-Test [4] was used. These tests used the null hypothesis that there is no significant difference between the two groups. When the calculated probability value (p-value) is less than 0.05, we can reject the null hypothesis and conclude that there is a significant difference between the two groups for the answers to that question. 120 120 121 -=== ANOVA === 122 - 123 -Firstly, the Jarque-Bera test [2] was used to check for normality. When the answers to a question weren't normally distributed, the Mann-Whitney U-Test [3] was used. For normally distributed answers, the T-Test [4] was used. These tests used the null hypothesis that there is no significant difference between the two groups. When the calculated probability value (p-value) is less than 0.05, we can reject the null hypothesis and conclude that there is a significant difference between the two groups for the answers to that question. 124 - 125 125 Even though the average rejections were higher for emotion-based (0,875) than for goal-based(0,125). This difference was not significant. 126 126 127 127 Furthermore, there was no significant difference in any of the questionnaire answers between the two groups. ... ... @@ -128,55 +128,26 @@ 128 128 129 129 [[This table>>doc:.p-values.WebHome]] shows the p-value per measure. 130 130 131 -=== Observations === 132 132 133 -General remarks made by participants evaluating the emotion-based system were only about the walking aspect of the robot, stating that the walking distance should be increased and the change in direction was quite sharp. Participants doing the goal-based evaluation commented on the badly performing speech recognition system and stated that it might be useful to start by asking how the participant feels. 134 - 135 135 When asked the reason that convinced the participant to join the robot on a walk, two out of the six participants that said yes eventually in the emotion-based system recited one of the persuasion subjects. For the goal-based system, this was three out of eight. 136 136 137 -When participants were standing too close to the robot, it wouldn't walk. This happened numerous times, resulting in conversation without walking. 126 +When participants were standing too close to the robot, it wouldn't walk. This happened in numerous times, resulting in conversation without walking. 138 138 128 +General remarks made by participants evaluating the emotion-based system were only about the walking aspect of the robot, stating that the walking distance should be increased and the change in direction was quite sharp. Participants doing the goal-based evaluation commented on the badly performing speech recognition system and stated that it might be useful to start by asking how the participant feels. 129 + 139 139 Even though it was specified at the start of every session that the participant can say either yes or no to the robot's persuasion attempts, we noticed that some participants did not seem to grasp the fact that they could say no. At the end of their session, one participant stated that he was not persuaded by the robot at all, even though they said yes on the robot's first persuasion attempt. 140 140 \\Another participant, who said no to all persuasion attempts, stated afterwards that they "Just wanted to see what would happen if I said no all the time". This indicated that some participants already had a plan of how many times they would reject the robot before starting, and did not really listen to the persuasions made. 141 141 142 142 As the robot's speech recognition could only understand single words due to its implementation, this resulted in numerous occasions where a participant was not understood and had to repeat themselves. It also occurred that the robot understood 'yes' when 'no' was said. 143 143 144 -- Mention something about only one participant going into Bob's character fully? And that he mentioned that the "no" he was giving was more attention-seeking than a real no. 145 145 146 -- Add that sometimes the robot cut participants off, if they were speaking slower or elaborating on their answer. 147 147 148 - 149 - 150 150 = 4. Discussion = 151 151 152 -- In terms of research question, no significant differences were found. It could be that this is true in general, but it is very likely that this is influenced by the circumstances surrounding the design and the evaluation. 153 153 154 -- The design is rather limited and with limited capabilities, due to time constraints. Speech recognition didn't always work properly and were not as flexible as desired which makes the interactions less realistic for the participant. 155 - 156 -- There are also other constraints to the interaction, which has to be given as instructions to the participant before testing, such as at what distance to stay from the robot, when to join the robots side, how long to wait to speak after a certain prompt, etc. This further made it unnatural, but was necessary for the system to perform properly. 157 - 158 -- Since participants were also prompted to give shorter answers and try to keep to things like "yes" and "no" it greatly influenced the way participants interacted with the robot. 159 - 160 -- Further, it was (obviously) not possible to test the design with PwD. This was attempted to be resolved by providing a persona description for participants to keep in mind during the testing. Only one participant ended up embodying this character. 161 - 162 -- Results may have been different if participants outside of the course were used, since we are all very familiar with these robots and systems. On one hand it could be positive, since we have all researched dementia and have gained a lot of knowledge within this we could be better at simulating appropriate behavior with the robot or testing the systems in a reasonable way. But since participants also have an idea of how the robot works perhaps some mistakes or issues went undetected which could have appeared with individuals that are not familiar with the robot. Of course knowing about dementia is not the same thing as actually suffering from the diagnosis, so many aspects have most likely gone undetected there. 163 - 164 -- Results could also be influences by the sheer amount of participants, which concluded at 8 participants per group (8 for the goal-oriented approach, and 8 for the emotional approach). Perhaps with more participants the results would differ to a greater extent between the two approaches. Due to time constraints it was not possible to include more participants. 165 - 166 -- Further participants who started the interaction with a pre-disposed idea of what they wanted to do, like the participant mentioned above in the results section, definitely influenced the outcome, since this was no longer about listening to the prompts the robot was giving. 167 - 168 -- Interesting to consider is if participants are perhaps inclined to be positive, or feel like they need to be in such a project evaluation and if ideas like these also ended up affecting the outcome. ? 169 - 170 -- Normally, a robot should really take a walk outside. It should have been tested how a robot will do in actual garden, totally another surface then the room we did the experiment. Unfortunately, we could not do that, because we are not allowed to move th robot from the room. 171 - 172 -- In future studies the amount of participants should be considered, as well as testing the design on PwD and in a garden. Further improvements to the speech recognition are needed, as well as the smoothness of the walking and the distances travelled and the aspect of the participant's distance to the robot. Perhaps if the less realistic aspects discussed above are minimized, a robot that feels more realistic would result in participants listening to the actual prompts given, rather than going into the experiment with a predisposed idea of what they are going to do or answer and would also perhaps deter the participants from tending to reply positively. 173 - 174 - 175 175 = 5. Conclusions = 176 176 177 -Both systems were deemed enjoyable and fascinating, and little rejections were made to both types of persuasions. No significant difference was found in any of the measures between the two groups. 178 178 179 - 180 180 == References == 181 181 182 182 [1] Brysbaert, M. (2019). How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. //Journal of Cognition//, //2//(1), 16. DOI: [[http:~~/~~/doi.org/10.5334/joc.72>>url:http://doi.org/10.5334/joc.72]] ... ... @@ -184,13 +184,13 @@ 184 184 [2] Thorsten Thadewald and Herbert Büning. “Jarque–Bera test and its competitors for testing 185 185 normality–a power comparison”. In: Journal of applied statistics 34.1 (2007), pp. 87–105. 186 186 150 + 187 187 [3] Nadim Nachar et al. “The Mann-Whitney U: A test for assessing whether two indepen- 188 188 dent samples come from the same distribution”. In: Tutorials in quantitative Methods for 189 189 Psychology 4.1 (2008), pp. 13–20. 190 190 155 + 191 191 [4] Tae Kyun Kim. “T test as a parametric statistic”. In: Korean journal of anesthesiology 68.6 192 192 (2015), pp. 540–546. 193 193 194 -[5] M. Heerink, B. Kröse, V. Evers, and B. Wielinga, “Measuring acceptance of an assistive social robot: a suggested toolkit .” [Online]. Available: https:~/~/mheerink.home.xs4all.nl/pdf/HeerinkRo-man09.pdf. [Accessed: 03-Apr-2023]. 195 - 196 196