Changes for page Test
Last modified by Clara Stiller on 2022/04/05 13:44
From version
61.1


edited by Vishruty Mittal
on 2022/04/02 12:26
on 2022/04/02 12:26
Change comment:
There is no comment for this version
To version
90.1


edited by Vishruty Mittal
on 2022/04/02 16:02
on 2022/04/02 16:02
Change comment:
There is no comment for this version
Summary
Details
- Page properties
-
- Content
-
... ... @@ -1,6 +1,5 @@ 1 -Evaluation is an iterative process where the initial iterations focus on examining if the proposed idea is working as intended. Therefore, we want to first understand how realistic and convincing the provided dialogues and suggested activities are, and would they be able to prevent people from wandering. To examine this, we conduct a small pilot study with students, who role-play having dementia. We then observe their interaction with Pepper to examine the effectiveness of our dialog flow in preventing people from wandering. 1 +Evaluation is an iterative process where the initial iterations focus on examining if the proposed idea is working as intended. Therefore, we want to first understand how realistic and convincing the provided dialogues and suggested activities are, and would they be able to prevent people from wandering. To examine this, we conduct a small pilot study with students, who role-play having dementia. We then observe their interaction with Pepper to examine the effectiveness of our dialogue flow in preventing people from wandering. 2 2 3 - 4 4 = Problem statement and research questions = 5 5 6 6 **Goal**: How effective is music and dialogue in preventing people with dementia from wandering? ... ... @@ -23,86 +23,44 @@ 23 23 24 24 = Method = 25 25 26 - Abetween-subject study with students who play the role of having dementia. Data will be collected with a questionnaire that participants fill out before and after interacting with Pepper. The questionnaire captures different aspects of the conversation along with their mood before and after the interaction with Pepper.25 +We will conduct a between-subject study with students who play the role of having dementia. Data will be collected with a questionnaire that participants fill out before and after interacting with Pepper. The questionnaire captures different aspects of the conversation along with their mood before and after the interaction with Pepper. 27 27 28 28 For our between-subject study, our independent variable is Pepper trying to distract the users by mentioning different activities along with the corresponding music. Through this, we want to measure the effectiveness of music and activities in preventing people from leaving the care home, which is thereby our dependent variable. So we developed 2 different prototype designs- 29 29 30 -Design X - Itis the full interaction flow where Pepper suggests activities and uses music to distract people from leaving.31 -Design Y - Itis the control condition where pepper simply tries to stop people from leaving by physically keeping its hand on the door.29 +Design X - is the full interaction flow where Pepper suggests activities and uses music to distract people from leaving. 30 +Design Y - is the control condition where pepper simply tries to stop people from leaving by physically keeping its hand on the door. 32 32 33 33 == Participants == 34 34 35 -17 students who play the role of having dementia. They will be divided into two groups. One group (11 participants) will be interacting with the intelligent (group 1) robot while the other group (6 students) will interact with the unintelligent robot (group 2). 36 -It is assumed that all participants are living at the same care center. 37 -Before they start, they can choose how stubborn they want to be and where they want to go. 34 +The ideal participants for our user study would have been people who have dementia. However, as the people in this section fall under vulnerable groups, testing with them would have been very difficult due to the current pandemic situation. Therefore we planned to conduct our experiments with students instead. 35 +Our experiment involves 17 students who play the role of having dementia. They will be divided into two groups. One group (11 participants) will be interacting with design X, while the other group (6 students) will interact with design Y. 38 38 39 39 == Experimental design == 40 40 41 -All questions collect quantitative data, using a 5 point Likert scale wherever applicable. 39 +**Before Experiment:** 40 +We will explain to the participants the goal of this experiment and what do they need to do to prevent ambiguity. Therefore, as our participants are students and only playing the role of having dementia, we will give them a level of stubbornness/ willpower with which they are trying to leave the care home. 41 +Participants will also be given a reason to leave, from the below list: 42 42 43 -1. Observe the participant's mood and see how the conversation goes. Observe the level of aggression (tone, volume, pace) 44 -1. Observe whether the mood is improved and the decision has been changed. 45 -1. Observe how natural the conversation is. (conversation makes sense) 46 -1. Participants fill out questionnaires. 43 +* going to the supermarket 44 +* going to the office 45 +* going for a walk 47 47 48 - ==Tasks==47 +After this preparation, the participant fills a part of the questionnaire. 49 49 50 - Because ourparticipants only play theole of having dementia, we will give them a level of stubbornness/ willpower with they are trying to leave. We try to detect this level with the robot.51 - Participantsfromgroup1(usingintelligentrobot)willalsobegivenonefthe reasons toleave,listedbelow:49 +**Experiment:** 50 +The participant begins interacting with Pepper who is standing near the exit door. The participant and robot have an interaction where the robot is trying to convince him/her to stay inside. 52 52 53 -1. going to the supermarket 54 -1. going to the office 55 -1. going for a walk 52 +**After Experiment:** 53 +After the participant finishes interacting with Pepper, he/she will be asked to fill out the remaining questionnaire. Almost all the questions in the questionnaire collect quantitative data, using a 5 point Likert scale. The questionnaire also used images from Self Assessment Manikin (SAM) so that user can self attest to their mood before and after their interaction with Pepper. 56 56 57 -After this preparation, the participant is told to (try to) leave the building. The participant and robot have an interaction where the robot is trying to convince the participant to stay inside. 58 - 59 - 60 -== Measures == 61 - 62 -We will be measuring this physically and emotionally. 63 -Physically: whether the participant was stopped from leaving the building or not. 64 -Emotionally: evaluate their responses to the robot and observe their mood before and after the interaction. 65 - 66 - 67 -== Procedure == 68 - 69 -{{html}} 70 -<!-- Your HTML code here --> 71 -<table width='100%'> 72 -<tr> 73 -<th width='50%'>Group 1</th> 74 -<th width='50%'>Group 2</th> 75 -</tr> 76 -<tr> 77 -<td>intelligent robot</td> 78 -<td>unintelligent robot</td> 79 -</tr> 80 -<tr> 81 -<td> 82 -1. Starts with a short briefing on what we expect from the participant<br> 83 -2. Let them fill out the informed consent form<br> 84 -3. Tell them their level of stubbornness and reason to leave<br> 85 -4. Fill out question about current mood (in their role)<br> 86 -4. Let the user interact with the robot<br> 87 -5. While user is interacting, we will be observing the conversation with the robot<br> 88 -6. Let user fill out the questionnaire about their experience after the interaction 89 -</td> 90 -<td> 91 -1. Starts with a short briefing on what we expect from the participant<br> 92 -2. Let them fill out the informed consent form<br> 93 -4. Fill out question about current mood (in their role)<br> 94 -5. Let the user interact with the robot<br> 95 -6. Let user fill out the questionnaire about their experience after the interaction<br> 96 -</td> 97 -</tr> 98 -</table> 99 - 100 -{{/html}} 101 - 102 102 == Material == 103 103 104 - Pepper,laptop,door,andmusic.57 +The items required for this evaluation are the following: 105 105 59 +* Pepper 60 +* Door 61 +* Caretaker in a nearby room in case of emergency 62 + 106 106 = Results = 107 107 108 108 {{html}} ... ... @@ -170,7 +170,8 @@ 170 170 <img src="/xwiki/wiki/sce2022group05/download/Foundation/Operational%20Demands/Personas/WebHome/RQ1.jpg?height=250&rev=1.1" /> 171 171 </td> 172 172 <td> 173 -Comment on the graph 130 +We used a Likert scale for this question, 1 being the lowest and 5 being the highest. Participants who interacted with design Y tend to agree less to stay inside compared to the people who interacted with design X. 131 + 174 174 </td> 175 175 </tr> 176 176 </table> ... ... @@ -185,7 +185,9 @@ 185 185 <img src="/xwiki/wiki/sce2022group05/download/Foundation/Operational%20Demands/Personas/WebHome/RQ2.jpg?height=250&rev=1.1" /> 186 186 </td> 187 187 <td> 188 -Comment on the graph 146 +We notice a positive change in valence with the full flow i.e design X (although negligible). This can be because of the music. The valence does not decrease for the baseline which might be due to the novelty effect of seeing Pepper for the first time. The change in arousal in both scenarios is nearly negligible. This might be due to the fact that the interaction with Pepper was very short. 147 +Additionally, in the case of the full flow i.e design X, these values might have not changed significantly as per the expectation (valence higher, arousal lower) because the music was not personalized for participants. 148 + 189 189 </td> 190 190 </tr> 191 191 </table> ... ... @@ -200,7 +200,9 @@ 200 200 <img src="/xwiki/wiki/sce2022group05/download/Foundation/Operational%20Demands/Personas/WebHome/RQ3.jpg?height=250&rev=1.1" /> 201 201 </td> 202 202 <td> 203 -Comment on the graph 163 +We notice a very minute difference between the full flow i.e design X, and control condition, design Y. There might be many reasons behind this. The speech recognition module in Pepper was not very efficient to understand different accents and thereby misunderstood words in some cases. <br> 164 +The null hypothesis is perceived message understanding for both the conditions is equal. Given the p value, the null hypothesis can not be rejected. High variance in data and also restrictive sample size could be the reasons behind the insignificant result. 165 + 204 204 </td> 205 205 </tr> 206 206 </table> ... ... @@ -215,7 +215,7 @@ 215 215 <img src="/xwiki/wiki/sce2022group05/download/Foundation/Operational%20Demands/Personas/WebHome/RQ4.jpg?height=250&rev=1.1" /> 216 216 </td> 217 217 <td> 218 - Comment on the graph180 +We found that participants who knew the songs, enjoyed the music and thought it fit the situation more than the ones who did not know the songs. 219 219 </td> 220 220 </tr> 221 221 </table> ... ... @@ -230,7 +230,7 @@ 230 230 <img src="/xwiki/wiki/sce2022group05/download/Foundation/Operational%20Demands/Personas/WebHome/RQ5.jpg?height=250&rev=1.1" /> 231 231 </td> 232 232 <td> 233 - Comment on thegraph195 +As per these results, we can say that if participants have a predilection toward the suggested activity, there is a higher chance of them staying in. Therefore there is a direct correlation between people staying in and their interest in the activity. After personalization, we expect the score to be further increased. 234 234 </td> 235 235 </tr> 236 236 </table> ... ... @@ -245,7 +245,11 @@ 245 245 <img src="/xwiki/wiki/sce2022group05/download/Foundation/Operational%20Demands/Personas/WebHome/RQ6.jpg?height=250&rev=1.1" /> 246 246 </td> 247 247 <td> 248 -Comment on the graph 210 +We find that the values for co-presence for both conditions are very similar. This may be attributed to the novelty effect and also to the fact that the face recognition module remains unchanged. 211 +The values for attention allocation are similar, but the controlled flow (design Y) has a higher value. We suspect that the potential reason might be, that people start to lose focus with the elongated conversations. 212 + 213 +Besides the co-presence, all the observations are not statistically significant because of the high variance in the limited responses. 214 + 249 249 </td> 250 250 </tr> 251 251 </table> ... ... @@ -260,7 +260,7 @@ 260 260 <img src="/xwiki/wiki/sce2022group05/download/Foundation/Operational%20Demands/Personas/WebHome/RelScores.jpg?height=250&rev=1.1" /> 261 261 </td> 262 262 <td> 263 -Com ment onthe graph229 +We achieved a high Cronbatch alpha score (>60%) for almost all the sections of our analysis. Thereby providing reliability to our evaluation. 264 264 </td> 265 265 </tr> 266 266 </table>