Changes for page Test

Last modified by Andrei Stefan on 2022/04/04 13:38

Manage
- Copy
Actions
- Export
- Print Preview
Viewers
- Source
- Children
- Content
- Comments (1)
- Attachments (4)
- History
- Information

From 66.1 to 67.1 From 95.1 to 96.1

From version

67.1

edited by Xinqi Li
on 2022/04/01 22:32

Change comment: There is no comment for this version

To version

95.1

edited by Xinqi Li
on 2022/04/02 01:41

Change comment: There is no comment for this version

Raw
Rendered

Summary

Page properties (1 modified, 0 added, 0 removed)
Attachments (0 modified, 4 added, 0 removed)

Details

Page properties

Content

@@ -15,7 +15,7 @@
  Besides, Our group decided to use a mixed-method approach for the evaluation.
--* Quantitative data will be derived during the experiment such as the number of mistakes the participant makes during the quiz.
++* Quantitative data will be derived during the experiment such as the number of mistakes the participant makes during the quiz. The participants were also asked to provide a score based on the given system usability scale^^1^^.
  * Qualitative data expected to be gathered through questionnaires, such as to what extent participants are satisfied with using the robot, is also adopted for evaluation.
  By measuring these two types of data, we will manage to assess if our claims are achieved and the research questions are answered.
@@ -41,47 +41,12 @@
 . What did you dislike most about the robot?
 . Do you have any further suggestions? (*optional)
--
--
  == Tasks ==
--**Event: Quiz**
++The participants are asked to memorize the association between the given music and activities as best as they can during the play with the robot.
++The robot would play the music and ask the participant to answer the correct activity.
++In the end, the participant would do the final test and we count the number of correct answers.
--{{html}}
--<table>
--    <tr>
--        <td>No.</td>
--        <td>Group A with the intelligent robot</td>
--        <td>Group B with the dumb robot</td>
--    </tr>
--    <tr>
--        <td>1</td>
--        <td>Participants sign the consent form and read the instruction for the evaluation;</td>
--        <td>Participants sign the consent form and read the instruction for the evaluation;</td>
--    </tr>
--    <tr>
--        <td>2</td>
--        <td>Participants memorize six pieces of music corresponding with different activities;</td>
--        <td>Participants memorize six pieces of music corresponding with different activities;</td>
--    </tr>
--    <tr>
--        <td>3</td>
--        <td>Participants play quiz with the smart robot for three minutes, which will correct the participant when wrong answers are given;</td>
--        <td>Participants play quiz with the dumb robot for three minutes, which will not correct the participant when wrong answers are given;</td>
--    </tr>
--    <tr>
--        <td>4</td>
--        <td>Test how well participants remember the music-activity pairs by counting the mistakes made;</td>
--        <td>Test how well participants remember the music-activity pairs by counting the mistakes made;</td>
--    </tr>
--    <tr>
--        <td>5</td>
--        <td>Participants fill in the questionnaire and give the feedback;</td>
--        <td>Participants fill in the questionnaire and give the feedback;</td>
--    </tr>
--<table>
--{{/html}}
--
  == Measures ==
  Count the correct answer in the final test.
@@ -132,6 +132,36 @@
  = Results =
++[[image:result2.png||height="400px"]]
++From the left figure, we can see the distribution of the number of correct answers. The average score of all participants is 3.6 among 6 questions. For group A, the average score is 3.3 and for group B the average score is 3.8. This bias can be explained because our group size is not large enough to eliminate the various memory ability. but we can also find that all participants in group A can learn something because they have no 0 scores but several participants in group B got 0 scores. In this degree, we can show that our robot does help in memory.
++
++From the middle figure, we can find that people in group A tend to think our robot can help improve the memory task and only a few of them thought our robot is annoying, as shown in the right figure.
++
++[[image:result4.png||height="400px"]]
++As shown in the above figure, group A with our intelligent robot gave our robot an average score of 66.7, and group B with the dumb robot gave 58.2. In this scale, we can see that participants are more willing to play with our intelligent robot.
++
++Also, we collect some feedback from the participants. Most of them liked the appearance of the robot which is consistent with the reasons we choose the NAO. People are more engaged and willing to interact with a humanoid robot. Some of them complained about the speech recognition of this robot.
++
  = Discussion =
++We assume that our intelligent robot can help people strengthen the association between music and activities. The result of average correct answers didn't approve this. Several reasons existed. First, our participants were not real PwD and their memory abilities vary. Our group size(about 10 for each group) was not large enough. Also, Participants were only given a limited time. The short duration of the quiz and not using personalised music also accounted for this biased result. However, the overall usability score between the two groups and some quantitative results above also shows that our claim PwD are more willing to play with our intelligent robot and PwD are happy to use the robot could still hold.
++
++Besides, our robot was limited by several key factors,
++
++* Due to the limited time and resources, we could not evaluate all the claims that were made in the use cases. This limited the broadness of our conclusion about the effectiveness of the system.
++* As mentioned before, the small sample size made the accuracy of the result doubtable. Having a larger and more diverse sample group would allow us to more accurately predict real-world usage.
++* The accuracy of the speech recognition system in the NAO and the availability of test subjects and robots also limited the evaluation.
++
++In the future, we could improve in the following aspects,
++
++* Test a full implementation of the system in a real setting with PwD.
++* Research should also be done to look if the robot is actually necessary, or if the advantage of the system could be achieved by a cheaper alternative, such as a virtual robot on a tablet. (Also inspired by the feedback we got. One participant asked why we didn't create an APP.)
++
++
  = Conclusion =
++
++Based on our evaluation, we proved that participants with our intelligent robot are more willing to play the quiz and consider the robot can help them remember the task better compared with the control group. Our robot still needs further improvement based on the previous discussion.
++
++= Reference =
++
++Bangor, A., Kortum, P. T., & Miller, J. T. (2008). An empirical evaluation of the system usability scale. Intl. Journal of Human–Computer Interaction, 24(6), 574-594.

result1.png

Author

...	...	@@ -1,0 +1,1 @@
	1	+XWiki.mona98

Size

...	...	@@ -1,0 +1,1 @@
	1	+107.3 KB

Content

result2.png

Author

...	...	@@ -1,0 +1,1 @@
	1	+XWiki.mona98

Size

...	...	@@ -1,0 +1,1 @@
	1	+169.4 KB

Content

result3.png

Author

...	...	@@ -1,0 +1,1 @@
	1	+XWiki.mona98

Size

...	...	@@ -1,0 +1,1 @@
	1	+217.7 KB

Content

result4.png

Author

...	...	@@ -1,0 +1,1 @@
	1	+XWiki.mona98

Size

...	...	@@ -1,0 +1,1 @@
	1	+52.8 KB

Content