b. Test - XWiki

1

= 1. Introduction =

2

In the section [[a. Prototype>>3\. Evaluation.a\. Prototype.WebHome]] two versions of the robot were presented, one with voice functionality and one without.

8

9

The main claims we are looking to test with this testing procedure are related to the functionality and usability of the robot.

10

11

The participants will be other students taking the course. The participants will be placed in the shoes of a PwD and be tasked with completing several basic actions with the robot while impaired in several known ways to simulate the difficulties of a PwD.

12

13

After the experiment, the participants will fill out a survey and be asked some more open ended questions with the purpose of understanding how the interaction with the robot went, and whether they have anything that they find concerning regarding the possible use of the system and its functions in a real life setting.

14

15

On top of this, a short questionnaire will be sent to several care homes throughout the Netherlands in hopes to get a general idea whether the caretakers at the facilities think that the system would be a good fit for the proposed use case.

= 2. Method =

The prototypes are evaluated in a simulated manner, with participants pretending to be PwDs and conducting in-person experiments.

20

21

== 2.1 Participants ==

22

23

All students in CS4235 Socio-Cognitive Engineering (2022-2023) in TU Delft are invited to test the robot. In the end, 20 students are presented.

24

25

== 2.2 Experimental design ==

== 2.3 Tasks ==

In the user test, the following tasks were asked of the participants:

32

33

==== Reminders for activities ====

34

35

* (((

36

Add a reminder that a relative will pay a visit tomorrow with the format as "sb. will come at 10 am on Friday". Set the reminder to remind you 10 min before that.

37

)))

38

* (((

39

Add a reminder USING A VOICE COMMAND that today at 2 pm will have a general health checkup.

40

)))

41

* (((

42

Check the reminders you have added for today and tomorrow.

43

)))

44

45

==== Personal profile ====

46

47

* (((

48

Add relatives as a contact in the "profile" section.

49

)))

50

51

==== Memory games ====

52

53

* (((

54

Go to the Games section and check what is included there.

55

56

<this might not be part of the experiment as it might not get implemented in time)

57

)))

58

59

==== Medicine reminders ====

60

61

(for professional caregivers, write it just in case)

62

63

* (((

64

In the section “My Health”, add a medicine reminder to take the medicine **Donepezil**, 1 time per day at 9 PM before going to bed.

65

)))

66

* (((

67

Check medicines that have been added.

68

)))

69

* (((

70

Delete medicines that have been added.

71

)))

72

73

==== About dementia ====

74

75

* (((

76

Go to the About dementia section and check the information provided.

77

)))

78

* (((

79

Click on the different chapters and have a look at them.

80

)))

81

82

==== General tasks ====

* (((

Turn on the robot.

)))

== 2.4 Measures ==

Two quantitative measures are used in the user evaluation: the first is to test different attributes including accessibility, trustworthiness, and comprehensibility; and the second is System Usability Scale (SUS).

91

92

**Interpretation for user evaluation **

93

94

If a respondent had a minimum total score of 60% (15 out of 25 for the matrix question) or more, he or she was considered to be satisfied with the application.

**Scoring SUS**

* For odd items: subtract one from the user response.

99

* For even-numbered items: subtract the user responses from 5

100

* This scales all values from 0 to 4 (with four being the most positive response).

101

* Add up the converted responses for each user and multiply that total by 2.5. This converts the range of possible values from 0 to 100 instead of from 0 to 40.

102

103

**Interpreting Scores for SUS [[*>>https://measuringu.com/sus/]]**

104

105

Interpreting scoring can be complex. The participant’s scores for each question are converted to a new number, added together and then multiplied by 2.5 to convert the original scores of 0-40 to 0-100. Though the scores are 0-100, these are not percentages and should be considered only in terms of their percentile ranking.

106

107

Based on research, a SUS score above a 68 would be considered above average and anything below 68 is below average, however, the best way to interpret your results involves “normalizing” the scores to produce a percentile ranking.

== 2.5 Procedure ==

The procedure was conducted as follows:

112

113

1. Welcome participants and give an introduction.

114

1. Get them to sign a consent form.

115

1. Prepare them to pretend to be a person with dementia. *

116

1. Have interaction with the robot and complete the tasks.

117

1. Complete a questionnaire.

118

1. Have a short interview with selected participants. (if possible, 2 participants)

119

120

//* Several fingers taped together (to simulate PwD's inability to control movements flexibly);//

121

122

// Wearing very dirty glasses (simulates blurred vision and degraded perception);//

123

124

// Wearing headphones that broadcast murmurs (simulating hearing degradation and the noisy environment).//

== 2.6 Material ==

1. Consent form. To protect the privacy of participants and ensure the evaluation process goes smoothly, we will ask participants to sign a consent form, indicating they are willing to take part in the evaluation and the data gathered from the experiment will be analyzed by researchers.

129

1. Pepper robot. <not sure how to elaborate on this>

= 3. Results =

|=(% style="width: 199px;" %)Tasks|=(% style="width: 147px;" %)Succeded by Themselves|=(% style="width: 146px;" %)Succeded with Some Guidance|=(% style="width: 185px;" %)Succeded with Detailed Explicit Instructions|=(% style="width: 175px;" %)Average Time to Complete Task (s)

if possible, note down some Parts Where Users Struggled in each task.

= 4. Discussion =

= 5. Conclusions =

Wiki source code of b. Test

Navigation

author	version	line-number	content
		1	= 1. Introduction =
		2
		3	<include a short summary of the claims to be tested, i.e., the effects of the functions in a specfic use case>
		4
		5	<nothing on prototype yet, we really need to get that going, but assuming this information>
		6
		7	In the section [[a. Prototype>>3\. Evaluation.a\. Prototype.WebHome]] two versions of the robot were presented, one with voice functionality and one without.
		8
		9	The main claims we are looking to test with this testing procedure are related to the functionality and usability of the robot.
		10
		11	The participants will be other students taking the course. The participants will be placed in the shoes of a PwD and be tasked with completing several basic actions with the robot while impaired in several known ways to simulate the difficulties of a PwD.
		12
		13	After the experiment, the participants will fill out a survey and be asked some more open ended questions with the purpose of understanding how the interaction with the robot went, and whether they have anything that they find concerning regarding the possible use of the system and its functions in a real life setting.
		14
		15	On top of this, a short questionnaire will be sent to several care homes throughout the Netherlands in hopes to get a general idea whether the caretakers at the facilities think that the system would be a good fit for the proposed use case.
		16
		17	= 2. Method =
		18
		19	The prototypes are evaluated in a simulated manner, with participants pretending to be PwDs and conducting in-person experiments.
		20
		21	== 2.1 Participants ==
		22
		23	All students in CS4235 Socio-Cognitive Engineering (2022-2023) in TU Delft are invited to test the robot. In the end, 20 students are presented.
		24
		25	== 2.2 Experimental design ==
		26
		27	<Here we can do between or within, doesn't really matter, depends on the number of people we evaluate on honestly because less people means that between subject results will be much more varied and therefore more stupid, not that they will make any sense anyway>
		28
		29	== 2.3 Tasks ==
		30
		31	In the user test, the following tasks were asked of the participants:
		32
		33	==== Reminders for activities ====
		34
		35	* (((
		36	Add a reminder that a relative will pay a visit tomorrow with the format as "sb. will come at 10 am on Friday". Set the reminder to remind you 10 min before that.
		37	)))
		38	* (((
		39	Add a reminder USING A VOICE COMMAND that today at 2 pm will have a general health checkup.
		40	)))
		41	* (((
		42	Check the reminders you have added for today and tomorrow.
		43	)))
		44
		45	==== Personal profile ====
		46
		47	* (((
		48	Add relatives as a contact in the "profile" section.
		49	)))
		50
		51	==== Memory games ====
		52
		53	* (((
		54	Go to the Games section and check what is included there.
		55
		56	<this might not be part of the experiment as it might not get implemented in time)
		57	)))
		58
		59	==== Medicine reminders ====
		60
		61	(for professional caregivers, write it just in case)
		62
		63	* (((
		64	In the section “My Health”, add a medicine reminder to take the medicine Donepezil, 1 time per day at 9 PM before going to bed.
		65	)))
		66	* (((
		67	Check medicines that have been added.
		68	)))
		69	* (((
		70	Delete medicines that have been added.
		71	)))
		72
		73	==== About dementia ====
		74
		75	* (((
		76	Go to the About dementia section and check the information provided.
		77	)))
		78	* (((
		79	Click on the different chapters and have a look at them.
		80	)))
		81
		82	==== General tasks ====
		83
		84	* (((
		85	Turn on the robot.
		86	)))
		87
		88	== 2.4 Measures ==
		89
		90	Two quantitative measures are used in the user evaluation: the first is to test different attributes including accessibility, trustworthiness, and comprehensibility; and the second is System Usability Scale (SUS).
		91
		92	Interpretation for user evaluation
		93
		94	If a respondent had a minimum total score of 60% (15 out of 25 for the matrix question) or more, he or she was considered to be satisfied with the application.
		95
		96	Scoring SUS
		97
		98	* For odd items: subtract one from the user response.
		99	* For even-numbered items: subtract the user responses from 5
		100	* This scales all values from 0 to 4 (with four being the most positive response).
		101	* Add up the converted responses for each user and multiply that total by 2.5. This converts the range of possible values from 0 to 100 instead of from 0 to 40.
		102
		103	*Interpreting Scores for SUS [[>>https://measuringu.com/sus/]]**
		104
		105	Interpreting scoring can be complex. The participant’s scores for each question are converted to a new number, added together and then multiplied by 2.5 to convert the original scores of 0-40 to 0-100. Though the scores are 0-100, these are not percentages and should be considered only in terms of their percentile ranking.
		106
		107	Based on research, a SUS score above a 68 would be considered above average and anything below 68 is below average, however, the best way to interpret your results involves “normalizing” the scores to produce a percentile ranking.
		108
		109	== 2.5 Procedure ==
		110
		111	The procedure was conducted as follows:
		112
		113	1. Welcome participants and give an introduction.
		114	1. Get them to sign a consent form.
		115	1. Prepare them to pretend to be a person with dementia. *
		116	1. Have interaction with the robot and complete the tasks.
		117	1. Complete a questionnaire.
		118	1. Have a short interview with selected participants. (if possible, 2 participants)
		119
		120	//* Several fingers taped together (to simulate PwD's inability to control movements flexibly);//
		121
		122	// Wearing very dirty glasses (simulates blurred vision and degraded perception);//
		123
		124	// Wearing headphones that broadcast murmurs (simulating hearing degradation and the noisy environment).//
		125
		126	== 2.6 Material ==
		127
		128	1. Consent form. To protect the privacy of participants and ensure the evaluation process goes smoothly, we will ask participants to sign a consent form, indicating they are willing to take part in the evaluation and the data gathered from the experiment will be analyzed by researchers.
		129	1. Pepper robot. <not sure how to elaborate on this>
		130
		131	= 3. Results =
		132
		133	\|=(% style="width: 199px;" %)Tasks\|=(% style="width: 147px;" %)Succeded by Themselves\|=(% style="width: 146px;" %)Succeded with Some Guidance\|=(% style="width: 185px;" %)Succeded with Detailed Explicit Instructions\|=(% style="width: 175px;" %)Average Time to Complete Task (s)
		134	\|(% style="width:199px" %)Turning the robot on and off\|(% style="width:147px" %) \|(% style="width:146px" %) \|(% style="width:185px" %) \|(% style="width:175px" %)
		135	\|(% style="width:199px" %)Add a reminder using "+" button\|(% style="width:147px" %) \|(% style="width:146px" %) \|(% style="width:185px" %) \|(% style="width:175px" %)
		136	\|(% style="width:199px" %)Add a reminder using voice\|(% style="width:147px" %) \|(% style="width:146px" %) \|(% style="width:185px" %) \|(% style="width:175px" %)
		137	\|(% style="width:199px" %)Check the reminder\|(% style="width:147px" %) \|(% style="width:146px" %) \|(% style="width:185px" %) \|(% style="width:175px" %)
		138	\|(% style="width:199px" %)Create a personal profile\|(% style="width:147px" %) \|(% style="width:146px" %) \|(% style="width:185px" %) \|(% style="width:175px" %)
		139
		140	if possible, note down some Parts Where Users Struggled in each task.
		141
		142	= 4. Discussion =
		143
		144
		145	= 5. Conclusions =