Step 4: Claims

1

|**Topic**|(% style="width:215px" %)**Question**|(% style="width:406px" %)**Answer**

2

|(((

[[image:12.png]]

//Measurements//

)))|(% style="width:215px" %)For each positive and negative effect listed in step 3, describe how you could evaluate (measure) whether they actually occur.|(% style="width:406px" %)(((

7

- Explicit feedback through 1 - answers given to the prompts by the patients, 2 - mood feedback is given at the end of the session.

8

9

Control group: Storytelling without interaction.

10

11

Experiment group: Interactive Storytelling.

12

13

Measure using a mood graph with a threshold value to quantify a mood.

14

15

Since the person is likely to have some sense of mobility and is in control of their choices, the system could understand the patient's (Georgina's) mood based on her feedback.

16

17

- Explicit feedback from caregivers

18

19

The system could ask the caregivers to enter a Yes/No for whether each task was performed. For example, did the patient take medicine after being reminded? Or did the patient eat their meal happily?

20

21

- To find the dependency of the users on the system, the system could be taken down for a day or two. The caregiver, **Eleana, **could aid the patient instead and then answer questions on whether she was able to effectively perform tasks otherwise automated by the AI system.

22

23

(This cannot be measured during the duration of the course.)

)))

|(((

[[image:14.png]]

//Benchmark//

)))|(% style="width:215px" %)For each measurement, what are the benchmarks (criteria)? (i.e., what are desired values?)|(% style="width:406px" %)(((

(Scenario A)

- For the mood graph, if the values are between 1 and 10, we could keep a benchmark of around 5-6 so that the system improves its performance to adhere to the patient's preferences. While the patient's mental state is not always in control of the system, it could prove to be a stabilizing factor.

34

35

36

- For the explicit feedback, we could set a benchmark of around 70-80% positive feedback, which would imply that the patient was able to perform 70-80% of the tasks successfully.

37

38

39

- Null hypothesis: Interaction adds no value to the patient experience.

40

41

- Alternative hypothesis: Interactive storytelling improves the patient experience.

42

43

44

(Scenario B) - Will not be tested with the prototype

45

46

- To measure dependency, we could use the same explicit feedback but set a lower benchmark of 65-70% since we remove the system from the interaction.

)))

|(((

[[image:13.png]]

//Demonstration of AI-functionality//

56

)))|(% style="width:215px" %)Can you describe how you could demonstrate that your AI function (s) achieve(s) the effects that you listed in the previous question?|(% style="width:406px" %)(((

57

- To demonstrate that the AI achieves the desired effects, we could plot the mood graph which hopefully shows a slightly increasing trend, above the threshold value.

58

59

- To prove the usefulness of the system, we could compare the feedback given between the control group and the experimental group, and hopefully, show that the feedback is better with interaction.

60

)))

Wiki source code of Step 4: Claims