2. Evaluation Methods

Last modified by Deniz Cetin on 2025/11/09 23:26

<Articulate the Evaluation Methods with literature references, that can be used to assess the aspects being mentioned in the Operational Demands and Situated Cognition sections. This is used as foundation for the actual Evaluation>

There are 2 main tasks that our product aims to adress. FIrst, there is increasing the autonomy of the patients and secondly we plan to also provide some social interaction to improve their mood.

A very important type of metrics are usability metrics, as discussed by Nielsen[1]. The most important metrics we plan to address 2 metrics:

  • Success Rate:  We will measure the amount of patients that performed a task by themselves with the aid of the system.
  • User Satisfaction. We plan to ask the patients that interacted with the system how satisfied they are, ranging from very unsatisfied to very satisfied.

Another important topic is defined by usability and user experience goals, as presented by Winograd [2]. For this, we plan to use 3 metrics:

  • The Cognitive Load: How much mental effort is required by the user. The system will be rated from very easy to use to very hard to use.
  • Emotional Impact: Engaging with the system should be a fun and entertaining experience for the users. For this, users will be asked to rate how they feel when interacting with the system, be that happy, bored, annoyed or some other emotion.
  • User Engagement: A good system will be used by users because they find it easier to perform a task with help from it. With this in mind, the amount of times and the actual total time users use the system will be measured. Good user engagement is captured by a high amount of uses with long use times.

Given that the system will use AI, there are other important metrics we need to discuss, such as accuracy and robustness [3], but also trust and privacy [4]. Both accuracy and robustness are measured by the system's ability to correctly remember and alert the user of tasks they have to perform, as well as the correctness of the information they provide. Trust will be measured by asking the patients and the caregivers whether they trust the system to properly aid the patient with their daily tasks or not. Privacy should be kept and respected. This is not exactly a metric since privacy should be kept to respect the patient's rights, but it is important to address nonetheless.

Since the evaluation will be conducted with dementia patients, it is important to ensure that all procedures are tailored to dementia care. Dementia patients can become nervous when interacting with unfamiliar individuals. Therefore, during evaluations, it is important that the sessions are conducted by, or in the presence of, a caregiver whom the patient trusts. 

Additionally, dementia patients will likely require guidance while completing questionnaires. The identity of the guiding person can influence the patient's mood and responses. Care must be taken to minimize stress and ensure that the evaluation environment is familiar and supportive.

[1] Nielsen, J. (1993). Usability Engineering. Academic Press.

[2] Winograd, T. (2001). Interaction Design: Getting the User's Perspective. ACM Press.

[3] Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). "Why should I trust you?" Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

[4] Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence.