Comments on 3. Evaluation

Last modified by Bernd Dudzik on 2025/11/03 23:39

  • Mark Neerincx
    Mark Neerincx, 2025/10/21 13:02

    1. Prototype Scope: “Is the Artifact or Prototype described with a clear scope, indicating which specific Requirements and Design Specifications it implements for testing?”

    Applies to: Prototype

    • Prototype:
      • Meets Criterion:
        • Clearly scopes a vertical prototype (focusing on specific function(s)) targeting a single interaction (companionship initiation) with four design variants (movement × sound).
        • Acknowledges that prototypes are animations (not interactive), which bounds the implemented functionality.
      • (Potential) Improvements
        • Does not state which Functions were implemented from the Design Specifications (see comment at the Design Specification on format of Use Case descriptions, usage of an old format version).

    2. Methodological Rigor. Is the chosen Evaluation Method clearly described and justified as being appropriate for testing the specific Claims outlined in the Specification stage?

    Applies to: Test

    • Test:
      • Meets Criterion:
        • Proposes a between-subjects online study with random assignment; outlines participants and tasks succinctly.
        • Includes an attention check and describes materials and procedure steps.
      • (Potential) Improvements
        • Control of stimulus properties (e.g., audio levels, video duration, display conditions) is not described.

    3. Clarity of Measures: Are the Measures used in the test clearly defined and directly linked to the operationalization of the Claims?

    Applies to: Test

    • Test:
      • Meets Criterion:
        • Defines primary measure as correct intention identification; includes clarity/appropriateness ratings and selection count as a proxy for ambiguity.
      • (Potential) Improvements
        • Mapping from measures to specific (high-level) Claims is implicit.
        • No checks confirming that movement/sound differences are perceived as intended.

    4. Results and Claim Validation: Do the Evaluation Results provide clear empirical evidence that either supports or refutes the tested Claims?

    Applies to: Test

    • Test:
      • Meets Criterion:
        • Results are reported and claim selection is mentioned.
      • (Potential) Improvements
        • The relation between claims, measures and results could be explicated more in-depth.

    5. Discussion of Results: Is there a thorough discussion of the Evaluation Results, including an analysis of the study's limitations and any unexpected findings?

    Applies to: Test

    • Test:
      • Meets Criterion:
        • Discussion and limitations were provided
      • (Potential) Improvements
        • The relations to the foundation could be further elaborated.

    6. Iterative Feedback Loop: Does the evaluation conclude with a clear analysis of how the Evaluation Results could inform the next iteration of the Foundation or Specification?

    Applies to: Test

    • Test:
      • Meets Criterion:
        • The need for more research of specific movement-sound combinations is mentioned.
      • (Potential) Improvements
        • No plan for how findings will select among design variants or update Requirements, Claims, or Interaction Patterns.

     

  • Bernd Dudzik
    Bernd Dudzik, 2025/11/03 23:36

    Feedback on Revised Draft

    1. Prototype Scope: "Is the Artifact or Prototype described with a clear scope, indicating which specific Requirements and Design Specifications it implements for testing?"

    Applies to: Prototype

    Prototype

    Meets the criterion:

    • Vertical prototype isolates companionship initiation with four controlled variants (movement × sound) rendered as animations.
    • Clearly states what is simulated (non-interactive video) and what aspect is tested (expressive layer).

    (Potential) Improvements:

    • Map each variant to requirement/claim IDs (C02/C03) and specify preconditions and dependencies; define expected Effects per variant.

    2. Methodological Rigor: "Is the chosen Evaluation Method clearly described and justified as being appropriate for testing the specific Claims outlined in the Specification stage?"

    Applies to: Test

    Test

    Meets the criterion:

    • Between-subjects design with random assignment to one video; procedure and materials laid out with attention check.
    • Stimulus control (equal loudness, duration, camera angle) is described.

    (Potential) Improvements:

    • Justify proxy sample more extensively; add sample-size rationale; expand randomization and screening/exclusion rules; consider stratification.
    • Pre-register analysis and define handling of inattentive responses beyond a single check.

    3. Clarity of Measures: "Are the Measures used in the test clearly defined and directly linked to the operationalization of the Claims?"

    Applies to: Test

    Test

    Meets the criterion:

    • Primary measure: correct identification of intent; secondary: clarity/appropriateness ratings; composite score with specified weights.
    • Error pattern analysis connects misinterpretations to design choices.

    (Potential) Improvements:

    • No notes.

    4. Results and Claim Validation: "Do the Evaluation Results provide clear empirical evidence that either supports or refutes the tested Claims?"

    Applies to: Test

    Test

    Meets the criterion:

    • Reports per-condition differences in interpretation and no significant differences in clarity/appropriateness; identifies best-performing combo.
    • Uses composite scoring to synthesize outcomes.

    (Potential) Improvements:

    • No notes.

    5. Discussion of Results: "Is there a thorough discussion of the Evaluation Results, including an analysis of the study's limitations and any unexpected findings?"

    Applies to: Test

    Test

    Meets the criterion:

    • Discusses trade-off between expressiveness and misinterpretation; positions subtle cues as more comfortable but potentially ambiguous.
    • Reflects on limitations of video-based evaluation and young adult sample.

    (Potential) Improvements:

    • Propose concrete design corrections; discuss ecological validity and next steps for live interaction.
    • Relate findings to HF concepts (clarity of intent, social comfort) to adjust IDPs.

    6. Iterative Feedback Loop: "Does the evaluation conclude with a clear analysis of how the Evaluation Results could inform the next iteration of the Foundation or Specification?"

    Applies to: Test

    Test

    Meets the criterion:

    • Selects best-performing video as baseline for physical prototype; plans to add verbal/musical cues and test with PwD.
    • Outlines movement toward real-world context tests.

    (Potential) Improvements:

    • Translate selection into updated Claims (expected clarity/appropriateness targets) and Requirements (expressive profile parameters)