5. Evaluation

5.1 Qualitative Evaluation

Table 2 - Heuristic Evaluation
InterfaceIssueHeuristic(s)Frequency (0–4)Impact (0–4)Persistence (0–4)Severity = (F + I + P) / 3
Game UIThe color contrast is not distinct, making it difficult for players to see clearly.Aesthetic and minimalist design323(3 + 2 + 3) / 3 = 2.67
Game RulesPlayers think they can step on monsters and do not understand the function of the mushroom, making the game rules unclear.Help and documentation434(4 + 3 + 4) / 3 = 3.67
ControlsUsing the space key for jumping does not align with user habits and is not easy to operate.Flexibility and efficiency of use323(3 + 2 + 3) / 3 = 2.67
Game ObjectsPlayers believe that brown is a toxic color and touching it will cause them to lose health.Consistency and standards223(2 + 2 + 3) / 3 = 2.33
Exit OptionThere is no clear exit option.User control and freedom334(3 + 3 + 4) / 3 = 3.33

5.2 Quantitative Evaluation: NASA TLX & SUS

Objective

This report aims to evaluate the user experience of the platformer puzzle game “Capoo” at low difficulty (L1) and high difficulty (L2) using quantitative methods, comparing workload and usability differences.

Background

NASA TLX is a tool for measuring subjective workload across six dimensions (Hart & Staveland, 1988). SUS is a reliable usability assessment tool (Brooke, 1986). This study combines both methods to analyze the impact of Capoo’s difficulty on player experience.

Goals

  • Quantify the workload (NASA TLX) and usability (SUS) of Capoo at L1 and L2.
  • Use statistical tests to determine the significance of differences.

Methodology

Participants

  • Number: 10 volunteers.
  • Characteristics: Classmates with no specific gaming experience requirements.
  • Selection Method: Random recruitment.

Experimental Design

  • Difficulty Levels: Capoo includes L1 (low difficulty) and L2 (high difficulty).
  • Testing Order:
    • 5 users played L1 first, then L2.
    • 5 users played L2 first, then L1.
    • This minimizes learning effects.

Data Collection

  • Tools:
    • NASA TLX: 6 dimensions (raw scores).
    • SUS: 10-question survey.
  • Procedure: Each user played one difficulty level and then filled out the NASA TLX and SUS forms, resulting in four scores per participant.

Scoring Method

  • NASA TLX: Dimension score = (Rating - 1) × 25, Total score = (∑ Dimension scores) / 6.
  • SUS: Odd-numbered questions = Rating - 1, Even-numbered questions = 5 - Rating, Total score = (∑ Score contributions) × 2.5.

Data Analysis

  • Tool: Wilcoxon signed-rank test.
  • Significance Level: α = 0.05.

Results

Table 3 - Data Overview
User IDL1 NASA TLXL2 NASA TLXL1 SUSL2 SUS
V112.516.675550
V220.8320.834535
V329.1733.335555
V420.83254542.5
V529.1729.1752.550
V637.541.6737.540
V733.3337.542.545
V88.3316.6737.540
V937.545.833535
V1016.6720.835542.5
  • Averages:
    • L1 NASA TLX: 24.58, L2 NASA TLX: 28.75.
    • L1 SUS: 45.5, L2 SUS: 43.5.

Graphical Representation

Figure 8 - NASA TLX Dimension Comparison
Figure 9 - SUS Score Trends
Figure 10 - Correlation Between SUS and NASA TLX

Statistical Analysis

  • NASA TLX:
    • Wilcoxon test result: W = 36 (n=8, excluding zero values).
    • Critical value (n=8, α=0.05): 3.
    • Conclusion: W > 3, no significant difference.
  • SUS:
    • Wilcoxon test result: W = 24 (n=8, excluding zero values).
    • Critical value (n=8, α=0.05): 3.
    • Conclusion: W > 3, no significant difference.

Discussion

Interpretation of Results

  • Workload: L2 NASA TLX (28.75) is slightly higher than L1 (24.58), mainly due to increased physical demands (e.g., jumping) and mental effort (e.g., puzzle complexity), but the difference is not significant.
  • Usability: L1 SUS (45.5) is slightly higher than L2 (43.5), suggesting that the increased difficulty of L2 slightly reduced perceived usability, but not significantly.

Comparison with Expectations

  • It was expected that L2 would have a higher workload and lower usability. The observed trend aligns with expectations but does not reach statistical significance, possibly due to insufficient difficulty differences.

Design Insights

  • Increase the difficulty of L2 by making jumps and puzzles more challenging to amplify workload differences.
  • Optimize L2’s control smoothness (SUS Q1 and Q6 had lower scores) to reduce inconsistencies.

Limitations

  • Small sample size (10 participants) limits statistical power.
  • The difficulty difference between L1 and L2 may not be significant enough to fully reflect puzzle and platforming challenges.

Conclusion

Capoo’s L2 workload is slightly higher than L1 (28.75 vs 24.58), and its SUS score is lower than L1 (43.5 vs 45.5), but neither difference is statistically significant (NASA TLX W = 36, SUS W = 24, p > 0.05). It is recommended to enhance L2’s difficulty and optimize control experience to improve player immersion.


Appendix

Game Design Updates

  1. Enhance L2 Jumping Difficulty: Increase platform height and introduce moving obstacles to heighten physical demand.
  2. Optimize Puzzle Consistency: Standardize puzzle hint styles to improve SUS Q4 (consistency) scores.
  3. Implement Dynamic Difficulty Adjustment: Adjust jumping and puzzle complexity based on player performance.

5.3 Code Testing and Debugging

We used black-box testing with equivalence partitioning to validate the game. Test cases were designed based on input types and game states, without looking at the internal code. We focused on transitions (e.g. main menu to level select), controls (e.g. movement, jumping), and interactions (e.g. keys, traps, water). Each feature was tested using representative inputs from different equivalence classes to ensure correct behavior.

Table 3 - Game State and Scene Testing
IDDescriptionPreconditionTest StepsExpected Result
TC-01Game Launch and Main Menu TestLaunch the game1. Start the game2. Observe the main menu- Game resources load correctly- “Capoo” title appears on the main menu- “Press ENTER To Start” prompt is shown- Background and cloud animation display properly
TC-02Transition from Main to Level MenuGame is at the main menu1. Press ENTER on the main menu- Game state changes from START to LEVEL_SELECT- Level selection screen displays correctly- “Use LEFT/RIGHT To Choose Press SPACE To Start” prompt appears- All level options are visible, with the current one highlighted
TC-03Level Selection FunctionalityGame is at level selection screen1. Press LEFT arrow key2. Press RIGHT arrow key3. Observe changes- LEFT arrow decreases selected level index (unless at first level)- RIGHT arrow increases selected level index (unless at last level)- Selected level is highlighted
TC-04Level Loading FunctionAt level selection, one level selected1. Press SPACE- Game state changes from LEVEL_SELECT to PLAYING- Selected level loads and displays correctly- Character appears at initial position- Level background, terrain, and items are displayed correctly

Table 4 - Character Movement and Control Testing
IDDescriptionPreconditionTest StepsExpected Result
TC-05Basic Character MovementGame is in PLAYING state, character on ground1. Press LEFT2. Release LEFT3. Press RIGHT4. Release RIGHT- Character moves left and faces left when LEFT key is pressed- Character moves right and faces right when RIGHT key is pressed- Character stops moving when key is released
TC-06Flying When Merged with PotionGame is in PLAYING state, character merged with potion1. Press SPACE2. Hold SPACE3. Release SPACE- Character flies upward when SPACE is pressed- Keeps flying while SPACE is held- Falls when SPACE is released
TC-07Separation from PotionGame is in PLAYING state, character merged with potion1. Press S- Character separates from potion- Potion pops upward- Merge wall disappears (merge layer invisible)
TC-08Potion Ejection After SeparationGame is in PLAYING state, potion and character separated1. Ensure potion is landed or stuck to wall2. Press A to test left ejection3. Press D for right- Potion can be ejected from ground/wall- A triggers left-upward ejection- D triggers right-upward ejection- Potion has proper initial velocity and gravity scaling
TC-09Climbing FeatureGame is in PLAYING state, near climbable wall1. Move character near wall2. Press UP3. Press DOWN- Wall is detected as climbable- UP moves character upward- DOWN moves character downward- Gravity doesn’t apply while climbing
TC-10Automatic Remerge with PotionGame is in PLAYING state, potion and character separated1. Move character close to potion- Potion automatically merges when close- Potion appears on back- Merge wall becomes visible again

Table 5 - Game Mechanics and Interaction Testing
IDDescriptionPreconditionTest StepsExpected Result
TC-11Spring MechanismGame is in PLAYING state, level contains spring bed1. Move character onto spring bed- Character gains upward speed- Bounce height exceeds regular jump- Spring sound plays
TC-12Collecting KeysGame is in PLAYING state, level contains visible key1. Move character to contact key- Key disappears on contact- Key count increases- UI updates- Key collection sound plays
TC-13Level CompletionGame is in PLAYING state, all keys collected1. Move character to touch the flag- State changes to LEVEL_COMPLETE- Completion screen shows- “Level Complete!” and continuation options appear- Completion sound plays
TC-14Switch and Mechanism Wall InteractionGame is in PLAYING state, level contains switch and wall1. Touch switch2. Observe wall3. Touch switch again- Switch changes state- Wall moves to target- Pressing again returns wall to original- Switch sound plays
TC-15Water HazardGame is in PLAYING state, level contains water area1. Lead character into water- Character dies- Message “Cats dissolve easily in water!” shown- Respawn at starting point- Death sound plays
TC-16Trap HazardGame is in PLAYING state, level contains traps1. Lead character into trap- Character dies- Message “You are trapped!” shown- Respawn at starting point- Death sound plays
TC-17Potion Falling into WaterGame is in PLAYING state, potion and character separated, level has water1. Make potion fall into water- Game resets- Message “No water with my pot!” shown- Character and potion respawn and remerge- Death sound plays
TC-18Ice Surface LimitationGame is in PLAYING state, level contains ice surface1. Move character onto ice2. Attempt to jump- Jumping is disabled- Message “Ice! You can’t jump!” shown

Table 6 - UI and Accessibility Testing
IDDescriptionPreconditionTest StepsExpected Result
TC-19Help InterfaceGame is in PLAYING state1. Press H to show help2. Press any key to close- H key shows help overlay with game controls- Any key hides help and returns to game
TC-20Level ResetGame is in PLAYING state1. Press R- Character resets to start- All keys reset- Switches and walls reset- Message “Restarting level…” shown- Death sound plays
TC-21Return to Level SelectionGame is in PLAYING state1. Press ESC- State changes to LEVEL_SELECT- Character and level state saved- Level selection screen shown
TC-22Post-Level Completion OptionsGame is in LEVEL_COMPLETE state1. Press any key (except ESC) to continue2. Press ESC after completion- If not last level, any key starts next level- If last level, any key returns to selection- ESC returns to selection in all cases

5.4 Summary

The evaluation process confirmed that Capoo is easy to use and cognitively manageable for casual players, even as difficulty increases. While workload and usability metrics did not show significant shifts, qualitative feedback provided actionable insights for polishing level design and improving onboarding. Heuristic evaluation revealed critical UI flaws that were successfully addressed. Ongoing testing ensured stability, performance, and a better user experience across devices.