AIF-C01 - Applications of foundation models - Section 3.4
Describe methods to evaluate foundation model performance.
Use human evaluation, benchmark datasets, and task-specific metrics to judge a model, and understand the limits of any single metric. Recognise why evaluation must reflect the real task, not a proxy.
BenchmarksHuman evaluationTask metrics
Examworthy is not affiliated with or endorsed by Amazon Web Services. Original, blueprint-aligned practice material only.