AI directory search

Search across educators, skills, and resources.

Use this when you know the topic you need: Claude Code, MCP, evals, RAG, agents, product, coding, prompting, foundations, or model internals.

13 matches for "Evaluation"

Video matches

Watch first when you want a fast feel for the topic before opening courses, docs, or profiles.

LLM evaluation with W&B video thumbnail

LLM evaluation with W&B

Weights & Biases · evals, llm apps, observability, mlops

AI Evals for Engineers & PMs video thumbnail

AI Evals for Engineers & PMs

Hamel Husain and Shreya Shankar · evals, product, llm reliability

Educators

Ben Lorica profile photo

Ben Lorica

The Data Exchange · Intermediate

Good practitioner interviews across data, ML, and AI engineering.

Skills

Data systems, ML engineering, AI trends

Greg Kamradt profile photo

Greg Kamradt

Data Independent AI tutorials · Beginner to intermediate

Practical walkthroughs for retrieval, LLM application patterns, and common developer questions.

Skills

RAG, LLM apps, Prompting, Evaluation

Providers and platforms

Weights & Biases profile photo

Weights & Biases

W&B Courses · Intermediate

Good for builders who need to measure, debug, and improve LLM apps rather than just demo them.

Topics

LLM apps, Evals, Experiment tracking, MLOps

Useful for teams building repeatable AI product processes around prompts, datasets, and evaluations.

Topics

Prompt management, Evals, LLM workflows

A strong foundation for people who need the math and modeling basics under applied AI.

Topics

ML foundations, Supervised learning, Unsupervised learning, Model evaluation

OpenRouter profile photo

OpenRouter

OpenRouter docs · Beginner to intermediate

Useful for learning model comparison, routing, fallback behavior, and API-compatible experimentation across proprietary and open model families.

Topics

Model routing, Model comparison, Auto Router, GPT models, Claude models, Gemini, Llama, Mistral, DeepSeek, Qwen, API examples, Evaluation

Resources

Building and Evaluating Advanced RAG Applications

Short course · DeepLearning.AI · Intermediate

You already know basic RAG and need better retrieval, evaluation, and production-quality patterns.

rag, evals, retrieval, llm apps, ai engineering

Hugging Face smol-course

Free course · Hugging Face · Intermediate

You want a current structured course on instruction tuning, fine-tuning, and evaluation around compact open models.

fine-tuning, post-training, open models, evaluation, smollm

W&B LLM Evaluation Course video thumbnail

W&B LLM Evaluation Course

Free course · Weights & Biases · Intermediate

You need to debug and measure LLM app quality.

evals, llm apps, observability

AI Evals for Engineers & PMs video thumbnail

AI Evals for Engineers & PMs

Cohort course · Hamel Husain and Shreya Shankar · Intermediate

You are shipping AI features and need a serious evaluation workflow.

evals, product, llm reliability

Data Independent AI tutorials

YouTube tutorials · Greg Kamradt · Beginner to intermediate

Use this when you want Greg Kamradt's material for rag and related AI skills.

RAG, LLM apps, Prompting, Evaluation