AI directory search

Search across educators, skills, and resources.

Use this when you know the topic you need: Claude Code, MCP, evals, RAG, agents, product, coding, prompting, foundations, or model internals.

13 matches for "Evaluation"

GPT-5.5 Claude Code Gemini Deep Research Grok 4.3 MCP context engineering evals RAG OpenRouter coding agents

Video matches

Watch first when you want a fast feel for the topic before opening courses, docs, or profiles.

►

LLM evaluation with W&B

Weights & Biases · evals, llm apps, observability, mlops

►

AI Evals for Engineers & PMs

Hamel Husain and Shreya Shankar · evals, product, llm reliability

Educators

Ben Lorica

The Data Exchange · Intermediate

Good practitioner interviews across data, ML, and AI engineering.

Skills

Data systems, ML engineering, AI trends

Greg Kamradt

Data Independent AI tutorials · Beginner to intermediate

Practical walkthroughs for retrieval, LLM application patterns, and common developer questions.

Skills

RAG, LLM apps, Prompting, Evaluation

Providers and platforms

Weights & Biases

W&B Courses · Intermediate

Good for builders who need to measure, debug, and improve LLM apps rather than just demo them.

Topics

LLM apps, Evals, Experiment tracking, MLOps

Humanloop

Humanloop Blog and Docs · Intermediate

Useful for teams building repeatable AI product processes around prompts, datasets, and evaluations.

Topics

Prompt management, Evals, LLM workflows

Stanford CS229

Stanford CS229 Machine Learning · Intermediate

A strong foundation for people who need the math and modeling basics under applied AI.

Topics

ML foundations, Supervised learning, Unsupervised learning, Model evaluation

OpenRouter

OpenRouter docs · Beginner to intermediate

Useful for learning model comparison, routing, fallback behavior, and API-compatible experimentation across proprietary and open model families.

Topics

Model routing, Model comparison, Auto Router, GPT models, Claude models, Gemini, Llama, Mistral, DeepSeek, Qwen, API examples, Evaluation

Resources

Building and Evaluating Advanced RAG Applications

Short course · DeepLearning.AI · Intermediate

You already know basic RAG and need better retrieval, evaluation, and production-quality patterns.

rag, evals, retrieval, llm apps, ai engineering

Hugging Face smol-course

Free course · Hugging Face · Intermediate

You want a current structured course on instruction tuning, fine-tuning, and evaluation around compact open models.

fine-tuning, post-training, open models, evaluation, smollm

►

W&B LLM Evaluation Course

Free course · Weights & Biases · Intermediate

You need to debug and measure LLM app quality.

evals, llm apps, observability

►

AI Evals for Engineers & PMs

Cohort course · Hamel Husain and Shreya Shankar · Intermediate

You are shipping AI features and need a serious evaluation workflow.

evals, product, llm reliability

Data Independent AI tutorials

YouTube tutorials · Greg Kamradt · Beginner to intermediate

Use this when you want Greg Kamradt's material for rag and related AI skills.

RAG, LLM apps, Prompting, Evaluation