The programme
A reconstruction-first plan run as two stages: design and demonstrate, then rigorous evaluation. Each item is something to understand, rebuild from scratch, and apply. A representative selection is shown here; the full seventeen-week plan is imported from the workbook.
Phase 1Days 1-30
Foundations and the agent atom
Week 1 — Information retrieval, the evaluation spine, dense versus lexical
FoundationsIR theory and term weighting
Derive TF-IDF and BM25 by hand and explain why each term exists.
Introduction to Information Retrieval (Manning)Self-test: Why does BM25 have a length-normalisation term, and what breaks without it?
EvalThe evaluation spine
Stand up a repeatable retrieval eval harness on your own data.
RAGAS (reference)Self-test: How will you separate retrieval failure from generation failure?
RAGDense retrieval
Explain how dense retrieval encodes meaning and where it fails.
DPR (Karpukhin 2020)Self-test: Why can two passages with no shared words sit close in embedding space?
Comments
No comments yet.
Week 4 — Vector search internals, parsing, the agent atom
AgentsReAct loop, the agent atom
Rebuild the reason-act-observe loop from scratch, no framework.
ReAct (Yao 2022)Self-test: What are the failure modes of a naive ReAct loop?
Comments
No comments yet.
Phase 3Days 61-90 and the rigorous-evaluation stage
The FinanceBench capstone
Week 9 — Capstone spec and the dismal baseline
CapstoneReproduce the baseline
Show why basic RAG fails on filings, with honest numbers.
FinanceBench (Islam 2023)Self-test: Where exactly does basic RAG break on a 10-K: tables, footnotes, or numbers?
Comments
No comments yet.