Final Project

You may use Generative AI for this assignment, but be careful. There are lots of little decisions to make that can significantly affect the results. Make sure you fully understand any LLM-generated code that you use and don’t let the model make decisions for you.

Proposal

The goal of the final project is to build an LLM-powered agent that does something new.. You should work in a team of 2-4 people, though larger teams will be expected to take on something more ambitious.

You should submit a proposal that answers the following questions:

What is the goal of your project?
What is the technical approach that you will take? You should propose some subset of scraping a novel dataset, generating synthetic data, building a retrieval augmented generation system, fine-tuning a model, or an agent architecture that goes beyond the homework assignments. A project that repeats a homework assignment with different prompts is insufficiently ambitious.
How will you evaluate your project?
How will you know if you’ve been successful?

Project Checkpoint: Preliminary Evaluations

Design a preliminary evaluation suite that includes at least 10 problem instances with solutions. Although you may not be able to perform an end-to-end test at this stage, you should have some sense of the difficulty of these problems by manually trying them out with an LLM. Aim to cover a range of problem difficulties, including some trivial problems and some that may be unsolvable with the models currently available.

Specifically, think of a trivial problem as one that the available model can solve without any system design on your part (i.e., without an agent, retrieval, or few-shot prompting).

We expect you will need to refine and expand your evaluation suite as you develop your system. As you do, reflect on what it is about your preliminary evaluation suite that was wrong.

Presentations

TBD

Final Submission

TBD