Instructor: Robin Jia
Large language models (LLMs) are modern engineering marvels that have revolutionized natural language processing. Despite this success, there are still many open questions surrounding how and why LLMs work. This class will cover current research that considers LLMs as scientific objects of study. We will consider three complementary perspectives on understanding LLMs. First, we will analyze the internal operations of LLMs to shed light on how their predictions are computed. Second, we will study LLMs as black boxes and aim to discover principles that govern their behavior. Finally, we will survey external data-related factors that shape the general tendencies of LLMs. By understanding these different perspectives, students will develop a fuller understanding of modern research on LLMs.
Course Staff
Robin Jia
Instructor
Office hours: Thursday 11am-12pm
Location: SAL 236
Johnny Wei
Teaching Assistant
Office hours: Monday and Friday 3-4pm
Location: MCB Lobby
Logistics
- Assignments: Submit all written assignments, including written reports for roles and project-related write-ups, on Brightspace. Grades and feedback will also be provided on Brightspace.
- General discussion: Please use the official course Slack channel for general questions.
- Other discussion: Email Robin and Johnny (
robinjia@
andjtwei@
) or come to office hours to discuss individual matters, such as project ideas or grading.
Prerequisites
This course is designed for students currently pursuing research on large language models. Students are expected to be comfortable reading and presenting NLP research papers. In terms of coursework, familiarity with natural language processing at the level of CSCI 544 (Applied Natural Language Processing) is expected.
The course’s recommended NLP textbook is Jurafsky and Martin’s Speech and Language Processing, whose third edition is available online and is very current.
Schedule
Format of Classes
After the first four classes, classes will revolve around student presentations of papers. For each paper, students will be assigned different roles. We will go one by one through each role, and the corresponding student will give a short presentation in this role.
Main paper roles
For each “main” paper, multiple students will play different roles. Below is the complete list of roles, in presentation order (click on each for details).
Proposer: Proposes the research in the paper to a funding agency.
- Write-up: A 1-2 page report that answers each of the following questions.
These questions are a subset of the questions from the Heilmeier Catechism,
an often-used set of questions for evaluating research proposals, such as grant proposals or fellowship applications.
- What are you trying to do? Articulate your objectives using absolutely no jargon.
- How is it done today, and what are the limits of current practice?
- What is new in your approach and why do you think it will be successful? (Keep this brief to the “pitch”–no need to describe the method in detail because that will be done in the main presentation.)
- Who cares? If you are successful, what difference will it make?
- What are the risks? (I interpret this to mean, how might this proposed work not succeed? What will you do about it if that happens?)
- What are the mid-term and final “exams” to check for success?
- Presentation: ~3 minute presentation covering the first four points. To save time, we will skip over the “risks” and “exams” sections.
Main Presenter: Presents the paper's methods and results.
- Write-up: Submit your slides, no separate write-up.
- Presentation: 10 minute slide-based presentation of the paper’s methods and results. Answer the following questions:
- What is the main method proposed by this paper?
- What are the baselines (if applicable)?
- What are the main experiments and results? You do not need to cover all experiments in the paper, just choose the most important ones.
- What conclusions can be drawn from the results?
Archaeologist: Compares and contrasts the current paper with relevant prior work.
- Write-up: 1-2 page report that discusses at least 3 related papers that were published before the current paper (excluding all papers that are scheduled to be discussed in class).
At least one prior papers must not be cited by the current paper (indicate which one this is in your write-up).
For each prior paper, describe:
- What does the prior work do? Give a brief summary.
- In what ways is this prior work similar to the current paper?
- What are the key differences?
- In what ways is the current paper “novel” compared to this previous paper?
- Does the prior work and the current work come to similar or different conclusions?
- Presentation: Choose one related paper and give a ~3 minute oral presentation about how it is related to the main paper.
Reviewer: Writes a review of the paper, identifying both its strengths and weaknesses.
- Write-up: A 1-2 page review in the format of ACL Rolling Review (somewhat abridged for the purposes of the class).
The review should answer each of the following questions in separate sections.
Please refer to the ARR Review Form page for details about each section, and the ARR Reviewer Tutorial for more advice on how to write good reviews.
- Paper Summary
- Summary of Strengths
- Summary of Weaknesses
- Comments/Suggestions (no need to flag typos, though this would also be part of the normal ARR review form)
- Soundness score (1-5)
- Overall assessment (1-5)
- Presentation: ~3 minute presentation of your review. You may optionally share your screen to show your review and/or parts of the paper related to your review.
Visionary: Brainstorms follow-up research and products based on the paper.
- Write-up: 1 page report detailing at least one idea for each of the following two types of future work:
- Research: What is a set of natural next research questions to ask? How could the authors of this paper answer these questions?
- Product: How could this research be made into the basis for a new product? You could imagine it being useful in a corporate setting, in a non-profit, for government use, etc. You should envision a specific use case for this product, and describe how the research paper would help or enable that specific application.
- Presentation: ~3 minute oral presentation of your future work ideas (visual aids are optional but encouraged).
Non-main paper roles
On some days, we will have papers that serve as background material or bonus material. For these special papers, there will only be two special roles:
Summarizer: Presents a summary of a background or bonus paper.
- Write-up: Submit your slides, no separate write-up.
- Presentation: 5-10 minute slide-based summary of the paper. The summary should answer the following questions:
- What is the goal of this paper?
- At a high level, what is the paper’s methodology?
- What are the main experiments and results of the paper?
- What conclusions can be drawn from the results?
Connector: Draws connections between the background/bonus paper and the main papers.
- Write-up: 1 page description of the connections between this paper and all main papers for that day (note that the Connector is expected to read both the background/bonus paper they are covering and all main papers for that day). Have a separate paragraph for each main paper. Within each paragraph, discuss:
- What themes does this paper share with the main papers?
- In what ways is this paper different from the main papers?
- (For background papers) How does this paper provide context to understand the main papers?
- (For bonus papers) How does this paper enhance our understanding of the main papers?
- Presentation: ~3 minute oral presentation summarizing your written report.
We will also revisit parts of the Llama 3 paper on a couple occasions. When we do this, there will be another special role:
Re-examiner: Investigates how Llama 3 handles the challenges discussed in that day's readings.
- Write-up: 1 page summary of the key challenges discussed in the day’s main papers (note that this requires reading all main papers). Then, discuss how Llama 3 deals with those challenges, and whether this seems to be a good choice given what is discussed in the main papers.
- Presentation: ~3 minute oral presentation of the written document.
Concluder Role
Finally, at the end of each class, the Concluder is responsible for reading all main papers from that day and summarizing the connections between them.
Concluder: Summarizes the relationships between all main papers.
- Write-up: 1-2 page report describing how the main papers are connected. For each paper, answer the following:
- What themes does this paper share with the other papers?
- In what ways does this paper support the other papers?
- In what ways does this paper disagree or present a different narrative than some of the other papers? Are these narratives mutually incompatible, and why?
- If there are disagreements, which side do you find more convincing, and why?
- Presentation: ~3 minute oral presentation of these connections. If there are disagreements between papers, stake out a clear position to seed further discussion.
Grading
Grades will be based on role-based written reports (25%) and presentations (25%), in-class discussion (10%), and a final project (40% total).
Roles (25% written reports, 25% presentations; 50% total). Students will play different roles (as described above) on different days. Over the course of the semester, each student will play six unique roles on six different days. One of these roles must be the Main Presenter role. All written reports are due by the time class starts (4:00pm). Each role’s grade will contribute equally to the overall grade.
Class discussion participation (10%). Students are expected to participate in class discussions even when they have no assigned role. This includes asking questions during presentations as well as voicing opinions on discussion topics.
Final project (40% total). Students must complete a final research project on a topic related to the class. Projects may be conducted individually or in groups of up to three.
Final project
Students must complete a final research project on a topic related to the class, either individually or in groups of up to three. This project is expected to include novel research that studies a scientific question about language models (which may or may not be “large,” depending on resource constraints). While projects may involve querying closed-source models like ChatGPT, all projects must also study some open-weight language models (i.e., weights are released, but full training details may not be). Please come to office hours or email me if you have questions related to choosing a project direction.
The final project is worth 40% of the total grade. Points will be allocated as follows:
Project proposal (5%). Students should submit a ~2-page proposal for their project by the end of Week 5 (September 27). The proposal should describe the goal of the project and include a survey of related work. When reading these proposals, I will be looking for the following:
- Clearly and precisely state your problem statement or goal. Use mathematical notation when appropriate.
- Summarize what is known in the literature about this problem and previous approaches.
- Describe an idea for your method. It does not need to be guaranteed to work, but it should come with a clear plan of how you would carry this out.
- Describe what resources you will need (compute, data, models) and whether you have access to these.
- State why this project is relevant to the course themes (broadly construed), if not obvious.
Project midterm report (10%). Students should submit a 3-4 page progress report for their project by the end of Week 10 (November 1). This should describe the project’s goals (which may have changed since the proposal), initial results, and a concrete plan of what will be done for the final report. While the initial results need not be positive, students are expected to have made non-trivial implementation progress by this point. For parts of the report describing project goals and plans, the expectations are largely the same as for the proposal. In addition, I will be looking for the following:
- Why did you choose to do the experiments you did? What hypotheses are you testing?
- Technical detail about what experiments were conducted. The level of description should be sufficient for someone else to be able to reproduce your experiments.
- Analysis of results. What conclusions can you draw from these results? Or if they are inconclusive, what further experiments are needed so that you can draw some conclusions?
Project final presentation (10%). This will be a 11 minute presentation during one of the final few class periods. Students should describe the motivation for their work, relevant background material, and results. I encourage students to present both positive and negative results. There will also be some time for audience questions.
Project final report (15%). Students should submit a 5-6 page final report detailing all aspects of their project by the end of the last week of class (December 6). The report should be structured like a conference paper, including an abstract, introduction, related work, and experiments. Parts of the proposal and progress report may be reused for the final report. Negative results will not be penalized, but should be accompanied with detailed analysis of why the expected results did not materialize.
- Regarding structure: If you’re not sure what to do, I recommend looking back at some of the papers we read this semester from ACL/EMNLP and using that paper’s structure as a template. Broadly speaking, every paper should have an abstract followed by sections for Introduction, Problem Statement, Approach, Experiments, and Discussion. Related work can go either after the introduction or mixed with Discussion. (My rough rule of thumb: put Related Work after the Introduction if there are significant prerequisites to understanding the context of your paper that cannot be adequately summarized in the introduction. Otherwise, mix Related Work with Discussion, as the paper will flow better if it goes directly from Introduction to Problem Statement.) You should end with some sort of conclusion–it can be its own section or just the end of the Discussion, but it should wrap up and provide some forward looking thoughts.
- Use proper LaTeX formatting. Use the ACL LaTeX template linked below as your guide. One common thing I see is confusion between \citet{} and \citep{}. Use \citet{} whenever the work you are citing is playing the role of a noun in your sentence. If you’re saying something like, “Li et al. (2021) argue that language models linearly represent properties of entities,” that should be written with \citet{}.
- It is of course important to report experimental results, but it is equally important to analyze them. What conclusions can be drawn from them? What have we learned by doing these experiments? Don’t expect the reader to infer everything you want them to from your results table—it’s your job to tell the reader what your results mean.
- Negative results will not be penalized, but should be accompanied with detailed analysis of why the proposed method did not work as anticipated. For example, did you have an underlying hypothesis about why your method would work? If your method did not work, was it because that hypothesis was not true? What do the negative results teach us about NLP models that you did not anticipate?
All written project-related assignments should use the standard *ACL paper submission template. All project due dates are 11:59pm PST on Friday.
Late days
You are given 3 free late days that may be used in integer amounts on any role-related written report, the project proposal, and the project midterm report. Each late day extends your deadline by exactly 24 hours. You do not need to contact the course staff before using these late days. For role-related reports, you still must present in class on the scheduled day even if you submit your report late. If you are working in a group and want to submit the project proposal or midterm report late, every group member must spend a late day. No late days are allowed for the project final report (we need to grade them quickly to assign final grades).
Additional late days will result in a deduction of 10% of the grade on the corresponding assignment per day. For the project proposal and midterm report, no late submissions will be accepted more than 3 days after the stated due date, so that we can provide feedback in a timely manner.
Project resources
- USC CARC Discovery Cluster is a computing cluster run by USC. Follow the link more to learn about how to use this resource. Note that both CPU and GPU-based machines are available—GPUs will be useful for running deep learning models. We have requested and received quota to run jobs specifically for class final projects. If you are interested in using CARC, contact Robin and he will add you to the class’s allocation so you can use this quota.
- Google Colab provides free computational resources, though there are limits (e.g., jobs can only run for 12 hours at a time). See their FAQ for details.