BIODS220: Final Project

The goal of the project is to provide an opportunity to gain in-depth experience developing an AI-based approach to a healthcare problem. Since a large part of the course is focused on deep learning approaches for biomedical data, we require that your project must involve training of at least one deep learning model on biomedical or more broadly healthcare-related data. To satisfy this requirement, simply re-running training of a model from an existing code repository is not sufficient; you must at minimum demonstrate the ability to perform thoughtful monitoring and analysis of the training process and hyperparameter selection. Beyond this requirement, the precise topic of your project is flexible to allow exploration of diverse interests.

You may work in groups of 1-3 on the project, and grades will be calibrated by group size. Your project may be related to research in another class project as long as permission is granted by instructors of both classes; however, you must clearly indicate in the project proposal, milestone, and final reports the exact portion of the project that is being counted for this course. In this case, you must prepare separate reports for each course, and submit your final report for the other course as well.

Following is a description of the project components that you will need to turn in, the grading rubric, as well as some suggestions for project ideas.

Project Proposal

Your project proposal is due on Friday, Oct 21. It should be a 1-1.5 page document excluding references and figures, using the NeurIPS template. The proposal should include the following (with grade breakdown):

Title, Authors(s)
(25%) What is the problem that you will be investigating? Explain the task thoroughly. Why is it interesting?
(25%) What is the data you will be using? Please include relevant characteristics such as the source of the data, the size of the dataset, and a sample of the data. Explain clearly which parts of the data will be used as well as potential obstacles.
(25%) What methods do you plan to experiment with? Please thoroughly describe them. If there are existing related implementations, will you use them and how? How do you plan to improve or modify such implementations? This can be subject to change, but you should have a general sense of how you plan to approach the problem you are working on.
(25%) How will you evaluate your results? Qualitatively, what kind of results do you expect (e.g. plots or figures)? Quantitatively, what kind of analysis will you use to evaluate and/or compare your results (e.g. what performance metrics or statistical tests)? What is your hypothesis regarding your results compared to baselines?

Submission: Please submit your proposal as a PDF on Gradescope. Only one person on your team should submit. Please have this person add the rest of your team as collaborators as a “Group Submission”.

Project Milestone

Your milestone is due on Friday, Nov 18. It should be a 3-4 page document using the NeurIPS template. It should include preliminary sections towards the final report, in standard formats of NeurIPS research papers. The milestone should include the following (with grade breakdown):

Title, Authors(s)
(20%) Introduction. Introduce your problem, and the landscape for why the problem is interesting and what has been done before in this space. Describe your overall plan for approaching the problem, what contributions you expect to make, and why this is interesting in the context of the described landscape.
(25%) Related Work. Describe in detail existing work related to your problem, how they are related to each other, and how your work relates to these. We expect this to be comprehensive and thorough, and with at least 10 citations discussed and cited accordingly.
(15%) Data. Describe in detail the data that you are using, including the source(s) of the data, relevant statistics, and qualitative examples if appropriate.
(20%) Approach. Thoroughly describe the methods that you intended to use in your approach, and the baselines you plan to compare against.
(20%) Preliminary Results. Describe any preliminary results up to the time of the milestone. You should show the results of training at least one deep learning model. (It is fine if the model has not yet attained good performance.) You should also show preliminary analysis on the model(s) that you have trained. You should also describe anticipated next steps and any obstacles that have come up.

Each of these will be graded on a scale of 0-3 based on how well each component of each section is addressed. You can look at past NeurIPS papers for examples of how to compose each of these sections.

Submission: Please submit your proposal as a PDF on Gradescope. Only one person on your team should submit. Please have this person add the rest of your team as collaborators as a “Group Submission”.

Project Advising Session with TAs

Between the project milestone and the project final report / presentation, we will schedule an advising session for each project group with the TAs, to discuss and receive additional feedback. More details will be provided closer to the date.

Project Final Presentation

We will be holding a poster session to share the work that you’ve done in your projects. The final poster session will be Dec 14 (Wed) from 3:30PM to 6:30PM at TBA.

Project Report

Your project report is due on Friday, Dec 16. It should be a 7-9 page document using the NeurIPS template. The report should include the following sections (with grade breakdown):

Title, Authors(s)
Abstract. A paragraph overview of the problem, approach, contribution, and key results.
(15%) Introduction. Introduce your problem, and the landscape for why the problem is interesting and what has been done before in this space. Describe your overall plan for approaching the problem, why this is an interesting contribution in the context of the described landscape, and a summary of your results.
(15%) Related Work. Describe in detail existing work related to your problem, how they are related to each other, and how your work relates to these. We expect this to be comprehensive and thorough, and with at least 10 citations discussed and cited accordingly.
(15%) Data. Describe in detail the data that you are using, including the source(s) of the data, relevant statistics, and qualitative examples if appropriate.
(20%) Approach. Describe in detail the methods that you use in your approach. Importantly, through this section and the experiments section (which may include implementation details), there should be sufficient information for others to reproduce your results.
(25%) Experiments. Describe experiments that you performed to support your approach and contribution. The exact experiments may vary depending on the project, but we will be looking for the thoughtfulness of your analysis. Examples may include performing comparison of your main approach with other baselines or methods, error analyses to investigate the performance of the model, ablation studies to determine the impact of various components of the approach, analyses to provide insight into the effect of different hyperparameter choices, techniques to interpret how the model is working, etc. You should include graphs, tables, or other figures to illustrate the experimental results.
(5%) Conclusion. Summarize the key results, what has been learned, and avenues for future work.
(5%) Writing/Formatting. Your paper should be clearly written and nicely formatted, comparable to published NeurIPS papers.
Contributions. Specify the contributions of each author on the paper. This includes discussion, implementation, and writing for each part of the paper. You should also describe the contributions of any contributors not enrolled in the course (see Additional Submission Requirements below). For an example of appropriate format, please see the author contributions for AlphaGo (Nature, 2016). However, we expect your description to include the more detailed breakdown specified here.
Supplementary Material. This should be submitted as a separate file from your paper and is not counted in the 7-9 page requirement. At minimum, this should include the relevant code for your project. You may also put additional visualizations, demos, videos, etc. that you wish to share with the teaching team.
What you should not put in your supplementary material:
- The entire TensorFlow (or PyTorch, etc.) Github source code. Only put the code you have written for the project.
- Any code that is larger than 10 MB.
- Model checkpoints.

Submission: You will submit your final report as a PDF and your supplementary material as a separate PDF or ZIP file. We will provide detailed submission instructions as the deadline nears.

Additional Submission Requirements: We also ask you do the following when you submit your project report:

Your report PDF should list all authors who have contributed to your work; enough to warrant a co-authorship position. This includes people not enrolled in the course, such as faculty/advisors if they sponsored your work with funding or data, and significant mentors (including PhD students or postdocs who coded with you, participated in data collection, or helped draft your model on a whiteboard). All authors should be listed directly underneath the title on your PDF. Include a footnote on the first page indicating which authors are not enrolled in the course. All co-authors should have their institutional/organizational affiliation specified below the title, and their role should be described in the Contributions section of the paper.
Any code that was used as a base for projects must be referenced and cited in the body of the paper. This includes course assignment code, and fine-tuning example code, open-source, or Github implementations. You can use a footnote or full reference/bibliography entry.
If you are using this project for multiple classes, submit the other class PDF as well. Remember, it is an honor code violation to use the same final report PDF for multiple classes.

Grading Breakdown

(10%) Project Proposal
(25%) Project Milestone
(5%) Project Advising Session with TAs
(10%) Project Final Presentation
(50%) Project Report

Honor Code

You may consult any papers, books, online references, or publicly available implementations for ideas and code that you may want to incorporate into your strategy or algorithm, so long as you clearly cite your sources in your code and your writeup. However, under no circumstances may you look at another group’s code or incorporate their code into your project. If you are combining your course project with the project from another class, you must also follow all guidelines for this as specified above.

Project Ideas

While you have significant flexibility with regards to the precise project topic, examples of types of projects you may wish to pursue include:

Novel problem

Develop a deep learning model for a new problem that has not previously been experimented on.
Utilize a new dataset or tackle new tasks on existing data

Novel analysis

Reimplement an existing paper that you find interesting.
Try different algorithmic variants to see if they improve performance.
Analyze and try different types of input data.
Interpret the model in different ways.
Predict different outputs than current state of the art.

Novel methods

New algorithms or model architectures for an existing problem.
Improve upon current problem’s measure of success.

Regardless of the nature of your project, grading will be based on the thoughtfulness, thoroughness, and depth with which you address the required components in each of the project deliverables.

Possible sources of project inspiration include published papers, Kaggle competitions, and public datasets and repositories including the following:
Biomedical images: Grand Challenges in Biomedical Image Analysis
Electronic health records: MIMIC Critical Care Database
Genomics: ENCODE (Encyclopedia of DNA Elements)

We will also be sharing a list of project ideas contributed by members of the Stanford community through Ed.

Final Course Project