Long Luong · 8 min read · Resources
Git Workflow for Collaborative Economics Research Projects
A practical Git workflow for starting a cookiecutter research repository, deciding what to track, and syncing with coauthors when local work is still unfinished.

Git is especially useful in economics research because a project usually changes in several places at once: data cleaning scripts, analysis code, tables, figures, and the paper itself.
This post is about a practical Git workflow for collaborative research projects. I will focus on three questions:
- how to start a new research repository when you use a cookiecutter project template
- how and when to add files to tracking and commit them
- what to do each morning if you still have unfinished local work but also want to pull your collaborator’s latest changes
I assume a cookiecutter-style project with folders such as data/, src/, paper/, output/, README.md, and .gitignore. The exact folder names may differ, but the Git logic is the same.
A simple mental model
Before the commands, separate these three things:
- your working directory: the files on your computer
- your local repository: the Git history stored in
.git/ - your remote repository: the copy on GitHub
Most Git confusion comes from mixing these three together.
1. Start a new research project
If you use a cookiecutter template, local-first is usually the cleanest workflow. The template already creates the research folder structure, so it is better to generate the project locally and then connect it to GitHub.
Recommended setup
- Create an empty repository on GitHub.
- Generate the project locally from your cookiecutter template.
- Enter the new project folder.
- Check whether the template already initialized Git.
- If not, initialize Git, add the remote, and push the first commit.
cookiecutter path/to/your-research-templatecd project-namegit statusIf the template did not initialize Git, do this:
git initgit branch -M maingit add .git commit -m "Initialize research project from cookiecutter template"git push -u origin mainThis is one of the few times git add . is usually fine. At the beginning, the template files belong to one logical unit: the initial project structure.
Important detail about folders
Git does not track empty folders by themselves. This matters in research projects because cookiecutter templates often create directories such as:
data/raw/data/derived/output/tables/output/figures/
If you want these folders to remain in the repository even when empty, keep a placeholder file inside, such as .gitkeep or a short README.md.
2. What should be tracked in an economics research repository?
As a general rule, track files that another researcher would need in order to understand, reproduce, review, or continue the project.
Usually track these:
- data cleaning and analysis scripts such as
.do,.R,.py, or carefully maintained notebooks - paper source files such as
.tex,.bib, markdown notes, slides, and presentations README.md, codebooks, variable definitions, and workflow notes- project configuration such as
Makefile,environment.yml,renv.lock, or package lists - small hand-built input files that are part of the project itself
Usually do not track these:
- licensed or confidential raw data from sources such as WRDS, CRSP, Compustat, Orbis, or administrative records
- large derived datasets that can be rebuilt from scripts
- temporary logs, cache files, autosave files, and editor junk
- secrets such as credentials, API keys, or
.envfiles - compiled paper outputs such as
.aux,.log, and often.pdf, unless your team explicitly wants them in Git
In economics, the important thing is usually to track the code that rebuilds the data and results, not every large file produced along the way.
3. How and when to use git add
git add does not mean “save my work.” It means “stage this change for the next commit.”
That distinction matters. A good research workflow is:
- edit files
- run
git status - stage only the files that belong to one research task
- review the staged changes
- commit that one logical unit
Stage by research task
In a research project, one good commit often corresponds to one clear research step.
Good examples:
- add one sample restriction and document it in the paper
- revise one regression specification and update the matching table note
- add one new data cleaning step and update the codebook
Bad examples:
- mix literature review edits with sample construction changes
- commit several unrelated robustness checks together
- stage every modified file just because
git add .was quick
Example:
git statusgit add src/02_build_sample.dogit add paper/data-section.texgit add notes/sample-notes.mdgit diff --stagedgit commit -m "Add sample screen and document it"If the same file contains several unrelated edits, use:
git add -pThat command is very useful when one do file or one paper section contains both a real change and some unrelated cleanup.
4. When should you commit?
A commit should represent one research decision or one coherent task that you can describe in one sentence.
Good commit messages:
Add winsorization step for firm characteristicsRevise baseline leverage specificationDocument WRDS download processUpdate Table 3 note after sample change
Weak commit messages:
updatechangesfix stuff
Practical rule
Commit when:
- the change has one clear purpose
- the files staged belong to the same research step
- a coauthor could understand the change from the commit message
Do not wait until the end of the day if one clean unit of work is already done. Small commits are easier to review and much easier to revisit later when you are trying to remember why a result changed.
Are WIP commits okay?
Yes, especially on your own branch.
In research, a WIP commit is often safer than leaving important changes only in your working directory.
git add -Agit commit -m "WIP: continue event-study appendix"You can later clean up the branch history before merging if your team prefers that.
5. A branch workflow that fits research collaboration
If you have collaborators, do not do daily research work directly on main.
A safer pattern is:
- keep
mainas the stable, reproducible version of the project - create one branch for one task
- merge back only when the task is in good shape
Example branch names:
feature/sample-constructionfeature/table-4-revisionfeature/new-robustness-checkfeature/lit-review-update
This structure is useful because research tasks are often naturally separated. One branch may be about data cleaning, another about a new identification test, and another about paper writing.
6. What to do each morning if you still have unfinished local work
This is the situation that causes most Git stress:
- you still have local unfinished work
- your collaborator may have pushed new changes
- you want the latest remote updates without losing your own work
My practical workflow is below.
Step 1: inspect your current state
git status -sbgit branchYou need to know:
- which branch you are on
- whether you have uncommitted changes
Step 2: protect your unfinished work
If your changes are already somewhat coherent, make a WIP commit on your branch:
git add src/03_analysis.dogit add paper/results.texgit commit -m "WIP: continue baseline specification revision"If the changes are too messy for even a temporary commit, stash them:
git stash push -u -m "wip before morning sync"I would use stash as short-term parking, not as long-term storage.
Step 3: fetch remote changes without touching your files yet
git fetch originThis downloads the latest remote history but does not merge anything into your working tree.
If you want to see whether main changed:
git log --oneline main..origin/mainStep 4: update your local main
git switch maingit pull --rebase origin mainNow your local main is aligned with the latest shared version.
Step 5: bring the new main into your research branch
git switch feature/table-4-revisiongit rebase mainIf your team prefers merge commits, use:
git merge mainFor a personal research branch, I usually prefer rebase because it keeps the branch history easier to read.
Step 6: restore your local work if you used stash
git stash popThen resolve any conflicts.
Step 7: rerun the relevant research pipeline
This last step matters a lot in research. Pulling your collaborator’s changes is not enough. You also need to check whether the updated code changes your results or paper text.
After syncing, rerun the relevant pieces of the project:
- rebuild the affected dataset
- rerun the relevant regression or estimation script
- regenerate the affected tables or figures
- check whether the manuscript still matches the outputs
In research, reproducibility is part of the Git workflow.
7. The short version of the daily collaborator workflow
If I have unfinished local work but want the latest collaborator changes, my default sequence is:
git status -sbgit add src/03_analysis.do paper/results.texgit commit -m "WIP: checkpoint before sync"git fetch origingit switch maingit pull --rebase origin maingit switch feature/table-4-revisiongit rebase mainIf I do not want a WIP commit, I replace the commit step with:
git stash push -u -m "wip before sync"and later:
git stash popThen I rerun the relevant scripts and check the outputs.
8. Habits that prevent Git pain in research projects
- Keep
mainreproducible. - Pull or fetch early in the day before doing several more hours of work.
- Track scripts, documentation, and paper source files more carefully than generated outputs.
- If a code change affects a table or figure, update the manuscript notes close to the same commit when possible.
- Avoid leaving large uncommitted changes in your working directory for too long.
- Write commit messages that explain the research step, not just the file change.
Final thought
For collaborative economics research, the goal of Git is not just version control. It is to keep your project understandable and reproducible while several people are editing scripts, data notes, and paper files at the same time.
If you build the habit of checking git status, committing small research tasks, and syncing main carefully before continuing your own branch, Git becomes much easier to manage.


