Git Workflow for Collaborative Economics Research Projects

Git is especially useful in economics research because a project usually changes in several places at once: data cleaning scripts, analysis code, tables, figures, and the paper itself.

This post is about a practical Git workflow for collaborative research projects. I will focus on three questions:

how to start a new research repository when you use a cookiecutter project template
how and when to add files to tracking and commit them
what to do each morning if you still have unfinished local work but also want to pull your collaborator’s latest changes

I assume a cookiecutter-style project with folders such as data/, src/, paper/, output/, README.md, and .gitignore. The exact folder names may differ, but the Git logic is the same.

A simple mental model

Before the commands, separate these three things:

your working directory: the files on your computer
your local repository: the Git history stored in .git/
your remote repository: the copy on GitHub

Most Git confusion comes from mixing these three together.

1. Start a new research project

If you use a cookiecutter template, local-first is usually the cleanest workflow. The template already creates the research folder structure, so it is better to generate the project locally and then connect it to GitHub.

Recommended setup

Create an empty repository on GitHub.
Generate the project locally from your cookiecutter template.
Enter the new project folder.
Check whether the template already initialized Git.
If not, initialize Git, add the remote, and push the first commit.

cookiecutter path/to/your-research-template
cd project-name
git status

If the template did not initialize Git, do this:

git init
git branch -M main
git remote add origin [email protected]:your-name/project-name.git
git add .
git commit -m "Initialize research project from cookiecutter template"
git push -u origin main

This is one of the few times git add . is usually fine. At the beginning, the template files belong to one logical unit: the initial project structure.

Important detail about folders

Git does not track empty folders by themselves. This matters in research projects because cookiecutter templates often create directories such as:

data/raw/
data/derived/
output/tables/
output/figures/

If you want these folders to remain in the repository even when empty, keep a placeholder file inside, such as .gitkeep or a short README.md.

2. What should be tracked in an economics research repository?

As a general rule, track files that another researcher would need in order to understand, reproduce, review, or continue the project.

Usually track these:

data cleaning and analysis scripts such as .do, .R, .py, or carefully maintained notebooks
paper source files such as .tex, .bib, markdown notes, slides, and presentations
README.md, codebooks, variable definitions, and workflow notes
project configuration such as Makefile, environment.yml, renv.lock, or package lists
small hand-built input files that are part of the project itself

Usually do not track these:

licensed or confidential raw data from sources such as WRDS, CRSP, Compustat, Orbis, or administrative records
large derived datasets that can be rebuilt from scripts
temporary logs, cache files, autosave files, and editor junk
secrets such as credentials, API keys, or .env files
compiled paper outputs such as .aux, .log, and often .pdf, unless your team explicitly wants them in Git

In economics, the important thing is usually to track the code that rebuilds the data and results, not every large file produced along the way.

3. How and when to use `git add`

git add does not mean “save my work.” It means “stage this change for the next commit.”

That distinction matters. A good research workflow is:

edit files
run git status
stage only the files that belong to one research task
review the staged changes
commit that one logical unit

Stage by research task

In a research project, one good commit often corresponds to one clear research step.

Good examples:

add one sample restriction and document it in the paper
revise one regression specification and update the matching table note
add one new data cleaning step and update the codebook

Bad examples:

mix literature review edits with sample construction changes
commit several unrelated robustness checks together
stage every modified file just because git add . was quick

Example:

git status
git add src/02_build_sample.do
git add paper/data-section.tex
git add notes/sample-notes.md
git diff --staged
git commit -m "Add sample screen and document it"

If the same file contains several unrelated edits, use:

git add -p

That command is very useful when one do file or one paper section contains both a real change and some unrelated cleanup.

4. When should you commit?

A commit should represent one research decision or one coherent task that you can describe in one sentence.

Good commit messages:

Add winsorization step for firm characteristics
Revise baseline leverage specification
Document WRDS download process
Update Table 3 note after sample change

Weak commit messages:

update
changes
fix stuff

Practical rule

Commit when:

the change has one clear purpose
the files staged belong to the same research step
a coauthor could understand the change from the commit message

Do not wait until the end of the day if one clean unit of work is already done. Small commits are easier to review and much easier to revisit later when you are trying to remember why a result changed.

Are WIP commits okay?

Yes, especially on your own branch.

In research, a WIP commit is often safer than leaving important changes only in your working directory.

git add -A
git commit -m "WIP: continue event-study appendix"

You can later clean up the branch history before merging if your team prefers that.

5. A branch workflow that fits research collaboration

If you have collaborators, do not do daily research work directly on main.

A safer pattern is:

keep main as the stable, reproducible version of the project
create one branch for one task
merge back only when the task is in good shape

Example branch names:

feature/sample-construction
feature/table-4-revision
feature/new-robustness-check
feature/lit-review-update

This structure is useful because research tasks are often naturally separated. One branch may be about data cleaning, another about a new identification test, and another about paper writing.

6. What to do each morning if you still have unfinished local work

This is the situation that causes most Git stress:

you still have local unfinished work
your collaborator may have pushed new changes
you want the latest remote updates without losing your own work

My practical workflow is below.

Step 1: inspect your current state

git status -sb
git branch

You need to know:

which branch you are on
whether you have uncommitted changes

Step 2: protect your unfinished work

If your changes are already somewhat coherent, make a WIP commit on your branch:

git add src/03_analysis.do
git add paper/results.tex
git commit -m "WIP: continue baseline specification revision"

If the changes are too messy for even a temporary commit, stash them:

git stash push -u -m "wip before morning sync"

I would use stash as short-term parking, not as long-term storage.

Step 3: fetch remote changes without touching your files yet

git fetch origin

This downloads the latest remote history but does not merge anything into your working tree.

If you want to see whether main changed:

git log --oneline main..origin/main

Step 4: update your local `main`

git switch main
git pull --rebase origin main

Now your local main is aligned with the latest shared version.

Step 5: bring the new `main` into your research branch

git switch feature/table-4-revision
git rebase main

If your team prefers merge commits, use:

git merge main

For a personal research branch, I usually prefer rebase because it keeps the branch history easier to read.

Step 6: restore your local work if you used `stash`

git stash pop

Then resolve any conflicts.

Step 7: rerun the relevant research pipeline

This last step matters a lot in research. Pulling your collaborator’s changes is not enough. You also need to check whether the updated code changes your results or paper text.

After syncing, rerun the relevant pieces of the project:

rebuild the affected dataset
rerun the relevant regression or estimation script
regenerate the affected tables or figures
check whether the manuscript still matches the outputs

In research, reproducibility is part of the Git workflow.

7. The short version of the daily collaborator workflow

If I have unfinished local work but want the latest collaborator changes, my default sequence is:

git status -sb
git add src/03_analysis.do paper/results.tex
git commit -m "WIP: checkpoint before sync"
git fetch origin
git switch main
git pull --rebase origin main
git switch feature/table-4-revision
git rebase main

If I do not want a WIP commit, I replace the commit step with:

git stash push -u -m "wip before sync"

and later:

git stash pop

Then I rerun the relevant scripts and check the outputs.

8. Habits that prevent Git pain in research projects

Keep main reproducible.
Pull or fetch early in the day before doing several more hours of work.
Track scripts, documentation, and paper source files more carefully than generated outputs.
If a code change affects a table or figure, update the manuscript notes close to the same commit when possible.
Avoid leaving large uncommitted changes in your working directory for too long.
Write commit messages that explain the research step, not just the file change.

Final thought

For collaborative economics research, the goal of Git is not just version control. It is to keep your project understandable and reproducible while several people are editing scripts, data notes, and paper files at the same time.

If you build the habit of checking git status, committing small research tasks, and syncing main carefully before continuing your own branch, Git becomes much easier to manage.

Git Workflow for Collaborative Economics Research Projects

A simple mental model

1. Start a new research project

Recommended setup

Important detail about folders

2. What should be tracked in an economics research repository?

3. How and when to use `git add`

Stage by research task

4. When should you commit?

Practical rule

Are WIP commits okay?

5. A branch workflow that fits research collaboration

6. What to do each morning if you still have unfinished local work

Step 1: inspect your current state

Step 2: protect your unfinished work

Step 3: fetch remote changes without touching your files yet

Step 4: update your local `main`

Step 5: bring the new `main` into your research branch

Step 6: restore your local work if you used `stash`

Step 7: rerun the relevant research pipeline

7. The short version of the daily collaborator workflow

8. Habits that prevent Git pain in research projects

Final thought

Related Posts

Economics and Finance Research Learning Resources

Academic paper check list

Top Academic Journals in Economics, Finance and Accounting

Jokes from Finance and/or Academia

A simple mental model

1. Start a new research project

Recommended setup

Important detail about folders

2. What should be tracked in an economics research repository?

3. How and when to use git add

Stage by research task

4. When should you commit?

Practical rule

Are WIP commits okay?

5. A branch workflow that fits research collaboration

6. What to do each morning if you still have unfinished local work

Step 1: inspect your current state

Step 2: protect your unfinished work

Step 3: fetch remote changes without touching your files yet

Step 4: update your local main

Step 5: bring the new main into your research branch

Step 6: restore your local work if you used stash

Step 7: rerun the relevant research pipeline

7. The short version of the daily collaborator workflow

8. Habits that prevent Git pain in research projects

Final thought

Related Posts

Economics and Finance Research Learning Resources

Academic paper check list

Top Academic Journals in Economics, Finance and Accounting

Jokes from Finance and/or Academia

3. How and when to use `git add`

Step 4: update your local `main`

Step 5: bring the new `main` into your research branch

Step 6: restore your local work if you used `stash`