Git

The agony and the ecstasy

David Turner

Structure of this talk

  • Repository structure
  • Rebase
  • The index
  • Whining
  • Interrupts

Git commits

Hg commits

Parents

DAG

Git Object Database

  • Everything has a SHA
  • .git/objects
  • .git/objects/de/adbeef8b6c07df513be12bcdb10979ada0142ace
  • .git/objects/f1/d2d2f924e986ac86fdf7b36c94bcdf32beec15

git log

  • git log walks the commit tree
  • git log filename walks the commit tree too

Git packs

  • .git/objects/packs
  • Hold many objects
  • Better compression
  • Fewer syscalls

Hg Revlogs

  • Each file has a revlog
  • The manifest has a revlog
  • The commit log is a revlog

Revlog structure

  • (Filename)
  • Compressed/diffed file contents
  • (contents include rename info)
  • Index
  • Index entry:
    • SHA1 ("nodeid")
    • offset in contents
    • length
    • base for diffs
    • parents
    • commit ("changeset")

Git branches

  • A branch is a name for a commit
  • .git/refs/heads/[branch name]
  • .git/refs/remotes/[remote]/[branch name]
  • .git/refs/[tags name]

Git HEAD

  • git checkout switches branches
  • HEAD is usually a "symref"
  • detached HEAD is instead a SHA

Hg clones, branches, and bookmarks

  • Clones: just another copy of the repo
  • Bookmarks: like git branches without namespaces
  • Branches: metadata on changesets

Hg oops

  • Oops, I wish that work had been on a branch
  • Oops, I wish I hadn't pushed

Git branch silliness

  • Branches are files
  • On Mac, files are case-insensitive
  • What if you create a branch called MaStEr?

Rebase

Rebase



Rebase properties

  • Git: Nothing is lost (yet)
  • Hg: Data is backed up, but second-class

    JBR's SF Chronophysics

    • Type One Plots (Deterministic)
    • Type Two Plots (Elastic)
    • Type Three Plots (Overwriting)
    • Type Four Plots (Quantum‐Forking)

Interactive rebase (histedit)


Interactive rebase (histedit)


The Index

  • Git only
  • Where commit objects are built
  • Also used for merges

Whining

Hg sucks: repo structure

  • File-based repo design is complex
  • Data can be lost
  • No packs
  • Two-parent limit

Hg sucks: UX

  • Python is slow
  • Hidden commands
  • Missing branch namespacing

Git sucks: UX

  • git checkout does too much
  • git reset does too much
  • git pull is wrong
  • Renames
  • Requires deep understanding

Git sucks: Breakage

  • Crufty codebase
  • Cache-tree extension broken
  • Pruning refs is O(N2)
  • Case-insensitive FS handling

Everything sucks

  • Wrong-side merges
  • Large histories are expensive
  • Large working copies are slow

Thank you

Diagrams are based on diagrams in Github's training kit. CC-BY-4.0