- BLOB: A "blob" object is nothing but a binary blob of data, and
- doesn't refer to anything else. There is no signature or any
- other verification of the data, so while the object is
- consistent (it _is_ indexed by its sha1 hash, so the data itself
- is certainly correct), it has absolutely no other attributes.
- No name associations, no permissions. It is purely a blob of
- data (ie normally "file contents").
-
- In particular, since the blob is entirely defined by its data,
- if two files in a directory tree (or in multiple different
- versions of the repository) have the same contents, they will
- share the same blob object. The object is toally independent
- of it's location in the directory tree, and renaming a file does
- not change the object that file is associated with in any way.
-
- TREE: The next hierarchical object type is the "tree" object. A tree
- object is a list of mode/name/blob data, sorted by name.
- Alternatively, the mode data may specify a directory mode, in
- which case instead of naming a blob, that name is associated
- with another TREE object.
-
- Like the "blob" object, a tree object is uniquely determined by
- the set contents, and so two separate but identical trees will
- always share the exact same object. This is true at all levels,
- ie it's true for a "leaf" tree (which does not refer to any
- other trees, only blobs) as well as for a whole subdirectory.
-
- For that reason a "tree" object is just a pure data abstraction:
- it has no history, no signatures, no verification of validity,
- except that since the contents are again protected by the hash
- itself, we can trust that the tree is immutable and its contents
- never change.
-
- So you can trust the contents of a tree to be valid, the same
- way you can trust the contents of a blob, but you don't know
- where those contents _came_ from.
-
- Side note on trees: since a "tree" object is a sorted list of
- "filename+content", you can create a diff between two trees
- without actually having to unpack two trees. Just ignore all
- common parts, and your diff will look right. In other words,
- you can effectively (and efficiently) tell the difference
- between any two random trees by O(n) where "n" is the size of
- the difference, rather than the size of the tree.
-
- Side note 2 on trees: since the name of a "blob" depends
- entirely and exclusively on its contents (ie there are no names
- or permissions involved), you can see trivial renames or
- permission changes by noticing that the blob stayed the same.
- However, renames with data changes need a smarter "diff" implementation.
-
-CHANGESET: The "changeset" object is an object that introduces the
- notion of history into the picture. In contrast to the other
- objects, it doesn't just describe the physical state of a tree,
- it describes how we got there, and why.
-
- A "changeset" is defined by the tree-object that it results in,
- the parent changesets (zero, one or more) that led up to that
- point, and a comment on what happened. Again, a changeset is
- not trusted per se: the contents are well-defined and "safe" due
- to the cryptographically strong signatures at all levels, but
- there is no reason to believe that the tree is "good" or that
- the merge information makes sense. The parents do not have to
- actually have any relationship with the result, for example.
-
- Note on changesets: unlike real SCM's, changesets do not contain
- rename information or file mode chane information. All of that
- is implicit in the trees involved (the result tree, and the
- result trees of the parents), and describing that makes no sense
- in this idiotic file manager.
-
-TRUST: The notion of "trust" is really outside the scope of "git", but
- it's worth noting a few things. First off, since everything is
- hashed with SHA1, you _can_ trust that an object is intact and
- has not been messed with by external sources. So the name of an
- object uniquely identifies a known state - just not a state that
- you may want to trust.
-
- Furthermore, since the SHA1 signature of a changeset refers to
- the SHA1 signatures of the tree it is associated with and the
- signatures of the parent, a single named changeset specifies
- uniquely a whole set of history, with full contents. You can't
- later fake any step of the way once you have the name of a
- changeset.
-
- So to introduce some real trust in the system, the only thing
- you need to do is to digitally sign just _one_ special note,
- which includes the name of a top-level changeset. Your digital
- signature shows others that you trust that changeset, and the
- immutability of the history of changesets tells others that they
- can trust the whole history.
-
- In other words, you can easily validate a whole archive by just
- sending out a single email that tells the people the name (SHA1
- hash) of the top changeset, and digitally sign that email using
- something like GPG/PGP.
-
- In particular, you can also have a separate archive of "trust
- points" or tags, which document your (and other peoples) trust.
- You may, of course, archive these "certificates of trust" using
- "git" itself, but it's not something "git" does for you.
-
-Another way of saying the last point: "git" itself only handles content
-integrity, the trust has to come from outside.
-
-
-
- The "index" aka "Current Directory Cache" (".git/index")
-
-
+Blob Object
+~~~~~~~~~~~
+A "blob" object is nothing but a binary blob of data, and doesn't
+refer to anything else. There is no signature or any other
+verification of the data, so while the object is consistent (it 'is'
+indexed by its sha1 hash, so the data itself is certainly correct), it
+has absolutely no other attributes. No name associations, no
+permissions. It is purely a blob of data (i.e. normally "file
+contents").
+
+In particular, since the blob is entirely defined by its data, if two
+files in a directory tree (or in multiple different versions of the
+repository) have the same contents, they will share the same blob
+object. The object is totally independent of its location in the
+directory tree, and renaming a file does not change the object that
+file is associated with in any way.
+
+A blob is typically created when gitlink:git-update-index[1]
+is run, and its data can be accessed by gitlink:git-cat-file[1].
+
+Tree Object
+~~~~~~~~~~~
+The next hierarchical object type is the "tree" object. A tree object
+is a list of mode/name/blob data, sorted by name. Alternatively, the
+mode data may specify a directory mode, in which case instead of
+naming a blob, that name is associated with another TREE object.
+
+Like the "blob" object, a tree object is uniquely determined by the
+set contents, and so two separate but identical trees will always
+share the exact same object. This is true at all levels, i.e. it's
+true for a "leaf" tree (which does not refer to any other trees, only
+blobs) as well as for a whole subdirectory.
+
+For that reason a "tree" object is just a pure data abstraction: it
+has no history, no signatures, no verification of validity, except
+that since the contents are again protected by the hash itself, we can
+trust that the tree is immutable and its contents never change.
+
+So you can trust the contents of a tree to be valid, the same way you
+can trust the contents of a blob, but you don't know where those
+contents 'came' from.
+
+Side note on trees: since a "tree" object is a sorted list of
+"filename+content", you can create a diff between two trees without
+actually having to unpack two trees. Just ignore all common parts,
+and your diff will look right. In other words, you can effectively
+(and efficiently) tell the difference between any two random trees by
+O(n) where "n" is the size of the difference, rather than the size of
+the tree.
+
+Side note 2 on trees: since the name of a "blob" depends entirely and
+exclusively on its contents (i.e. there are no names or permissions
+involved), you can see trivial renames or permission changes by
+noticing that the blob stayed the same. However, renames with data
+changes need a smarter "diff" implementation.
+
+A tree is created with gitlink:git-write-tree[1] and
+its data can be accessed by gitlink:git-ls-tree[1].
+Two trees can be compared with gitlink:git-diff-tree[1].
+
+Commit Object
+~~~~~~~~~~~~~
+The "commit" object is an object that introduces the notion of
+history into the picture. In contrast to the other objects, it
+doesn't just describe the physical state of a tree, it describes how
+we got there, and why.
+
+A "commit" is defined by the tree-object that it results in, the
+parent commits (zero, one or more) that led up to that point, and a
+comment on what happened. Again, a commit is not trusted per se:
+the contents are well-defined and "safe" due to the cryptographically
+strong signatures at all levels, but there is no reason to believe
+that the tree is "good" or that the merge information makes sense.
+The parents do not have to actually have any relationship with the
+result, for example.
+
+Note on commits: unlike real SCM's, commits do not contain
+rename information or file mode change information. All of that is
+implicit in the trees involved (the result tree, and the result trees
+of the parents), and describing that makes no sense in this idiotic
+file manager.
+
+A commit is created with gitlink:git-commit-tree[1] and
+its data can be accessed by gitlink:git-cat-file[1].
+
+Trust
+~~~~~
+An aside on the notion of "trust". Trust is really outside the scope
+of "git", but it's worth noting a few things. First off, since
+everything is hashed with SHA1, you 'can' trust that an object is
+intact and has not been messed with by external sources. So the name
+of an object uniquely identifies a known state - just not a state that
+you may want to trust.
+
+Furthermore, since the SHA1 signature of a commit refers to the
+SHA1 signatures of the tree it is associated with and the signatures
+of the parent, a single named commit specifies uniquely a whole set
+of history, with full contents. You can't later fake any step of the
+way once you have the name of a commit.
+
+So to introduce some real trust in the system, the only thing you need
+to do is to digitally sign just 'one' special note, which includes the
+name of a top-level commit. Your digital signature shows others
+that you trust that commit, and the immutability of the history of
+commits tells others that they can trust the whole history.
+
+In other words, you can easily validate a whole archive by just
+sending out a single email that tells the people the name (SHA1 hash)
+of the top commit, and digitally sign that email using something
+like GPG/PGP.
+
+To assist in this, git also provides the tag object...
+
+Tag Object
+~~~~~~~~~~
+Git provides the "tag" object to simplify creating, managing and
+exchanging symbolic and signed tokens. The "tag" object at its
+simplest simply symbolically identifies another object by containing
+the sha1, type and symbolic name.
+
+However it can optionally contain additional signature information
+(which git doesn't care about as long as there's less than 8k of
+it). This can then be verified externally to git.
+
+Note that despite the tag features, "git" itself only handles content
+integrity; the trust framework (and signature provision and
+verification) has to come from outside.
+
+A tag is created with gitlink:git-mktag[1],
+its data can be accessed by gitlink:git-cat-file[1],
+and the signature can be verified by
+gitlink:git-verify-tag[1].
+
+
+The "index" aka "Current Directory Cache"
+-----------------------------------------