X-Git-Url: https://git.octo.it/?a=blobdiff_plain;f=Documentation%2Fdiffcore.txt;h=cb4e562004e58439a0055d9ed6a6bdab249dfcdc;hb=162f41292167a800432fc6bbacfcd9f93a90b0c8;hp=6c474d1c0c33748777a1ee28d5cd251489eb7c21;hpb=85c1f337be49eaa9a22e42a1c9958deef5ab57c3;p=git.git diff --git a/Documentation/diffcore.txt b/Documentation/diffcore.txt index 6c474d1c..cb4e5620 100644 --- a/Documentation/diffcore.txt +++ b/Documentation/diffcore.txt @@ -6,13 +6,12 @@ June 2005 Introduction ------------ -The diff commands git-diff-cache, git-diff-files, and -git-diff-tree can be told to manipulate differences they find -in unconventional ways before showing diff(1) output. The -manipulation is collectively called "diffcore transformation". -This short note describes what they are and how to use them to -produce diff outputs that are easier to understand than the -conventional kind. +The diff commands git-diff-index, git-diff-files, git-diff-tree, and +git-diff-stages can be told to manipulate differences they find in +unconventional ways before showing diff(1) output. The manipulation +is collectively called "diffcore transformation". This short note +describes what they are and how to use them to produce diff outputs +that are easier to understand than the conventional kind. The chain of operation @@ -21,15 +20,18 @@ The chain of operation The git-diff-* family works by first comparing two sets of files: - - git-diff-cache compares contents of a "tree" object and the - working directory (when --cached flag is not used) or a - "tree" object and the index file (when --cached flag is + - git-diff-index compares contents of a "tree" object and the + working directory (when '\--cached' flag is not used) or a + "tree" object and the index file (when '\--cached' flag is used); - git-diff-files compares contents of the index file and the working directory; - - git-diff-tree compares contents of two "tree" objects. + - git-diff-tree compares contents of two "tree" objects; + + - git-diff-stages compares contents of blobs at two stages in an + unmerged index file. In all of these cases, the commands themselves compare corresponding paths in the two sets of files. The result of @@ -37,40 +39,51 @@ comparison is passed from these commands to what is internally called "diffcore", in a format similar to what is output when the -p option is not used. E.g. - in-place edit :100644 100644 bcd1234... 0123456... M file0 - create :000000 100644 0000000... 1234567... N file4 - delete :100644 000000 1234567... 0000000... D file5 - unmerged :000000 000000 0000000... 0000000... U file6 +------------------------------------------------ +in-place edit :100644 100644 bcd1234... 0123456... M file0 +create :000000 100644 0000000... 1234567... A file4 +delete :100644 000000 1234567... 0000000... D file5 +unmerged :000000 000000 0000000... 0000000... U file6 +------------------------------------------------ The diffcore mechanism is fed a list of such comparison results (each of which is called "filepair", although at this point each of them talks about a single file), and transforms such a list into another list. There are currently 6 such transformations: - - diffcore-pathspec - - diffcore-break - - diffcore-rename - - diffcore-merge-broken - - diffcore-pickaxe - - diffcore-order +- diffcore-pathspec +- diffcore-break +- diffcore-rename +- diffcore-merge-broken +- diffcore-pickaxe +- diffcore-order -These are applied in sequence. The set of filepairs git-diff-* +These are applied in sequence. The set of filepairs git-diff-\* commands find are used as the input to diffcore-pathspec, and the output from diffcore-pathspec is used as the input to the next transformation. The final result is then passed to the output routine and generates either diff-raw format (see Output -format sections of the manual for git-diff-* commands) or +format sections of the manual for git-diff-\* commands) or diff-patch format. -diffcore-pathspec ------------------ +diffcore-pathspec: For Ignoring Files Outside Our Consideration +--------------------------------------------------------------- The first transformation in the chain is diffcore-pathspec, and is controlled by giving the pathname parameters to the git-diff-* commands on the command line. The pathspec is used to limit the world diff operates in. It removes the filepairs -outside the specified set of pathnames. +outside the specified set of pathnames. E.g. If the input set +of filepairs included: + +------------------------------------------------ +:100644 100644 bcd1234... 0123456... M junkfile +------------------------------------------------ + +but the command invocation was "git-diff-files myfile", then the +junkfile entry would be removed from the list because only "myfile" +is under consideration. Implementation note. For performance reasons, git-diff-tree uses the pathname parameters on the command line to cull set of @@ -78,8 +91,8 @@ filepairs it feeds the diffcore mechanism itself, and does not use diffcore-pathspec, but the end result is the same. -diffcore-break --------------- +diffcore-break: For Splitting Up "Complete Rewrites" +---------------------------------------------------- The second transformation in the chain is diffcore-break, and is controlled by the -B option to the git-diff-* commands. This is @@ -87,13 +100,17 @@ used to detect a filepair that represents "complete rewrite" and break such filepair into two filepairs that represent delete and create. E.g. If the input contained this filepair: - :100644 100644 bcd1234... 0123456... M file0 +------------------------------------------------ +:100644 100644 bcd1234... 0123456... M file0 +------------------------------------------------ and if it detects that the file "file0" is completely rewritten, it changes it to: - :100644 000000 bcd1234... 0000000... D file0 - :000000 100644 0000000... 0123456... N file0 +------------------------------------------------ +:100644 000000 bcd1234... 0000000... D file0 +:000000 100644 0000000... 0123456... A file0 +------------------------------------------------ For the purpose of breaking a filepair, diffcore-break examines the extent of changes between the contents of the files before @@ -109,61 +126,69 @@ the original is used), and can be customized by giving a number after "-B" option (e.g. "-B75" to tell it to use 75%). -diffcore-rename ---------------- +diffcore-rename: For Detection Renames and Copies +------------------------------------------------- This transformation is used to detect renames and copies, and is controlled by the -M option (to detect renames) and the -C option (to detect copies as well) to the git-diff-* commands. If the input contained these filepairs: - :100644 000000 0123456... 0000000... D fileX - :000000 100644 0000000... 0123456... N file0 +------------------------------------------------ +:100644 000000 0123456... 0000000... D fileX +:000000 100644 0000000... 0123456... A file0 +------------------------------------------------ and the contents of the deleted file fileX is similar enough to the contents of the created file file0, then rename detection merges these filepairs and creates: - :100644 100644 0123456... 0123456... R100 fileX file0 +------------------------------------------------ +:100644 100644 0123456... 0123456... R100 fileX file0 +------------------------------------------------ -When the "-C" option is used, the original contents of modified -files and contents of unchanged files are considered as -candidates of the source files in rename/copy operation, in -addition to the deleted files. If the input were like these -filepairs, that talk about a modified file fileY and a newly +When the "-C" option is used, the original contents of modified files, +and deleted files (and also unmodified files, if the +"\--find-copies-harder" option is used) are considered as candidates +of the source files in rename/copy operation. If the input were like +these filepairs, that talk about a modified file fileY and a newly created file file0: - :100644 100644 0123456... 1234567... M fileY - :000000 100644 0000000... 0123456... N file0 +------------------------------------------------ +:100644 100644 0123456... 1234567... M fileY +:000000 100644 0000000... bcd3456... A file0 +------------------------------------------------ the original contents of fileY and the resulting contents of file0 are compared, and if they are similar enough, they are changed to: - :100644 100644 0123456... 1234567... M fileY - :100644 100644 0123456... 0123456... C100 fileY file0 +------------------------------------------------ +:100644 100644 0123456... 1234567... M fileY +:100644 100644 0123456... bcd3456... C100 fileY file0 +------------------------------------------------ In both rename and copy detection, the same "extent of changes" algorithm used in diffcore-break is used to determine if two files are "similar enough", and can be customized to use -similarity score different from the default 50% by giving a -number after "-M" or "-C" option (e.g. "-M8" to tell it to use +a similarity score different from the default of 50% by giving a +number after the "-M" or "-C" option (e.g. "-M8" to tell it to use 8/10 = 80%). -Note. When the "-C" option is used with --find-copies-harder -option, git-diff-* commands feed unmodified filepairs to +Note. When the "-C" option is used with `\--find-copies-harder` +option, git-diff-\* commands feed unmodified filepairs to diffcore mechanism as well as modified ones. This lets the copy detector consider unmodified files as copy source candidates at -the expense of making it slower. Without --find-copies-harder, -git-diff-* commands can detect copies only if the file that was +the expense of making it slower. Without `\--find-copies-harder`, +git-diff-\* commands can detect copies only if the file that was copied happened to have been modified in the same changeset. -diffcore-merge-broken ---------------------- +diffcore-merge-broken: For Putting "Complete Rewrites" Back Together +-------------------------------------------------------------------- This transformation is used to merge filepairs broken by -diffcore-break, and were not transformed into rename/copy by +diffcore-break, and not transformed into rename/copy by diffcore-rename, back into a single modification. This always runs when diffcore-break is used. @@ -186,27 +211,27 @@ material is deleted, the broken pairs are merged back into a single modification) by giving a second number to -B option, like these: - -B50/60 (give 50% "break score" to diffcore-break, use - 60% for diffcore-merge-broken). - -B/60 (the same as above, since diffcore-break defautls to - 50%). +* -B50/60 (give 50% "break score" to diffcore-break, use 60% + for diffcore-merge-broken). + +* -B/60 (the same as above, since diffcore-break defaults to 50%). Note that earlier implementation left a broken pair as a separate -creation and deletion patches. This was unnecessary hack and +creation and deletion patches. This was an unnecessary hack and the latest implementation always merges all the broken pairs back into modifications, but the resulting patch output is -formatted differently to still let the reviewing easier for such +formatted differently for easier review in case of such a complete rewrite by showing the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. -diffcore-pickaxe ----------------- +diffcore-pickaxe: For Detecting Addition/Deletion of Specified String +--------------------------------------------------------------------- This transformation is used to find filepairs that represent changes that touch a specified string, and is controlled by the --S option and the --pickaxe-all option to the git-diff-* +-S option and the `\--pickaxe-all` option to the git-diff-* commands. When diffcore-pickaxe is in use, it checks if there are @@ -215,34 +240,36 @@ whose "result" side does not. Such a filepair represents "the string appeared in this changeset". It also checks for the opposite case that loses the specified string. -When --pickaxe-all is not in effect, diffcore-pickaxe leaves -only such filepairs that touches the specified string in its -output. When --pickaxe-all is used, diffcore-pickaxe leaves all +When `\--pickaxe-all` is not in effect, diffcore-pickaxe leaves +only such filepairs that touch the specified string in its +output. When `\--pickaxe-all` is used, diffcore-pickaxe leaves all filepairs intact if there is such a filepair, or makes the output empty otherwise. The latter behaviour is designed to make reviewing of the changes in the context of the whole changeset easier. -diffcore-order --------------- +diffcore-order: For Sorting the Output Based on Filenames +--------------------------------------------------------- This is used to reorder the filepairs according to the user's (or project's) taste, and is controlled by the -O option to the git-diff-* commands. -This takes a text file each of whose line is a shell glob +This takes a text file each of whose lines is a shell glob pattern. Filepairs that match a glob pattern on an earlier line in the file are output before ones that match a later line, and filepairs that do not match any glob pattern are output last. -As an example, typical orderfile for the core GIT probably -should look like this: - - README - Makefile - Documentation - *.h - *.c - t +As an example, a typical orderfile for the core git probably +would look like this: + +------------------------------------------------ +README +Makefile +Documentation +*.h +*.c +t +------------------------------------------------