Merge branch 'np/delta' into next

author Junio C Hamano <junkio@cox.net>

Thu, 23 Feb 2006 10:59:24 +0000 (02:59 -0800)

committer Junio C Hamano <junkio@cox.net>

Thu, 23 Feb 2006 10:59:24 +0000 (02:59 -0800)
author Junio C Hamano <junkio@cox.net>
Thu, 23 Feb 2006 10:59:24 +0000 (02:59 -0800)
committer Junio C Hamano <junkio@cox.net>
Thu, 23 Feb 2006 10:59:24 +0000 (02:59 -0800)
diff --git a/.gitignore b/.gitignore

index d7e8d2a..94f66d5 100644 (file)
--- a/.gitignore
+++ b/.gitignore
@@ -84,6 +84,7 @@ git-resolve
  git-rev-list
  git-rev-parse
  git-revert
+git-rm
  git-send-email
  git-send-pack
  git-sh-setup
diff --git a/Documentation/git-cvsserver.txt b/Documentation/git-cvsserver.txt

new file mode 100644 (file)

index 0000000..88f07ff
--- /dev/null
+++ b/Documentation/git-cvsserver.txt
@@ -0,0 +1,89 @@
+git-cvsserver(1)
+================
+
+NAME
+----
+git-cvsserver - A CVS server emulator for git
+
+
+SYNOPSIS
+--------
+[verse]
+export CVS_SERVER=git-cvsserver
+'cvs' -d :ext:user@server/path/repo.git co <HEAD_name>
+
+
+DESCRIPTION
+-----------
+
+This application is a CVS emulation layer for git.
+
+It is highly functional. However, not all methods are implemented,
+and for those methods that are implemented,
+not all switches are implemented.
+
+Testing has been done using both the CLI CVS client, and the Eclipse CVS
+plugin. Most functionality works fine with both of these clients.
+
+LIMITATIONS
+-----------
+Currently gitcvs only works over ssh connections.
+
+
+INSTALLATION
+------------
+1. Put server.pl somewhere useful on the same machine that is hosting your git repos
+
+2. For each repo that you want accessible from CVS you need to edit config in
+   the repo and add the following section.
+
+   [gitcvs]
+        enabled=1
+        logfile=/path/to/logfile
+
+   n.b. you need to ensure each user that is going to invoke server.pl has
+   write access to the log file.
+
+5. On each client machine you need to set the following variables.
+     CVSROOT should be set as per normal, but the directory should point at the
+             appropriate git repo.
+     CVS_SERVER should be set to the server.pl script that has been put on the
+                remote machine.
+
+6. Clients should now be able to check out modules (where modules are the names
+   of branches in git).
+     $ cvs co -d mylocaldir master
+
+Operations supported
+--------------------
+
+All the operations required for normal use are supported, including
+checkout, diff, status, update, log, add, remove, commit.
+Legacy monitoring operations are not supported (edit, watch and related).
+Exports and tagging (tags and branches) are not supported at this stage.
+
+The server will set the -k mode to binary when relevant. In proper GIT
+tradition, the contents of the files are always respected.
+No keyword expansion or newline munging is supported.
+
+Dependencies
+------------
+
+git-cvsserver depends on DBD::SQLite.
+
+Copyright and Authors
+---------------------
+
+This program is copyright The Open University UK - 2006.
+
+Authors: Martyn Smith    <martyn@catalyst.net.nz>
+         Martin Langhoff <martin@catalyst.net.nz>
+         with ideas and patches from participants of the git-list <git@vger.kernel.org>.
+
+Documentation
+--------------
+Documentation by Martyn Smith <martyn@catalyst.net.nz> and Martin Langhoff <martin@catalyst.net.nz>Matthias Urlichs <smurf@smurf.noris.de>.
+
+GIT
+---
+Part of the gitlink:git[7] suite
diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt

new file mode 100644 (file)

index 0000000..401bfb2
--- /dev/null
+++ b/Documentation/git-rm.txt
@@ -0,0 +1,89 @@
+git-rm(1)
+=========
+
+NAME
+----
+git-rm - Remove files from the working tree and from the index.
+
+SYNOPSIS
+--------
+'git-rm' [-f] [-n] [-v] [--] <file>...
+
+DESCRIPTION
+-----------
+A convenience wrapper for git-update-index --remove. For those coming
+from cvs, git-rm provides an operation similar to "cvs rm" or "cvs
+remove".
+
+
+OPTIONS
+-------
+<file>...::
+       Files to remove from the index and optionally, from the
+       working tree as well.
+
+-f::
+       Remove files from the working tree as well as from the index.
+
+-n::
+        Don't actually remove the file(s), just show if they exist in
+        the index.
+
+-v::
+        Be verbose.
+
+--::
+       This option can be used to separate command-line options from
+       the list of files, (useful when filenames might be mistaken
+       for command-line options).
+
+
+DISCUSSION
+----------
+
+The list of <file> given to the command is fed to `git-ls-files`
+command to list files that are registered in the index and
+are not ignored/excluded by `$GIT_DIR/info/exclude` file or
+`.gitignore` file in each directory.  This means two things:
+
+. You can put the name of a directory on the command line, and the
+  command will remove all files in it and its subdirectories (the
+  directories themselves are never removed from the working tree);
+
+. Giving the name of a file that is not in the index does not
+  remove that file.
+
+
+EXAMPLES
+--------
+git-rm Documentation/\\*.txt::
+
+       Removes all `\*.txt` files from the index that are under the
+       `Documentation` directory and any of its subdirectories. The
+       files are not removed from the working tree.
++
+Note that the asterisk `\*` is quoted from the shell in this
+example; this lets the command include the files from
+subdirectories of `Documentation/` directory.
+
+git-rm -f git-*.sh::
+
+       Remove all git-*.sh scripts that are in the index. The files
+       are removed from the index, and (because of the -f option),
+       from the working tree as well. Because this example lets the
+       shell expand the asterisk (i.e. you are listing the files
+       explicitly), it does not remove `subdir/git-foo.sh`.
+
+
+Author
+------
+Written by Linus Torvalds <torvalds@osdl.org>
+
+Documentation
+--------------
+Documentation by Junio C Hamano and the git-list <git@vger.kernel.org>.
+
+GIT
+---
+Part of the gitlink:git[7] suite
+
diff --git a/Makefile b/Makefile

index 0c04882..8e6bbce 100644 (file)
--- a/Makefile
+++ b/Makefile
@@ -120,7 +120,7 @@ SCRIPT_SH = \
         git-merge-one-file.sh git-parse-remote.sh \
         git-prune.sh git-pull.sh git-push.sh git-rebase.sh \
         git-repack.sh git-request-pull.sh git-reset.sh \
-       git-resolve.sh git-revert.sh git-sh-setup.sh \
+       git-resolve.sh git-revert.sh git-rm.sh git-sh-setup.sh \
         git-tag.sh git-verify-tag.sh git-whatchanged.sh \
         git-applymbox.sh git-applypatch.sh git-am.sh \
         git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
@@ -130,6 +130,7 @@ SCRIPT_SH = \
  SCRIPT_PERL = \
         git-archimport.perl git-cvsimport.perl git-relink.perl \
         git-shortlog.perl git-fmt-merge-msg.perl git-rerere.perl \
+       git-annotate.perl git-cvsserver.perl \
         git-svnimport.perl git-mv.perl git-cvsexportcommit.perl
  
  SCRIPT_PYTHON = \
@@ -164,7 +165,7 @@ PROGRAMS = \
         git-upload-pack$X git-verify-pack$X git-write-tree$X \
         git-update-ref$X git-symbolic-ref$X git-check-ref-format$X \
         git-name-rev$X git-pack-redundant$X git-repo-config$X git-var$X \
-       git-describe$X git-merge-tree$X
+       git-describe$X git-merge-tree$X git-blame$X
  
  # what 'all' will build and 'install' will install, in gitexecdir
  ALL_PROGRAMS = $(PROGRAMS) $(SIMPLE_PROGRAMS) $(SCRIPTS)
diff --git a/blame.c b/blame.c

new file mode 100644 (file)

index 0000000..1e65546
--- /dev/null
+++ b/blame.c
@@ -0,0 +1,443 @@
+#include <assert.h>
+
+#include "cache.h"
+#include "refs.h"
+#include "tag.h"
+#include "commit.h"
+#include "tree.h"
+#include "blob.h"
+#include "epoch.h"
+#include "diff.h"
+
+#define DEBUG 0
+
+struct commit** blame_lines;
+int num_blame_lines;
+
+struct util_info
+{
+    int* line_map;
+    int num_lines;
+    unsigned char sha1[20]; /* blob sha, not commit! */
+    char* buf;
+    unsigned long size;
+//    const char* path;
+};
+
+struct chunk
+{
+    int off1, len1; // ---
+    int off2, len2; // +++
+};
+
+struct patch
+{
+    struct chunk* chunks;
+    int num;
+};
+
+static void get_blob(struct commit* commit);
+
+int num_get_patch = 0;
+int num_commits = 0;
+
+struct patch* get_patch(struct commit* commit, struct commit* other)
+{
+    struct patch* ret = xmalloc(sizeof(struct patch));
+    ret->chunks = NULL;
+    ret->num = 0;
+
+    struct util_info* info_c = (struct util_info*) commit->object.util;
+    struct util_info* info_o = (struct util_info*) other->object.util;
+
+    if(!memcmp(info_c->sha1, info_o->sha1, 20))
+        return ret;
+
+    get_blob(commit);
+    get_blob(other);
+
+    FILE* fout = fopen("/tmp/git-blame-tmp1", "w");
+    if(!fout)
+        die("fopen tmp1 failed: %s", strerror(errno));
+
+    if(fwrite(info_c->buf, info_c->size, 1, fout) != 1)
+        die("fwrite 1 failed: %s", strerror(errno));
+    fclose(fout);
+
+    fout = fopen("/tmp/git-blame-tmp2", "w");
+    if(!fout)
+        die("fopen tmp2 failed: %s", strerror(errno));
+
+    if(fwrite(info_o->buf, info_o->size, 1, fout) != 1)
+        die("fwrite 2 failed: %s", strerror(errno));
+    fclose(fout);
+
+    FILE* fin = popen("diff -u0 /tmp/git-blame-tmp1 /tmp/git-blame-tmp2", "r");
+    if(!fin)
+        die("popen failed: %s", strerror(errno));
+
+    char buf[1024];
+    while(fgets(buf, sizeof(buf), fin)) {
+        if(buf[0] != '@' || buf[1] != '@')
+            continue;
+
+        if(DEBUG)
+            printf("chunk line: %s", buf);
+        ret->num++;
+        ret->chunks = xrealloc(ret->chunks, sizeof(struct chunk)*ret->num);
+        struct chunk* chunk = &ret->chunks[ret->num-1];
+
+        assert(!strncmp(buf, "@@ -", 4));
+
+        char* start = buf+4;
+        char* sp = index(start, ' ');
+        *sp = '\0';
+        if(index(start, ',')) {
+            int ret = sscanf(start, "%d,%d", &chunk->off1, &chunk->len1);
+            assert(ret == 2);
+        } else {
+            int ret = sscanf(start, "%d", &chunk->off1);
+            assert(ret == 1);
+            chunk->len1 = 1;
+        }
+        *sp = ' ';
+
+        start = sp+1;
+        sp = index(start, ' ');
+        *sp = '\0';
+        if(index(start, ',')) {
+            int ret = sscanf(start, "%d,%d", &chunk->off2, &chunk->len2);
+            assert(ret == 2);
+        } else {
+            int ret = sscanf(start, "%d", &chunk->off2);
+            assert(ret == 1);
+            chunk->len2 = 1;
+        }
+        *sp = ' ';
+
+        if(chunk->off1 > 0)
+            chunk->off1 -= 1;
+        if(chunk->off2 > 0)
+            chunk->off2 -= 1;
+
+        assert(chunk->off1 >= 0);
+        assert(chunk->off2 >= 0);
+    }
+    fclose(fin);
+
+    num_get_patch++;
+    return ret;
+}
+
+void free_patch(struct patch* p)
+{
+    free(p->chunks);
+    free(p);
+}
+
+static int get_blob_sha1_internal(unsigned char *sha1, const char *base, int baselen,
+                                  const char *pathname, unsigned mode, int stage);
+
+
+static unsigned char blob_sha1[20];
+static int get_blob_sha1(struct tree* t, const char* pathname, unsigned char* sha1)
+{
+    const char *pathspec[2];
+    pathspec[0] = pathname;
+    pathspec[1] = NULL;
+    memset(blob_sha1, 0, sizeof(blob_sha1));
+    read_tree_recursive(t, "", 0, 0, pathspec, get_blob_sha1_internal);
+
+    int i;
+    for(i = 0; i < 20; i++) {
+        if(blob_sha1[i] != 0)
+            break;
+    }
+
+    if(i == 20)
+        return -1;
+
+    memcpy(sha1, blob_sha1, 20);
+    return 0;
+}
+
+static int get_blob_sha1_internal(unsigned char *sha1, const char *base, int baselen,
+                                  const char *pathname, unsigned mode, int stage)
+{
+//    printf("Got blob: %s base: '%s' baselen: %d pathname: '%s' mode: %o stage: %d\n",
+//           sha1_to_hex(sha1), base, baselen, pathname, mode, stage);
+
+    if(S_ISDIR(mode))
+        return READ_TREE_RECURSIVE;
+
+    memcpy(blob_sha1, sha1, 20);
+    return -1;
+}
+
+static void get_blob(struct commit* commit)
+{
+    struct util_info* info = commit->object.util;
+    char type[20];
+
+    if(info->buf)
+        return;
+
+    info->buf = read_sha1_file(info->sha1, type, &info->size);
+    assert(!strcmp(type, "blob"));
+}
+
+void print_patch(struct patch* p)
+{
+    printf("Num chunks: %d\n", p->num);
+    int i;
+    for(i = 0; i < p->num; i++) {
+        printf("%d,%d %d,%d\n", p->chunks[i].off1, p->chunks[i].len1, p->chunks[i].off2, p->chunks[i].len2);
+    }
+}
+
+
+// p is a patch from commit to other.
+void fill_line_map(struct commit* commit, struct commit* other, struct patch* p)
+{
+    int num_lines = ((struct util_info*) commit->object.util)->num_lines;
+    int* line_map = ((struct util_info*) commit->object.util)->line_map;
+    int num_lines2 = ((struct util_info*) other->object.util)->num_lines;
+    int* line_map2 = ((struct util_info*) other->object.util)->line_map;
+    int cur_chunk = 0;
+    int i1, i2;
+
+    if(p->num && DEBUG)
+        print_patch(p);
+
+    for(i1 = 0; i1 < num_lines; i1++)
+        line_map[i1] = -1;
+
+    if(DEBUG)
+        printf("num lines 1: %d num lines 2: %d\n", num_lines, num_lines2);
+
+    for(i1 = 0, i2 = 0; i1 < num_lines; i1++, i2++) {
+        if(DEBUG > 1)
+            printf("%d %d\n", i1, i2);
+
+        if(i2 >= num_lines2)
+            break;
+
+        line_map[i1] = line_map2[i2];
+
+        struct chunk* chunk = NULL;
+        if(cur_chunk < p->num)
+            chunk = &p->chunks[cur_chunk];
+
+        if(chunk && chunk->off1 == i1) {
+            i2 = chunk->off2;
+
+            if(chunk->len1 > 0)
+                i1 += chunk->len1-1;
+            if(chunk->len2 > 0)
+                i2 += chunk->len2-1;
+            cur_chunk++;
+        }
+    }
+}
+
+int map_line(struct commit* commit, int line)
+{
+    struct util_info* info = commit->object.util;
+    assert(line >= 0 && line < info->num_lines);
+    return info->line_map[line];
+}
+
+int fill_util_info(struct commit* commit, const char* path)
+{
+    if(commit->object.util)
+        return 0;
+
+    struct util_info* util = xmalloc(sizeof(struct util_info));
+    util->buf = NULL;
+    util->size = 0;
+    util->num_lines = -1;
+    util->line_map = NULL;
+
+    commit->object.util = util;
+
+    if(get_blob_sha1(commit->tree, path, util->sha1))
+        return -1;
+
+    return 0;
+}
+
+void alloc_line_map(struct commit* commit)
+{
+    struct util_info* util = commit->object.util;
+
+    if(util->line_map)
+        return;
+
+    get_blob(commit);
+
+    int i;
+    util->num_lines = 0;
+    for(i = 0; i < util->size; i++) {
+        if(util->buf[i] == '\n')
+            util->num_lines++;
+    }
+    util->line_map = xmalloc(sizeof(int)*util->num_lines);
+}
+
+void copy_line_map(struct commit* dst, struct commit* src)
+{
+    struct util_info* u_dst = dst->object.util;
+    struct util_info* u_src = src->object.util;
+
+    u_dst->line_map = u_src->line_map;
+    u_dst->num_lines = u_src->num_lines;
+    u_dst->buf = u_src->buf;
+    u_dst->size = u_src->size;
+}
+
+void process_commits(struct commit_list* list, const char* path)
+{
+    int i;
+
+    while(list) {
+        struct commit* commit = pop_commit(&list);
+        struct commit_list* parents;
+        struct util_info* info;
+
+        info = commit->object.util;
+        num_commits++;
+        if(DEBUG)
+            printf("\nProcessing commit: %d %s\n", num_commits, sha1_to_hex(commit->object.sha1));
+        for(parents = commit->parents;
+            parents != NULL; parents = parents->next) {
+            struct commit* parent = parents->item;
+
+            if(parse_commit(parent) < 0)
+                die("parse_commit error");
+
+            if(DEBUG)
+                printf("parent: %s\n", sha1_to_hex(parent->object.sha1));
+
+            if(fill_util_info(parent, path))
+                continue;
+
+            // Temporarily assign everything to the parent.
+            int num_blame = 0;
+            for(i = 0; i < num_blame_lines; i++) {
+                if(blame_lines[i] == commit) {
+                    num_blame++;
+                    blame_lines[i] = parent;
+                }
+            }
+
+            if(num_blame == 0)
+                continue;
+
+            struct patch* patch = get_patch(parent, commit);
+            if(patch->num == 0) {
+                copy_line_map(parent, commit);
+            } else {
+                alloc_line_map(parent);
+                fill_line_map(parent, commit, patch);
+            }
+
+            for(i = 0; i < patch->num; i++) {
+                int l;
+                for(l = 0; l < patch->chunks[i].len2; l++) {
+                    int mapped_line = map_line(commit, patch->chunks[i].off2 + l);
+                    if(mapped_line != -1 && blame_lines[mapped_line] == parent)
+                        blame_lines[mapped_line] = commit;
+                }
+            }
+            free_patch(patch);
+        }
+    }
+}
+
+#define SEEN 1
+struct commit_list* get_commit_list(struct commit* commit, const char* pathname)
+{
+    struct commit_list* ret = NULL;
+    struct commit_list* process = NULL;
+    unsigned char sha1[20];
+
+    commit_list_insert(commit, &process);
+
+    while(process) {
+        struct commit* com = pop_commit(&process);
+        if(com->object.flags & SEEN)
+            continue;
+
+        com->object.flags |= SEEN;
+        commit_list_insert(com, &ret);
+        struct commit_list* parents;
+
+        parse_commit(com);
+
+        for(parents = com->parents;
+            parents != NULL; parents = parents->next) {
+            struct commit* parent = parents->item;
+
+            parse_commit(parent);
+
+            if(!get_blob_sha1(parent->tree, pathname, sha1))
+                commit_list_insert(parent, &process);
+        }
+    }
+
+    return ret;
+}
+
+int main(int argc, const char **argv)
+{
+    unsigned char sha1[20];
+    struct commit *commit;
+    const char* filename;
+    int i;
+
+    setup_git_directory();
+
+    if (argc != 3)
+        die("Usage: blame commit-ish file");
+
+    if (get_sha1(argv[1], sha1))
+        die("get_sha1 failed");
+
+    commit = lookup_commit_reference(sha1);
+
+    filename = argv[2];
+
+    struct commit_list* list = get_commit_list(commit, filename);
+    sort_in_topological_order(&list, 1);
+
+    if(fill_util_info(commit, filename)) {
+        printf("%s not found in %s\n", filename, argv[1]);
+        return 0;
+    }
+    alloc_line_map(commit);
+
+    struct util_info* util = commit->object.util;
+    num_blame_lines = util->num_lines;
+    blame_lines = xmalloc(sizeof(struct commit*)*num_blame_lines);
+
+
+    for(i = 0; i < num_blame_lines; i++) {
+        blame_lines[i] = commit;
+
+        ((struct util_info*) commit->object.util)->line_map[i] = i;
+    }
+
+    process_commits(list, filename);
+
+    for(i = 0; i < num_blame_lines; i++) {
+        printf("%d %s\n", i+1-1, sha1_to_hex(blame_lines[i]->object.sha1));
+//        printf("%d %s\n", i+1-1, find_unique_abbrev(blame_lines[i]->object.sha1, 6));
+    }
+
+    if(DEBUG) {
+        printf("num get patch: %d\n", num_get_patch);
+        printf("num commits: %d\n", num_commits);
+    }
+
+    return 0;
+}
diff --git a/commit.c b/commit.c

index c550a00..06d5439 100644 (file)
--- a/commit.c
+++ b/commit.c
@@ -212,7 +212,8 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
         if (memcmp(bufptr, "tree ", 5))
                 return error("bogus commit object %s", sha1_to_hex(item->object.sha1));
         if (get_sha1_hex(bufptr + 5, parent) < 0)
-               return error("bad tree pointer in commit %s\n", sha1_to_hex(item->object.sha1));
+               return error("bad tree pointer in commit %s",
+                            sha1_to_hex(item->object.sha1));
         item->tree = lookup_tree(parent);
         if (item->tree)
                 n_refs++;
diff --git a/contrib/gitview/gitview b/contrib/gitview/gitview

index 5c338c0..4b52eb7 100755 (executable)
--- a/contrib/gitview/gitview
+++ b/contrib/gitview/gitview
@@ -454,11 +454,7 @@ class GitView:
  
                 self.bt_sha1 = { }
                 ls_remote = re.compile('^(.{40})\trefs/([^^]+)(?:\\^(..))?$');
-               git_dir = os.getenv("GIT_DIR")
-               if (git_dir == None):
-                       git_dir = ".git"
-
-               fp = os.popen('git ls-remote ' + git_dir)
+               fp = os.popen('git ls-remote "${GIT_DIR-.git}"')
                 while 1:
                         line = string.strip(fp.readline())
                         if line == '':
diff --git a/diffcore-rename.c b/diffcore-rename.c

index 39d9126..ffd126a 100644 (file)
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -176,8 +176,10 @@ static int estimate_similarity(struct diff_filespec *src,
         /* A delta that has a lot of literal additions would have
          * big delta_size no matter what else it does.
          */
-       if (base_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE)
+       if (base_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE) {
+               free(delta);
                 return 0;
+       }
  
         /* Estimate the edit size by interpreting delta. */
         if (count_delta(delta, delta_size, &src_copied, &literal_added)) {
diff --git a/fetch-pack.c b/fetch-pack.c

index aa6f42a..09738fe 100644 (file)
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -8,7 +8,7 @@ static int keep_pack;
  static int quiet;
  static int verbose;
  static const char fetch_pack_usage[] =
-"git-fetch-pack [-q] [-v] [-k] [--exec=upload-pack] [host:]directory <refs>...";
+"git-fetch-pack [-q] [-v] [-k] [--thin] [--exec=upload-pack] [host:]directory <refs>...";
  static const char *exec = "git-upload-pack";
  
  #define COMPLETE       (1U << 0)
@@ -18,7 +18,7 @@ static const char *exec = "git-upload-pack";
  #define POPPED         (1U << 4)
  
  static struct commit_list *rev_list = NULL;
-static int non_common_revs = 0, multi_ack = 0;
+static int non_common_revs = 0, multi_ack = 0, use_thin_pack = 0;
  
  static void rev_list_push(struct commit *commit, int mark)
  {
@@ -156,8 +156,9 @@ static int find_common(int fd[2], unsigned char *result_sha1,
                         continue;
                 }
  
-               packet_write(fd[1], "want %s%s\n", sha1_to_hex(remote),
-                       multi_ack ? " multi_ack" : "");
+               packet_write(fd[1], "want %s%s%s\n", sha1_to_hex(remote),
+                            (multi_ack ? " multi_ack" : ""),
+                            (use_thin_pack ? " thin-pack" : ""));
                 fetching++;
         }
         packet_flush(fd[1]);
@@ -421,6 +422,10 @@ int main(int argc, char **argv)
                                 keep_pack = 1;
                                 continue;
                         }
+                       if (!strcmp("--thin", arg)) {
+                               use_thin_pack = 1;
+                               continue;
+                       }
                         if (!strcmp("-v", arg)) {
                                 verbose = 1;
                                 continue;
@@ -434,6 +439,8 @@ int main(int argc, char **argv)
         }
         if (!dest)
                 usage(fetch_pack_usage);
+       if (keep_pack)
+               use_thin_pack = 0;
         pid = git_connect(fd, dest, exec);
         if (pid < 0)
                 return 1;
diff --git a/git-annotate.perl b/git-annotate.perl

new file mode 100755 (executable)

index 0000000..3800c46
--- /dev/null
+++ b/git-annotate.perl
@@ -0,0 +1,356 @@
+#!/usr/bin/perl
+# Copyright 2006, Ryan Anderson <ryan@michonline.com>
+#
+# GPL v2 (See COPYING)
+#
+# This file is licensed under the GPL v2, or a later version
+# at the discretion of Linus Torvalds.
+
+use warnings;
+use strict;
+use Getopt::Std;
+use POSIX qw(strftime gmtime);
+
+sub usage() {
+       print STDERR 'Usage: ${\basename $0} [-s] [-S revs-file] file
+
+       -l              show long rev
+       -r              follow renames
+       -S commit       use revs from revs-file instead of calling git-rev-list
+';
+
+       exit(1);
+}
+
+our ($opt_h, $opt_l, $opt_r, $opt_S);
+getopts("hlrS:") or usage();
+$opt_h && usage();
+
+my $filename = shift @ARGV;
+
+my @stack = (
+       {
+               'rev' => "HEAD",
+               'filename' => $filename,
+       },
+);
+
+our (@lineoffsets, @pendinglineoffsets);
+our @filelines = ();
+open(F,"<",$filename)
+       or die "Failed to open filename: $!";
+
+while(<F>) {
+       chomp;
+       push @filelines, $_;
+}
+close(F);
+our $leftover_lines = @filelines;
+our %revs;
+our @revqueue;
+our $head;
+
+my $revsprocessed = 0;
+while (my $bound = pop @stack) {
+       my @revisions = git_rev_list($bound->{'rev'}, $bound->{'filename'});
+       foreach my $revinst (@revisions) {
+               my ($rev, @parents) = @$revinst;
+               $head ||= $rev;
+
+               if (!defined($rev)) {
+                       $rev = "";
+               }
+               $revs{$rev}{'filename'} = $bound->{'filename'};
+               if (scalar @parents > 0) {
+                       $revs{$rev}{'parents'} = \@parents;
+                       next;
+               }
+
+               if (!$opt_r) {
+                       next;
+               }
+
+               my $newbound = find_parent_renames($rev, $bound->{'filename'});
+               if ( exists $newbound->{'filename'} && $newbound->{'filename'} ne $bound->{'filename'}) {
+                       push @stack, $newbound;
+                       $revs{$rev}{'parents'} = [$newbound->{'rev'}];
+               }
+       }
+}
+push @revqueue, $head;
+init_claim($head);
+$revs{$head}{'lineoffsets'} = {};
+handle_rev();
+
+
+my $i = 0;
+foreach my $l (@filelines) {
+       my ($output, $rev, $committer, $date);
+       if (ref $l eq 'ARRAY') {
+               ($output, $rev, $committer, $date) = @$l;
+               if (!$opt_l && length($rev) > 8) {
+                       $rev = substr($rev,0,8);
+               }
+       } else {
+               $output = $l;
+               ($rev, $committer, $date) = ('unknown', 'unknown', 'unknown');
+       }
+
+       printf("%s\t(%10s\t%10s\t%d)%s\n", $rev, $committer,
+               format_date($date), $i++, $output);
+}
+
+sub init_claim {
+       my ($rev) = @_;
+       my %revinfo = git_commit_info($rev);
+       for (my $i = 0; $i < @filelines; $i++) {
+               $filelines[$i] = [ $filelines[$i], '', '', '', 1];
+                       # line,
+                       # rev,
+                       # author,
+                       # date,
+                       # 1 <-- belongs to the original file.
+       }
+       $revs{$rev}{'lines'} = \@filelines;
+}
+
+
+sub handle_rev {
+       my $i = 0;
+       while (my $rev = shift @revqueue) {
+
+               my %revinfo = git_commit_info($rev);
+
+               foreach my $p (@{$revs{$rev}{'parents'}}) {
+
+                       git_diff_parse($p, $rev, %revinfo);
+                       push @revqueue, $p;
+               }
+
+
+               if (scalar @{$revs{$rev}{parents}} == 0) {
+                       # We must be at the initial rev here, so claim everything that is left.
+                       for (my $i = 0; $i < @{$revs{$rev}{lines}}; $i++) {
+                               if (ref ${$revs{$rev}{lines}}[$i] eq '' || ${$revs{$rev}{lines}}[$i][1] eq '') {
+                                       claim_line($i, $rev, $revs{$rev}{lines}, %revinfo);
+                               }
+                       }
+               }
+       }
+}
+
+
+sub git_rev_list {
+       my ($rev, $file) = @_;
+
+       if ($opt_S) {
+               open(P, '<' . $opt_S);
+       } else {
+               open(P,"-|","git-rev-list","--parents","--remove-empty",$rev,"--",$file)
+                       or die "Failed to exec git-rev-list: $!";
+       }
+
+       my @revs;
+       while(my $line = <P>) {
+               chomp $line;
+               my ($rev, @parents) = split /\s+/, $line;
+               push @revs, [ $rev, @parents ];
+       }
+       close(P);
+
+       printf("0 revs found for rev %s (%s)\n", $rev, $file) if (@revs == 0);
+       return @revs;
+}
+
+sub find_parent_renames {
+       my ($rev, $file) = @_;
+
+       open(P,"-|","git-diff-tree", "-M50", "-r","--name-status", "-z","$rev")
+               or die "Failed to exec git-diff: $!";
+
+       local $/ = "\0";
+       my %bound;
+       my $junk = <P>;
+       while (my $change = <P>) {
+               chomp $change;
+               my $filename = <P>;
+               chomp $filename;
+
+               if ($change =~ m/^[AMD]$/ ) {
+                       next;
+               } elsif ($change =~ m/^R/ ) {
+                       my $oldfilename = $filename;
+                       $filename = <P>;
+                       chomp $filename;
+                       if ( $file eq $filename ) {
+                               my $parent = git_find_parent($rev, $oldfilename);
+                               @bound{'rev','filename'} = ($parent, $oldfilename);
+                               last;
+                       }
+               }
+       }
+       close(P);
+
+       return \%bound;
+}
+
+
+sub git_find_parent {
+       my ($rev, $filename) = @_;
+
+       open(REVPARENT,"-|","git-rev-list","--remove-empty", "--parents","--max-count=1","$rev","--",$filename)
+               or die "Failed to open git-rev-list to find a single parent: $!";
+
+       my $parentline = <REVPARENT>;
+       chomp $parentline;
+       my ($revfound,$parent) = split m/\s+/, $parentline;
+
+       close(REVPARENT);
+
+       return $parent;
+}
+
+
+# Get a diff between the current revision and a parent.
+# Record the commit information that results.
+sub git_diff_parse {
+       my ($parent, $rev, %revinfo) = @_;
+
+       my ($ri, $pi) = (0,0);
+       open(DIFF,"-|","git-diff-tree","-M","-p",$rev,$parent,"--",
+                       $revs{$rev}{'filename'}, $revs{$parent}{'filename'})
+               or die "Failed to call git-diff for annotation: $!";
+
+       my $slines = $revs{$rev}{'lines'};
+       my @plines;
+
+       my $gotheader = 0;
+       my ($remstart, $remlength, $addstart, $addlength);
+       my ($hunk_start, $hunk_index, $hunk_adds);
+       while(<DIFF>) {
+               chomp;
+               if (m/^@@ -(\d+),(\d+) \+(\d+),(\d+)/) {
+                       ($remstart, $remlength, $addstart, $addlength) = ($1, $2, $3, $4);
+                       # Adjust for 0-based arrays
+                       $remstart--;
+                       $addstart--;
+                       # Reinit hunk tracking.
+                       $hunk_start = $remstart;
+                       $hunk_index = 0;
+                       $gotheader = 1;
+
+                       for (my $i = $ri; $i < $remstart; $i++) {
+                               $plines[$pi++] = $slines->[$i];
+                               $ri++;
+                       }
+                       next;
+               } elsif (!$gotheader) {
+                       next;
+               }
+
+               if (m/^\+(.*)$/) {
+                       my $line = $1;
+                       $plines[$pi++] = [ $line, '', '', '', 0 ];
+                       next;
+
+               } elsif (m/^-(.*)$/) {
+                       my $line = $1;
+                       if (get_line($slines, $ri) eq $line) {
+                               # Found a match, claim
+                               claim_line($ri, $rev, $slines, %revinfo);
+                       } else {
+                               die sprintf("Sync error: %d/%d\n|%s\n|%s\n%s => %s\n",
+                                               $ri, $hunk_start + $hunk_index,
+                                               $line,
+                                               get_line($slines, $ri),
+                                               $rev, $parent);
+                       }
+                       $ri++;
+
+               } else {
+                       if (substr($_,1) ne get_line($slines,$ri) ) {
+                               die sprintf("Line %d (%d) does not match:\n|%s\n|%s\n%s => %s\n",
+                                               $hunk_start + $hunk_index, $ri,
+                                               substr($_,1),
+                                               get_line($slines,$ri),
+                                               $rev, $parent);
+                       }
+                       $plines[$pi++] = $slines->[$ri++];
+               }
+               $hunk_index++;
+       }
+       close(DIFF);
+       for (my $i = $ri; $i < @{$slines} ; $i++) {
+               push @plines, $slines->[$ri++];
+       }
+
+       $revs{$parent}{lines} = \@plines;
+       return;
+}
+
+sub get_line {
+       my ($lines, $index) = @_;
+
+       return ref $lines->[$index] ne '' ? $lines->[$index][0] : $lines->[$index];
+}
+
+sub git_cat_file {
+       my ($parent, $filename) = @_;
+       return () unless defined $parent && defined $filename;
+       my $blobline = `git-ls-tree $parent $filename`;
+       my ($mode, $type, $blob, $tfilename) = split(/\s+/, $blobline, 4);
+
+       open(C,"-|","git-cat-file", "blob", $blob)
+               or die "Failed to git-cat-file blob $blob (rev $parent, file $filename): " . $!;
+
+       my @lines;
+       while(<C>) {
+               chomp;
+               push @lines, $_;
+       }
+       close(C);
+
+       return @lines;
+}
+
+
+sub claim_line {
+       my ($floffset, $rev, $lines, %revinfo) = @_;
+       my $oline = get_line($lines, $floffset);
+       @{$lines->[$floffset]} = ( $oline, $rev,
+               $revinfo{'author'}, $revinfo{'author_date'} );
+       #printf("Claiming line %d with rev %s: '%s'\n",
+       #               $floffset, $rev, $oline) if 1;
+}
+
+sub git_commit_info {
+       my ($rev) = @_;
+       open(COMMIT, "-|","git-cat-file", "commit", $rev)
+               or die "Failed to call git-cat-file: $!";
+
+       my %info;
+       while(<COMMIT>) {
+               chomp;
+               last if (length $_ == 0);
+
+               if (m/^author (.*) <(.*)> (.*)$/) {
+                       $info{'author'} = $1;
+                       $info{'author_email'} = $2;
+                       $info{'author_date'} = $3;
+               } elsif (m/^committer (.*) <(.*)> (.*)$/) {
+                       $info{'committer'} = $1;
+                       $info{'committer_email'} = $2;
+                       $info{'committer_date'} = $3;
+               }
+       }
+       close(COMMIT);
+
+       return %info;
+}
+
+sub format_date {
+       my ($timestamp, $timezone) = split(' ', $_[0]);
+
+       return strftime("%Y-%m-%d %H:%M:%S " . $timezone, gmtime($timestamp));
+}
+
diff --git a/git-clone.sh b/git-clone.sh

index dc0ad55..a059088 100755 (executable)
--- a/git-clone.sh
+++ b/git-clone.sh
@@ -253,7 +253,7 @@ Pull: $head_points_at:$origin" &&
  
         case "$no_checkout" in
         '')
-               git checkout
+               git-read-tree -m -u -v HEAD HEAD
         esac
  fi
  
diff --git a/git-cvsserver.perl b/git-cvsserver.perl

new file mode 100755 (executable)

index 0000000..d20d1a8
--- /dev/null
+++ b/git-cvsserver.perl
@@ -0,0 +1,2449 @@
+#!/usr/bin/perl
+
+####
+#### This application is a CVS emulation layer for git.
+#### It is intended for clients to connect over SSH.
+#### See the documentation for more details.
+####
+#### Copyright The Open University UK - 2006.
+####
+#### Authors: Martyn Smith    <martyn@catalyst.net.nz>
+####          Martin Langhoff <martin@catalyst.net.nz>
+####
+####
+#### Released under the GNU Public License, version 2.
+####
+####
+
+use strict;
+use warnings;
+
+use Fcntl;
+use File::Temp qw/tempdir tempfile/;
+use File::Basename;
+
+my $log = GITCVS::log->new();
+my $cfg;
+
+my $DATE_LIST = {
+    Jan => "01",
+    Feb => "02",
+    Mar => "03",
+    Apr => "04",
+    May => "05",
+    Jun => "06",
+    Jul => "07",
+    Aug => "08",
+    Sep => "09",
+    Oct => "10",
+    Nov => "11",
+    Dec => "12",
+};
+
+# Enable autoflush for STDOUT (otherwise the whole thing falls apart)
+$| = 1;
+
+#### Definition and mappings of functions ####
+
+my $methods = {
+    'Root'            => \&req_Root,
+    'Valid-responses' => \&req_Validresponses,
+    'valid-requests'  => \&req_validrequests,
+    'Directory'       => \&req_Directory,
+    'Entry'           => \&req_Entry,
+    'Modified'        => \&req_Modified,
+    'Unchanged'       => \&req_Unchanged,
+    'Argument'        => \&req_Argument,
+    'Argumentx'       => \&req_Argument,
+    'expand-modules'  => \&req_expandmodules,
+    'add'             => \&req_add,
+    'remove'          => \&req_remove,
+    'co'              => \&req_co,
+    'update'          => \&req_update,
+    'ci'              => \&req_ci,
+    'diff'            => \&req_diff,
+    'log'             => \&req_log,
+    'tag'             => \&req_CATCHALL,
+    'status'          => \&req_status,
+    'admin'           => \&req_CATCHALL,
+    'history'         => \&req_CATCHALL,
+    'watchers'        => \&req_CATCHALL,
+    'editors'         => \&req_CATCHALL,
+    'annotate'        => \&req_annotate,
+    'Global_option'   => \&req_Globaloption,
+    #'annotate'        => \&req_CATCHALL,
+};
+
+##############################################
+
+
+# $state holds all the bits of information the clients sends us that could
+# potentially be useful when it comes to actually _doing_ something.
+my $state = {};
+$log->info("--------------- STARTING -----------------");
+
+my $TEMP_DIR = tempdir( CLEANUP => 1 );
+$log->debug("Temporary directory is '$TEMP_DIR'");
+
+# Keep going until the client closes the connection
+while (<STDIN>)
+{
+    chomp;
+
+    # Check to see if we've seen this method, and call appropiate function.
+    if ( /^([\w-]+)(?:\s+(.*))?$/ and defined($methods->{$1}) )
+    {
+        # use the $methods hash to call the appropriate sub for this command
+        #$log->info("Method : $1");
+        &{$methods->{$1}}($1,$2);
+    } else {
+        # log fatal because we don't understand this function. If this happens
+        # we're fairly screwed because we don't know if the client is expecting
+        # a response. If it is, the client will hang, we'll hang, and the whole
+        # thing will be custard.
+        $log->fatal("Don't understand command $_\n");
+        die("Unknown command $_");
+    }
+}
+
+$log->debug("Processing time : user=" . (times)[0] . " system=" . (times)[1]);
+$log->info("--------------- FINISH -----------------");
+
+# Magic catchall method.
+#    This is the method that will handle all commands we haven't yet
+#    implemented. It simply sends a warning to the log file indicating a
+#    command that hasn't been implemented has been invoked.
+sub req_CATCHALL
+{
+    my ( $cmd, $data ) = @_;
+    $log->warn("Unhandled command : req_$cmd : $data");
+}
+
+
+# Root pathname \n
+#     Response expected: no. Tell the server which CVSROOT to use. Note that
+#     pathname is a local directory and not a fully qualified CVSROOT variable.
+#     pathname must already exist; if creating a new root, use the init
+#     request, not Root. pathname does not include the hostname of the server,
+#     how to access the server, etc.; by the time the CVS protocol is in use,
+#     connection, authentication, etc., are already taken care of. The Root
+#     request must be sent only once, and it must be sent before any requests
+#     other than Valid-responses, valid-requests, UseUnchanged, Set or init.
+sub req_Root
+{
+    my ( $cmd, $data ) = @_;
+    $log->debug("req_Root : $data");
+
+    $state->{CVSROOT} = $data;
+
+    $ENV{GIT_DIR} = $state->{CVSROOT} . "/";
+
+    foreach my $line ( `git-var -l` )
+    {
+        next unless ( $line =~ /^(.*?)\.(.*?)=(.*)$/ );
+        $cfg->{$1}{$2} = $3;
+    }
+
+    unless ( defined ( $cfg->{gitcvs}{enabled} ) and $cfg->{gitcvs}{enabled} =~ /^\s*(1|true|yes)\s*$/i )
+    {
+        print "E GITCVS emulation needs to be enabled on this repo\n";
+        print "E the repo config file needs a [gitcvs] section added, and the parameter 'enabled' set to 1\n";
+        print "E \n";
+        print "error 1 GITCVS emulation disabled\n";
+    }
+
+    if ( defined ( $cfg->{gitcvs}{logfile} ) )
+    {
+        $log->setfile($cfg->{gitcvs}{logfile});
+    } else {
+        $log->nofile();
+    }
+}
+
+# Global_option option \n
+#     Response expected: no. Transmit one of the global options `-q', `-Q',
+#     `-l', `-t', `-r', or `-n'. option must be one of those strings, no
+#     variations (such as combining of options) are allowed. For graceful
+#     handling of valid-requests, it is probably better to make new global
+#     options separate requests, rather than trying to add them to this
+#     request.
+sub req_Globaloption
+{
+    my ( $cmd, $data ) = @_;
+    $log->debug("req_Globaloption : $data");
+
+    # TODO : is this data useful ???
+}
+
+# Valid-responses request-list \n
+#     Response expected: no. Tell the server what responses the client will
+#     accept. request-list is a space separated list of tokens.
+sub req_Validresponses
+{
+    my ( $cmd, $data ) = @_;
+    $log->debug("req_Validrepsonses : $data");
+
+    # TODO : re-enable this, currently it's not particularly useful
+    #$state->{validresponses} = [ split /\s+/, $data ];
+}
+
+# valid-requests \n
+#     Response expected: yes. Ask the server to send back a Valid-requests
+#     response.
+sub req_validrequests
+{
+    my ( $cmd, $data ) = @_;
+
+    $log->debug("req_validrequests");
+
+    $log->debug("SEND : Valid-requests " . join(" ",keys %$methods));
+    $log->debug("SEND : ok");
+
+    print "Valid-requests " . join(" ",keys %$methods) . "\n";
+    print "ok\n";
+}
+
+# Directory local-directory \n
+#     Additional data: repository \n. Response expected: no. Tell the server
+#     what directory to use. The repository should be a directory name from a
+#     previous server response. Note that this both gives a default for Entry
+#     and Modified and also for ci and the other commands; normal usage is to
+#     send Directory for each directory in which there will be an Entry or
+#     Modified, and then a final Directory for the original directory, then the
+#     command. The local-directory is relative to the top level at which the
+#     command is occurring (i.e. the last Directory which is sent before the
+#     command); to indicate that top level, `.' should be sent for
+#     local-directory.
+sub req_Directory
+{
+    my ( $cmd, $data ) = @_;
+
+    my $repository = <STDIN>;
+    chomp $repository;
+
+
+    $state->{localdir} = $data;
+    $state->{repository} = $repository;
+    $state->{directory} = $repository;
+    $state->{directory} =~ s/^$state->{CVSROOT}\///;
+    $state->{module} = $1 if ($state->{directory} =~ s/^(.*?)(\/|$)//);
+    $state->{directory} .= "/" if ( $state->{directory} =~ /\S/ );
+
+    $log->debug("req_Directory : localdir=$data repository=$repository directory=$state->{directory} module=$state->{module}");
+}
+
+# Entry entry-line \n
+#     Response expected: no. Tell the server what version of a file is on the
+#     local machine. The name in entry-line is a name relative to the directory
+#     most recently specified with Directory. If the user is operating on only
+#     some files in a directory, Entry requests for only those files need be
+#     included. If an Entry request is sent without Modified, Is-modified, or
+#     Unchanged, it means the file is lost (does not exist in the working
+#     directory). If both Entry and one of Modified, Is-modified, or Unchanged
+#     are sent for the same file, Entry must be sent first. For a given file,
+#     one can send Modified, Is-modified, or Unchanged, but not more than one
+#     of these three.
+sub req_Entry
+{
+    my ( $cmd, $data ) = @_;
+
+    $log->debug("req_Entry : $data");
+
+    my @data = split(/\//, $data);
+
+    $state->{entries}{$state->{directory}.$data[1]} = {
+        revision    => $data[2],
+        conflict    => $data[3],
+        options     => $data[4],
+        tag_or_date => $data[5],
+    };
+}
+
+# add \n
+#     Response expected: yes. Add a file or directory. This uses any previous
+#     Argument, Directory, Entry, or Modified requests, if they have been sent.
+#     The last Directory sent specifies the working directory at the time of
+#     the operation. To add a directory, send the directory to be added using
+#     Directory and Argument requests.
+sub req_add
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit("add");
+
+    my $addcount = 0;
+
+    foreach my $filename ( @{$state->{args}} )
+    {
+        $filename = filecleanup($filename);
+
+        unless ( defined ( $state->{entries}{$filename}{modified_filename} ) )
+        {
+            print "E cvs add: nothing known about `$filename'\n";
+            next;
+        }
+        # TODO : check we're not squashing an already existing file
+        if ( defined ( $state->{entries}{$filename}{revision} ) )
+        {
+            print "E cvs add: `$filename' has already been entered\n";
+            next;
+        }
+
+
+        my ( $filepart, $dirpart ) = filenamesplit($filename);
+
+        print "E cvs add: scheduling file `$filename' for addition\n";
+
+        print "Checked-in $dirpart\n";
+        print "$filename\n";
+        print "/$filepart/0///\n";
+
+        $addcount++;
+    }
+
+    if ( $addcount == 1 )
+    {
+        print "E cvs add: use `cvs commit' to add this file permanently\n";
+    }
+    elsif ( $addcount > 1 )
+    {
+        print "E cvs add: use `cvs commit' to add these files permanently\n";
+    }
+
+    print "ok\n";
+}
+
+# remove \n
+#     Response expected: yes. Remove a file. This uses any previous Argument,
+#     Directory, Entry, or Modified requests, if they have been sent. The last
+#     Directory sent specifies the working directory at the time of the
+#     operation. Note that this request does not actually do anything to the
+#     repository; the only effect of a successful remove request is to supply
+#     the client with a new entries line containing `-' to indicate a removed
+#     file. In fact, the client probably could perform this operation without
+#     contacting the server, although using remove may cause the server to
+#     perform a few more checks. The client sends a subsequent ci request to
+#     actually record the removal in the repository.
+sub req_remove
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit("remove");
+
+    # Grab a handle to the SQLite db and do any necessary updates
+    my $updater = GITCVS::updater->new($state->{CVSROOT}, $state->{module}, $log);
+    $updater->update();
+
+    #$log->debug("add state : " . Dumper($state));
+
+    my $rmcount = 0;
+
+    foreach my $filename ( @{$state->{args}} )
+    {
+        $filename = filecleanup($filename);
+
+        if ( defined ( $state->{entries}{$filename}{unchanged} ) or defined ( $state->{entries}{$filename}{modified_filename} ) )
+        {
+            print "E cvs remove: file `$filename' still in working directory\n";
+            next;
+        }
+
+        my $meta = $updater->getmeta($filename);
+        my $wrev = revparse($filename);
+
+        unless ( defined ( $wrev ) )
+        {
+            print "E cvs remove: nothing known about `$filename'\n";
+            next;
+        }
+
+        if ( defined($wrev) and $wrev < 0 )
+        {
+            print "E cvs remove: file `$filename' already scheduled for removal\n";
+            next;
+        }
+
+        unless ( $wrev == $meta->{revision} )
+        {
+            # TODO : not sure if the format of this message is quite correct.
+            print "E cvs remove: Up to date check failed for `$filename'\n";
+            next;
+        }
+
+
+        my ( $filepart, $dirpart ) = filenamesplit($filename);
+
+        print "E cvs remove: scheduling `$filename' for removal\n";
+
+        print "Checked-in $dirpart\n";
+        print "$filename\n";
+        print "/$filepart/-1.$wrev///\n";
+
+        $rmcount++;
+    }
+
+    if ( $rmcount == 1 )
+    {
+        print "E cvs remove: use `cvs commit' to remove this file permanently\n";
+    }
+    elsif ( $rmcount > 1 )
+    {
+        print "E cvs remove: use `cvs commit' to remove these files permanently\n";
+    }
+
+    print "ok\n";
+}
+
+# Modified filename \n
+#     Response expected: no. Additional data: mode, \n, file transmission. Send
+#     the server a copy of one locally modified file. filename is a file within
+#     the most recent directory sent with Directory; it must not contain `/'.
+#     If the user is operating on only some files in a directory, only those
+#     files need to be included. This can also be sent without Entry, if there
+#     is no entry for the file.
+sub req_Modified
+{
+    my ( $cmd, $data ) = @_;
+
+    my $mode = <STDIN>;
+    chomp $mode;
+    my $size = <STDIN>;
+    chomp $size;
+
+    # Grab config information
+    my $blocksize = 8192;
+    my $bytesleft = $size;
+    my $tmp;
+
+    # Get a filehandle/name to write it to
+    my ( $fh, $filename ) = tempfile( DIR => $TEMP_DIR );
+
+    # Loop over file data writing out to temporary file.
+    while ( $bytesleft )
+    {
+        $blocksize = $bytesleft if ( $bytesleft < $blocksize );
+        read STDIN, $tmp, $blocksize;
+        print $fh $tmp;
+        $bytesleft -= $blocksize;
+    }
+
+    close $fh;
+
+    # Ensure we have something sensible for the file mode
+    if ( $mode =~ /u=(\w+)/ )
+    {
+        $mode = $1;
+    } else {
+        $mode = "rw";
+    }
+
+    # Save the file data in $state
+    $state->{entries}{$state->{directory}.$data}{modified_filename} = $filename;
+    $state->{entries}{$state->{directory}.$data}{modified_mode} = $mode;
+    $state->{entries}{$state->{directory}.$data}{modified_hash} = `git-hash-object $filename`;
+    $state->{entries}{$state->{directory}.$data}{modified_hash} =~ s/\s.*$//s;
+
+    #$log->debug("req_Modified : file=$data mode=$mode size=$size");
+}
+
+# Unchanged filename \n
+#     Response expected: no. Tell the server that filename has not been
+#     modified in the checked out directory. The filename is a file within the
+#     most recent directory sent with Directory; it must not contain `/'.
+sub req_Unchanged
+{
+    my ( $cmd, $data ) = @_;
+
+    $state->{entries}{$state->{directory}.$data}{unchanged} = 1;
+
+    #$log->debug("req_Unchanged : $data");
+}
+
+# Argument text \n
+#     Response expected: no. Save argument for use in a subsequent command.
+#     Arguments accumulate until an argument-using command is given, at which
+#     point they are forgotten.
+# Argumentx text \n
+#     Response expected: no. Append \n followed by text to the current argument
+#     being saved.
+sub req_Argument
+{
+    my ( $cmd, $data ) = @_;
+
+    # TODO :  Not quite sure how Argument and Argumentx differ, but I assume
+    # it's for multi-line arguments ... somehow ...
+
+    $log->debug("$cmd : $data");
+
+    push @{$state->{arguments}}, $data;
+}
+
+# expand-modules \n
+#     Response expected: yes. Expand the modules which are specified in the
+#     arguments. Returns the data in Module-expansion responses. Note that the
+#     server can assume that this is checkout or export, not rtag or rdiff; the
+#     latter do not access the working directory and thus have no need to
+#     expand modules on the client side. Expand may not be the best word for
+#     what this request does. It does not necessarily tell you all the files
+#     contained in a module, for example. Basically it is a way of telling you
+#     which working directories the server needs to know about in order to
+#     handle a checkout of the specified modules. For example, suppose that the
+#     server has a module defined by
+#   aliasmodule -a 1dir
+#     That is, one can check out aliasmodule and it will take 1dir in the
+#     repository and check it out to 1dir in the working directory. Now suppose
+#     the client already has this module checked out and is planning on using
+#     the co request to update it. Without using expand-modules, the client
+#     would have two bad choices: it could either send information about all
+#     working directories under the current directory, which could be
+#     unnecessarily slow, or it could be ignorant of the fact that aliasmodule
+#     stands for 1dir, and neglect to send information for 1dir, which would
+#     lead to incorrect operation. With expand-modules, the client would first
+#     ask for the module to be expanded:
+sub req_expandmodules
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit();
+
+    $log->debug("req_expandmodules : " . ( defined($data) ? $data : "[NULL]" ) );
+
+    unless ( ref $state->{arguments} eq "ARRAY" )
+    {
+        print "ok\n";
+        return;
+    }
+
+    foreach my $module ( @{$state->{arguments}} )
+    {
+        $log->debug("SEND : Module-expansion $module");
+        print "Module-expansion $module\n";
+    }
+
+    print "ok\n";
+    statecleanup();
+}
+
+# co \n
+#     Response expected: yes. Get files from the repository. This uses any
+#     previous Argument, Directory, Entry, or Modified requests, if they have
+#     been sent. Arguments to this command are module names; the client cannot
+#     know what directories they correspond to except by (1) just sending the
+#     co request, and then seeing what directory names the server sends back in
+#     its responses, and (2) the expand-modules request.
+sub req_co
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit("co");
+
+    my $module = $state->{args}[0];
+    my $checkout_path = $module;
+
+    # use the user specified directory if we're given it
+    $checkout_path = $state->{opt}{d} if ( exists ( $state->{opt}{d} ) );
+
+    $log->debug("req_co : " . ( defined($data) ? $data : "[NULL]" ) );
+
+    $log->info("Checking out module '$module' ($state->{CVSROOT}) to '$checkout_path'");
+
+    $ENV{GIT_DIR} = $state->{CVSROOT} . "/";
+
+    # Grab a handle to the SQLite db and do any necessary updates
+    my $updater = GITCVS::updater->new($state->{CVSROOT}, $module, $log);
+    $updater->update();
+
+    # instruct the client that we're checking out to $checkout_path
+    print "E cvs server: updating $checkout_path\n";
+
+    foreach my $git ( @{$updater->gethead} )
+    {
+        # Don't want to check out deleted files
+        next if ( $git->{filehash} eq "deleted" );
+
+        ( $git->{name}, $git->{dir} ) = filenamesplit($git->{name});
+
+        # modification time of this file
+        print "Mod-time $git->{modified}\n";
+
+        # print some information to the client
+        print "MT +updated\n";
+        print "MT text U\n";
+        if ( defined ( $git->{dir} ) and $git->{dir} ne "./" )
+        {
+            print "MT fname $checkout_path/$git->{dir}$git->{name}\n";
+        } else {
+            print "MT fname $checkout_path/$git->{name}\n";
+        }
+        print "MT newline\n";
+        print "MT -updated\n";
+
+        # instruct client we're sending a file to put in this path
+        print "Created $checkout_path/" . ( defined ( $git->{dir} ) ? $git->{dir} . "/" : "" ) . "\n";
+
+        print $state->{CVSROOT} . "/$module/" . ( defined ( $git->{dir} ) ? $git->{dir} . "/" : "" ) . "$git->{name}\n";
+
+        # this is an "entries" line
+        print "/$git->{name}/1.$git->{revision}///\n";
+        # permissions
+        print "u=$git->{mode},g=$git->{mode},o=$git->{mode}\n";
+
+        # transmit file
+        transmitfile($git->{filehash});
+    }
+
+    print "ok\n";
+
+    statecleanup();
+}
+
+# update \n
+#     Response expected: yes. Actually do a cvs update command. This uses any
+#     previous Argument, Directory, Entry, or Modified requests, if they have
+#     been sent. The last Directory sent specifies the working directory at the
+#     time of the operation. The -I option is not used--files which the client
+#     can decide whether to ignore are not mentioned and the client sends the
+#     Questionable request for others.
+sub req_update
+{
+    my ( $cmd, $data ) = @_;
+
+    $log->debug("req_update : " . ( defined($data) ? $data : "[NULL]" ));
+
+    argsplit("update");
+
+    # Grab a handle to the SQLite db and do any necessary updates
+    my $updater = GITCVS::updater->new($state->{CVSROOT}, $state->{module}, $log);
+
+    $updater->update();
+
+    # if no files were specified, we need to work out what files we should be providing status on ...
+    argsfromdir($updater) if ( scalar ( @{$state->{args}} ) == 0 );
+
+    #$log->debug("update state : " . Dumper($state));
+
+    # foreach file specified on the commandline ...
+    foreach my $filename ( @{$state->{args}} )
+    {
+        $filename = filecleanup($filename);
+
+        # if we have a -C we should pretend we never saw modified stuff
+        if ( exists ( $state->{opt}{C} ) )
+        {
+            delete $state->{entries}{$filename}{modified_hash};
+            delete $state->{entries}{$filename}{modified_filename};
+            $state->{entries}{$filename}{unchanged} = 1;
+        }
+
+        my $meta;
+        if ( defined($state->{opt}{r}) and $state->{opt}{r} =~ /^1\.(\d+)/ )
+        {
+            $meta = $updater->getmeta($filename, $1);
+        } else {
+            $meta = $updater->getmeta($filename);
+        }
+
+        next unless ( $meta->{revision} );
+
+        my $oldmeta = $meta;
+
+        my $wrev = revparse($filename);
+
+        # If the working copy is an old revision, lets get that version too for comparison.
+        if ( defined($wrev) and $wrev != $meta->{revision} )
+        {
+            $oldmeta = $updater->getmeta($filename, $wrev);
+        }
+
+        #$log->debug("Target revision is $meta->{revision}, current working revision is $wrev");
+
+        # Files are up to date if the working copy and repo copy have the same revision, and the working copy is unmodified _and_ the user hasn't specified -C
+        next if ( defined ( $wrev ) and defined($meta->{revision}) and $wrev == $meta->{revision} and $state->{entries}{$filename}{unchanged} and not exists ( $state->{opt}{C} ) );
+
+        if ( $meta->{filehash} eq "deleted" )
+        {
+            my ( $filepart, $dirpart ) = filenamesplit($filename);
+
+            $log->info("Removing '$filename' from working copy (no longer in the repo)");
+
+            print "E cvs update: `$filename' is no longer in the repository\n";
+            print "Removed $dirpart\n";
+            print "$filepart\n";
+        }
+        elsif ( not defined ( $state->{entries}{$filename}{modified_hash} ) or $state->{entries}{$filename}{modified_hash} eq $oldmeta->{filehash} )
+        {
+            $log->info("Updating '$filename'");
+            # normal update, just send the new revision (either U=Update, or A=Add, or R=Remove)
+            print "MT +updated\n";
+            print "MT text U\n";
+            print "MT fname $filename\n";
+            print "MT newline\n";
+            print "MT -updated\n";
+
+            my ( $filepart, $dirpart ) = filenamesplit($filename);
+            $dirpart =~ s/^$state->{directory}//;
+
+            if ( defined ( $wrev ) )
+            {
+                # instruct client we're sending a file to put in this path as a replacement
+                print "Update-existing $dirpart\n";
+                $log->debug("Updating existing file 'Update-existing $dirpart'");
+            } else {
+                # instruct client we're sending a file to put in this path as a new file
+                print "Created $dirpart\n";
+                $log->debug("Creating new file 'Created $dirpart'");
+            }
+            print $state->{CVSROOT} . "/$state->{module}/$filename\n";
+
+            # this is an "entries" line
+            $log->debug("/$filepart/1.$meta->{revision}///");
+            print "/$filepart/1.$meta->{revision}///\n";
+
+            # permissions
+            $log->debug("SEND : u=$meta->{mode},g=$meta->{mode},o=$meta->{mode}");
+            print "u=$meta->{mode},g=$meta->{mode},o=$meta->{mode}\n";
+
+            # transmit file
+            transmitfile($meta->{filehash});
+        } else {
+            my ( $filepart, $dirpart ) = filenamesplit($meta->{name});
+
+            my $dir = tempdir( DIR => $TEMP_DIR, CLEANUP => 1 ) . "/";
+
+            chdir $dir;
+            my $file_local = $filepart . ".mine";
+            system("ln","-s",$state->{entries}{$filename}{modified_filename}, $file_local);
+            my $file_old = $filepart . "." . $oldmeta->{revision};
+            transmitfile($oldmeta->{filehash}, $file_old);
+            my $file_new = $filepart . "." . $meta->{revision};
+            transmitfile($meta->{filehash}, $file_new);
+
+            # we need to merge with the local changes ( M=successful merge, C=conflict merge )
+            $log->info("Merging $file_local, $file_old, $file_new");
+
+            $log->debug("Temporary directory for merge is $dir");
+
+            my $return = system("merge", $file_local, $file_old, $file_new);
+            $return >>= 8;
+
+            if ( $return == 0 )
+            {
+                $log->info("Merged successfully");
+                print "M M $filename\n";
+                $log->debug("Update-existing $dirpart");
+                print "Update-existing $dirpart\n";
+                $log->debug($state->{CVSROOT} . "/$state->{module}/$filename");
+                print $state->{CVSROOT} . "/$state->{module}/$filename\n";
+                $log->debug("/$filepart/1.$meta->{revision}///");
+                print "/$filepart/1.$meta->{revision}///\n";
+            }
+            elsif ( $return == 1 )
+            {
+                $log->info("Merged with conflicts");
+                print "M C $filename\n";
+                print "Update-existing $dirpart\n";
+                print $state->{CVSROOT} . "/$state->{module}/$filename\n";
+                print "/$filepart/1.$meta->{revision}/+//\n";
+            }
+            else
+            {
+                $log->warn("Merge failed");
+                next;
+            }
+
+            # permissions
+            $log->debug("SEND : u=$meta->{mode},g=$meta->{mode},o=$meta->{mode}");
+            print "u=$meta->{mode},g=$meta->{mode},o=$meta->{mode}\n";
+
+            # transmit file, format is single integer on a line by itself (file
+            # size) followed by the file contents
+            # TODO : we should copy files in blocks
+            my $data = `cat $file_local`;
+            $log->debug("File size : " . length($data));
+            print length($data) . "\n";
+            print $data;
+
+            chdir "/";
+        }
+
+    }
+
+    print "ok\n";
+}
+
+sub req_ci
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit("ci");
+
+    #$log->debug("State : " . Dumper($state));
+
+    $log->info("req_ci : " . ( defined($data) ? $data : "[NULL]" ));
+
+    if ( -e $state->{CVSROOT} . "/index" )
+    {
+        print "error 1 Index already exists in git repo\n";
+        exit;
+    }
+
+    my $lockfile = "$state->{CVSROOT}/refs/heads/$state->{module}.lock";
+    unless ( sysopen(LOCKFILE,$lockfile,O_EXCL|O_CREAT|O_WRONLY) )
+    {
+        print "error 1 Lock file '$lockfile' already exists, please try again\n";
+        exit;
+    }
+
+    # Grab a handle to the SQLite db and do any necessary updates
+    my $updater = GITCVS::updater->new($state->{CVSROOT}, $state->{module}, $log);
+    $updater->update();
+
+    my $tmpdir = tempdir ( DIR => $TEMP_DIR );
+    my ( undef, $file_index ) = tempfile ( DIR => $TEMP_DIR, OPEN => 0 );
+    $log->info("Lock successful, basing commit on '$tmpdir', index file is '$file_index'");
+
+    $ENV{GIT_DIR} = $state->{CVSROOT} . "/";
+    $ENV{GIT_INDEX_FILE} = $file_index;
+
+    chdir $tmpdir;
+
+    # populate the temporary index based
+    system("git-read-tree", $state->{module});
+    unless ($? == 0)
+    {
+       die "Error running git-read-tree $state->{module} $file_index $!";
+    }
+    $log->info("Created index '$file_index' with for head $state->{module} - exit status $?");
+
+
+    my @committedfiles = ();
+
+    # foreach file specified on the commandline ...
+    foreach my $filename ( @{$state->{args}} )
+    {
+        $filename = filecleanup($filename);
+
+        next unless ( exists $state->{entries}{$filename}{modified_filename} or not $state->{entries}{$filename}{unchanged} );
+
+        my $meta = $updater->getmeta($filename);
+
+        my $wrev = revparse($filename);
+
+        my ( $filepart, $dirpart ) = filenamesplit($filename);
+
+        # do a checkout of the file if it part of this tree
+        if ($wrev) {
+            system('git-checkout-index', '-f', '-u', $filename);
+            unless ($? == 0) {
+                die "Error running git-checkout-index -f -u $filename : $!";
+            }
+        }
+
+        my $addflag = 0;
+        my $rmflag = 0;
+        $rmflag = 1 if ( defined($wrev) and $wrev < 0 );
+        $addflag = 1 unless ( -e $filename );
+
+        # Do up to date checking
+        unless ( $addflag or $wrev == $meta->{revision} or ( $rmflag and -$wrev == $meta->{revision} ) )
+        {
+            # fail everything if an up to date check fails
+            print "error 1 Up to date check failed for $filename\n";
+            close LOCKFILE;
+            unlink($lockfile);
+            chdir "/";
+            exit;
+        }
+
+        push @committedfiles, $filename;
+        $log->info("Committing $filename");
+
+        system("mkdir","-p",$dirpart) unless ( -d $dirpart );
+
+        unless ( $rmflag )
+        {
+            $log->debug("rename $state->{entries}{$filename}{modified_filename} $filename");
+            rename $state->{entries}{$filename}{modified_filename},$filename;
+
+            # Calculate modes to remove
+            my $invmode = "";
+            foreach ( qw (r w x) ) { $invmode .= $_ unless ( $state->{entries}{$filename}{modified_mode} =~ /$_/ ); }
+
+            $log->debug("chmod u+" . $state->{entries}{$filename}{modified_mode} . "-" . $invmode . " $filename");
+            system("chmod","u+" .  $state->{entries}{$filename}{modified_mode} . "-" . $invmode, $filename);
+        }
+
+        if ( $rmflag )
+        {
+            $log->info("Removing file '$filename'");
+            unlink($filename);
+            system("git-update-index", "--remove", $filename);
+        }
+        elsif ( $addflag )
+        {
+            $log->info("Adding file '$filename'");
+            system("git-update-index", "--add", $filename);
+        } else {
+            $log->info("Updating file '$filename'");
+            system("git-update-index", $filename);
+        }
+    }
+
+    unless ( scalar(@committedfiles) > 0 )
+    {
+        print "E No files to commit\n";
+        print "ok\n";
+        close LOCKFILE;
+        unlink($lockfile);
+        chdir "/";
+        return;
+    }
+
+    my $treehash = `git-write-tree`;
+    my $parenthash = `cat $ENV{GIT_DIR}refs/heads/$state->{module}`;
+    chomp $treehash;
+    chomp $parenthash;
+
+    $log->debug("Treehash : $treehash, Parenthash : $parenthash");
+
+    # write our commit message out if we have one ...
+    my ( $msg_fh, $msg_filename ) = tempfile( DIR => $TEMP_DIR );
+    print $msg_fh $state->{opt}{m};# if ( exists ( $state->{opt}{m} ) );
+    print $msg_fh "\n\nvia git-CVS emulator\n";
+    close $msg_fh;
+
+    my $commithash = `git-commit-tree $treehash -p $parenthash < $msg_filename`;
+    $log->info("Commit hash : $commithash");
+
+    unless ( $commithash =~ /[a-zA-Z0-9]{40}/ )
+    {
+        $log->warn("Commit failed (Invalid commit hash)");
+        print "error 1 Commit failed (unknown reason)\n";
+        close LOCKFILE;
+        unlink($lockfile);
+        chdir "/";
+        exit;
+    }
+
+    open FILE, ">", "$ENV{GIT_DIR}refs/heads/$state->{module}";
+    print FILE $commithash;
+    close FILE;
+
+    $updater->update();
+
+    # foreach file specified on the commandline ...
+    foreach my $filename ( @committedfiles )
+    {
+        $filename = filecleanup($filename);
+
+        my $meta = $updater->getmeta($filename);
+
+        my ( $filepart, $dirpart ) = filenamesplit($filename);
+
+        $log->debug("Checked-in $dirpart : $filename");
+
+        if ( $meta->{filehash} eq "deleted" )
+        {
+            print "Remove-entry $dirpart\n";
+            print "$filename\n";
+        } else {
+            print "Checked-in $dirpart\n";
+            print "$filename\n";
+            print "/$filepart/1.$meta->{revision}///\n";
+        }
+    }
+
+    close LOCKFILE;
+    unlink($lockfile);
+    chdir "/";
+
+    print "ok\n";
+}
+
+sub req_status
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit("status");
+
+    $log->info("req_status : " . ( defined($data) ? $data : "[NULL]" ));
+    #$log->debug("status state : " . Dumper($state));
+
+    # Grab a handle to the SQLite db and do any necessary updates
+    my $updater = GITCVS::updater->new($state->{CVSROOT}, $state->{module}, $log);
+    $updater->update();
+
+    # if no files were specified, we need to work out what files we should be providing status on ...
+    argsfromdir($updater) if ( scalar ( @{$state->{args}} ) == 0 );
+
+    # foreach file specified on the commandline ...
+    foreach my $filename ( @{$state->{args}} )
+    {
+        $filename = filecleanup($filename);
+
+        my $meta = $updater->getmeta($filename);
+        my $oldmeta = $meta;
+
+        my $wrev = revparse($filename);
+
+        # If the working copy is an old revision, lets get that version too for comparison.
+        if ( defined($wrev) and $wrev != $meta->{revision} )
+        {
+            $oldmeta = $updater->getmeta($filename, $wrev);
+        }
+
+        # TODO : All possible statuses aren't yet implemented
+        my $status;
+        # Files are up to date if the working copy and repo copy have the same revision, and the working copy is unmodified
+        $status = "Up-to-date" if ( defined ( $wrev ) and defined($meta->{revision}) and $wrev == $meta->{revision}
+                                    and
+                                    ( ( $state->{entries}{$filename}{unchanged} and ( not defined ( $state->{entries}{$filename}{conflict} ) or $state->{entries}{$filename}{conflict} !~ /^\+=/ ) )
+                                      or ( defined($state->{entries}{$filename}{modified_hash}) and $state->{entries}{$filename}{modified_hash} eq $meta->{filehash} ) )
+                                   );
+
+        # Need checkout if the working copy has an older revision than the repo copy, and the working copy is unmodified
+        $status ||= "Needs Checkout" if ( defined ( $wrev ) and defined ( $meta->{revision} ) and $meta->{revision} > $wrev
+                                          and
+                                          ( $state->{entries}{$filename}{unchanged}
+                                            or ( defined($state->{entries}{$filename}{modified_hash}) and $state->{entries}{$filename}{modified_hash} eq $oldmeta->{filehash} ) )
+                                        );
+
+        # Need checkout if it exists in the repo but doesn't have a working copy
+        $status ||= "Needs Checkout" if ( not defined ( $wrev ) and defined ( $meta->{revision} ) );
+
+        # Locally modified if working copy and repo copy have the same revision but there are local changes
+        $status ||= "Locally Modified" if ( defined ( $wrev ) and defined($meta->{revision}) and $wrev == $meta->{revision} and $state->{entries}{$filename}{modified_filename} );
+
+        # Needs Merge if working copy revision is less than repo copy and there are local changes
+        $status ||= "Needs Merge" if ( defined ( $wrev ) and defined ( $meta->{revision} ) and $meta->{revision} > $wrev and $state->{entries}{$filename}{modified_filename} );
+
+        $status ||= "Locally Added" if ( defined ( $state->{entries}{$filename}{revision} ) and not defined ( $meta->{revision} ) );
+        $status ||= "Locally Removed" if ( defined ( $wrev ) and defined ( $meta->{revision} ) and -$wrev == $meta->{revision} );
+        $status ||= "Unresolved Conflict" if ( defined ( $state->{entries}{$filename}{conflict} ) and $state->{entries}{$filename}{conflict} =~ /^\+=/ );
+        $status ||= "File had conflicts on merge" if ( 0 );
+
+        $status ||= "Unknown";
+
+        print "M ===================================================================\n";
+        print "M File: $filename\tStatus: $status\n";
+        if ( defined($state->{entries}{$filename}{revision}) )
+        {
+            print "M Working revision:\t" . $state->{entries}{$filename}{revision} . "\n";
+        } else {
+            print "M Working revision:\tNo entry for $filename\n";
+        }
+        if ( defined($meta->{revision}) )
+        {
+            print "M Repository revision:\t1." . $meta->{revision} . "\t$state->{repository}/$filename,v\n";
+            print "M Sticky Tag:\t\t(none)\n";
+            print "M Sticky Date:\t\t(none)\n";
+            print "M Sticky Options:\t\t(none)\n";
+        } else {
+            print "M Repository revision:\tNo revision control file\n";
+        }
+        print "M\n";
+    }
+
+    print "ok\n";
+}
+
+sub req_diff
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit("diff");
+
+    $log->debug("req_diff : " . ( defined($data) ? $data : "[NULL]" ));
+    #$log->debug("status state : " . Dumper($state));
+
+    my ($revision1, $revision2);
+    if ( defined ( $state->{opt}{r} ) and ref $state->{opt}{r} eq "ARRAY" )
+    {
+        $revision1 = $state->{opt}{r}[0];
+        $revision2 = $state->{opt}{r}[1];
+    } else {
+        $revision1 = $state->{opt}{r};
+    }
+
+    $revision1 =~ s/^1\.// if ( defined ( $revision1 ) );
+    $revision2 =~ s/^1\.// if ( defined ( $revision2 ) );
+
+    $log->debug("Diffing revisions " . ( defined($revision1) ? $revision1 : "[NULL]" ) . " and " . ( defined($revision2) ? $revision2 : "[NULL]" ) );
+
+    # Grab a handle to the SQLite db and do any necessary updates
+    my $updater = GITCVS::updater->new($state->{CVSROOT}, $state->{module}, $log);
+    $updater->update();
+
+    # if no files were specified, we need to work out what files we should be providing status on ...
+    argsfromdir($updater) if ( scalar ( @{$state->{args}} ) == 0 );
+
+    # foreach file specified on the commandline ...
+    foreach my $filename ( @{$state->{args}} )
+    {
+        $filename = filecleanup($filename);
+
+        my ( $fh, $file1, $file2, $meta1, $meta2, $filediff );
+
+        my $wrev = revparse($filename);
+
+        # We need _something_ to diff against
+        next unless ( defined ( $wrev ) );
+
+        # if we have a -r switch, use it
+        if ( defined ( $revision1 ) )
+        {
+            ( undef, $file1 ) = tempfile( DIR => $TEMP_DIR, OPEN => 0 );
+            $meta1 = $updater->getmeta($filename, $revision1);
+            unless ( defined ( $meta1 ) and $meta1->{filehash} ne "deleted" )
+            {
+                print "E File $filename at revision 1.$revision1 doesn't exist\n";
+                next;
+            }
+            transmitfile($meta1->{filehash}, $file1);
+        }
+        # otherwise we just use the working copy revision
+        else
+        {
+            ( undef, $file1 ) = tempfile( DIR => $TEMP_DIR, OPEN => 0 );
+            $meta1 = $updater->getmeta($filename, $wrev);
+            transmitfile($meta1->{filehash}, $file1);
+        }
+
+        # if we have a second -r switch, use it too
+        if ( defined ( $revision2 ) )
+        {
+            ( undef, $file2 ) = tempfile( DIR => $TEMP_DIR, OPEN => 0 );
+            $meta2 = $updater->getmeta($filename, $revision2);
+
+            unless ( defined ( $meta2 ) and $meta2->{filehash} ne "deleted" )
+            {
+                print "E File $filename at revision 1.$revision2 doesn't exist\n";
+                next;
+            }
+
+            transmitfile($meta2->{filehash}, $file2);
+        }
+        # otherwise we just use the working copy
+        else
+        {
+            $file2 = $state->{entries}{$filename}{modified_filename};
+        }
+
+        # if we have been given -r, and we don't have a $file2 yet, lets get one
+        if ( defined ( $revision1 ) and not defined ( $file2 ) )
+        {
+            ( undef, $file2 ) = tempfile( DIR => $TEMP_DIR, OPEN => 0 );
+            $meta2 = $updater->getmeta($filename, $wrev);
+            transmitfile($meta2->{filehash}, $file2);
+        }
+
+        # We need to have retrieved something useful
+        next unless ( defined ( $meta1 ) );
+
+        # Files to date if the working copy and repo copy have the same revision, and the working copy is unmodified
+        next if ( not defined ( $meta2 ) and $wrev == $meta1->{revision}
+                  and
+                   ( ( $state->{entries}{$filename}{unchanged} and ( not defined ( $state->{entries}{$filename}{conflict} ) or $state->{entries}{$filename}{conflict} !~ /^\+=/ ) )
+                     or ( defined($state->{entries}{$filename}{modified_hash}) and $state->{entries}{$filename}{modified_hash} eq $meta1->{filehash} ) )
+                  );
+
+        # Apparently we only show diffs for locally modified files
+        next unless ( defined($meta2) or defined ( $state->{entries}{$filename}{modified_filename} ) );
+
+        print "M Index: $filename\n";
+        print "M ===================================================================\n";
+        print "M RCS file: $state->{CVSROOT}/$state->{module}/$filename,v\n";
+        print "M retrieving revision 1.$meta1->{revision}\n" if ( defined ( $meta1 ) );
+        print "M retrieving revision 1.$meta2->{revision}\n" if ( defined ( $meta2 ) );
+        print "M diff ";
+        foreach my $opt ( keys %{$state->{opt}} )
+        {
+            if ( ref $state->{opt}{$opt} eq "ARRAY" )
+            {
+                foreach my $value ( @{$state->{opt}{$opt}} )
+                {
+                    print "-$opt $value ";
+                }
+            } else {
+                print "-$opt ";
+                print "$state->{opt}{$opt} " if ( defined ( $state->{opt}{$opt} ) );
+            }
+        }
+        print "$filename\n";
+
+        $log->info("Diffing $filename -r $meta1->{revision} -r " . ( $meta2->{revision} or "workingcopy" ));
+
+        ( $fh, $filediff ) = tempfile ( DIR => $TEMP_DIR );
+
+        if ( exists $state->{opt}{u} )
+        {
+            system("diff -u -L '$filename revision 1.$meta1->{revision}' -L '$filename " . ( defined($meta2->{revision}) ? "revision 1.$meta2->{revision}" : "working copy" ) . "' $file1 $file2 > $filediff");
+        } else {
+            system("diff $file1 $file2 > $filediff");
+        }
+
+        while ( <$fh> )
+        {
+            print "M $_";
+        }
+        close $fh;
+    }
+
+    print "ok\n";
+}
+
+sub req_log
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit("log");
+
+    $log->debug("req_log : " . ( defined($data) ? $data : "[NULL]" ));
+    #$log->debug("log state : " . Dumper($state));
+
+    my ( $minrev, $maxrev );
+    if ( defined ( $state->{opt}{r} ) and $state->{opt}{r} =~ /([\d.]+)?(::?)([\d.]+)?/ )
+    {
+        my $control = $2;
+        $minrev = $1;
+        $maxrev = $3;
+        $minrev =~ s/^1\.// if ( defined ( $minrev ) );
+        $maxrev =~ s/^1\.// if ( defined ( $maxrev ) );
+        $minrev++ if ( defined($minrev) and $control eq "::" );
+    }
+
+    # Grab a handle to the SQLite db and do any necessary updates
+    my $updater = GITCVS::updater->new($state->{CVSROOT}, $state->{module}, $log);
+    $updater->update();
+
+    # if no files were specified, we need to work out what files we should be providing status on ...
+    argsfromdir($updater) if ( scalar ( @{$state->{args}} ) == 0 );
+
+    # foreach file specified on the commandline ...
+    foreach my $filename ( @{$state->{args}} )
+    {
+        $filename = filecleanup($filename);
+
+        my $headmeta = $updater->getmeta($filename);
+
+        my $revisions = $updater->getlog($filename);
+        my $totalrevisions = scalar(@$revisions);
+
+        if ( defined ( $minrev ) )
+        {
+            $log->debug("Removing revisions less than $minrev");
+            while ( scalar(@$revisions) > 0 and $revisions->[-1]{revision} < $minrev )
+            {
+                pop @$revisions;
+            }
+        }
+        if ( defined ( $maxrev ) )
+        {
+            $log->debug("Removing revisions greater than $maxrev");
+            while ( scalar(@$revisions) > 0 and $revisions->[0]{revision} > $maxrev )
+            {
+                shift @$revisions;
+            }
+        }
+
+        next unless ( scalar(@$revisions) );
+
+        print "M \n";
+        print "M RCS file: $state->{CVSROOT}/$state->{module}/$filename,v\n";
+        print "M Working file: $filename\n";
+        print "M head: 1.$headmeta->{revision}\n";
+        print "M branch:\n";
+        print "M locks: strict\n";
+        print "M access list:\n";
+        print "M symbolic names:\n";
+        print "M keyword substitution: kv\n";
+        print "M total revisions: $totalrevisions;\tselected revisions: " . scalar(@$revisions) . "\n";
+        print "M description:\n";
+
+        foreach my $revision ( @$revisions )
+        {
+            print "M ----------------------------\n";
+            print "M revision 1.$revision->{revision}\n";
+            # reformat the date for log output
+            $revision->{modified} = sprintf('%04d/%02d/%02d %s', $3, $DATE_LIST->{$2}, $1, $4 ) if ( $revision->{modified} =~ /(\d+)\s+(\w+)\s+(\d+)\s+(\S+)/ and defined($DATE_LIST->{$2}) );
+            $revision->{author} =~ s/\s+.*//;
+            $revision->{author} =~ s/^(.{8}).*/$1/;
+            print "M date: $revision->{modified};  author: $revision->{author};  state: " . ( $revision->{filehash} eq "deleted" ? "dead" : "Exp" ) . ";  lines: +2 -3\n";
+            my $commitmessage = $updater->commitmessage($revision->{commithash});
+            $commitmessage =~ s/^/M /mg;
+            print $commitmessage . "\n";
+        }
+        print "M =============================================================================\n";
+    }
+
+    print "ok\n";
+}
+
+sub req_annotate
+{
+    my ( $cmd, $data ) = @_;
+
+    argsplit("annotate");
+
+    $log->info("req_annotate : " . ( defined($data) ? $data : "[NULL]" ));
+    #$log->debug("status state : " . Dumper($state));
+
+    # Grab a handle to the SQLite db and do any necessary updates
+    my $updater = GITCVS::updater->new($state->{CVSROOT}, $state->{module}, $log);
+    $updater->update();
+
+    # if no files were specified, we need to work out what files we should be providing annotate on ...
+    argsfromdir($updater) if ( scalar ( @{$state->{args}} ) == 0 );
+
+    # we'll need a temporary checkout dir
+    my $tmpdir = tempdir ( DIR => $TEMP_DIR );
+    my ( undef, $file_index ) = tempfile ( DIR => $TEMP_DIR, OPEN => 0 );
+    $log->info("Temp checkoutdir creation successful, basing annotate session work on '$tmpdir', index file is '$file_index'");
+
+    $ENV{GIT_DIR} = $state->{CVSROOT} . "/";
+    $ENV{GIT_INDEX_FILE} = $file_index;
+
+    chdir $tmpdir;
+
+    # foreach file specified on the commandline ...
+    foreach my $filename ( @{$state->{args}} )
+    {
+        $filename = filecleanup($filename);
+
+        my $meta = $updater->getmeta($filename);
+
+        next unless ( $meta->{revision} );
+
+       # get all the commits that this file was in
+       # in dense format -- aka skip dead revisions
+        my $revisions   = $updater->gethistorydense($filename);
+       my $lastseenin  = $revisions->[0][2];
+
+       # populate the temporary index based on the latest commit were we saw
+       # the file -- but do it cheaply without checking out any files
+       # TODO: if we got a revision from the client, use that instead
+       # to look up the commithash in sqlite (still good to default to
+       # the current head as we do now)
+       system("git-read-tree", $lastseenin);
+       unless ($? == 0)
+       {
+           die "Error running git-read-tree $lastseenin $file_index $!";
+       }
+       $log->info("Created index '$file_index' with commit $lastseenin - exit status $?");
+
+        # do a checkout of the file
+        system('git-checkout-index', '-f', '-u', $filename);
+        unless ($? == 0) {
+            die "Error running git-checkout-index -f -u $filename : $!";
+        }
+
+        $log->info("Annotate $filename");
+
+        # Prepare a file with the commits from the linearized
+        # history that annotate should know about. This prevents
+        # git-jsannotate telling us about commits we are hiding
+        # from the client.
+
+        open(ANNOTATEHINTS, ">$tmpdir/.annotate_hints") or die "Error opening > $tmpdir/.annotate_hints $!";
+        for (my $i=0; $i < @$revisions; $i++)
+        {
+            print ANNOTATEHINTS $revisions->[$i][2];
+            if ($i+1 < @$revisions) { # have we got a parent?
+                print ANNOTATEHINTS ' ' . $revisions->[$i+1][2];
+            }
+            print ANNOTATEHINTS "\n";
+        }
+
+        print ANNOTATEHINTS "\n";
+        close ANNOTATEHINTS;
+
+        my $annotatecmd = 'git-annotate';
+        open(ANNOTATE, "-|", $annotatecmd, '-l', '-S', "$tmpdir/.annotate_hints", $filename)
+           or die "Error invoking $annotatecmd -l -S $tmpdir/.annotate_hints $filename : $!";
+        my $metadata = {};
+        print "E Annotations for $filename\n";
+        print "E ***************\n";
+        while ( <ANNOTATE> )
+        {
+            if (m/^([a-zA-Z0-9]{40})\t\([^\)]*\)(.*)$/i)
+            {
+                my $commithash = $1;
+                my $data = $2;
+                unless ( defined ( $metadata->{$commithash} ) )
+                {
+                    $metadata->{$commithash} = $updater->getmeta($filename, $commithash);
+                    $metadata->{$commithash}{author} =~ s/\s+.*//;
+                    $metadata->{$commithash}{author} =~ s/^(.{8}).*/$1/;
+                    $metadata->{$commithash}{modified} = sprintf("%02d-%s-%02d", $1, $2, $3) if ( $metadata->{$commithash}{modified} =~ /^(\d+)\s(\w+)\s\d\d(\d\d)/ );
+                }
+                printf("M 1.%-5d      (%-8s %10s): %s\n",
+                    $metadata->{$commithash}{revision},
+                    $metadata->{$commithash}{author},
+                    $metadata->{$commithash}{modified},
+                    $data
+                );
+            } else {
+                $log->warn("Error in annotate output! LINE: $_");
+                print "E Annotate error \n";
+                next;
+            }
+        }
+        close ANNOTATE;
+    }
+
+    # done; get out of the tempdir
+    chdir "/";
+
+    print "ok\n";
+
+}
+
+# This method takes the state->{arguments} array and produces two new arrays.
+# The first is $state->{args} which is everything before the '--' argument, and
+# the second is $state->{files} which is everything after it.
+sub argsplit
+{
+    return unless( defined($state->{arguments}) and ref $state->{arguments} eq "ARRAY" );
+
+    my $type = shift;
+
+    $state->{args} = [];
+    $state->{files} = [];
+    $state->{opt} = {};
+
+    if ( defined($type) )
+    {
+        my $opt = {};
+        $opt = { A => 0, N => 0, P => 0, R => 0, c => 0, f => 0, l => 0, n => 0, p => 0, s => 0, r => 1, D => 1, d => 1, k => 1, j => 1, } if ( $type eq "co" );
+        $opt = { v => 0, l => 0, R => 0 } if ( $type eq "status" );
+        $opt = { A => 0, P => 0, C => 0, d => 0, f => 0, l => 0, R => 0, p => 0, k => 1, r => 1, D => 1, j => 1, I => 1, W => 1 } if ( $type eq "update" );
+        $opt = { l => 0, R => 0, k => 1, D => 1, D => 1, r => 2 } if ( $type eq "diff" );
+        $opt = { c => 0, R => 0, l => 0, f => 0, F => 1, m => 1, r => 1 } if ( $type eq "ci" );
+        $opt = { k => 1, m => 1 } if ( $type eq "add" );
+        $opt = { f => 0, l => 0, R => 0 } if ( $type eq "remove" );
+        $opt = { l => 0, b => 0, h => 0, R => 0, t => 0, N => 0, S => 0, r => 1, d => 1, s => 1, w => 1 } if ( $type eq "log" );
+
+
+        while ( scalar ( @{$state->{arguments}} ) > 0 )
+        {
+            my $arg = shift @{$state->{arguments}};
+
+            next if ( $arg eq "--" );
+            next unless ( $arg =~ /\S/ );
+
+            # if the argument looks like a switch
+            if ( $arg =~ /^-(\w)(.*)/ )
+            {
+                # if it's a switch that takes an argument
+                if ( $opt->{$1} )
+                {
+                    # If this switch has already been provided
+                    if ( $opt->{$1} > 1 and exists ( $state->{opt}{$1} ) )
+                    {
+                        $state->{opt}{$1} = [ $state->{opt}{$1} ];
+                        if ( length($2) > 0 )
+                        {
+                            push @{$state->{opt}{$1}},$2;
+                        } else {
+                            push @{$state->{opt}{$1}}, shift @{$state->{arguments}};
+                        }
+                    } else {
+                        # if there's extra data in the arg, use that as the argument for the switch
+                        if ( length($2) > 0 )
+                        {
+                            $state->{opt}{$1} = $2;
+                        } else {
+                            $state->{opt}{$1} = shift @{$state->{arguments}};
+                        }
+                    }
+                } else {
+                    $state->{opt}{$1} = undef;
+                }
+            }
+            else
+            {
+                push @{$state->{args}}, $arg;
+            }
+        }
+    }
+    else
+    {
+        my $mode = 0;
+
+        foreach my $value ( @{$state->{arguments}} )
+        {
+            if ( $value eq "--" )
+            {
+                $mode++;
+                next;
+            }
+            push @{$state->{args}}, $value if ( $mode == 0 );
+            push @{$state->{files}}, $value if ( $mode == 1 );
+        }
+    }
+}
+
+# This method uses $state->{directory} to populate $state->{args} with a list of filenames
+sub argsfromdir
+{
+    my $updater = shift;
+
+    $state->{args} = [];
+
+    foreach my $file ( @{$updater->gethead} )
+    {
+        next if ( $file->{filehash} eq "deleted" and not defined ( $state->{entries}{$file->{name}} ) );
+        next unless ( $file->{name} =~ s/^$state->{directory}// );
+        push @{$state->{args}}, $file->{name};
+    }
+}
+
+# This method cleans up the $state variable after a command that uses arguments has run
+sub statecleanup
+{
+    $state->{files} = [];
+    $state->{args} = [];
+    $state->{arguments} = [];
+    $state->{entries} = {};
+}
+
+sub revparse
+{
+    my $filename = shift;
+
+    return undef unless ( defined ( $state->{entries}{$filename}{revision} ) );
+
+    return $1 if ( $state->{entries}{$filename}{revision} =~ /^1\.(\d+)/ );
+    return -$1 if ( $state->{entries}{$filename}{revision} =~ /^-1\.(\d+)/ );
+
+    return undef;
+}
+
+# This method takes a file hash and does a CVS "file transfer" which transmits the
+# size of the file, and then the file contents.
+# If a second argument $targetfile is given, the file is instead written out to
+# a file by the name of $targetfile
+sub transmitfile
+{
+    my $filehash = shift;
+    my $targetfile = shift;
+
+    if ( defined ( $filehash ) and $filehash eq "deleted" )
+    {
+        $log->warn("filehash is 'deleted'");
+        return;
+    }
+
+    die "Need filehash" unless ( defined ( $filehash ) and $filehash =~ /^[a-zA-Z0-9]{40}$/ );
+
+    my $type = `git-cat-file -t $filehash`;
+    chomp $type;
+
+    die ( "Invalid type '$type' (expected 'blob')" ) unless ( defined ( $type ) and $type eq "blob" );
+
+    my $size = `git-cat-file -s $filehash`;
+    chomp $size;
+
+    $log->debug("transmitfile($filehash) size=$size, type=$type");
+
+    if ( open my $fh, '-|', "git-cat-file", "blob", $filehash )
+    {
+        if ( defined ( $targetfile ) )
+        {
+            open NEWFILE, ">", $targetfile or die("Couldn't open '$targetfile' for writing : $!");
+            print NEWFILE $_ while ( <$fh> );
+            close NEWFILE;
+        } else {
+            print "$size\n";
+            print while ( <$fh> );
+        }
+        close $fh or die ("Couldn't close filehandle for transmitfile()");
+    } else {
+        die("Couldn't execute git-cat-file");
+    }
+}
+
+# This method takes a file name, and returns ( $dirpart, $filepart ) which
+# refers to the directory porition and the file portion of the filename
+# respectively
+sub filenamesplit
+{
+    my $filename = shift;
+
+    my ( $filepart, $dirpart ) = ( $filename, "." );
+    ( $filepart, $dirpart ) = ( $2, $1 ) if ( $filename =~ /(.*)\/(.*)/ );
+    $dirpart .= "/";
+
+    return ( $filepart, $dirpart );
+}
+
+sub filecleanup
+{
+    my $filename = shift;
+
+    return undef unless(defined($filename));
+    if ( $filename =~ /^\// )
+    {
+        print "E absolute filenames '$filename' not supported by server\n";
+        return undef;
+    }
+
+    $filename =~ s/^\.\///g;
+    $filename = $state->{directory} . $filename;
+
+    return $filename;
+}
+
+package GITCVS::log;
+
+####
+#### Copyright The Open University UK - 2006.
+####
+#### Authors: Martyn Smith    <martyn@catalyst.net.nz>
+####          Martin Langhoff <martin@catalyst.net.nz>
+####
+####
+
+use strict;
+use warnings;
+
+=head1 NAME
+
+GITCVS::log
+
+=head1 DESCRIPTION
+
+This module provides very crude logging with a similar interface to
+Log::Log4perl
+
+=head1 METHODS
+
+=cut
+
+=head2 new
+
+Creates a new log object, optionally you can specify a filename here to
+indicate the file to log to. If no log file is specified, you can specifiy one
+later with method setfile, or indicate you no longer want logging with method
+nofile.
+
+Until one of these methods is called, all log calls will buffer messages ready
+to write out.
+
+=cut
+sub new
+{
+    my $class = shift;
+    my $filename = shift;
+
+    my $self = {};
+
+    bless $self, $class;
+
+    if ( defined ( $filename ) )
+    {
+        open $self->{fh}, ">>", $filename or die("Couldn't open '$filename' for writing : $!");
+    }
+
+    return $self;
+}
+
+=head2 setfile
+
+This methods takes a filename, and attempts to open that file as the log file.
+If successful, all buffered data is written out to the file, and any further
+logging is written directly to the file.
+
+=cut
+sub setfile
+{
+    my $self = shift;
+    my $filename = shift;
+
+    if ( defined ( $filename ) )
+    {
+        open $self->{fh}, ">>", $filename or die("Couldn't open '$filename' for writing : $!");
+    }
+
+    return unless ( defined ( $self->{buffer} ) and ref $self->{buffer} eq "ARRAY" );
+
+    while ( my $line = shift @{$self->{buffer}} )
+    {
+        print {$self->{fh}} $line;
+    }
+}
+
+=head2 nofile
+
+This method indicates no logging is going to be used. It flushes any entries in
+the internal buffer, and sets a flag to ensure no further data is put there.
+
+=cut
+sub nofile
+{
+    my $self = shift;
+
+    $self->{nolog} = 1;
+
+    return unless ( defined ( $self->{buffer} ) and ref $self->{buffer} eq "ARRAY" );
+
+    $self->{buffer} = [];
+}
+
+=head2 _logopen
+
+Internal method. Returns true if the log file is open, false otherwise.
+
+=cut
+sub _logopen
+{
+    my $self = shift;
+
+    return 1 if ( defined ( $self->{fh} ) and ref $self->{fh} eq "GLOB" );
+    return 0;
+}
+
+=head2 debug info warn fatal
+
+These four methods are wrappers to _log. They provide the actual interface for
+logging data.
+
+=cut
+sub debug { my $self = shift; $self->_log("debug", @_); }
+sub info  { my $self = shift; $self->_log("info" , @_); }
+sub warn  { my $self = shift; $self->_log("warn" , @_); }
+sub fatal { my $self = shift; $self->_log("fatal", @_); }
+
+=head2 _log
+
+This is an internal method called by the logging functions. It generates a
+timestamp and pushes the logged line either to file, or internal buffer.
+
+=cut
+sub _log
+{
+    my $self = shift;
+    my $level = shift;
+
+    return if ( $self->{nolog} );
+
+    my @time = localtime;
+    my $timestring = sprintf("%4d-%02d-%02d %02d:%02d:%02d : %-5s",
+        $time[5] + 1900,
+        $time[4] + 1,
+        $time[3],
+        $time[2],
+        $time[1],
+        $time[0],
+        uc $level,
+    );
+
+    if ( $self->_logopen )
+    {
+        print {$self->{fh}} $timestring . " - " . join(" ",@_) . "\n";
+    } else {
+        push @{$self->{buffer}}, $timestring . " - " . join(" ",@_) . "\n";
+    }
+}
+
+=head2 DESTROY
+
+This method simply closes the file handle if one is open
+
+=cut
+sub DESTROY
+{
+    my $self = shift;
+
+    if ( $self->_logopen )
+    {
+        close $self->{fh};
+    }
+}
+
+package GITCVS::updater;
+
+####
+#### Copyright The Open University UK - 2006.
+####
+#### Authors: Martyn Smith    <martyn@catalyst.net.nz>
+####          Martin Langhoff <martin@catalyst.net.nz>
+####
+####
+
+use strict;
+use warnings;
+use DBI;
+
+=head1 METHODS
+
+=cut
+
+=head2 new
+
+=cut
+sub new
+{
+    my $class = shift;
+    my $config = shift;
+    my $module = shift;
+    my $log = shift;
+
+    die "Need to specify a git repository" unless ( defined($config) and -d $config );
+    die "Need to specify a module" unless ( defined($module) );
+
+    $class = ref($class) || $class;
+
+    my $self = {};
+
+    bless $self, $class;
+
+    $self->{dbdir} = $config . "/";
+    die "Database dir '$self->{dbdir}' isn't a directory" unless ( defined($self->{dbdir}) and -d $self->{dbdir} );
+
+    $self->{module} = $module;
+    $self->{file} = $self->{dbdir} . "/gitcvs.$module.sqlite";
+
+    $self->{git_path} = $config . "/";
+
+    $self->{log} = $log;
+
+    die "Git repo '$self->{git_path}' doesn't exist" unless ( -d $self->{git_path} );
+
+    $self->{dbh} = DBI->connect("dbi:SQLite:dbname=" . $self->{file},"","");
+
+    $self->{tables} = {};
+    foreach my $table ( $self->{dbh}->tables )
+    {
+        $table =~ s/^"//;
+        $table =~ s/"$//;
+        $self->{tables}{$table} = 1;
+    }
+
+    # Construct the revision table if required
+    unless ( $self->{tables}{revision} )
+    {
+        $self->{dbh}->do("
+            CREATE TABLE revision (
+                name       TEXT NOT NULL,
+                revision   INTEGER NOT NULL,
+                filehash   TEXT NOT NULL,
+                commithash TEXT NOT NULL,
+                author     TEXT NOT NULL,
+                modified   TEXT NOT NULL,
+                mode       TEXT NOT NULL
+            )
+        ");
+    }
+
+    # Construct the revision table if required
+    unless ( $self->{tables}{head} )
+    {
+        $self->{dbh}->do("
+            CREATE TABLE head (
+                name       TEXT NOT NULL,
+                revision   INTEGER NOT NULL,
+                filehash   TEXT NOT NULL,
+                commithash TEXT NOT NULL,
+                author     TEXT NOT NULL,
+                modified   TEXT NOT NULL,
+                mode       TEXT NOT NULL
+            )
+        ");
+    }
+
+    # Construct the properties table if required
+    unless ( $self->{tables}{properties} )
+    {
+        $self->{dbh}->do("
+            CREATE TABLE properties (
+                key        TEXT NOT NULL PRIMARY KEY,
+                value      TEXT
+            )
+        ");
+    }
+
+    # Construct the commitmsgs table if required
+    unless ( $self->{tables}{commitmsgs} )
+    {
+        $self->{dbh}->do("
+            CREATE TABLE commitmsgs (
+                key        TEXT NOT NULL PRIMARY KEY,
+                value      TEXT
+            )
+        ");
+    }
+
+    return $self;
+}
+
+=head2 update
+
+=cut
+sub update
+{
+    my $self = shift;
+
+    # first lets get the commit list
+    $ENV{GIT_DIR} = $self->{git_path};
+
+    # prepare database queries
+    my $db_insert_rev = $self->{dbh}->prepare_cached("INSERT INTO revision (name, revision, filehash, commithash, modified, author, mode) VALUES (?,?,?,?,?,?,?)",{},1);
+    my $db_insert_mergelog = $self->{dbh}->prepare_cached("INSERT INTO commitmsgs (key, value) VALUES (?,?)",{},1);
+    my $db_delete_head = $self->{dbh}->prepare_cached("DELETE FROM head",{},1);
+    my $db_insert_head = $self->{dbh}->prepare_cached("INSERT INTO head (name, revision, filehash, commithash, modified, author, mode) VALUES (?,?,?,?,?,?,?)",{},1);
+
+    my $commitinfo = `git-cat-file commit $self->{module} 2>&1`;
+    unless ( $commitinfo =~ /tree\s+[a-zA-Z0-9]{40}/ )
+    {
+        die("Invalid module '$self->{module}'");
+    }
+
+
+    my $git_log;
+    my $lastcommit = $self->_get_prop("last_commit");
+
+    # Start exclusive lock here...
+    $self->{dbh}->begin_work() or die "Cannot lock database for BEGIN";
+
+    # TODO: log processing is memory bound
+    # if we can parse into a 2nd file that is in reverse order
+    # we can probably do something really efficient
+    my @git_log_params = ('--parents', '--topo-order');
+
+    if (defined $lastcommit) {
+        push @git_log_params, "$lastcommit..$self->{module}";
+    } else {
+        push @git_log_params, $self->{module};
+    }
+    open(GITLOG, '-|', 'git-log', @git_log_params) or die "Cannot call git-log: $!";
+
+    my @commits;
+
+    my %commit = ();
+
+    while ( <GITLOG> )
+    {
+        chomp;
+        if (m/^commit\s+(.*)$/) {
+            # on ^commit lines put the just seen commit in the stack
+            # and prime things for the next one
+            if (keys %commit) {
+                my %copy = %commit;
+                unshift @commits, \%copy;
+                %commit = ();
+            }
+            my @parents = split(m/\s+/, $1);
+            $commit{hash} = shift @parents;
+            $commit{parents} = \@parents;
+        } elsif (m/^(\w+?):\s+(.*)$/ && !exists($commit{message})) {
+            # on rfc822-like lines seen before we see any message,
+            # lowercase the entry and put it in the hash as key-value
+            $commit{lc($1)} = $2;
+        } else {
+            # message lines - skip initial empty line
+            # and trim whitespace
+            if (!exists($commit{message}) && m/^\s*$/) {
+                # define it to mark the end of headers
+                $commit{message} = '';
+                next;
+            }
+            s/^\s+//; s/\s+$//; # trim ws
+            $commit{message} .= $_ . "\n";
+        }
+    }
+    close GITLOG;
+
+    unshift @commits, \%commit if ( keys %commit );
+
+    # Now all the commits are in the @commits bucket
+    # ordered by time DESC. for each commit that needs processing,
+    # determine whether it's following the last head we've seen or if
+    # it's on its own branch, grab a file list, and add whatever's changed
+    # NOTE: $lastcommit refers to the last commit from previous run
+    #       $lastpicked is the last commit we picked in this run
+    my $lastpicked;
+    my $head = {};
+    if (defined $lastcommit) {
+        $lastpicked = $lastcommit;
+    }
+
+    my $committotal = scalar(@commits);
+    my $commitcount = 0;
+
+    # Load the head table into $head (for cached lookups during the update process)
+    foreach my $file ( @{$self->gethead()} )
+    {
+        $head->{$file->{name}} = $file;
+    }
+
+    foreach my $commit ( @commits )
+    {
+        $self->{log}->debug("GITCVS::updater - Processing commit $commit->{hash} (" . (++$commitcount) . " of $committotal)");
+        if (defined $lastpicked)
+        {
+            if (!in_array($lastpicked, @{$commit->{parents}}))
+            {
+                # skip, we'll see this delta
+                # as part of a merge later
+                # warn "skipping off-track  $commit->{hash}\n";
+                next;
+            } elsif (@{$commit->{parents}} > 1) {
+                # it is a merge commit, for each parent that is
+                # not $lastpicked, see if we can get a log
+                # from the merge-base to that parent to put it
+                # in the message as a merge summary.
+                my @parents = @{$commit->{parents}};
+                foreach my $parent (@parents) {
+                    # git-merge-base can potentially (but rarely) throw
+                    # several candidate merge bases. let's assume
+                    # that the first one is the best one.
+                    if ($parent eq $lastpicked) {
+                        next;
+                    }
+                    open my $p, 'git-merge-base '. $lastpicked . ' '
+                    . $parent . '|';
+                    my @output = (<$p>);
+                    close $p;
+                    my $base = join('', @output);
+                    chomp $base;
+                    if ($base) {
+                        my @merged;
+                        # print "want to log between  $base $parent \n";
+                        open(GITLOG, '-|', 'git-log', "$base..$parent")
+                        or die "Cannot call git-log: $!";
+                        my $mergedhash;
+                        while (<GITLOG>) {
+                            chomp;
+                            if (!defined $mergedhash) {
+                                if (m/^commit\s+(.+)$/) {
+                                    $mergedhash = $1;
+                                } else {
+                                    next;
+                                }
+                            } else {
+                                # grab the first line that looks non-rfc822
+                                # aka has content after leading space
+                                if (m/^\s+(\S.*)$/) {
+                                    my $title = $1;
+                                    $title = substr($title,0,100); # truncate
+                                    unshift @merged, "$mergedhash $title";
+                                    undef $mergedhash;
+                                }
+                            }
+                        }
+                        close GITLOG;
+                        if (@merged) {
+                            $commit->{mergemsg} = $commit->{message};
+                            $commit->{mergemsg} .= "\nSummary of merged commits:\n\n";
+                            foreach my $summary (@merged) {
+                                $commit->{mergemsg} .= "\t$summary\n";
+                            }
+                            $commit->{mergemsg} .= "\n\n";
+                            # print "Message for $commit->{hash} \n$commit->{mergemsg}";
+                        }
+                    }
+                }
+            }
+        }
+
+        # convert the date to CVS-happy format
+        $commit->{date} = "$2 $1 $4 $3 $5" if ( $commit->{date} =~ /^\w+\s+(\w+)\s+(\d+)\s+(\d+:\d+:\d+)\s+(\d+)\s+([+-]\d+)$/ );
+
+        if ( defined ( $lastpicked ) )
+        {
+            my $filepipe = open(FILELIST, '-|', 'git-diff-tree', '-r', $lastpicked, $commit->{hash}) or die("Cannot call git-diff-tree : $!");
+            while ( <FILELIST> )
+            {
+                unless ( /^:\d{6}\s+\d{3}(\d)\d{2}\s+[a-zA-Z0-9]{40}\s+([a-zA-Z0-9]{40})\s+(\w)\s+(.*)$/o )
+                {
+                    die("Couldn't process git-diff-tree line : $_");
+                }
+
+                # $log->debug("File mode=$1, hash=$2, change=$3, name=$4");
+
+                my $git_perms = "";
+                $git_perms .= "r" if ( $1 & 4 );
+                $git_perms .= "w" if ( $1 & 2 );
+                $git_perms .= "x" if ( $1 & 1 );
+                $git_perms = "rw" if ( $git_perms eq "" );
+
+                if ( $3 eq "D" )
+                {
+                    #$log->debug("DELETE   $4");
+                    $head->{$4} = {
+                        name => $4,
+                        revision => $head->{$4}{revision} + 1,
+                        filehash => "deleted",
+                        commithash => $commit->{hash},
+                        modified => $commit->{date},
+                        author => $commit->{author},
+                        mode => $git_perms,
+                    };
+                    $db_insert_rev->execute($4, $head->{$4}{revision}, $2, $commit->{hash}, $commit->{date}, $commit->{author}, $git_perms);
+                }
+                elsif ( $3 eq "M" )
+                {
+                    #$log->debug("MODIFIED $4");
+                    $head->{$4} = {
+                        name => $4,
+                        revision => $head->{$4}{revision} + 1,
+                        filehash => $2,
+                        commithash => $commit->{hash},
+                        modified => $commit->{date},
+                        author => $commit->{author},
+                        mode => $git_perms,
+                    };
+                    $db_insert_rev->execute($4, $head->{$4}{revision}, $2, $commit->{hash}, $commit->{date}, $commit->{author}, $git_perms);
+                }
+                elsif ( $3 eq "A" )
+                {
+                    #$log->debug("ADDED    $4");
+                    $head->{$4} = {
+                        name => $4,
+                        revision => 1,
+                        filehash => $2,
+                        commithash => $commit->{hash},
+                        modified => $commit->{date},
+                        author => $commit->{author},
+                        mode => $git_perms,
+                    };
+                    $db_insert_rev->execute($4, $head->{$4}{revision}, $2, $commit->{hash}, $commit->{date}, $commit->{author}, $git_perms);
+                }
+                else
+                {
+                    $log->warn("UNKNOWN FILE CHANGE mode=$1, hash=$2, change=$3, name=$4");
+                    die;
+                }
+            }
+            close FILELIST;
+        } else {
+            # this is used to detect files removed from the repo
+            my $seen_files = {};
+
+            my $filepipe = open(FILELIST, '-|', 'git-ls-tree', '-r', $commit->{hash}) or die("Cannot call git-ls-tree : $!");
+            while ( <FILELIST> )
+            {
+                unless ( /^(\d+)\s+(\w+)\s+([a-zA-Z0-9]+)\s+(.*)$/o )
+                {
+                    die("Couldn't process git-ls-tree line : $_");
+                }
+
+                my ( $git_perms, $git_type, $git_hash, $git_filename ) = ( $1, $2, $3, $4 );
+
+                $seen_files->{$git_filename} = 1;
+
+                my ( $oldhash, $oldrevision, $oldmode ) = (
+                    $head->{$git_filename}{filehash},
+                    $head->{$git_filename}{revision},
+                    $head->{$git_filename}{mode}
+                );
+
+                if ( $git_perms =~ /^\d\d\d(\d)\d\d/o )
+                {
+                    $git_perms = "";
+                    $git_perms .= "r" if ( $1 & 4 );
+                    $git_perms .= "w" if ( $1 & 2 );
+                    $git_perms .= "x" if ( $1 & 1 );
+                } else {
+                    $git_perms = "rw";
+                }
+
+                # unless the file exists with the same hash, we need to update it ...
+                unless ( defined($oldhash) and $oldhash eq $git_hash and defined($oldmode) and $oldmode eq $git_perms )
+                {
+                    my $newrevision = ( $oldrevision or 0 ) + 1;
+
+                    $head->{$git_filename} = {
+                        name => $git_filename,
+                        revision => $newrevision,
+                        filehash => $git_hash,
+                        commithash => $commit->{hash},
+                        modified => $commit->{date},
+                        author => $commit->{author},
+                        mode => $git_perms,
+                    };
+
+
+                    $db_insert_rev->execute($git_filename, $newrevision, $git_hash, $commit->{hash}, $commit->{date}, $commit->{author}, $git_perms);
+                }
+            }
+            close FILELIST;
+
+            # Detect deleted files
+            foreach my $file ( keys %$head )
+            {
+                unless ( exists $seen_files->{$file} or $head->{$file}{filehash} eq "deleted" )
+                {
+                    $head->{$file}{revision}++;
+                    $head->{$file}{filehash} = "deleted";
+                    $head->{$file}{commithash} = $commit->{hash};
+                    $head->{$file}{modified} = $commit->{date};
+                    $head->{$file}{author} = $commit->{author};
+
+                    $db_insert_rev->execute($file, $head->{$file}{revision}, $head->{$file}{filehash}, $commit->{hash}, $commit->{date}, $commit->{author}, $head->{$file}{mode});
+                }
+            }
+            # END : "Detect deleted files"
+        }
+
+
+        if (exists $commit->{mergemsg})
+        {
+            $db_insert_mergelog->execute($commit->{hash}, $commit->{mergemsg});
+        }
+
+        $lastpicked = $commit->{hash};
+
+        $self->_set_prop("last_commit", $commit->{hash});
+    }
+
+    $db_delete_head->execute();
+    foreach my $file ( keys %$head )
+    {
+        $db_insert_head->execute(
+            $file,
+            $head->{$file}{revision},
+            $head->{$file}{filehash},
+            $head->{$file}{commithash},
+            $head->{$file}{modified},
+            $head->{$file}{author},
+            $head->{$file}{mode},
+        );
+    }
+    # invalidate the gethead cache
+    $self->{gethead_cache} = undef;
+
+
+    # Ending exclusive lock here
+    $self->{dbh}->commit() or die "Failed to commit changes to SQLite";
+}
+
+sub _headrev
+{
+    my $self = shift;
+    my $filename = shift;
+
+    my $db_query = $self->{dbh}->prepare_cached("SELECT filehash, revision, mode FROM head WHERE name=?",{},1);
+    $db_query->execute($filename);
+    my ( $hash, $revision, $mode ) = $db_query->fetchrow_array;
+
+    return ( $hash, $revision, $mode );
+}
+
+sub _get_prop
+{
+    my $self = shift;
+    my $key = shift;
+
+    my $db_query = $self->{dbh}->prepare_cached("SELECT value FROM properties WHERE key=?",{},1);
+    $db_query->execute($key);
+    my ( $value ) = $db_query->fetchrow_array;
+
+    return $value;
+}
+
+sub _set_prop
+{
+    my $self = shift;
+    my $key = shift;
+    my $value = shift;
+
+    my $db_query = $self->{dbh}->prepare_cached("UPDATE properties SET value=? WHERE key=?",{},1);
+    $db_query->execute($value, $key);
+
+    unless ( $db_query->rows )
+    {
+        $db_query = $self->{dbh}->prepare_cached("INSERT INTO properties (key, value) VALUES (?,?)",{},1);
+        $db_query->execute($key, $value);
+    }
+
+    return $value;
+}
+
+=head2 gethead
+
+=cut
+
+sub gethead
+{
+    my $self = shift;
+
+    return $self->{gethead_cache} if ( defined ( $self->{gethead_cache} ) );
+
+    my $db_query = $self->{dbh}->prepare_cached("SELECT name, filehash, mode, revision, modified, commithash, author FROM head",{},1);
+    $db_query->execute();
+
+    my $tree = [];
+    while ( my $file = $db_query->fetchrow_hashref )
+    {
+        push @$tree, $file;
+    }
+
+    $self->{gethead_cache} = $tree;
+
+    return $tree;
+}
+
+=head2 getlog
+
+=cut
+
+sub getlog
+{
+    my $self = shift;
+    my $filename = shift;
+
+    my $db_query = $self->{dbh}->prepare_cached("SELECT name, filehash, author, mode, revision, modified, commithash FROM revision WHERE name=? ORDER BY revision DESC",{},1);
+    $db_query->execute($filename);
+
+    my $tree = [];
+    while ( my $file = $db_query->fetchrow_hashref )
+    {
+        push @$tree, $file;
+    }
+
+    return $tree;
+}
+
+=head2 getmeta
+
+This function takes a filename (with path) argument and returns a hashref of
+metadata for that file.
+
+=cut
+
+sub getmeta
+{
+    my $self = shift;
+    my $filename = shift;
+    my $revision = shift;
+
+    my $db_query;
+    if ( defined($revision) and $revision =~ /^\d+$/ )
+    {
+        $db_query = $self->{dbh}->prepare_cached("SELECT * FROM revision WHERE name=? AND revision=?",{},1);
+        $db_query->execute($filename, $revision);
+    }
+    elsif ( defined($revision) and $revision =~ /^[a-zA-Z0-9]{40}$/ )
+    {
+        $db_query = $self->{dbh}->prepare_cached("SELECT * FROM revision WHERE name=? AND commithash=?",{},1);
+        $db_query->execute($filename, $revision);
+    } else {
+        $db_query = $self->{dbh}->prepare_cached("SELECT * FROM head WHERE name=?",{},1);
+        $db_query->execute($filename);
+    }
+
+    return $db_query->fetchrow_hashref;
+}
+
+=head2 commitmessage
+
+this function takes a commithash and returns the commit message for that commit
+
+=cut
+sub commitmessage
+{
+    my $self = shift;
+    my $commithash = shift;
+
+    die("Need commithash") unless ( defined($commithash) and $commithash =~ /^[a-zA-Z0-9]{40}$/ );
+
+    my $db_query;
+    $db_query = $self->{dbh}->prepare_cached("SELECT value FROM commitmsgs WHERE key=?",{},1);
+    $db_query->execute($commithash);
+
+    my ( $message ) = $db_query->fetchrow_array;
+
+    if ( defined ( $message ) )
+    {
+        $message .= " " if ( $message =~ /\n$/ );
+        return $message;
+    }
+
+    my @lines = safe_pipe_capture("git-cat-file", "commit", $commithash);
+    shift @lines while ( $lines[0] =~ /\S/ );
+    $message = join("",@lines);
+    $message .= " " if ( $message =~ /\n$/ );
+    return $message;
+}
+
+=head2 gethistory
+
+This function takes a filename (with path) argument and returns an arrayofarrays
+containing revision,filehash,commithash ordered by revision descending
+
+=cut
+sub gethistory
+{
+    my $self = shift;
+    my $filename = shift;
+
+    my $db_query;
+    $db_query = $self->{dbh}->prepare_cached("SELECT revision, filehash, commithash FROM revision WHERE name=? ORDER BY revision DESC",{},1);
+    $db_query->execute($filename);
+
+    return $db_query->fetchall_arrayref;
+}
+
+=head2 gethistorydense
+
+This function takes a filename (with path) argument and returns an arrayofarrays
+containing revision,filehash,commithash ordered by revision descending.
+
+This version of gethistory skips deleted entries -- so it is useful for annotate.
+The 'dense' part is a reference to a '--dense' option available for git-rev-list
+and other git tools that depend on it.
+
+=cut
+sub gethistorydense
+{
+    my $self = shift;
+    my $filename = shift;
+
+    my $db_query;
+    $db_query = $self->{dbh}->prepare_cached("SELECT revision, filehash, commithash FROM revision WHERE name=? AND filehash!='deleted' ORDER BY revision DESC",{},1);
+    $db_query->execute($filename);
+
+    return $db_query->fetchall_arrayref;
+}
+
+=head2 in_array()
+
+from Array::PAT - mimics the in_array() function
+found in PHP. Yuck but works for small arrays.
+
+=cut
+sub in_array
+{
+    my ($check, @array) = @_;
+    my $retval = 0;
+    foreach my $test (@array){
+        if($check eq $test){
+            $retval =  1;
+        }
+    }
+    return $retval;
+}
+
+=head2 safe_pipe_capture
+
+an alterative to `command` that allows input to be passed as an array
+to work around shell problems with weird characters in arguments
+
+=cut
+sub safe_pipe_capture {
+
+    my @output;
+
+    if (my $pid = open my $child, '-|') {
+        @output = (<$child>);
+        close $child or die join(' ',@_).": $! $?";
+    } else {
+        exec(@_) or die "$! $?"; # exec() can fail the executable can't be found
+    }
+    return wantarray ? @output : join('',@output);
+}
+
+
+1;
diff --git a/git-fetch.sh b/git-fetch.sh

index b4325d9..de4f011 100755 (executable)
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -320,7 +320,7 @@ fetch_main () {
      ( : subshell because we muck with IFS
        IFS="    $LF"
        (
-         git-fetch-pack $exec $keep "$remote" $rref || echo failed "$remote"
+         git-fetch-pack $exec $keep --thin "$remote" $rref || echo failed "$remote"
        ) |
        while read sha1 remote_name
        do
@@ -368,20 +368,25 @@ fetch_main "$reflist"
  # automated tag following
  case "$no_tags$tags" in
  '')
-       taglist=$(IFS=" " &&
-       git-ls-remote $upload_pack --tags "$remote" |
-       sed -ne 's|^\([0-9a-f]*\)[      ]\(refs/tags/.*\)^{}$|\1 \2|p' |
-       while read sha1 name
-       do
-               test -f "$GIT_DIR/$name" && continue
-               git-check-ref-format "$name" || {
-                       echo >&2 "warning: tag ${name} ignored"
-                       continue
-               }
-               git-cat-file -t "$sha1" >/dev/null 2>&1 || continue
-               echo >&2 "Auto-following $name"
-               echo ".${name}:${name}"
-       done)
+       case "$reflist" in
+       *:refs/*)
+               # effective only when we are following remote branch
+               # using local tracking branch.
+               taglist=$(IFS=" " &&
+               git-ls-remote $upload_pack --tags "$remote" |
+               sed -ne 's|^\([0-9a-f]*\)[      ]\(refs/tags/.*\)^{}$|\1 \2|p' |
+               while read sha1 name
+               do
+                       test -f "$GIT_DIR/$name" && continue
+                       git-check-ref-format "$name" || {
+                               echo >&2 "warning: tag ${name} ignored"
+                               continue
+                       }
+                       git-cat-file -t "$sha1" >/dev/null 2>&1 || continue
+                       echo >&2 "Auto-following $name"
+                       echo ".${name}:${name}"
+               done)
+       esac
         case "$taglist" in
         '') ;;
         ?*)
diff --git a/git-merge.sh b/git-merge.sh

index 4609fe5..7be9e81 100755 (executable)
--- a/git-merge.sh
+++ b/git-merge.sh
@@ -134,7 +134,7 @@ case "$#,$common,$no_commit" in
         echo "Updating from $head to $1."
         git-update-index --refresh 2>/dev/null
         new_head=$(git-rev-parse --verify "$1^0") &&
-       git-read-tree -u -m $head "$new_head" &&
+       git-read-tree -u -v -m $head "$new_head" &&
         finish "$new_head" "Fast forward"
         dropsave
         exit 0
@@ -150,7 +150,7 @@ case "$#,$common,$no_commit" in
  
         echo "Trying really trivial in-index merge..."
         git-update-index --refresh 2>/dev/null
-       if git-read-tree --trivial -m -u $common $head "$1" &&
+       if git-read-tree --trivial -m -u -v $common $head "$1" &&
            result_tree=$(git-write-tree)
         then
             echo "Wonderful."
diff --git a/git-push.sh b/git-push.sh

index 706db99..73dcf06 100755 (executable)
--- a/git-push.sh
+++ b/git-push.sh
@@ -8,6 +8,7 @@ USAGE='[--all] [--tags] [--force] <repository> [<refspec>...]'
  has_all=
  has_force=
  has_exec=
+has_thin=
  remote=
  do_tags=
  
@@ -22,6 +23,8 @@ do
                 has_force=--force ;;
         --exec=*)
                 has_exec="$1" ;;
+       --thin)
+               has_thin="$1" ;;
         -*)
                  usage ;;
          *)
@@ -72,6 +75,7 @@ set x "$remote" "$@"; shift
  test "$has_all" && set x "$has_all" "$@" && shift
  test "$has_force" && set x "$has_force" "$@" && shift
  test "$has_exec" && set x "$has_exec" "$@" && shift
+test "$has_thin" && set x "$has_thin" "$@" && shift
  
  case "$remote" in
  http://* | https://*)
diff --git a/git-rm.sh b/git-rm.sh

new file mode 100755 (executable)

index 0000000..fda4541
--- /dev/null
+++ b/git-rm.sh
@@ -0,0 +1,70 @@
+#!/bin/sh
+
+USAGE='[-f] [-n] [-v] [--] <file>...'
+SUBDIRECTORY_OK='Yes'
+. git-sh-setup
+
+remove_files=
+show_only=
+verbose=
+while : ; do
+  case "$1" in
+    -f)
+       remove_files=true
+       ;;
+    -n)
+       show_only=true
+       ;;
+    -v)
+       verbose=--verbose
+       ;;
+    --)
+       shift; break
+       ;;
+    -*)
+       usage
+       ;;
+    *)
+       break
+       ;;
+  esac
+  shift
+done
+
+# This is typo-proofing. If some paths match and some do not, we want
+# to do nothing.
+case "$#" in
+0)     ;;
+*)
+       git-ls-files --error-unmatch -- "$@" >/dev/null || {
+               echo >&2 "Maybe you misspelled it?"
+               exit 1
+       }
+       ;;
+esac
+
+if test -f "$GIT_DIR/info/exclude"
+then
+       git-ls-files -z \
+       --exclude-from="$GIT_DIR/info/exclude" \
+       --exclude-per-directory=.gitignore -- "$@"
+else
+       git-ls-files -z \
+       --exclude-per-directory=.gitignore -- "$@"
+fi |
+case "$show_only,$remove_files" in
+true,*)
+       xargs -0 echo
+       ;;
+*,true)
+       xargs -0 sh -c "
+               while [ \$# -gt 0 ]; do
+                       file=\$1; shift
+                       rm -- \"\$file\" && git-update-index --remove $verbose \"\$file\"
+               done
+       " inline
+       ;;
+*)
+       git-update-index --force-remove $verbose -z --stdin
+       ;;
+esac
diff --git a/http-fetch.c b/http-fetch.c

index ce3df5f..8fd9de0 100644 (file)
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -130,7 +130,7 @@ static void start_object_request(struct object_request *obj_req)
  
         if (obj_req->local < 0) {
                 obj_req->state = ABORTED;
-               error("Couldn't create temporary file %s for %s: %s\n",
+               error("Couldn't create temporary file %s for %s: %s",
                       obj_req->tmpfile, obj_req->filename, strerror(errno));
                 return;
         }
@@ -830,9 +830,9 @@ static int fetch_object(struct alt_base *repo, unsigned char *sha1)
                                     obj_req->errorstr, obj_req->curl_result,
                                     obj_req->http_code, hex);
         } else if (obj_req->zret != Z_STREAM_END) {
-               ret = error("File %s (%s) corrupt\n", hex, obj_req->url);
+               ret = error("File %s (%s) corrupt", hex, obj_req->url);
         } else if (memcmp(obj_req->sha1, obj_req->real_sha1, 20)) {
-               ret = error("File %s has bad hash\n", hex);
+               ret = error("File %s has bad hash", hex);
         } else if (obj_req->rename < 0) {
                 ret = error("unable to write sha1 filename %s",
                             obj_req->filename);
@@ -854,7 +854,7 @@ int fetch(unsigned char *sha1)
                 fetch_alternates(alt->base);
                 altbase = altbase->next;
         }
-       return error("Unable to find %s under %s\n", sha1_to_hex(sha1),
+       return error("Unable to find %s under %s", sha1_to_hex(sha1),
                      alt->base);
  }
  
diff --git a/pack-objects.c b/pack-objects.c

index e3946db..32befa0 100644 (file)
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,9 @@
  #include "delta.h"
  #include "pack.h"
  #include "csum-file.h"
+#include "diff.h"
  #include <sys/time.h>
+#include <signal.h>
  
  static const char pack_usage[] = "git-pack-objects [-q] [--no-reuse-delta] [--non-empty] [--local] [--incremental] [--window=N] [--depth=N] {--stdout | base-name} < object-list";
  
@@ -26,6 +28,13 @@ struct object_entry {
         struct object_entry *delta_sibling; /* other deltified objects who
                                              * uses the same base as me
                                              */
+       int preferred_base;     /* we do not pack this, but is encouraged to
+                                * be used as the base objectto delta huge
+                                * objects against.
+                                */
+       int based_on_preferred; /* current delta candidate is a preferred
+                                * one, or delta against a preferred one.
+                                */
  };
  
  /*
@@ -48,10 +57,11 @@ static int local = 0;
  static int incremental = 0;
  static struct object_entry **sorted_by_sha, **sorted_by_type;
  static struct object_entry *objects = NULL;
-static int nr_objects = 0, nr_alloc = 0;
+static int nr_objects = 0, nr_alloc = 0, nr_result = 0;
  static const char *base_name;
  static unsigned char pack_file_sha1[20];
  static int progress = 1;
+static volatile int progress_update = 0;
  
  /*
   * The object names in objects array are hashed with this hashtable,
@@ -229,7 +239,8 @@ static int encode_header(enum object_type type, unsigned long size, unsigned cha
         return n;
  }
  
-static unsigned long write_object(struct sha1file *f, struct object_entry *entry)
+static unsigned long write_object(struct sha1file *f,
+                                 struct object_entry *entry)
  {
         unsigned long size;
         char type[10];
@@ -239,6 +250,9 @@ static unsigned long write_object(struct sha1file *f, struct object_entry *entry
         enum object_type obj_type;
         int to_reuse = 0;
  
+       if (entry->preferred_base)
+               return 0;
+
         obj_type = entry->type;
         if (! entry->in_pack)
                 to_reuse = 0;   /* can't reuse what we don't have */
@@ -322,28 +336,51 @@ static void write_pack_file(void)
         struct sha1file *f;
         unsigned long offset;
         struct pack_header hdr;
+       unsigned last_percent = 999;
+       int do_progress = 0;
  
         if (!base_name)
                 f = sha1fd(1, "<stdout>");
-       else
-               f = sha1create("%s-%s.%s", base_name, sha1_to_hex(object_list_sha1), "pack");
+       else {
+               f = sha1create("%s-%s.%s", base_name,
+                              sha1_to_hex(object_list_sha1), "pack");
+               do_progress = progress;
+       }
+       if (do_progress)
+               fprintf(stderr, "Writing %d objects.\n", nr_result);
+
         hdr.hdr_signature = htonl(PACK_SIGNATURE);
         hdr.hdr_version = htonl(PACK_VERSION);
-       hdr.hdr_entries = htonl(nr_objects);
+       hdr.hdr_entries = htonl(nr_result);
         sha1write(f, &hdr, sizeof(hdr));
         offset = sizeof(hdr);
-       for (i = 0; i < nr_objects; i++)
+       if (!nr_result)
+               goto done;
+       for (i = 0; i < nr_objects; i++) {
                 offset = write_one(f, objects + i, offset);
-
+               if (do_progress) {
+                       unsigned percent = written * 100 / nr_result;
+                       if (progress_update || percent != last_percent) {
+                               fprintf(stderr, "%4u%% (%u/%u) done\r",
+                                       percent, written, nr_result);
+                               progress_update = 0;
+                               last_percent = percent;
+                       }
+               }
+       }
+       if (do_progress)
+               fputc('\n', stderr);
+ done:
         sha1close(f, pack_file_sha1, 1);
  }
  
  static void write_index_file(void)
  {
         int i;
-       struct sha1file *f = sha1create("%s-%s.%s", base_name, sha1_to_hex(object_list_sha1), "idx");
+       struct sha1file *f = sha1create("%s-%s.%s", base_name,
+                                       sha1_to_hex(object_list_sha1), "idx");
         struct object_entry **list = sorted_by_sha;
-       struct object_entry **last = list + nr_objects;
+       struct object_entry **last = list + nr_result;
         unsigned int array[256];
  
         /*
@@ -368,7 +405,7 @@ static void write_index_file(void)
          * Write the actual SHA1 entries..
          */
         list = sorted_by_sha;
-       for (i = 0; i < nr_objects; i++) {
+       for (i = 0; i < nr_result; i++) {
                 struct object_entry *entry = *list++;
                 unsigned int offset = htonl(entry->offset);
                 sha1write(f, &offset, 4);
@@ -378,27 +415,87 @@ static void write_index_file(void)
         sha1close(f, NULL, 1);
  }
  
-static int add_object_entry(unsigned char *sha1, unsigned int hash)
+static int locate_object_entry_hash(const unsigned char *sha1)
  {
+       int i;
+       unsigned int ui;
+       memcpy(&ui, sha1, sizeof(unsigned int));
+       i = ui % object_ix_hashsz;
+       while (0 < object_ix[i]) {
+               if (!memcmp(sha1, objects[object_ix[i]-1].sha1, 20))
+                       return i;
+               if (++i == object_ix_hashsz)
+                       i = 0;
+       }
+       return -1 - i;
+}
+
+static struct object_entry *locate_object_entry(const unsigned char *sha1)
+{
+       int i;
+
+       if (!object_ix_hashsz)
+               return NULL;
+
+       i = locate_object_entry_hash(sha1);
+       if (0 <= i)
+               return &objects[object_ix[i]-1];
+       return NULL;
+}
+
+static void rehash_objects(void)
+{
+       int i;
+       struct object_entry *oe;
+
+       object_ix_hashsz = nr_objects * 3;
+       if (object_ix_hashsz < 1024)
+               object_ix_hashsz = 1024;
+       object_ix = xrealloc(object_ix, sizeof(int) * object_ix_hashsz);
+       object_ix = memset(object_ix, 0, sizeof(int) * object_ix_hashsz);
+       for (i = 0, oe = objects; i < nr_objects; i++, oe++) {
+               int ix = locate_object_entry_hash(oe->sha1);
+               if (0 <= ix)
+                       continue;
+               ix = -1 - ix;
+               object_ix[ix] = i + 1;
+       }
+}
+
+static int add_object_entry(const unsigned char *sha1, const char *name, int exclude)
+{
+       unsigned int hash = 0;
         unsigned int idx = nr_objects;
         struct object_entry *entry;
         struct packed_git *p;
         unsigned int found_offset = 0;
         struct packed_git *found_pack = NULL;
-
-       for (p = packed_git; p; p = p->next) {
-               struct pack_entry e;
-               if (find_pack_entry_one(sha1, &e, p)) {
-                       if (incremental)
-                               return 0;
-                       if (local && !p->pack_local)
-                               return 0;
-                       if (!found_pack) {
-                               found_offset = e.offset;
-                               found_pack = e.p;
+       int ix;
+
+       if (!exclude) {
+               for (p = packed_git; p; p = p->next) {
+                       struct pack_entry e;
+                       if (find_pack_entry_one(sha1, &e, p)) {
+                               if (incremental)
+                                       return 0;
+                               if (local && !p->pack_local)
+                                       return 0;
+                               if (!found_pack) {
+                                       found_offset = e.offset;
+                                       found_pack = e.p;
+                               }
                         }
                 }
         }
+       if ((entry = locate_object_entry(sha1)) != NULL)
+               goto already_added;
+
+       while (*name) {
+               unsigned char c = *name++;
+               if (isspace(c))
+                       continue;
+               hash = hash * 11 + c;
+       }
  
         if (idx >= nr_alloc) {
                 unsigned int needed = (idx + 1024) * 3 / 2;
@@ -406,45 +503,83 @@ static int add_object_entry(unsigned char *sha1, unsigned int hash)
                 nr_alloc = needed;
         }
         entry = objects + idx;
+       nr_objects = idx + 1;
         memset(entry, 0, sizeof(*entry));
         memcpy(entry->sha1, sha1, 20);
         entry->hash = hash;
-       if (found_pack) {
-               entry->in_pack = found_pack;
-               entry->in_pack_offset = found_offset;
+
+       if (object_ix_hashsz * 3 <= nr_objects * 4)
+               rehash_objects();
+       else {
+               ix = locate_object_entry_hash(entry->sha1);
+               if (0 <= ix)
+                       die("internal error in object hashing.");
+               object_ix[-1 - ix] = idx + 1;
+       }
+
+ already_added:
+       if (progress_update) {
+               fprintf(stderr, "Counting objects...%d\r", nr_objects);
+               progress_update = 0;
+       }
+       if (exclude)
+               entry->preferred_base = 1;
+       else {
+               if (found_pack) {
+                       entry->in_pack = found_pack;
+                       entry->in_pack_offset = found_offset;
+               }
         }
-       nr_objects = idx+1;
         return 1;
  }
  
-static int locate_object_entry_hash(unsigned char *sha1)
+static void add_pbase_tree(struct tree_desc *tree)
  {
-       int i;
-       unsigned int ui;
-       memcpy(&ui, sha1, sizeof(unsigned int));
-       i = ui % object_ix_hashsz;
-       while (0 < object_ix[i]) {
-               if (!memcmp(sha1, objects[object_ix[i]-1].sha1, 20))
-                       return i;
-               if (++i == object_ix_hashsz)
-                       i = 0;
+       while (tree->size) {
+               const unsigned char *sha1;
+               const char *name;
+               unsigned mode;
+               unsigned long size;
+               char type[20];
+
+               sha1 = tree_entry_extract(tree, &name, &mode);
+               update_tree_entry(tree);
+               if (!has_sha1_file(sha1))
+                       continue;
+               if (sha1_object_info(sha1, type, &size))
+                       continue;
+               add_object_entry(sha1, name, 1);
+               if (!strcmp(type, "tree")) {
+                       struct tree_desc sub;
+                       void *elem;
+                       elem = read_sha1_file(sha1, type, &sub.size);
+                       sub.buf = elem;
+                       if (sub.buf) {
+                               add_pbase_tree(&sub);
+                               free(elem);
+                       }
+               }
         }
-       return -1 - i;
  }
  
-static struct object_entry *locate_object_entry(unsigned char *sha1)
+static void add_preferred_base(unsigned char *sha1)
  {
-       int i = locate_object_entry_hash(sha1);
-       if (0 <= i)
-               return &objects[object_ix[i]-1];
-       return NULL;
+       struct tree_desc tree;
+       void *elem;
+       elem = read_object_with_reference(sha1, "tree", &tree.size, NULL);
+       tree.buf = elem;
+       if (!tree.buf)
+               return;
+       add_object_entry(sha1, "", 1);
+       add_pbase_tree(&tree);
+       free(elem);
  }
  
  static void check_object(struct object_entry *entry)
  {
         char type[20];
  
-       if (entry->in_pack) {
+       if (entry->in_pack && !entry->preferred_base) {
                 unsigned char base[20];
                 unsigned long size;
                 struct object_entry *base_entry;
@@ -463,7 +598,8 @@ static void check_object(struct object_entry *entry)
                  */
                 if (!no_reuse_delta &&
                     entry->in_pack_type == OBJ_DELTA &&
-                   (base_entry = locate_object_entry(base))) {
+                   (base_entry = locate_object_entry(base)) &&
+                   (!base_entry->preferred_base)) {
  
                         /* Depth value does not matter - find_deltas()
                          * will never consider reused delta as the
@@ -501,25 +637,6 @@ static void check_object(struct object_entry *entry)
                     sha1_to_hex(entry->sha1), type);
  }
  
-static void hash_objects(void)
-{
-       int i;
-       struct object_entry *oe;
-
-       object_ix_hashsz = nr_objects * 2;
-       object_ix = xcalloc(sizeof(int), object_ix_hashsz);
-       for (i = 0, oe = objects; i < nr_objects; i++, oe++) {
-               int ix = locate_object_entry_hash(oe->sha1);
-               if (0 <= ix) {
-                       error("the same object '%s' added twice",
-                             sha1_to_hex(oe->sha1));
-                       continue;
-               }
-               ix = -1 - ix;
-               object_ix[ix] = i + 1;
-       }
-}
-
  static unsigned int check_delta_limit(struct object_entry *me, unsigned int n)
  {
         struct object_entry *child = me->delta_child;
@@ -538,7 +655,6 @@ static void get_object_details(void)
         int i;
         struct object_entry *entry;
  
-       hash_objects();
         prepare_pack_ix();
         for (i = 0, entry = objects; i < nr_objects; i++, entry++)
                 check_object(entry);
@@ -576,6 +692,24 @@ static int sha1_sort(const struct object_entry *a, const struct object_entry *b)
         return memcmp(a->sha1, b->sha1, 20);
  }
  
+static struct object_entry **create_final_object_list()
+{
+       struct object_entry **list;
+       int i, j;
+
+       for (i = nr_result = 0; i < nr_objects; i++)
+               if (!objects[i].preferred_base)
+                       nr_result++;
+       list = xmalloc(nr_result * sizeof(struct object_entry *));
+       for (i = j = 0; i < nr_objects; i++) {
+               if (!objects[i].preferred_base)
+                       list[j++] = objects + i;
+       }
+       current_sort = sha1_sort;
+       qsort(list, nr_result, sizeof(struct object_entry *), sort_comparator);
+       return list;
+}
+
  static int type_size_sort(const struct object_entry *a, const struct object_entry *b)
  {
         if (a->type < b->type)
@@ -586,6 +720,10 @@ static int type_size_sort(const struct object_entry *a, const struct object_entr
                 return -1;
         if (a->hash > b->hash)
                 return 1;
+       if (a->preferred_base < b->preferred_base)
+               return -1;
+       if (a->preferred_base > b->preferred_base)
+               return 1;
         if (a->size < b->size)
                 return -1;
         if (a->size > b->size)
@@ -610,6 +748,8 @@ static int try_delta(struct unpacked *cur, struct unpacked *old, unsigned max_de
  {
         struct object_entry *cur_entry = cur->entry;
         struct object_entry *old_entry = old->entry;
+       int old_preferred = (old_entry->preferred_base ||
+                            old_entry->based_on_preferred);
         unsigned long size, oldsize, delta_size, sizediff;
         long max_size;
         void *delta_buf;
@@ -618,9 +758,15 @@ static int try_delta(struct unpacked *cur, struct unpacked *old, unsigned max_de
         if (cur_entry->type != old_entry->type)
                 return -1;
  
-       /* If the current object is at edge, take the depth the objects
-        * that depend on the current object into account -- otherwise
-        * they would become too deep.
+       /* We do not compute delta to *create* objects we are not
+        * going to pack.
+        */
+       if (cur_entry->preferred_base)
+               return -1;
+
+       /* If the current object is at pack edge, take the depth the
+        * objects that depend on the current object into account --
+        * otherwise they would become too deep.
          */
         if (cur_entry->delta_child) {
                 if (max_depth <= cur_entry->delta_limit)
@@ -645,8 +791,27 @@ static int try_delta(struct unpacked *cur, struct unpacked *old, unsigned max_de
          * delete).
          */
         max_size = size / 2 - 20;
-       if (cur_entry->delta)
-               max_size = cur_entry->delta_size-1;
+       if (cur_entry->delta) {
+               if (cur_entry->based_on_preferred) {
+                       if (old_preferred)
+                               max_size = cur_entry->delta_size-1;
+                       else
+                               /* trying with non-preferred one when we
+                                * already have a delta based on preferred
+                                * one is pointless.
+                                */
+                               return 0;
+               }
+               else if (!old_preferred)
+                       max_size = cur_entry->delta_size-1;
+               else
+                       /* otherwise...  even if delta with a
+                        * preferred one produces a bigger result than
+                        * what we currently have, which is based on a
+                        * non-preferred one, it is OK.
+                        */
+                       ;
+       }
         if (sizediff >= max_size)
                 return -1;
         delta_buf = diff_delta(old->data, oldsize,
@@ -656,21 +821,30 @@ static int try_delta(struct unpacked *cur, struct unpacked *old, unsigned max_de
         cur_entry->delta = old_entry;
         cur_entry->delta_size = delta_size;
         cur_entry->depth = old_entry->depth + 1;
+       cur_entry->based_on_preferred = old_preferred;
         free(delta_buf);
         return 0;
  }
  
+static void progress_interval(int signum)
+{
+       signal(SIGALRM, progress_interval);
+       progress_update = 1;
+}
+
  static void find_deltas(struct object_entry **list, int window, int depth)
  {
         int i, idx;
         unsigned int array_size = window * sizeof(struct unpacked);
         struct unpacked *array = xmalloc(array_size);
-       int eye_candy;
+       unsigned processed = 0;
+       unsigned last_percent = 999;
  
         memset(array, 0, array_size);
         i = nr_objects;
         idx = 0;
-       eye_candy = i - (nr_objects / 20);
+       if (progress)
+               fprintf(stderr, "Deltifying %d objects.\n", nr_result);
  
         while (--i >= 0) {
                 struct object_entry *entry = list[i];
@@ -679,9 +853,17 @@ static void find_deltas(struct object_entry **list, int window, int depth)
                 char type[10];
                 int j;
  
-               if (progress && i <= eye_candy) {
-                       eye_candy -= nr_objects / 20;
-                       fputc('.', stderr);
+               if (!entry->preferred_base)
+                       processed++;
+
+               if (progress) {
+                       unsigned percent = processed * 100 / nr_result;
+                       if (percent != last_percent || progress_update) {
+                               fprintf(stderr, "%4u%% (%u/%u) done\r",
+                                       percent, processed, nr_result);
+                               progress_update = 0;
+                               last_percent = percent;
+                       }
                 }
  
                 if (entry->delta)
@@ -713,6 +895,9 @@ static void find_deltas(struct object_entry **list, int window, int depth)
                         idx = 0;
         }
  
+       if (progress)
+               fputc('\n', stderr);
+
         for (i = 0; i < window; ++i)
                 free(array[i].data);
         free(array);
@@ -720,18 +905,10 @@ static void find_deltas(struct object_entry **list, int window, int depth)
  
  static void prepare_pack(int window, int depth)
  {
-       if (progress)
-               fprintf(stderr, "Packing %d objects", nr_objects);
         get_object_details();
-       if (progress)
-               fputc('.', stderr);
-
         sorted_by_type = create_sorted_list(type_size_sort);
         if (window && depth)
                 find_deltas(sorted_by_type, window+1, depth);
-       if (progress)
-               fputc('\n', stderr);
-       write_pack_file();
  }
  
  static int reuse_cached_pack(unsigned char *sha1, int pack_to_stdout)
@@ -795,10 +972,6 @@ int main(int argc, char **argv)
         int window = 10, depth = 10, pack_to_stdout = 0;
         struct object_entry **list;
         int i;
-       struct timeval prev_tv;
-       int eye_candy = 0;
-       int eye_candy_incr = 500;
-
  
         setup_git_directory();
  
@@ -855,61 +1028,60 @@ int main(int argc, char **argv)
                 usage(pack_usage);
  
         prepare_packed_git();
+
         if (progress) {
+               struct itimerval v;
+               v.it_interval.tv_sec = 1;
+               v.it_interval.tv_usec = 0;
+               v.it_value = v.it_interval;
+               signal(SIGALRM, progress_interval);
+               setitimer(ITIMER_REAL, &v, NULL);
                 fprintf(stderr, "Generating pack...\n");
-               gettimeofday(&prev_tv, NULL);
         }
+
         while (fgets(line, sizeof(line), stdin) != NULL) {
-               unsigned int hash;
-               char *p;
                 unsigned char sha1[20];
  
-               if (progress && (eye_candy <= nr_objects)) {
-                       fprintf(stderr, "Counting objects...%d\r", nr_objects);
-                       if (eye_candy && (50 <= eye_candy_incr)) {
-                               struct timeval tv;
-                               int time_diff;
-                               gettimeofday(&tv, NULL);
-                               time_diff = (tv.tv_sec - prev_tv.tv_sec);
-                               time_diff <<= 10;
-                               time_diff += (tv.tv_usec - prev_tv.tv_usec);
-                               if ((1 << 9) < time_diff)
-                                       eye_candy_incr += 50;
-                               else if (50 < eye_candy_incr)
-                                       eye_candy_incr -= 50;
-                       }
-                       eye_candy += eye_candy_incr;
+               if (line[0] == '-') {
+                       if (get_sha1_hex(line+1, sha1))
+                               die("expected edge sha1, got garbage:\n %s",
+                                   line+1);
+                       add_preferred_base(sha1);
+                       continue;
                 }
                 if (get_sha1_hex(line, sha1))
                         die("expected sha1, got garbage:\n %s", line);
-               hash = 0;
-               p = line+40;
-               while (*p) {
-                       unsigned char c = *p++;
-                       if (isspace(c))
-                               continue;
-                       hash = hash * 11 + c;
-               }
-               add_object_entry(sha1, hash);
+               add_object_entry(sha1, line+40, 0);
         }
         if (progress)
                 fprintf(stderr, "Done counting %d objects.\n", nr_objects);
-       if (non_empty && !nr_objects)
+       sorted_by_sha = create_final_object_list();
+       if (non_empty && !nr_result)
                 return 0;
  
-       sorted_by_sha = create_sorted_list(sha1_sort);
         SHA1_Init(&ctx);
         list = sorted_by_sha;
-       for (i = 0; i < nr_objects; i++) {
+       for (i = 0; i < nr_result; i++) {
                 struct object_entry *entry = *list++;
                 SHA1_Update(&ctx, entry->sha1, 20);
         }
         SHA1_Final(object_list_sha1, &ctx);
+       if (progress && (nr_objects != nr_result))
+               fprintf(stderr, "Result has %d objects.\n", nr_result);
  
         if (reuse_cached_pack(object_list_sha1, pack_to_stdout))
                 ;
         else {
-               prepare_pack(window, depth);
+               if (nr_result)
+                       prepare_pack(window, depth);
+               if (progress && pack_to_stdout) {
+                       /* the other end usually displays progress itself */
+                       struct itimerval v = {{0,},};
+                       setitimer(ITIMER_REAL, &v, NULL);
+                       signal(SIGALRM, SIG_IGN );
+                       progress_update = 0;
+               }
+               write_pack_file();
                 if (!pack_to_stdout) {
                         write_index_file();
                         puts(sha1_to_hex(object_list_sha1));
@@ -917,6 +1089,6 @@ int main(int argc, char **argv)
         }
         if (progress)
                 fprintf(stderr, "Total %d, written %d (delta %d), reused %d (delta %d)\n",
-                       nr_objects, written, written_delta, reused, reused_delta);
+                       nr_result, written, written_delta, reused, reused_delta);
         return 0;
  }
diff --git a/read-tree.c b/read-tree.c

index 52f06e3..f39fe5c 100644 (file)
--- a/read-tree.c
+++ b/read-tree.c
@@ -9,6 +9,8 @@
  
  #include "object.h"
  #include "tree.h"
+#include <sys/time.h>
+#include <signal.h>
  
  static int merge = 0;
  static int update = 0;
@@ -16,6 +18,8 @@ static int index_only = 0;
  static int nontrivial_merge = 0;
  static int trivial_merges_only = 0;
  static int aggressive = 0;
+static int verbose_update = 0;
+static volatile int progress_update = 0;
  
  static int head_idx = -1;
  static int merge_size = 0;
@@ -267,6 +271,12 @@ static void unlink_entry(char *name)
         }
  }
  
+static void progress_interval(int signum)
+{
+       signal(SIGALRM, progress_interval);
+       progress_update = 1;
+}
+
  static void check_updates(struct cache_entry **src, int nr)
  {
         static struct checkout state = {
@@ -276,8 +286,49 @@ static void check_updates(struct cache_entry **src, int nr)
                 .refresh_cache = 1,
         };
         unsigned short mask = htons(CE_UPDATE);
+       unsigned last_percent = 200, cnt = 0, total = 0;
+
+       if (update && verbose_update) {
+               struct itimerval v;
+
+               for (total = cnt = 0; cnt < nr; cnt++) {
+                       struct cache_entry *ce = src[cnt];
+                       if (!ce->ce_mode || ce->ce_flags & mask)
+                               total++;
+               }
+
+               /* Don't bother doing this for very small updates */
+               if (total < 250)
+                       total = 0;
+
+               if (total) {
+                       v.it_interval.tv_sec = 1;
+                       v.it_interval.tv_usec = 0;
+                       v.it_value = v.it_interval;
+                       signal(SIGALRM, progress_interval);
+                       setitimer(ITIMER_REAL, &v, NULL);
+                       fprintf(stderr, "Checking files out...\n");
+                       progress_update = 1;
+               }
+               cnt = 0;
+       }
+
         while (nr--) {
                 struct cache_entry *ce = *src++;
+
+               if (total) {
+                       if (!ce->ce_mode || ce->ce_flags & mask) {
+                               unsigned percent;
+                               cnt++;
+                               percent = (cnt * 100) / total;
+                               if (percent != last_percent ||
+                                   progress_update) {
+                                       fprintf(stderr, "%4u%% (%u/%u) done\r",
+                                               percent, cnt, total);
+                                       last_percent = percent;
+                               }
+                       }
+               }
                 if (!ce->ce_mode) {
                         if (update)
                                 unlink_entry(ce->name);
@@ -289,6 +340,10 @@ static void check_updates(struct cache_entry **src, int nr)
                                 checkout_entry(ce, &state);
                 }
         }
+       if (total) {
+               fputc('\n', stderr);
+               signal(SIGALRM, SIG_IGN);
+       }
  }
  
  static int unpack_trees(merge_fn_t fn)
@@ -564,7 +619,7 @@ static int twoway_merge(struct cache_entry **src)
         struct cache_entry *oldtree = src[1], *newtree = src[2];
  
         if (merge_size != 2)
-               return error("Cannot do a twoway merge of %d trees\n",
+               return error("Cannot do a twoway merge of %d trees",
                              merge_size);
  
         if (current) {
@@ -616,7 +671,7 @@ static int oneway_merge(struct cache_entry **src)
         struct cache_entry *a = src[1];
  
         if (merge_size != 1)
-               return error("Cannot do a oneway merge of %d trees\n",
+               return error("Cannot do a oneway merge of %d trees",
                              merge_size);
  
         if (!a)
@@ -680,6 +735,11 @@ int main(int argc, char **argv)
                         continue;
                 }
  
+               if (!strcmp(arg, "-v")) {
+                       verbose_update = 1;
+                       continue;
+               }
+
                 /* "-i" means "index only", meaning that a merge will
                  * not even look at the working tree.
                  */
diff --git a/receive-pack.c b/receive-pack.c

index eae31e3..2a3db16 100644 (file)
--- a/receive-pack.c
+++ b/receive-pack.c
@@ -92,7 +92,7 @@ static int run_update_hook(const char *refname,
         case -ERR_RUN_COMMAND_WAITPID_WRONG_PID:
                 return error("waitpid is confused");
         case -ERR_RUN_COMMAND_WAITPID_SIGNAL:
-               return error("%s died of signal\n", update_hook);
+               return error("%s died of signal", update_hook);
         case -ERR_RUN_COMMAND_WAITPID_NOEXIT:
                 return error("%s died strangely", update_hook);
         default:
@@ -158,7 +158,7 @@ static int update(struct command *cmd)
         if (run_update_hook(name, old_hex, new_hex)) {
                 unlink(lock_name);
                 cmd->error_string = "hook declined";
-               return error("hook declined to update %s\n", name);
+               return error("hook declined to update %s", name);
         }
         else if (rename(lock_name, name) < 0) {
                 unlink(lock_name);
diff --git a/refs.c b/refs.c

index d01fc39..826ae7a 100644 (file)
--- a/refs.c
+++ b/refs.c
@@ -268,7 +268,7 @@ static int write_ref_file(const char *filename,
         char term = '\n';
         if (write(fd, hex, 40) < 40 ||
             write(fd, &term, 1) < 1) {
-               error("Couldn't write %s\n", filename);
+               error("Couldn't write %s", filename);
                 close(fd);
                 return -1;
         }
diff --git a/rev-list.c b/rev-list.c

index f2d1105..1589400 100644 (file)
--- a/rev-list.c
+++ b/rev-list.c
@@ -30,7 +30,7 @@ static const char rev_list_usage[] =
  "    --date-order\n"
  "  formatting output:\n"
  "    --parents\n"
-"    --objects\n"
+"    --objects | --objects-edge\n"
  "    --unpacked\n"
  "    --header | --pretty\n"
  "    --abbrev=nr | --no-abbrev\n"
@@ -44,6 +44,7 @@ static int bisect_list = 0;
  static int tag_objects = 0;
  static int tree_objects = 0;
  static int blob_objects = 0;
+static int edge_hint = 0;
  static int verbose_header = 0;
  static int abbrev = DEFAULT_ABBREV;
  static int show_parents = 0;
@@ -255,8 +256,8 @@ static void show_commit_list(struct commit_list *list)
                 die("unknown pending object %s (%s)", sha1_to_hex(obj->sha1), name);
         }
         while (objects) {
-               /* An object with name "foo\n0000000000000000000000000000000000000000"
-                * can be used confuse downstream git-pack-objects very badly.
+               /* An object with name "foo\n0000000..." can be used to
+                * confuse downstream git-pack-objects very badly.
                  */
                 const char *ep = strchr(objects->name, '\n');
                 if (ep) {
@@ -430,16 +431,30 @@ static struct commit_list *find_bisection(struct commit_list *list)
         return best;
  }
  
+static void mark_edge_parents_uninteresting(struct commit *commit)
+{
+       struct commit_list *parents;
+
+       for (parents = commit->parents; parents; parents = parents->next) {
+               struct commit *parent = parents->item;
+               if (!(parent->object.flags & UNINTERESTING))
+                       continue;
+               mark_tree_uninteresting(parent->tree);
+               if (edge_hint)
+                       printf("-%s\n", sha1_to_hex(parent->object.sha1));
+       }
+}
+
  static void mark_edges_uninteresting(struct commit_list *list)
  {
         for ( ; list; list = list->next) {
-               struct commit_list *parents = list->item->parents;
+               struct commit *commit = list->item;
  
-               for ( ; parents; parents = parents->next) {
-                       struct commit *commit = parents->item;
-                       if (commit->object.flags & UNINTERESTING)
-                               mark_tree_uninteresting(commit->tree);
+               if (commit->object.flags & UNINTERESTING) {
+                       mark_tree_uninteresting(commit->tree);
+                       continue;
                 }
+               mark_edge_parents_uninteresting(commit);
         }
  }
  
@@ -843,6 +858,13 @@ int main(int argc, const char **argv)
                         blob_objects = 1;
                         continue;
                 }
+               if (!strcmp(arg, "--objects-edge")) {
+                       tag_objects = 1;
+                       tree_objects = 1;
+                       blob_objects = 1;
+                       edge_hint = 1;
+                       continue;
+               }
                 if (!strcmp(arg, "--unpacked")) {
                         unpacked = 1;
                         limited = 1;
diff --git a/rev-parse.c b/rev-parse.c

index a5fb93c..610eacb 100644 (file)
--- a/rev-parse.c
+++ b/rev-parse.c
@@ -43,6 +43,7 @@ static int is_rev_argument(const char *arg)
                 "--min-age=",
                 "--no-merges",
                 "--objects",
+               "--objects-edge",
                 "--parents",
                 "--pretty",
                 "--show-breaks",
diff --git a/send-pack.c b/send-pack.c

index 990be3f..f558386 100644 (file)
--- a/send-pack.c
+++ b/send-pack.c
@@ -12,6 +12,7 @@ static const char *exec = "git-receive-pack";
  static int verbose = 0;
  static int send_all = 0;
  static int force_update = 0;
+static int use_thin_pack = 0;
  
  static int is_zero_sha1(const unsigned char *sha1)
  {
@@ -37,26 +38,47 @@ static void exec_pack_objects(void)
  
  static void exec_rev_list(struct ref *refs)
  {
+       struct ref *ref;
         static char *args[1000];
-       int i = 0;
+       int i = 0, j;
  
         args[i++] = "rev-list"; /* 0 */
-       args[i++] = "--objects";        /* 1 */
-       while (refs) {
-               char *buf = malloc(100);
-               if (i > 900)
+       if (use_thin_pack)      /* 1 */
+               args[i++] = "--objects-edge";
+       else
+               args[i++] = "--objects";
+
+       /* First send the ones we care about most */
+       for (ref = refs; ref; ref = ref->next) {
+               if (900 < i)
                         die("git-rev-list environment overflow");
-               if (!is_zero_sha1(refs->old_sha1) &&
-                   has_sha1_file(refs->old_sha1)) {
+               if (!is_zero_sha1(ref->new_sha1)) {
+                       char *buf = malloc(100);
                         args[i++] = buf;
-                       snprintf(buf, 50, "^%s", sha1_to_hex(refs->old_sha1));
+                       snprintf(buf, 50, "%s", sha1_to_hex(ref->new_sha1));
                         buf += 50;
+                       if (!is_zero_sha1(ref->old_sha1) &&
+                           has_sha1_file(ref->old_sha1)) {
+                               args[i++] = buf;
+                               snprintf(buf, 50, "^%s",
+                                        sha1_to_hex(ref->old_sha1));
+                       }
                 }
-               if (!is_zero_sha1(refs->new_sha1)) {
+       }
+
+       /* Then a handful of the remainder
+        * NEEDSWORK: we would be better off if used the newer ones first.
+        */
+       for (ref = refs, j = i + 16;
+            i < 900 && i < j && ref;
+            ref = ref->next) {
+               if (is_zero_sha1(ref->new_sha1) &&
+                   !is_zero_sha1(ref->old_sha1) &&
+                   has_sha1_file(ref->old_sha1)) {
+                       char *buf = malloc(42);
                         args[i++] = buf;
-                       snprintf(buf, 50, "%s", sha1_to_hex(refs->new_sha1));
+                       snprintf(buf, 42, "^%s", sha1_to_hex(ref->old_sha1));
                 }
-               refs = refs->next;
         }
         args[i] = NULL;
         execv_git_cmd(args);
@@ -361,6 +383,10 @@ int main(int argc, char **argv)
                                 verbose = 1;
                                 continue;
                         }
+                       if (!strcmp(arg, "--thin")) {
+                               use_thin_pack = 1;
+                               continue;
+                       }
                         usage(send_pack_usage);
                 }
                 if (!dest) {
diff --git a/sha1_file.c b/sha1_file.c

index 9cab99a..a80d849 100644 (file)
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -247,6 +247,7 @@ static void link_alt_odb_entries(const char *alt, const char *ep, int sep,
                 for ( ; cp < ep && *cp != sep; cp++)
                         ;
                 if (last != cp) {
+                       struct stat st;
                         struct alternate_object_database *alt;
                         /* 43 = 40-byte + 2 '/' + terminating NUL */
                         int pfxlen = cp - last;
@@ -269,9 +270,19 @@ static void link_alt_odb_entries(const char *alt, const char *ep, int sep,
                         }
                         else
                                 memcpy(ent->base, last, pfxlen);
+
                         ent->name = ent->base + pfxlen + 1;
-                       ent->base[pfxlen] = ent->base[pfxlen + 3] = '/';
-                       ent->base[entlen-1] = 0;
+                       ent->base[pfxlen + 3] = '/';
+                       ent->base[pfxlen] = ent->base[entlen-1] = 0;
+
+                       /* Detect cases where alternate disappeared */
+                       if (stat(ent->base, &st) || !S_ISDIR(st.st_mode)) {
+                               error("object directory %s does not exist; "
+                                     "check .git/objects/info/alternates.",
+                                     ent->base);
+                               goto bad;
+                       }
+                       ent->base[pfxlen] = '/';
  
                         /* Prevent the common mistake of listing the same
                          * thing twice, or object directory itself.
@@ -552,7 +563,9 @@ static void prepare_packed_git_one(char *objdir, int local)
         len = strlen(path);
         dir = opendir(path);
         if (!dir) {
-               fprintf(stderr, "unable to open object pack directory: %s: %s\n", path, strerror(errno));
+               if (errno != ENOENT)
+                       error("unable to open object pack directory: %s: %s",
+                             path, strerror(errno));
                 return;
         }
         path[len++] = '/';
@@ -1500,7 +1513,8 @@ int write_sha1_from_fd(const unsigned char *sha1, int fd, char *buffer,
  
         local = mkstemp(tmpfile);
         if (local < 0)
-               return error("Couldn't open %s for %s\n", tmpfile, sha1_to_hex(sha1));
+               return error("Couldn't open %s for %s",
+                            tmpfile, sha1_to_hex(sha1));
  
         memset(&stream, 0, sizeof(stream));
  
@@ -1548,7 +1562,7 @@ int write_sha1_from_fd(const unsigned char *sha1, int fd, char *buffer,
         }
         if (memcmp(sha1, real_sha1, 20)) {
                 unlink(tmpfile);
-               return error("File %s has bad hash\n", sha1_to_hex(sha1));
+               return error("File %s has bad hash", sha1_to_hex(sha1));
         }
  
         return move_temp_to_file(tmpfile, sha1_file_name(sha1));
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh

new file mode 100755 (executable)

index 0000000..cabfadd
--- /dev/null
+++ b/t/t3600-rm.sh
@@ -0,0 +1,60 @@
+#!/bin/sh
+#
+# Copyright (c) 2006 Carl D. Worth
+#
+
+test_description='Test of the various options to git-rm.'
+
+. ./test-lib.sh
+
+# Setup some files to be removed, some with funny characters
+touch -- foo bar baz 'space embedded' 'tab     embedded' 'newline
+embedded' -q
+git-add -- foo bar baz 'space embedded' 'tab   embedded' 'newline
+embedded' -q
+git-commit -m "add files"
+
+test_expect_success \
+    'Pre-check that foo exists and is in index before git-rm foo' \
+    '[ -f foo ] && git-ls-files --error-unmatch foo'
+
+test_expect_success \
+    'Test that git-rm foo succeeds' \
+    'git-rm foo'
+
+test_expect_success \
+    'Post-check that foo exists but is not in index after git-rm foo' \
+    '[ -f foo ] && ! git-ls-files --error-unmatch foo'
+
+test_expect_success \
+    'Pre-check that bar exists and is in index before "git-rm -f bar"' \
+    '[ -f bar ] && git-ls-files --error-unmatch bar'
+
+test_expect_success \
+    'Test that "git-rm -f bar" succeeds' \
+    'git-rm -f bar'
+
+test_expect_success \
+    'Post-check that bar does not exist and is not in index after "git-rm -f bar"' \
+    '! [ -f bar ] && ! git-ls-files --error-unmatch bar'
+
+test_expect_success \
+    'Test that "git-rm -- -q" succeeds (remove a file that looks like an option)' \
+    'git-rm -- -q'
+
+test_expect_success \
+    "Test that \"git-rm -f\" succeeds with embedded space, tab, or newline characters." \
+    "git-rm -f 'space embedded' 'tab   embedded' 'newline
+embedded'"
+
+chmod u-w .
+test_expect_failure \
+    'Test that "git-rm -f" fails if its rm fails' \
+    'git-rm -f baz'
+chmod u+w .
+
+test_expect_success \
+    'When the rm in "git-rm -f" fails, it should not remove the file from the index' \
+    'git-ls-files --error-unmatch baz'
+
+test_done
diff --git a/upload-pack.c b/upload-pack.c

index 3606529..635abb3 100644 (file)
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -14,6 +14,7 @@ static const char upload_pack_usage[] = "git-upload-pack [--strict] [--timeout=n
  #define MAX_HAS 256
  #define MAX_NEEDS 256
  static int nr_has = 0, nr_needs = 0, multi_ack = 0, nr_our_refs = 0;
+static int use_thin_pack = 0;
  static unsigned char has_sha1[MAX_HAS][20];
  static unsigned char needs_sha1[MAX_NEEDS][20];
  static unsigned int timeout = 0;
@@ -49,8 +50,10 @@ static void create_pack_file(void)
                 char *buf;
                 char **p;
  
-               if (create_full_pack)
+               if (create_full_pack) {
                         args = 10;
+                       use_thin_pack = 0; /* no point doing it */
+               }
                 else
                         args = nr_has + nr_needs + 5;
                 argv = xmalloc(args * sizeof(char *));
@@ -62,7 +65,7 @@ static void create_pack_file(void)
                 close(fd[0]);
                 close(fd[1]);
                 *p++ = "rev-list";
-               *p++ = "--objects";
+               *p++ = use_thin_pack ? "--objects-edge" : "--objects";
                 if (create_full_pack || MAX_NEEDS <= nr_needs)
                         *p++ = "--all";
                 else {
@@ -192,6 +195,8 @@ static int receive_needs(void)
                             "expected to get sha, not '%s'", line);
                 if (strstr(line+45, "multi_ack"))
                         multi_ack = 1;
+               if (strstr(line+45, "thin-pack"))
+                       use_thin_pack = 1;
  
                 /* We have sent all our refs already, and the other end
                  * should have chosen out of them; otherwise they are
@@ -213,7 +218,7 @@ static int receive_needs(void)
  
  static int send_ref(const char *refname, const unsigned char *sha1)
  {
-       static char *capabilities = "multi_ack";
+       static char *capabilities = "multi_ack thin-pack";
         struct object *o = parse_object(sha1);
  
         if (!o)
author	Junio C Hamano <junkio@cox.net>
	Thu, 23 Feb 2006 10:59:24 +0000 (02:59 -0800)
committer	Junio C Hamano <junkio@cox.net>
	Thu, 23 Feb 2006 10:59:24 +0000 (02:59 -0800)
.gitignore		patch \| blob \| history
Documentation/git-cvsserver.txt	[new file with mode: 0644]	patch \| blob
Documentation/git-rm.txt	[new file with mode: 0644]	patch \| blob
Makefile		patch \| blob \| history
blame.c	[new file with mode: 0644]	patch \| blob
commit.c		patch \| blob \| history
contrib/gitview/gitview		patch \| blob \| history
diffcore-rename.c		patch \| blob \| history
fetch-pack.c		patch \| blob \| history
git-annotate.perl	[new file with mode: 0755]	patch \| blob
git-clone.sh		patch \| blob \| history
git-cvsserver.perl	[new file with mode: 0755]	patch \| blob
git-fetch.sh		patch \| blob \| history
git-merge.sh		patch \| blob \| history
git-push.sh		patch \| blob \| history
git-rm.sh	[new file with mode: 0755]	patch \| blob
http-fetch.c		patch \| blob \| history
pack-objects.c		patch \| blob \| history
read-tree.c		patch \| blob \| history
receive-pack.c		patch \| blob \| history
refs.c		patch \| blob \| history
rev-list.c		patch \| blob \| history
rev-parse.c		patch \| blob \| history
send-pack.c		patch \| blob \| history
sha1_file.c		patch \| blob \| history
t/t3600-rm.sh	[new file with mode: 0755]	patch \| blob
upload-pack.c		patch \| blob \| history