Commit graph

86 commits

Author SHA1 Message Date
Ori Bernstein c2c397422f git/log: fix log count
saved wrong version when rebasing, oops.
2022-07-03 06:42:17 +00:00
Ori Bernstein 126cc163e2 git/compat: expand to cover go bootstrap
go bootstrap uses more of git than we supported, so
stub in enough that we can bootstrap go.
2022-07-03 04:25:08 +00:00
Ori Bernstein 21aac62c1f git/log: support -n option to restrict log counts
this is useful for scripting, and convenient for
interactive use.
2022-07-03 04:38:13 +00:00
Ori Bernstein 3e176bd975 git/pack: add support for skipping ssh signatures
ssh signatures confused our commit parsing; teach our
commit parsing to skip them.
2022-06-11 17:48:20 +00:00
Ori Bernstein bb33663b40 git/get: keep sending what we have until we get an ack
Git9 was sloppy about telling git what commits we have.

We would list the commits at the tip of the branch, but not
walk down it, which means we would request too much data if
our local branches were ahead of the remote.

This patch changes that, sending the tips *and* the first
256 commits after them, so that git can produce a better
pack for us, with fewer redundant commits.
2022-06-11 16:36:45 +00:00
Ori Bernstein 926be5e34e git/import: use patch(1)
we have a new, pretty patch(1), lets use it.
2022-06-04 23:35:49 +00:00
Ori Bernstein 01a6de812c git: performance enhancements
Inspired by some changes made in game of trees, I've
implemented a number of speedups in git9.

First, hashing the chunks during deltification with
murmurhash instead of sha1 speeds up the delta search
significantly.

The stretch function was micro-optimized a bit as well,
since that was taking a large portion of the time when
chunking.

Finally, the full path is not stored. We only care about
grouping files with the same name and path. We don't care
about the ordering. Therefore, only the hash of the path
xored with the hash of the diretory is kept, which saves
a bunch of mallocs and string munging.

This reduces the time spent repacking some test repos
significantly.

9front:
	% time git/repack
	deltifying 97473 objects: 100%
	writing 97473 objects: 100%
	indexing 97473 objects: 100%
	58.85u 1.39s 61.82r 	 git/repack

	% time /sys/src/cmd/git/6.repack
	deltifying 97473 objects: 100%
	writing 97473 objects: 100%
	indexing 97473 objects: 100%
	43.86u 1.29s 47.51r 	 /sys/src/cmd/git/6.repack

openbsd:

	% time git/repack
	deltifying 2092325 objects: 100%
	writing 2092325 objects: 100%
	indexing 2092325 objects: 100%
	1589.48u 45.03s 1729.18r 	 git/repack

	% time /sys/src/cmd/git/6.repack
	deltifying 2092325 objects: 100%
	writing 2092325 objects: 100%
	indexing 2092325 objects: 100%
	1238.68u 41.49s 1373.15r 	 /sys/src/cmd/git/6.repack

go:
	% time git/repack
	deltifying 529507 objects: 100%
	writing 529507 objects: 100%
	indexing 529507 objects: 100%
	345.32u 7.71s 369.25r     git/repack

	% time /sys/src/cmd/git/6.repack
	deltifying 529507 objects: 100%
	writing 529507 objects: 100%
	indexing 529507 objects: 100%
	248.07u 4.47s 257.59r 	 /sys/src/cmd/git/6.repack
2022-05-28 16:38:07 +00:00
Ori Bernstein 408242edcf git: improve error on short read
we don't recover from an invalid packet, so just
sysfatal with a useful message.
2022-05-20 17:16:41 +00:00
Ori Bernstein a271f62bf2 git/pull: remove '-b' and '-a' option
we do the right thing by default now, let's not
add knobs that nobody cares about.
2022-04-28 03:35:54 +00:00
Ori Bernstein 929b0ff087 git: rename internal 'git/fetch' plumbing to 'git/get'
This caused some confusion, so to make it clear that
it's plumbing and has nothing to do with 'git fetch',
rename it.
2022-04-17 17:03:47 +00:00
Ori Bernstein 08447e5d64 git/send: fill in 'theirs' object, even if we miss it
When pushing, git/send would sometimes decide we had all the
objects that we'd need to update the remote, and would try
to pack and send the entire history of the repository. This
is because we only set the 'theirs' ref when we had the object.

If we didn't have the object, we would set a zero hash,
then when deciding if we needed to force, we would think
that we were updating a new branch and send everything,
which would fail to update the remote.
2022-04-17 01:19:10 +00:00
Ori Bernstein 8319b750ea git/serve: log correct error message
Sending the packet on failure could junk the errstr,
so set it after we send the message.
2022-04-17 00:22:43 +00:00
Ori Bernstein 03e5d9e9e2 git/merge: preserve exec bit correctly
A while ago, qwx noticed that we clobbered the exec
bit when merging files. This is not what we want, so
we changed the operator precedence to avoid merging
dirty files implicitly.

But we do want to merge, because it's convenient for
maintaining permissions. So, instead, we should do a
3 way merge of the exec bit.

This patch does that, as well as reverting the rollback
of that change.

While we're here, we adjust the timestamps correctly
in git/branch.

This requires changes to git/fs, because without an open
handler, lib9p allows opening any file with any mode,
which confuses 'test -x'.
2022-04-16 23:53:19 +00:00
Ori Bernstein 261d1ac0e3 git/pull: fetch all branches (please test)
there was a diff that went in a while ago to improve
this, but it got backed out because it encounters a
bug in upstream git -- the spec says that a single
ACK should be sent when not using multi-ack modes,
but they send back multiple ones.

This commit brings back the functionality, and works
around the upstream git bug in two different ways.

First, it skips the packets up until it finds the
start of a pack header.

Second, it deduplicates the want messages, which
is what seems to trigger the duplicate ACKs that
cause us trouble.
2022-04-16 23:52:10 +00:00
Michael Forney 798375ad45 git/import: squash leading/trailing/consecutive blanks and strip trailing space
This fixes importing patches with multiline commit messages generated
by git-format-patch.  It also matches commit message sanitation done
by git-am.
2022-04-26 19:06:53 +00:00
Michael Forney 909205036d git/branch: remove duplicate assignment of dirtypaths 2022-04-04 23:09:49 +00:00
Michael Forney 331f19ef21 git/branch: fix typo in error message 2022-04-04 22:54:09 +00:00
Michael Forney 638b82129e git/fetch: use read for reading packfiles instead of readn 2022-03-18 23:45:43 +00:00
Michael Forney d55a64c905 git: use commit date as traversal hint instead of author date
Although git9 always uses the same commit date and author date, other
implementation do make a distinction.  Since commit date is more
representative of the commit graph order, use this as a traversal hint
instead of author date.
2022-03-17 01:41:44 +00:00
Michael Forney 8bd5be7c70 git/fetch: improve detection of dumb http protocol
If the server only supports the dumb protocol, the first 4 bytes of
response will be the initial part of the hash of the first ref.

The http-protocol documentation says that we should fall back to the
dumb protocol when we don't see a content-type of
application/x-$servicename-advertisement.  Check this before
attempting to read a smart git packet.
2022-03-17 01:41:09 +00:00
Michael Forney 2e47badb88 git/query: refactor graph painting algorithm (findtwixt, lca)
We now keep track of 3 sets during traversal:
- keep: commits we've reached from head commits
- drop: commits we've reached from tail commits
- skip: ancestors of commits in both 'keep' and 'drop'

Commits in 'keep' and/or 'drop' may be added later to the 'skip' set
if we discover later that they are part of a common subgraph of the
head and tail commits.

From these sets we can calculate the commits we are interested in:
lca commits are those in 'keep' and 'drop', but not in 'skip'.
findtwixt commits are those in 'keep', but not in 'drop' or 'skip'.

The "LCA" commit returned is a common ancestor such that there are no
other common ancestors that can reach that commit.  Although there can
be multiple commits that meet this criteria, where one is technically
lower on the commit-graph than the other, these cases only happen in
complex merge arrangements and any choice is likely a decent merge
base.

Repainting is now done in paint() directly.  When we find a boundary
commit, we switch our paint color to 'skip'.  'skip' painting does
not stop when it hits another color; we continue until we are left
with only 'skip' commits on the queue.

This fixes several mishandled cases in the current algorithm:
1. If we hit the common subgraph from tail commits first (if the tail
   commit was newer than the head commit), we ended up traversing the
   entire commit graph.  This is because we couldn't distinguish
   between 'drop' commits that were part of the common subgraph, and
   those that were still looking for it.
2. If we traversed through an initial part of the common subgraph from
   head commits before reaching it from tail commits, these commits
   were returned from findtwixt even though they were also reachable
   from tail commits.
3. In the same case as 2, we might end up choosing an incorrect
   commit as the LCA, which is an ancestor of the real LCA.
2022-03-16 21:41:59 +00:00
Ori Bernstein 840d16912a git/revert: update modification time on revert
when reverting files, 'cp -x' updates the mtime
to the time the file was committed. this prevents
'mk' from rebuilding the file, leading to stale
builds.

this change touches the file on revert, so that
we rebuild the file.
2022-02-27 04:27:56 +00:00
Ori Bernstein 2367a2aeae git/branch: fix order of operations (thanks qwx) 2022-02-10 01:33:36 +00:00
Michael Forney a5a8a92adf git/query: leave range commits in topological order
This prevents commits from getting reordered incorrectly during rebase
or export.
2022-01-23 00:39:21 +00:00
Ori Bernstein 9e79aaceba git/commit: squelch error when run outside repository
when running outside of a repository, we would try to
remove '$msgfile.tmp', but we had never actually set
'$msgfile'.

the error is harmless, but annoying.
2022-01-09 17:37:29 +00:00
Ori Bernstein 70edb7fbae git/fs: remove trailing null bytes from parent file (thanks mcf)
due to the way the size of buf was calculated, the parent
file had one trailing null byte for each parent. also, while
we're here, replace the sprint with seprint, and compute the
size from how much we printed in.
2022-01-07 01:43:52 +00:00
Ori Bernstein 370bfd26ce git: fix typo in git/log output
Commiter => Committer
2022-01-06 06:38:56 +00:00
Ori Bernstein f63d1d3ced git: size cache in bytes, not objects
git used to track cache size in object
count, rather than bytes. This had the
unfortunate effect of making memory use
depend on the size of objects -- repos
with lots of large objects could cause
out of memory deaths.

now, we track sizes in bytes, which should
keep our memory usage flatter.
2022-01-02 03:37:23 +00:00
Ori Bernstein facb0e757a git: revert c947bf808 -- it triggers a bug.
We seem to have a botch in the protocol negotiation, where
we leak some protocol packets into the packfile; this will
need to be fixed before we put this change in.
2021-12-22 00:48:09 +00:00
Ori Bernstein c947bf8087 git: fetch all branches by default.
when the remote side creates a new branch, it is
desirable to have it show up in the repo.
2021-12-20 15:16:29 +00:00
Ori Bernstein 3710ed60fd git: fully init objq
we were leaving objq.best uninitialized, and
would therefore read garbage if we didn't
find a best match.
2021-12-08 00:20:32 +00:00
Ori Bernstein f0adfb4ded git: improve pack cache heuristics
the pack cache was very stupid: it would close packs
as early as possible, which would prevent packs from
getting reused effectively. It would also select a
bad pack to close.

This picks the oldest pack, refcounts correctly, and
keeps up to Npackcache open at once (though it will
go over if more are in use).
2021-12-05 00:13:54 +00:00
Kyle Milz e2e4a46f26 git/revert: fix empty invocation
git/revert requires a file name argument, but when none is given
it fails in a strange way:

	% git/revert
	usage: cleanname [-d pwd] name...
	/bin/git/revert:15: null list in concatenation
2021-11-04 19:08:02 +00:00
Sigrid Solveig Haflínudóttir 35a8152ebc git/pack: check pf->pack for nil before Bterming it 2021-10-28 15:26:57 +00:00
Ori Bernstein c2661f86fc git/serve: one more silencing of non-interactive prints 2021-10-24 14:37:36 +00:00
Ori Bernstein a7f6b58d0d git/serve: don't show progress when not interactive
this prevents console spam
2021-10-24 01:36:46 +00:00
Ori Bernstein 8f4842d346 git: when stealing from the old packs list, keep what we stole.
we were missing a return after stealing, which killed the point
of doing the theft.
2021-09-14 16:13:58 +00:00
Ori Bernstein c7dcc82b0b git/query: fix spurious merge requests
Due to the way LCA is defined, a using a strict LCA
on a graph like this:

 <--a--b--c--d--e--f--g
     \               /
       +-----h-------

can lead to spurious requests to merge. This happens
because 'lca(b, g)' would return 'a', since it can be
reached in one step from 'b', and 2 steps from 'g', while
reaching 'b' from 'a' would be a longer path.

As a result, we need to implement an lca variant that
returns the starting node if one is reachable from the
other, even if it's already found the technically correct
least common ancestor.

This replaces our LCA algorithm with one based on the
painting we do while finding a twixt, making it give
the resutls we want.
git/query: fix spurious merge requests

Due to the way LCA is defined, a using a strict LCA
on a graph like this:

 <--a--b--c--d--e--f--g
     \               /
       +-----h-------

can lead to spurious requests to merge. This happens
because 'lca(b, g)' would return 'a', since it can be
reached in one step from 'b', and 2 steps from 'g', while
reaching 'b' from 'a' would be a longer path.

As a result, we need to implement an lca variant that
returns the starting node if one is reachable from the
other, even if it's already found the technically correct
least common ancestor.

This replaces our LCA algorithm with one based on the
painting we do while finding a twixt.
2021-09-11 17:46:26 +00:00
Ori Bernstein d9564c0642 git: separate author and committer
Git has the ability to track the person who
creates a commit separately from the person
who wrote the commit. For git9, we ignored
this feature.

However, as we start using git/import more,
it will be useful to figure out who imported
a commit, as well as who wrote it.

This change adds support for seeing this
information in git, as well as setting the
author and committer separately in git/import.
2021-09-03 02:47:18 +00:00
Ori Bernstein 0741147eab git/serve: add a '\n' after HEAD
Per the docs:

	the sender SHOULD include a LF, but the
	receiver MUST NOT complain if it is not
	present.

I typoed away the SHOULD, and got missed the
MUST NOT.

thanks qbit.
2021-08-25 22:15:34 +00:00
Ori Bernstein 9ca6ca345f git/compat: add support for ls-remote [-d]
This is used by 'go get' sometimes, so add it.
2021-08-25 02:24:15 +00:00
Ori Bernstein fb2e0a1987 git/diff: clean up diffs
We were overzealous about showing the changed
header, as well as setting a junk variable for
files that didn't exist; fix both.
2021-08-23 01:22:04 +00:00
Ori Bernstein 9a69a2bf2a git/commit: remove trailing 'subst -g'
the subst utility no longer supports a '-g'
flag, but this was left behind in commit;

this means that the lines listing modified
files were not correctly commented in the
commit header.

This is mostly harmless, but when using an
editor like sam to edit the commit message,
the modified lines would have to be removed
manually.
2021-08-23 00:22:04 +00:00
ori@eigenstate.org abe0534492 git/{diff,import}: make it easier to handle manually-asembled patch emails
Often, people (including myself) will write emails that
can almost be applied with git/import.  This changes
git/diff and git/import so that things will generally
work even when assembling diffs by hand:

1.	git/import becomes slightly more lax:

		^diff ...
		^--- ...

	will both be detected as the start of a patch.

2.  git/diff produces the same format of diff
	as git/export, starting with paths:

		--- a/path/to/file
		+++	b/path/to/file

	which means that the 'ape/patch -p1' used
	within git/import will just work.

So with this, if you send an email to the mailing list,
write up a committable description, and append the
output of git/diff to the end of the email, git/import
should just work.

[this patch was send through the mailing list using the
above procedure, and will be committed with git/import
to verify that it works as advertised]
2021-08-22 17:18:35 +00:00
Ori Bernstein cfebf83947 git: better handling of absolute paths, regex metachars
Git currently gets a bit confused if you try to
manipulate files by absolute path.  There were also a
number of places where user-controlled file paths ended
up getting passed to regex interpretation, which could
confuse things.

This change mainly does 2 things:

	- Adds a 'drop' function which drops
	  a non-regex prefix from a string, and uses
	  that to manipulate paths, simplifies 'subst',
	  and removes 'subst -g', which was only used
	  with fixed regexes; sed does this job fine.
	- When getting a path from a user, we
	  make it absolute and then strip out the head

Along the way it cleans up a couple of stupids:

	- 'for(f in $list) if(! ~ $#f 0) use $f:
	  $f can't be a nil list because of
	  list flattening.
	- removes a useless substitution here:

	 	all=`$nl{{git/query -c $1 $2; git/query -c $2 $3} | sed 's/^..//' | \
			gsubst '^('$ourbr'|'$basebr'|'$theirbr')/*' | sort | uniq}

	  where git/query -c doesn't produce
	  paths prefixed with the query.
2021-08-17 04:31:15 +00:00
Ori Bernstein da085a2d4c git/branch: make '-n' use HEAD when '-b' unspecified
This brings the behavior in line with the manual page,
and makes things less surprising for users.
2021-08-13 05:16:50 +00:00
Ori Bernstein 758067ee56 git/export: use 'date -f' instead of 'date -m'
The '-m' flag was added to date largely
to support git scripts. It predates the
tmdate code, which is why it exists, but
it's a recent enough addition that nothing
I'm aware of uses it, other than git.

As a result, it would be good to remove
it, so let's do that.
2021-08-12 14:42:47 +00:00
Ori Bernstein 54993a1f5b git: fix non-interruptible temporary warning
harmless, but annoying.
2021-08-11 15:00:48 +00:00
Ori Bernstein 3909b83a90 git/save: leave submodules unmangled
When modifying a submodule, we would garble the
mode, leading to an apparently dangling object.

This fixes the issue.
2021-08-07 18:01:22 +00:00
Ori Bernstein 70d173bfa4 git/fetch: be more robust
currently, git/fetch prints the refs
to update before it fully fetches the
pack files; this can lead to updates
to the refs before we're 100% certain
that the objects are present.

This change prints the updates after
the packfile has been successfully
indexed.
2021-07-27 15:05:45 +00:00