Message ID | CA+55aFz3=cbciRfTYodNhdEetXYxTARGTfpP9GL9RZK222XmKQ@mail.gmail.com |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
Linus Torvalds <torvalds@linux-foundation.org> writes: > For the people who use "git request-pull", I'm attaching a trivial > patch to make it add this kind of signature if you give it the "-s" > flag. It basically just adds a hunk like the appended crazy example to > the pull request, and it's small enough and simple enough that it > makes verification simple too with just the above kind of trivial > cut-and-paste thing. > > (Junio cc'd, I think he had something more complicated in mind) You have misread me this time. I think the minimalistic "paste this line to your 'git pull' command line and expect to get history leading to this commit" like you did in your patch would be the solution that is the least painful and still useful, which is an important criterion for wide adoption. > Now, admittedly it would be *even nicer* if this gpg-signed block was > instead uploaded as a signed tag automatically, and "git pull" would > notice such a signed tag (tagname the same as the branch name + date > or something) and would download and verify the tag as I pull. Then I > wouldn't even need to actually do the cut-and-paste at all. But this > is the *really* simple approach that gets up 95% of the way there. I however have a small trouble with "lieutenants use signed tags in order to prove who they are to Linus", depending on the details. It certainly lets you run "git tag --verify" after you pulled and will give you assurance that you pulled the right thing from the right person, but what do you plan to do to the tag from your lieutenants after you fetched and verified? I count 379 merges by you between 3.0 (2011-07-21) and 3.1 (2011-10-24), which would mean you would see 4-5 tags per day on average. Will these tags be pushed out to your public history? On one hand, we (not just you but the consumers of "Linus kernel") can consider these tags are of ephemeral nature. Once they are used for _you_ to verify the authenticity, they are not needed anymore. The consumers of "Linus kernel" by definition trusts what you publish, so as long as they have a way to verify the tip commit you push out, they _should_ be happy. If you take this stance, you would not push these tags out so that you do not have to contaminate the tags namespace with them, and you might even choose to discard them once you pulled and verified the lieutenants' tips to avoid contamination of your own refs namespace. On the other hand, the consumers of "Linus kernel" may want to say that they trust your tree and your tags because they can verify them with your GPG signature, but also they can independently verify the lieutenants' trees you pulled from are genuine. Keeping signed tags and publishing them is one way to make it possible, but 400 extra tags in 3 months feels like an approach with too much downside (i.e. noise) for that benefit. On Git mailing list, we have been toying with a couple of ideas. The simplest one (cooking in next) is to allow committers to add gpg signature in an additional header of the commit objects. "git show" and friends are taught how to verify these signatures when asked. This might have a potential downside on the lieutenants' workflow; after integrating the work by their sub-lieutenants and by themselves, they would test and review the result to convince themselves that it is worth asking you to pull, and then they have to either (1) "commit --amend --gpg-signature" the tip; or (2) "commit --allow-empty --gpg-signature" to add an empty commit whose sole purpose is to hold the signature (and avoid amending the tip) before pushing it out, asking you to pull. An alternative we have discussed was to store gpg signature for the commit ("push certificate") somewhere in notes tree and push that out, certifying that the commit indeed came from the pusher, but that would: (1) require upstreams to fetch (and possibly suffer from merge conflicts in notes tree) push certificate whenever they pull from their lieutenants; and (2) require downstreams to also fetch the notes tree for "push certificates" (especially when the central repository is shared among multiple people) before adding their own signature and then push it back (and possiblly suffer from "non-fast-forward" in notes tree). both of which are downsides coming from "notes" being not a very good match for what these signatures are trying to achieve. Namely, the current "notes" mechanism is designed to keep track of history of changes made to notes attached to commits, but for the signature application, we do not care about the order that signatures came to two separate commits. "Non-fast-forward" conflicts while pushing, or having to fetch and merge before adding one's own signature, are unwanted burden imposed only by choosing to use "notes" for storing and conveying the signature. Also the "notes" approach would end up mixing "push certificates" for different branches (this won't be an issue in your repository where there is only one branch) into a single "notes" tree. We would want to use something that behaves more like the "auto-following" semantics of tag objects. You would want to fetch only signatures that are attached to the commits you are fetching. Use of signed tags, or commit objects that can be signed in-place, have this property, but storing signature in notes tree does not give it to us. I think further discussions on this should continue on the git mailing list. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 31, 2011 at 11:23 AM, Junio C Hamano <gitster@pobox.com> wrote: > > It certainly lets you run "git tag --verify" after you pulled and will > give you assurance that you pulled the right thing from the right person, > but what do you plan to do to the tag from your lieutenants after you > fetched and verified? I count 379 merges by you between 3.0 (2011-07-21) > and 3.1 (2011-10-24), which would mean you would see 4-5 tags per day on > average. Will these tags be pushed out to your public history? No, you misunderstand. I can do that kind of "crazy manual check of a tag" today. And it's too painful to be useful in the long run (or even the short run - I'd much prefer the pgp signature in the email which is easier to check and more visible anyway). Fetching a tag by name and saving it as a tag is indeed pointless. But what would be nice is that "git pull" would fetch the tag (based on name) *automatically*, and not actually create a tag in my repository at all. Instead, if would use the tag to check the signature, and - if we do this right - also use the tag contents to populate the merge commit message. In other words, no actual tag would ever be left around as a turd, it would simply be used as an automatic communication channel between the "git push -s" of the submitter and my subsequent "git pull". Neither side would have to do anything special, and the tag would never show up in any relevant tree (it could even be in a totally separate namespace like "refs/pullmarker/<branchname>" or something). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2011 03:18 PM, Linus Torvalds wrote: > On Mon, Oct 31, 2011 at 11:23 AM, Junio C Hamano <gitster@pobox.com> wrote: >> >> It certainly lets you run "git tag --verify" after you pulled and will >> give you assurance that you pulled the right thing from the right person, >> but what do you plan to do to the tag from your lieutenants after you >> fetched and verified? I count 379 merges by you between 3.0 (2011-07-21) >> and 3.1 (2011-10-24), which would mean you would see 4-5 tags per day on >> average. Will these tags be pushed out to your public history? > > No, you misunderstand. > > I can do that kind of "crazy manual check of a tag" today. And it's > too painful to be useful in the long run (or even the short run - I'd > much prefer the pgp signature in the email which is easier to check > and more visible anyway). Fetching a tag by name and saving it as a > tag is indeed pointless. > > But what would be nice is that "git pull" would fetch the tag (based > on name) *automatically*, and not actually create a tag in my > repository at all. Instead, if would use the tag to check the > signature, and - if we do this right - also use the tag contents to > populate the merge commit message. > > In other words, no actual tag would ever be left around as a turd, it > would simply be used as an automatic communication channel between the > "git push -s" of the submitter and my subsequent "git pull". Neither > side would have to do anything special, and the tag would never show > up in any relevant tree (it could even be in a totally separate > namespace like "refs/pullmarker/<branchname>" or something). > Perhaps we should introduce the notion of a "private tag" or something along those lines? (I guess that would still have to be possible to push it, but not pull it by default...) -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 31, 2011 at 3:20 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > Perhaps we should introduce the notion of a "private tag" or something > along those lines? (I guess that would still have to be possible to > push it, but not pull it by default...) All tags are private by default. We actually *only* fetch tags if somebody explicitly asks for them (--tags), or when fetching from a named remote (and even then it will only fetch tags that point to objects you fetched by default iirc - you have to mark the remote specially to get *all* tags). But if you do the normal "git pull git://git.kernel.org/name/of/repo" - which is how things happen as a result of a pull request - you won't get tags at all - you have to ask for them by name or use "--tags" to get them all. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 31 Oct 2011, H. Peter Anvin wrote: > Perhaps we should introduce the notion of a "private tag" or something > along those lines? (I guess that would still have to be possible to > push it, but not pull it by default...) That's exactly what git does now, right? (unless you pull from a very specific remote).
On 10/31/2011 03:30 PM, Linus Torvalds wrote: > > But if you do the normal "git pull git://git.kernel.org/name/of/repo" > - which is how things happen as a result of a pull request - you won't > get tags at all - you have to ask for them by name or use "--tags" to > get them all. > Didn't realize that... I guess I'm too used to named remotes. If so, just using a tag should be fine, no? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 31, 2011 at 3:33 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > Didn't realize that... I guess I'm too used to named remotes. > > If so, just using a tag should be fine, no? Yes, that's what I think. But the argument for using a separate namespace is that (a) you never get confused (b) it would make it easier to make the 1:1 relationship between branch names and these "pull request signature tags" without limiting the naming of *normal* tags in any way (c) they do have separate lifetimes from "real" tags. But seriously, I don't care about the *implementation* all that much. If people want to use the crazy git "notes" capability, you can do that too, although quite frankly, I don't see the point. What actually matters is that "git push" and "git pull" would JustWork(tm), and check the signature if one exists, without having to cut-and-paste data that simply shouldn't be visible to the user. I abhor the interface Ingo suggested, for example. Why would we have stupid command line options that we should cut-and-paste? Automation is for computers, not for people. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
"H. Peter Anvin" <hpa@zytor.com> writes: > On 10/31/2011 03:30 PM, Linus Torvalds wrote: >> >> But if you do the normal "git pull git://git.kernel.org/name/of/repo" >> - which is how things happen as a result of a pull request - you won't >> get tags at all - you have to ask for them by name or use "--tags" to >> get them all. >> > > Didn't realize that... I guess I'm too used to named remotes. > > If so, just using a tag should be fine, no? So nobody is worried about this (quoting from my earlier message)? On the other hand, the consumers of "Linus kernel" may want to say that they trust your tree and your tags because they can verify them with your GPG signature, but also they can independently verify the lieutenants' trees you pulled from are genuine. A signed emphemeral tag is usable as means to verify authenticity in a hop-by-hop fashion, but that does not leave a permanent trail that can be used for auditing. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2011 03:44 PM, Junio C Hamano wrote: > "H. Peter Anvin" <hpa@zytor.com> writes: > >> On 10/31/2011 03:30 PM, Linus Torvalds wrote: >>> >>> But if you do the normal "git pull git://git.kernel.org/name/of/repo" >>> - which is how things happen as a result of a pull request - you won't >>> get tags at all - you have to ask for them by name or use "--tags" to >>> get them all. >>> >> >> Didn't realize that... I guess I'm too used to named remotes. >> >> If so, just using a tag should be fine, no? > > So nobody is worried about this (quoting from my earlier message)? > > On the other hand, the consumers of "Linus kernel" may want to say that > they trust your tree and your tags because they can verify them with your > GPG signature, but also they can independently verify the lieutenants' > trees you pulled from are genuine. > > A signed emphemeral tag is usable as means to verify authenticity in a > hop-by-hop fashion, but that does not leave a permanent trail that can be > used for auditing. > Well, the permanent trail is in the maintainer's tree, but that might still be suboptimal. The problem with Linus pulling those tags I assume that it makes the tree too noisy? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 31, 2011 at 03:44:25PM -0700, Junio C Hamano wrote: > So nobody is worried about this (quoting from my earlier message)? > > On the other hand, the consumers of "Linus kernel" may want to say that > they trust your tree and your tags because they can verify them with your > GPG signature, but also they can independently verify the lieutenants' > trees you pulled from are genuine. > > A signed emphemeral tag is usable as means to verify authenticity in a > hop-by-hop fashion, but that does not leave a permanent trail that can be > used for auditing. Oh, there are definitely people who worry about this. They tend to be security poeple, though, so the goal is how do we leave the permanent trail in a way that doesn't generate too much noise or otherwise makes life difficult for developers who don't care. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > But seriously, I don't care about the *implementation* all that much. > If people want to use the crazy git "notes" capability, you can do > that too, although quite frankly, I don't see the point. As I already said, I do not think notes is a good match as a tool to do this. > matters is that "git push" and "git pull" would JustWork(tm), and > check the signature if one exists, without having to cut-and-paste > data that simply shouldn't be visible to the user. > > I abhor the interface Ingo suggested, for example.... Some cut-and-paste (or piping the e-mail to a command) would be necessary evil, though, as you would have GPG keys from more than one trusted person in your keyring, and when you are responding to a pull-request from person A, finding a valid commit signed by person B should not be a success, but at least should raise a warning. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2011 03:49 PM, Ted Ts'o wrote: > On Mon, Oct 31, 2011 at 03:44:25PM -0700, Junio C Hamano wrote: >> So nobody is worried about this (quoting from my earlier message)? >> >> On the other hand, the consumers of "Linus kernel" may want to say that >> they trust your tree and your tags because they can verify them with your >> GPG signature, but also they can independently verify the lieutenants' >> trees you pulled from are genuine. >> >> A signed emphemeral tag is usable as means to verify authenticity in a >> hop-by-hop fashion, but that does not leave a permanent trail that can be >> used for auditing. > > Oh, there are definitely people who worry about this. They tend to be > security poeple, though, so the goal is how do we leave the permanent > trail in a way that doesn't generate too much noise or otherwise makes > life difficult for developers who don't care. > Could we introduce a tag namespace that doesn't show up in gitweb by default, and perhaps doesn't resolve in abbreviated form? This is basically what Linus suggested, as far as I understand: something like refs/pulls/hpa/tip-123-456 which is otherwise a normal tag object? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 31, 2011 at 3:44 PM, Junio C Hamano <gitster@pobox.com> wrote: > > So nobody is worried about this (quoting from my earlier message)? No, because you haven't been reading what we write. The tag is useless. The information *in* the tag is not. But it shouldn't be saved in the tag (or note, or whatever). Because that's just an annoying place for it to be, with no upside. Save it in the commit we generate. BAM! Useful, readable, permanent, and independently verifiable. And the advantage is that we can make that same mechanism add "maintainer notes" to the merge message too. Right now some maintainers write good notes about what the merge will bring in, but they are basically lost, because git is so good at merging and doesn't even stop to ask people to edit the merge message. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2011 03:52 PM, Linus Torvalds wrote: > > Save it in the commit we generate. BAM! Useful, readable, permanent, > and independently verifiable. > Note: this means creating a commit even for a fast-forward merge. Not that there is any technical problem with that, of course. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 31, 2011 at 3:51 PM, Junio C Hamano <gitster@pobox.com> wrote: > > Some cut-and-paste (or piping the e-mail to a command) would be necessary > evil, though, as you would have GPG keys from more than one trusted person > in your keyring, and when you are responding to a pull-request from person > A, finding a valid commit signed by person B should not be a success, but > at least should raise a warning. Why? The signer of the message needs to be printed out *anyway*. I can match that up with the pull request, the same way I already match up diffstat information. So any extra cut-and-paste is (a) stupid, (b) unnecessary and (c) annoying. It's also "bad user interface". The whole point is that we should make the user interface *good*. Which means that the pushing side should only need to add a "-s" to ask for signing, have to type his passphrase (and even that would go away when using gpg-agent or something), and perhaps a message (which would not be about the signing, but about something that could be added to the merge commit. And the receiving side would just do the "git pull" and automatically just get notified that "Yes, this push has been signed by key Xyz Abcdef" Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 31, 2011 at 3:54 PM, H. Peter Anvin <hpa@zytor.com> wrote: > On 10/31/2011 03:52 PM, Linus Torvalds wrote: >> >> Save it in the commit we generate. BAM! Useful, readable, permanent, >> and independently verifiable. >> > > Note: this means creating a commit even for a fast-forward merge. Not > that there is any technical problem with that, of course. Well, only for the signed case, but yes. And for that case it's likely a good thing. In fact, even without signing, some projects always use --no-ff, because they want the merge messages with the nice summary in them. I've played around with it too, but haven't generally found it to be worth it, and tend to think that it aggrandizes the merger too much. It generates nice merge summaries, and it can look nice, but if the *only* upside is the merge summary I think it's borderline worth it. But with a signature, it would suddenly actually contain real information, and I think that changes the equation. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2011 06:44 PM, Junio C Hamano wrote: > "H. Peter Anvin"<hpa@zytor.com> writes: > >> On 10/31/2011 03:30 PM, Linus Torvalds wrote: >>> >>> But if you do the normal "git pull git://git.kernel.org/name/of/repo" >>> - which is how things happen as a result of a pull request - you won't >>> get tags at all - you have to ask for them by name or use "--tags" to >>> get them all. >>> >> >> Didn't realize that... I guess I'm too used to named remotes. >> >> If so, just using a tag should be fine, no? > > So nobody is worried about this (quoting from my earlier message)? > > On the other hand, the consumers of "Linus kernel" may want to say that > they trust your tree and your tags because they can verify them with your > GPG signature, but also they can independently verify the lieutenants' > trees you pulled from are genuine. > > A signed emphemeral tag is usable as means to verify authenticity in a > hop-by-hop fashion, but that does not leave a permanent trail that can be > used for auditing. The main worry is Linus ($human_who_pulls) gets cryptographically-verified data at the time he pulls. Once Linus republishes his tree (git push), there will be few, if any, wanting to verify Jeff Garzik's signature. So no, I don't see that as a _driving_ need in the kernel's case. And IMO the kernel will be a mix of signed and unsigned content for a while, possibly forever. And Linus wrote: > [ Example gpg-signed small block that the attached patch adds to the > pull request: ] > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Commit be3fa9125e708348c7baf04ebe9507a72a9d1800 > from git.kernel.org/pub/git > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.18 (GNU/Linux) > > iQEcBAEBAgAGBQJOrsILAAoJEHm+PkMAQRiGxZcH/31e0RrBitXUPKxHJajD58yh > SIEe/7i6E2RUSFva3KybEuFslcR8p8DYzDQTPLejStvnkO8v0lXu9s9R53tvjLMF > aaQXLOgrOC2RqvzP4F27O972h32YpLBkwIdWQGAhYcUOdKYDZ9RfgEgtdJwSYuL+ > oJ7TjLrtkcILaFmr9nYZC+0Fh7z+84R8kR53v0iBHJQOFfssuMjUWCoj9aEY12t+ > pywXuVk2FsuYvhniCAcyU6Y1K9aXaf6w5iOY2hx/ysXtUBnv92F7lcathxQkvgjO > fA7/TXEcummOv5KQFc9vckd5Z1gN2ync5jhfnmlT2uiobE6mNdCbOVlCOpsKQkU= > =l5PG > -----END PGP SIGNATURE----- This is my preference for kernel pull requests at the moment. That has one advantage over Junio's "git pull --require-signature" and signed commits, notably, the URL is signed. But in general signed commits would be nice, too. pull-generated merge requests would need to be signed, potentially introducing an additional interactive step (GPG passphrase request) into an automated process. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > The main worry is Linus ($human_who_pulls) gets > cryptographically-verified data at the time he pulls. Once Linus > republishes his tree (git push), there will be few, if any, wanting to > verify Jeff Garzik's signature. > > So no, I don't see that as a _driving_ need in the kernel's case. > > And IMO the kernel will be a mix of signed and unsigned content for a > while, possibly forever. > I think the desire is to be able to deconstruct things if things were to go wrong. -hpa
On Mon, 2011-10-31 at 15:52 -0700, Linus Torvalds wrote: > On Mon, Oct 31, 2011 at 3:44 PM, Junio C Hamano <gitster@pobox.com> wrote: > > > > So nobody is worried about this (quoting from my earlier message)? > > No, because you haven't been reading what we write. > > The tag is useless. It's not useless to people who want to verify the tree after it's been released by you (say for forensics or something). As Peter said, we can put it in a normally invisible namespace, but having a flag to make it visible allows tools like git describe --contains to tell me which signed tag was used to send a particular commit. > The information *in* the tag is not. But it shouldn't be saved in the > tag (or note, or whatever). Because that's just an annoying place for > it to be, with no upside. > > Save it in the commit we generate. BAM! Useful, readable, permanent, > and independently verifiable. > > And the advantage is that we can make that same mechanism add > "maintainer notes" to the merge message too. Right now some > maintainers write good notes about what the merge will bring in, but > they are basically lost, because git is so good at merging and doesn't > even stop to ask people to edit the merge message. A signed empty commit containing the merge message as a comment also looks fine to me. We'd need extra tooling to say which signed merge corresponds to this patch, but I'd say its workable. The only slightly counter intuitive thing is that for a non-trivial merge, my signed merge description will have to be the next commit below rather than in the actual merge you do (because we can't alter a cryptographically signed commit). James -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > But what would be nice is that "git pull" would fetch the tag (based on > name) *automatically*, and not actually create a tag in my repository at > all. Instead, if would use the tag to check the signature, and - if we > do this right - also use the tag contents to populate the merge commit > message. > > In other words, no actual tag would ever be left around as a turd, it > would simply be used as an automatic communication channel between the > "git push -s" of the submitter and my subsequent "git pull". Neither > side would have to do anything special, and the tag would never show > up in any relevant tree (it could even be in a totally separate > namespace like "refs/pullmarker/<branchname>" or something). While I like the "an ephemeral tag is used only for hop-to-hop communication to carry information to be recorded in the resulting history" approach, I see a few downsides. * The ephemeral tag needs to stay somewhere under refs/ hierarchy of the lieutenant's tree until you pick it up, even if they are out of the way in refs/pullmarker/$branchname. The next time the same lieutenant makes a pull request, either it will be overwritten or multiple versions of them refs/pullmarker/$branchname/$serial need to be kept. - If the former, this makes forking of the project harder. Suppose a pull request is made, you fetch and reject it. The lieutenant reworks and makes another pull request. At this point the earlier signature is gone. If somebody disagreed with your rejection and wanted to run his tree with the initial version you rejected, his tree will not carry the signature from the lieutenant. - If the latter, then there needs to be a way to expire these pull markers when they no longer are useful (i.e. the signature in it is transcribed to a merge commit you create) [*1*]. But the party who has power to clean them (i.e. the lieutenant who owns the repository) is different from the party whose action determines when they no longer are necessary (i.e. you). In practice this would lead to these pull markers not cleaned at all [*2*]. * To verify the commit C that was taken from the tip of lieutenant's tree some time ago, one has to find the merge commit that has C as a parent, and look at the merge commit. For example "git log --show-signature" would either show or not show the authenticity of C depending on where the traversal comes from. You certainly can implement it that way, but "some child describes an aspect of its parent, but not necessarily all children do so" feels philosophically less correct than "the commit has data to describe itself". In your "ephemeral tag", the workflow for a developer (D) and his integrator (U) would look like this, I think. D$ until have something worth sending; do work; done D$ git push -s Enter passphrase: ... - "push" internally creates a pull marker that signs the commit object name this is pushing, among other things, and sends it along the primary payload D$ git pull-request; mail linus U$ git pull - "pull" notices the pull marker and fetches it as well; - "pull" GPG validates the pull marker; - When preparing a merge commit message, the contents of the pull marker is included in .git/MERGE_MSG The "in-commit signature" would give you 100% and your contributors 98% of that, I think. D$ until have something worth sending; do work; done - The final round of reworking is concluded with "commit -S", which would GPG sign the tip commit itself D$ git push - Nothing needs to change in the protocol nor "push" itself D$ git pull-request; mail linus U$ git pull - "pull" GPG validates the tip commit - Nothing unusual needs to happen to the resulting "merge" commit And as a bonus, the code is already there ;-). [Footnote] *1* The common ancestor discovery in fetch uses as many refs as it can to reduce the amount of data that needs to be transferred, and it is known to hurt performance of the initial advertisement exchange when there are too many useless refs. *2* Do casual git users even know how to remove refs in a remote/publishing repository? -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 1, 2011 at 12:47 PM, Junio C Hamano <gitster@pobox.com> wrote: > > While I like the "an ephemeral tag is used only for hop-to-hop > communication to carry information to be recorded in the resulting > history" approach, I see a few downsides. So I do agree. I'd actually be *happier* with a generic multi-line "branch description" thing that involves no git objects at all, just a nice description of what the branch is. The fact that you could also hide a signed version of the top-of-branch there would be kind of a side effect, and wouldn't be a requirement. I hate how anonymous our branches are. Sure, we can use good names for them, but it was a mistake to think we should describe the repository (for gitweb), rather than the branch. Ok, "hate" is a strong word. I don't "hate" it. I don't even think it's a major design issue. But I do think that it would have been nicer if we had had some branch description model. The only reason I suggest a tag is really because it would fit with existing tooling - especially the git transport protocol. So it's not that I actually think that a tag is the right way to describe (and sign) the branch, it's just that it's the way that wouldn't require any changes other than in "git push -s" and "git pull". > * To verify the commit C that was taken from the tip of lieutenant's tree > some time ago, one has to find the merge commit that has C as a parent, > and look at the merge commit. For example "git log --show-signature" > would either show or not show the authenticity of C depending on where > the traversal comes from. You certainly can implement it that way, but > "some child describes an aspect of its parent, but not necessarily all > children do so" feels philosophically less correct than "the commit has > data to describe itself". Yeah. Having thought about it, I'm also not convinced I really want to pollute the "git log" output with information that realistically almost nobody cares about. The primary use is just for the person who pulls things to verify it, after that the information is largely stale and almost certain to never be interesting to anybody ever again. It's *theoretically* useful if somebody wants to go back and re-verify, but at the same time that really isn't expected to be the common case. So I'm wondering if we want to save it at all. it's quite possible that realistically speaking "google the mailing list archives" is the *right* way to look up the signature if it is ever needed later. Maybe just verifying the email message (with the suggested kind of change to "git request-pull") is actually the right approach. And what I should do is to just wrap my "git pull" in some script that I can just cut-and-paste the gpg-signed thing into, and which just does the "gpg --verify" on it, and then does the "git pull" after that. Because in many ways, "git request-pull" is when you do want to sign stuff. A developer might well want to push out his stuff for some random internal testing (linux-next, for example), and then only later decide "Ok, it was all good, now I want to make it 'official' and ask Linus to pull it", and sign it at *that* time, rather than when actually pushing it out. And I suspect signing the pull request fits better into peoples existing workflow anyway - sending out the email to ask the maintainer to pull really is the "special event", rather than pushing out the code itself. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > Having thought about it, I'm also not convinced I really want to > pollute the "git log" output with information that realistically > almost nobody cares about. The primary use is just for the person who > pulls things to verify it, after that the information is largely stale > and almost certain to never be interesting to anybody ever again. It's > *theoretically* useful if somebody wants to go back and re-verify, but > at the same time that really isn't expected to be the common case. > ... > So I'm wondering if we want to save it at all. it's quite possible > that realistically speaking "google the mailing list archives" is the > *right* way to look up the signature if it is ever needed later. I'd rather want to hear opinions from people who base their work on public kernels (e.g. distros, and companies who roll their own prod kernels), on that. But my gut feeling is that "usually hidden not to disturb normal users, but is cast in stone in the history and cannot be lost" strikes the right balance. Both your "next merge commit records the signature together with the largely useless merge summary cruft but everybody learned to ignore it with 'log --no-merges' anyway so it does not hurt to have it there" and the commit signature topic from the next branch [*1*] that puts the signature in the object header and teaches '--show-signature' option to the log family to show it share this property. > Maybe just verifying the email message (with the suggested kind of > change to "git request-pull") is actually the right approach. And what > I should do is to just wrap my "git pull" in some script that I can > just cut-and-paste the gpg-signed thing into, and which just does the > "gpg --verify" on it, and then does the "git pull" after that. > > Because in many ways, "git request-pull" is when you do want to sign > stuff. A developer might well want to push out his stuff for some > random internal testing (linux-next, for example), and then only later > decide "Ok, it was all good, now I want to make it 'official' and ask > Linus to pull it", and sign it at *that* time, rather than when > actually pushing it out. > > And I suspect signing the pull request fits better into peoples > existing workflow anyway - sending out the email to ask the maintainer > to pull really is the "special event", rather than pushing out the > code itself. "I can silently push and re-push or even rewind-and-then-push until I officially send pull-request out" fits well with the "defer the decision as much as possible" model Git takes in general, and I find certain attractiveness in it. But on the other hand, in many ways, publishing your commit to the outside world, not necessarily for getting pulled into the final destination (i.e. your tree) but merely for other people to try it out, is the point of no return (aka "don't rewind or rebase once you publish"). "pushing out" might be less special than "please pull", but it still is special. Also there is nothing lost if you sign commits whenever you push them out. [Footnote] *1* Here are three examples on the same commit that is signed for illustration. ------------------------------------------------ $ git show -s pu commit c9d870fceac787fdb1c1c43b136c1a94ab2ab005 Merge: 8367c51 71f45ee Author: Junio C Hamano <gitster@pobox.com> Date: Mon Oct 31 20:06:58 2011 -0700 Merge branch 'jc/stream-to-pack' into pu * jc/stream-to-pack: Bulk check-in finish_tmp_packfile(): a helper function create_tmp_packfile(): a helper function write_pack_header(): a helper function ------------------------------------------------ $ git show -s --show-signature pu commit c9d870fceac787fdb1c1c43b136c1a94ab2ab005 gpg: Signature made Mon 31 Oct 2011 08:07:04 PM PDT using RSA key ID 96AFE6CB gpg: Good signature from "Junio C Hamano <gitster@pobox.com>" gpg: aka "Junio C Hamano <junio@pobox.com>" gpg: aka "Junio C Hamano <jch@google.com>" Merge: 8367c51 71f45ee Author: Junio C Hamano <gitster@pobox.com> Date: Mon Oct 31 20:06:58 2011 -0700 Merge branch 'jc/stream-to-pack' into pu * jc/stream-to-pack: Bulk check-in finish_tmp_packfile(): a helper function create_tmp_packfile(): a helper function write_pack_header(): a helper function ------------------------------------------------ $ git cat-file commit pu tree 9add290d468800c3c51ff68fedfb3d16427872ff parent 8367c51becc5a225b9a192348b7d7c615fb6d250 parent 71f45eeb8278670257bea83620f7d3eac174eee7 author Junio C Hamano <gitster@pobox.com> 1320116818 -0700 committer Junio C Hamano <gitster@pobox.com> 1320116824 -0700 gpgsig -----BEGIN PGP SIGNATURE----- gpgsig Version: GnuPG v1.4.10 (GNU/Linux) gpgsig gpgsig ... gpgsig =c62U gpgsig -----END PGP SIGNATURE----- Merge branch 'jc/stream-to-pack' into pu * jc/stream-to-pack: Bulk check-in finish_tmp_packfile(): a helper function create_tmp_packfile(): a helper function write_pack_header(): a helper function ------------------------------------------------ -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 01, 2011 at 02:21:59PM -0700, Linus Torvalds wrote: > So I'm wondering if we want to save it at all. it's quite possible > that realistically speaking "google the mailing list archives" is the > *right* way to look up the signature if it is ever needed later. Given the number of trees that you merge in every merge window (never mind over an entire release), I don't think "google the mailing list archives" is going to scale. Finding some way to keep it along with the merge window seems the right thing. I agree that it should hidden normally, but that's a UI display issue. Heck, we could just hide after the terminating NULL in the commit description, per a discussion on the git list 2-3 weeks ago. :-) > Because in many ways, "git request-pull" is when you do want to sign > stuff. A developer might well want to push out his stuff for some > random internal testing (linux-next, for example), and then only later > decide "Ok, it was all good, now I want to make it 'official' and ask > Linus to pull it", and sign it at *that* time, rather than when > actually pushing it out. Sure, the signed content should be buried in the commit that it describes. Whether we carry it in an emphemeral tag or in the git request-pull is not really important from a security perspective. The tag is nicer simply because the person doing the pull won't need to cut and paste the signature information. One approach which might work is if git request-pull sends the e-mail message with the git shortlog and diffstat, *and* an MIME attachment that contained all of the necessary information. The maintainer would then save the attachment, and feed it to git, which will display the git shortlog and diffstat, ask for confirmation, and then embed the digital signature into the merge commit. The only problem with that is (a) you'd have to get over your hatred of attachment (but if you're using Gmail hopefully that's relative convenient :-), and (b) LKML list filter would have to be taught to tolerate git-generated attachments. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Linus Torvalds <torvalds@linux-foundation.org> wrote: > And the receiving side would just do the "git pull" and > automatically just get notified that "Yes, this push has been > signed by key Xyz Abcdef" If this approach is used then it would be nice to have a .gitconfig switch to require trusted pulls by default: to not allow doing non-signed or untrusted pulls accidentally, or for Git to warn in a visible, hard to miss way if there's a non-signed pull. This adds social uncertainty (and an element of a silent alarm) to a realistic attack: the attacker wouldnt know exactly how the puller checks signed pull requests, it's kept private. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Junio C Hamano venit, vidit, dixit 01.11.2011 20:47: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> But what would be nice is that "git pull" would fetch the tag (based on >> name) *automatically*, and not actually create a tag in my repository at >> all. Instead, if would use the tag to check the signature, and - if we >> do this right - also use the tag contents to populate the merge commit >> message. >> >> In other words, no actual tag would ever be left around as a turd, it >> would simply be used as an automatic communication channel between the >> "git push -s" of the submitter and my subsequent "git pull". Neither >> side would have to do anything special, and the tag would never show >> up in any relevant tree (it could even be in a totally separate >> namespace like "refs/pullmarker/<branchname>" or something). > > While I like the "an ephemeral tag is used only for hop-to-hop > communication to carry information to be recorded in the resulting > history" approach, I see a few downsides. > > * The ephemeral tag needs to stay somewhere under refs/ hierarchy of the > lieutenant's tree until you pick it up, even if they are out of the way > in refs/pullmarker/$branchname. The next time the same lieutenant makes > a pull request, either it will be overwritten or multiple versions of > them refs/pullmarker/$branchname/$serial need to be kept. If we are interested in commit sigs, the easiest tag-based approach is to name the sig carrying tag by the commit's sha1. Just like the sig is tied (in)to a commit in Junio's approach, it would be indexed by it. We can do that now: git config --global alias.sign '!f() { c=$(git rev-parse "$1") || exit; shift; git tag -s $@ sigs/$c $c; }; f' But a different place rather than refs/tags/sigs/<sha1> will be more appropriate, so that we don't pollute the tag namespace. (Yes, this is similar to storing them in notes.) tags have a message etc. With an appropriate refspec, these sigs can be pushed out automatically (by the lieutenant). pull-request as in next will list the expected <sha1> at tip. git pull needs to learn to (fetch and) use refs/<whatever>/<sha1> to verify that the tip is signed. git log --show-signature can do the same tricks as with in-commit sigs. Some things to decide in this approach: - Should git-pull (pull sigs and) verify by default? - Should we worry about overwriting existings sigs? We have union-merge for notes already, and that would be appropriate for sigs. (Yes, our tags code does verify multiple concatenated sigs.) The advantage of tags is that they can be added without rewriting the commit, of course. Michael -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Wed, Nov 02, 2011 at 10:11:26AM +0100, Ingo Molnar wrote: > If this approach is used then it would be nice to have a .gitconfig > switch to require trusted pulls by default: to not allow doing > non-signed or untrusted pulls accidentally, or for Git to warn in a > visible, hard to miss way if there's a non-signed pull. > > This adds social uncertainty (and an element of a silent alarm) to a > realistic attack: the attacker wouldnt know exactly how the puller > checks signed pull requests, it's kept private. But that way you get a false sense of alarm when someone sent a perfectly trustable pull request, e.g. by signed email. Another question: If store the actual pgp/gpg signatures in the git tree, how do you handle signatures by keys which were valid by the time the signature was made but expired when checking some time afterwards? AFAICT, gpg will only tell you the key is expired _now_, and will make no statement regarding the time the actual signature was made. Thanks, Jochen. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael J Gruber <git@drmicha.warpmail.net> writes: > The advantage of tags is that they can be added without rewriting the > commit, of course. And you did neither think about the downsides of tags, nor read what others already explained for you? -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 1, 2011 at 2:56 PM, Junio C Hamano <gitster@pobox.com> wrote: > > But on the other hand, in many ways, publishing your commit to the outside > world, not necessarily for getting pulled into the final destination > (i.e. your tree) but merely for other people to try it out, is the point > of no return (aka "don't rewind or rebase once you publish"). "pushing > out" might be less special than "please pull", but it still is special. So I really think that signing the top commit itself is fundamentally wrong. That commit may not even be *yours*. You may have pulled it from a sub-lieutenant as a fast-forward, or similar. Amending it later would be actively very very *wrong*. So quite frankly, I think the stuff in pu (or next?) is completely mis-designed. Doing it in the commit is wrong for fundamental reasons, which all boil down to a simple issue: - you absolutely *need* to add the signature later. You *cannot* do it at "git commit" time. That's a fundamental issue both from a "workflow model" issue (ie you want to sign stuff after it has passed testing etc, but you may need to commit it in order to *get* testing), as well as from a "fundamental git datastructures" issue (ie you would want to sign commits that aren't yours. "git commit --amend" is not the answer - that destroys the fundamental concept of history being immutable, and while it works for your local commits, it doesn't work for anybody elses commits, or for stuff you already pushed out. And "add a fake empty commit just for the signature" is not the answer either - because that is clearly inferior to the tags we already had. I dunno. Did I miss something? As far as I can tell, the signed tags that we've had since day one are *clearly* much better in very fundamental ways. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Junio C Hamano venit, vidit, dixit 02.11.2011 19:58: > Michael J Gruber <git@drmicha.warpmail.net> writes: > >> The advantage of tags is that they can be added without rewriting the >> commit, of course. > > And you did neither think about the downsides of tags, nor read what > others already explained for you? We're just weighing things differently here, and no accusations of "misinformation" or "not thinking" will change this. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > And "add a fake empty commit just for the signature" is not the answer > either - because that is clearly inferior to the tags we already had. > > I dunno. Did I miss something? As far as I can tell, the signed tags > that we've had since day one are *clearly* much better in very > fundamental ways. Ok, back to the drawing board (which is not a loss as I wasn't expecting this to be in the official release in upcoming 1.7.8 anyway). -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > I hate how anonymous our branches are. Sure, we can use good names for > them, but it was a mistake to think we should describe the repository > (for gitweb), rather than the branch. > > Ok, "hate" is a strong word. I don't "hate" it. I don't even think > it's a major design issue. But I do think that it would have been > nicer if we had had some branch description model. > ... > Maybe just verifying the email message (with the suggested kind of > change to "git request-pull") is actually the right approach. And what > I should do is to just wrap my "git pull" in some script that I can > just cut-and-paste the gpg-signed thing into, and which just does the > "gpg --verify" on it, and then does the "git pull" after that. > > Because in many ways, "git request-pull" is when you do want to sign > stuff. A developer might well want to push out his stuff for some > random internal testing (linux-next, for example), and then only later > decide "Ok, it was all good, now I want to make it 'official' and ask > Linus to pull it", and sign it at *that* time, rather than when > actually pushing it out. You keep saying cut-and-paste, but do you mind feeding the e-mail text itself to a tool, instead of cut-and-paste? The reason I am wondering about this is because in another topic (also in 'next') cooking there is an extended support for topic description for the branch that states what the purpose of the topic is why the requestor wants you to have it (this information can be set and updated with "git branch --edit-description"). A respond-to-request-pull wrapper you would use could be: - Get the e-mail from the standard input; - Pick up the signed bits and validate the signature; - Perform the requested fetch; and - Record the merge (or prepare .git/MERGE_MSG) with both the signed bits. and the "signed bits" could include: - the repository and the branch you were expected to pull; - the topic description. among other things the requestor can edit when request-pull message is prepared. That would get us back to your "the lieutenant tip is not so special, but the merge commit the integrator makes using that tip has the signature for this particular pull" model. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2 Nov 2011, Junio C Hamano wrote: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> I hate how anonymous our branches are. Sure, we can use good names for >> them, but it was a mistake to think we should describe the repository >> (for gitweb), rather than the branch. >> >> Ok, "hate" is a strong word. I don't "hate" it. I don't even think >> it's a major design issue. But I do think that it would have been >> nicer if we had had some branch description model. >> ... >> Maybe just verifying the email message (with the suggested kind of >> change to "git request-pull") is actually the right approach. And what >> I should do is to just wrap my "git pull" in some script that I can >> just cut-and-paste the gpg-signed thing into, and which just does the >> "gpg --verify" on it, and then does the "git pull" after that. >> >> Because in many ways, "git request-pull" is when you do want to sign >> stuff. A developer might well want to push out his stuff for some >> random internal testing (linux-next, for example), and then only later >> decide "Ok, it was all good, now I want to make it 'official' and ask >> Linus to pull it", and sign it at *that* time, rather than when >> actually pushing it out. > > You keep saying cut-and-paste, but do you mind feeding the e-mail text > itself to a tool, instead of cut-and-paste? think webmail (i.e. gmail), to feed the e-mail itself to a tool you either need to cut-n-paste the entire e-mail or you have to first save the mail to a text file. both of which are significantly harder than doing a cut-n-past of a portion of the message. David Lang > The reason I am wondering about this is because in another topic (also in > 'next') cooking there is an extended support for topic description for the > branch that states what the purpose of the topic is why the requestor > wants you to have it (this information can be set and updated with "git > branch --edit-description"). > > A respond-to-request-pull wrapper you would use could be: > > - Get the e-mail from the standard input; > - Pick up the signed bits and validate the signature; > - Perform the requested fetch; and > - Record the merge (or prepare .git/MERGE_MSG) with both the signed bits. > > and the "signed bits" could include: > > - the repository and the branch you were expected to pull; > - the topic description. > > among other things the requestor can edit when request-pull message is > prepared. > > That would get us back to your "the lieutenant tip is not so special, but > the merge commit the integrator makes using that tip has the signature for > this particular pull" model. > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 2, 2011 at 4:34 PM, Junio C Hamano <gitster@pobox.com> wrote: > > You keep saying cut-and-paste, but do you mind feeding the e-mail text > itself to a tool, instead of cut-and-paste? Feeding the email to a tool is actually a fair amount of extra work. It would have worked well in the days when I used text-based email clients that just had a "pipe email to command" model, but that's long gone. In contrast, cut-and-paste to another program is easy - but then you really can't depend on whitespace or headers or other subtle things. > A respond-to-request-pull wrapper you would use could be: > > - Get the e-mail from the standard input; > - Pick up the signed bits and validate the signature; > - Perform the requested fetch; and > - Record the merge (or prepare .git/MERGE_MSG) with both the signed bits. So is there any reason this couldn't be cut-and-paste? Make the signed part small (*not* including diffstat and shortlog), and make it whitespace-safe, and I wouldn't mind a tool at all. If it *can* take the whole email, that would probably be a good design (so that a "pipe email to command" model would still work), but it would be much better if it doesn't require it. > and the "signed bits" could include: > > - the repository and the branch you were expected to pull; > - the topic description. > > among other things the requestor can edit when request-pull message is > prepared. One thing I'd like is that it would also fire up an editor for the merge, even if it gets the topic description from the email or cut-and-paste. I often want to fix up peoples grammar etc. That's a separate argument for trying to keep the signed part minimal - because I really don't want to have to maintain spelin errors just because they are part of what was signed.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 2, 2011 at 13:04, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Nov 1, 2011 at 2:56 PM, Junio C Hamano <gitster@pobox.com> wrote: >> >> But on the other hand, in many ways, publishing your commit to the outside >> world, not necessarily for getting pulled into the final destination >> (i.e. your tree) but merely for other people to try it out, is the point >> of no return (aka "don't rewind or rebase once you publish"). "pushing >> out" might be less special than "please pull", but it still is special. > > So I really think that signing the top commit itself is fundamentally wrong. I really disagree. I like the signed commit approach. It allows for a lot more workflows than just providing a way for you to validate a pull from a trusted lieutenant. Debian/Gentoo folks want a way to sign every commit in their workflow. Just because you don't want that and think its crazy doesn't mean its not a valid workflow for that community and is something Git shouldn't support. I never use `git stash`. I hate the damn command. Yet its still there. I just choose not to use it. Junio's gpgsig header on each commit is also optional, and communities/contributors can choose to use (or ignore) the feature as they need to. > That commit may not even be *yours*. You may have pulled it from a > sub-lieutenant as a fast-forward, or similar. Amending it later would > be actively very very *wrong*. Obviously you shouldn't amend a commit that would otherwise be a fast-forward. But why not write a new empty signed commit on top, and teach `git log` without the verify signatures flag to skip over commits that have a gpgsig header line, have exactly one parent, and whose parent tree matches the commit's own tree? This removes these commits from the normal `git log` revision output, but yet the flow of changes is still very visible within the history. As I understand it, the point of multiple Signed-off-by lines in commit message bodies is to show the flow of a change, who reviewed and applied a given commit, until it finally lands in a tree where its commit SHA-1 is frozen in stone and you can later pull it. The empty signed commit on top of a fast-forward provides that same flow of a change, readily visible with standard `git log` tools, but doesn't have to clutter up history if we teach log how to skip this particular type. Similar to the --no-merges way to skip merges. :-) > So quite frankly, I think the stuff in pu (or next?) is completely > mis-designed. Doing it in the commit is wrong for fundamental reasons, > which all boil down to a simple issue: Totally disagree. I'm really in favor of embedding these into the commit headers the way Junio has done. > - you absolutely *need* to add the signature later. You *cannot* do > it at "git commit" time. Why can't you add it at commit time? What is stopping me from running `git commit -S` every time I make a commit? Is it that my fingers will wear out more quickly because I have to type my pass-phrase too often? What is wrong with making a signed commit on a commit I have a high level of confidence in, but not signing the others? In my own workflow I make a lot of commit --amends / rebases until I am pretty confident in the code being written and organized the way I think it should be for distribution to others. But at some point in that workflow I'm doing an --amend or a rebase to make that last final touch, and during that commit I can add -S to make it signed, because I'm pretty certain its ready to go. At that point, barring some horrific bug or reviewer comments, I am unlikely to change the commit. I know at the time I make that commit that I am pretty confident in the commit, so I take the extra few key strokes to sign it. > That's a fundamental issue both from a "workflow model" issue (ie you > want to sign stuff after it has passed testing etc, Why do I have to wait until its tested to sign it? The gpgsig signature isn't any more special than the Signed-off-by line I put into my commit message to agree to the developer's certificate of origin, nor is it any more special than the committer line in the commit header. Its just a statement on the commit that I have a reasonable enough confidence in the value of this particular commit and its ancestors that I should take the time to unlock my GPG key and sign the content in case I do distribute this to others. If you are going to spend time testing a commit, its probably going to take longer to perform that testing than it is to perform the GPG key unlock and signature. So why are you complaining about the time it takes to sign something you think is worthy of testing? If the tests fail, you'll need to rewind/amend/whatever to address the breakage. If the tests pass, the commit is already signed and ready for distribution. If you are spending a lot of time signing commits that are highly likely to fail tests, well, maybe you should look at other ways to improve your workflow so that you have a higher level of confidence in the code you record and assume will be a permanent part of the project's history. > but you may need > to commit it in order to *get* testing), Maybe consider allowing a ".dirty" suffix like git-core does on builds? Or if you are submitting the code to a remote test cluster that auto-compiles the code for you (and that is why you need a commit), it sounds like the time it takes for that to push, compile, test, and report back is way higher than the time it takes to make the signature. So you probably should only be submitting something that you had a reasonable level of confidence in. So you should go ahead and sign it before sending it for testing, in case the tests do pass and you want to publish that commit. > as well as from a > "fundamental git datastructures" issue (ie you would want to sign > commits that aren't yours. Sure. But this is why you can make an empty commit and sign that. > "git commit --amend" is not the answer - that destroys the fundamental > concept of history being immutable, and while it works for your local > commits, it doesn't work for anybody elses commits, or for stuff you > already pushed out. Nobody said you had to amend everything. You can add an empty commit. > And "add a fake empty commit just for the signature" is not the answer > either - because that is clearly inferior to the tags we already had. Really? I disagree. The commit DAG scales quite well. The tag namespace does not. A refs/signatures/$COMMIT_SHA1 namespace also does not scale well. An empty commit with a gpgsig header has about the same object cost as an annotated tag once packed. But it has the advantage that the damn thing doesn't clog up the reference space, the reference handling code, or the advertisements in the native protocol. As history goes on, older signatures are less relevant, and automatically are avoided/skipped/bypassed by the normal DAG walking code. Tags don't do this well because they have no relationship to the project history. The only downside to an empty commit with the gpgsig header is I cannot grab an arbitrarily deep ancestor and say "Who has signed a commit that depends on this"? Today we already have this with git describe --contains (aka git name-rev) for annotated tags. Its a new feature we have to teach to some part of the log machinery, but the algorithm will be easier because it doesn't have to mess with the mapping table of tag objects. It just has to start digging from roots, remembering each commit that has a gpgsig on any given branch path, and then outputting the matches when it finds the commit in question. The commit approach also has the advantage that your tree automatically carries any lieutenant's signatures, by virtue of them already being frozen in the commits. This allows anyone downstream of you to verify the same signatures, and check them against their own keyring contents. If the signatures are all detached in some transient annotated tag space, its impossible for anyone other than you to verify pull requests. I would hate to say we have this nice distributed version control system, but only Linus can prove the pull requests in his repository are what they claim, and we have to then implicitly trust you to resign that data without the original signatures being present. $DAY_JOB would feel a lot better about the integrity of the Linux kernel repository if _ANYONE_ can validate pull requests offline after they have happened. > I dunno. Did I miss something? As far as I can tell, the signed tags > that we've had since day one are *clearly* much better in very > fundamental ways. Completely disagree. :-) -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 2, 2011 at 6:02 PM, Shawn Pearce <spearce@spearce.org> wrote: >> >> So I really think that signing the top commit itself is fundamentally wrong. > > I really disagree. I like the signed commit approach. If you like it so much, go ahead and use them. But stop with the crazy excuses for the downsides. I explained exactly why amending is stupid and wrong, and why empty commits are f*cking moronic. But even apart from the *technical* problems with the stupid mis-designed feature, I explained why it was fundamentally broken from a workflow standpoint too. I'm not saying that you shouldn't use them - go ahead and use the feature if you like it. But please spare me your excuses for stupid workarounds that come from the fact that they aren't a good match for sane workflows. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 2, 2011 at 6:19 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > I'm not saying that you shouldn't use them - go ahead and use the > feature if you like it. But please spare me your excuses for stupid > workarounds that come from the fact that they aren't a good match for > sane workflows. Btw, having now done odd things with signed tags (because we've used them as a side-band verification mechanism), I can certainly also say that the signed tags have their set of problems too. So signed tags aren't perfect. They were designed for making releases, and that shows very clearly in how git works with them. The default choices that git makes are very awkward indeed when you use signed tags as "security tokens". But unlike the "sign the commit" approach, those are implementation and UI issues, not "fundamentally broken design" issues. For example, fetching a single signed tag with git is surprisingly hard. It *shouldn't* be hard - and there's no underlying technical or design reason why it would be hard, but it is. Why? Because all the git actions when it comes to tags are all geared towards one particular use, that is *not* about the signature checking aspect of them. Here's an example: Rusty Russell now makes nice signed tags for the things he asks me to pull, and then states them in the pull message. So he will mention that he has a tag named rusty@rustcorp.com.au-v3.1-8068-g5087a50 in his git repository at git://github.com/rustyrussell/linux.git and while I don't think his tag names are all that wonderful, it makes sense from an automated script kind of standpoint. Now, let's try to get that tag: [torvalds@i5 linux]$ git fetch git://github.com/rustyrussell/linux.git rusty@rustcorp.com.au-v3.1-8068-g5087a50 fatal: Couldn't find remote ref rusty@rustcorp.com.au-v3.1-8068-g5087a50 oops. Ok, so his tag naming is *really* akward. Whatever. Let's try again: [torvalds@i5 linux]$ git fetch git://github.com/rustyrussell/linux.git refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50 From git://github.com/rustyrussell/linux * tag rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> FETCH_HEAD Ahh, success! Oops. Nope. It turns out that git will *peel* the tag when you fetch it, so FETCH_HEAD actually doesn't contain the tag object at all, but the commit object that the tag pointed to. MAJOR FAIL. Quite frankly, I think that's a git bug, but it's a git bug because "git fetch" was designed to get the commit to merge. Fair enough. Let's work around it, and rename the tag at the same time: [torvalds@i5 linux]$ git fetch git://github.com/rustyrussell/linux.git refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50:refs/tags/rusty From git://github.com/rustyrussell/linux * [new tag] rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> rusty * [new tag] rusty@rustcorp.com.au-v3.1-2-gb1e4d20 -> rusty@rustcorp.com.au-v3.1-2-gb1e4d20 * [new tag] rusty@rustcorp.com.au-v3.1-4896-g0acf000 -> rusty@rustcorp.com.au-v3.1-4896-g0acf000 * [new tag] rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> rusty@rustcorp.com.au-v3.1-8068-g5087a50 WTF? Now we finally *did* get the tag, and we can do git verify-tag rusty and that will work. But what the hell happened? We got three other tags too that we didn't even ask for! So we have actual git bugs here, that relate to the fact that we've treated signed tags specially, and have magic code to basically say "if there's a signed tag that is reachable from the thing you pull, and you're not just doing a temporary pull into FETCH_HEAD, we'll fetch that signed tag too". Again - not a fundamental design mistake in the data structures, and it actually made sense from a "signed tags are important release points" standpoint, but it makes it *really* inconvenient to use signed tags for signature verification. Also, the fact that the signed tag gets peeled when we do fetch into FETCH_HEAD also means that we can't actually save the signature in resulting the merge commit. The merge, instead of being able to perhaps save the information that we merged a nice trusted signed point, only has the commit. But practically, all of these issues should be pretty easily solvable. So it should be quite easy to make git pull <repo> <tag-name> just do the right thing - including verifying the tag, and adding the information in the tag into the merge commit message. So signed tags are not mis-designed from a conceptual standpoint - they just work really really awkwardly right now for what the kernel would like to do with them. With a few UI fixes, I think the signed tag thing would "just work". That said, I do think that the "signature in the pull request" should also "just work", and I'm not entirely sure which one is better. It might be more convenient to get the signature data from the pull request. So I'm not at all married the the notion of using signed tags for this. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 2, 2011 at 18:45, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Nov 2, 2011 at 6:19 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> I'm not saying that you shouldn't use them - go ahead and use the >> feature if you like it. But please spare me your excuses for stupid >> workarounds that come from the fact that they aren't a good match for >> sane workflows. We often disagree. :-) > Btw, having now done odd things with signed tags (because we've used > them as a side-band verification mechanism), I can certainly also say > that the signed tags have their set of problems too. ... > But practically, all of these issues should be pretty easily solvable. > So it should be quite easy to make > > git pull <repo> <tag-name> > > just do the right thing - including verifying the tag, and adding the > information in the tag into the merge commit message. Uhm, sure. Quoting you 2 days ago: On Mon, Oct 31, 2011 at 15:52, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Mon, Oct 31, 2011 at 3:44 PM, Junio C Hamano <gitster@pobox.com> wrote: >> >> So nobody is worried about this (quoting from my earlier message)? > > No, because you haven't been reading what we write. > > The tag is useless. > > The information *in* the tag is not. But it shouldn't be saved in the > tag (or note, or whatever). Because that's just an annoying place for > it to be, with no upside. > > Save it in the commit we generate. BAM! Useful, readable, permanent, > and independently verifiable. So you propose we put the tag contents into the merge commit message so it can be verified after the fact? So merges are now going to be something much more horrific to read, because it will end with Git object tag cruft, the tag message, and the PGP signature spew that no human can decode in the head? Oh, right, tags are almost good enough. Elsewhere in this thread you also stated we have to redo the way tags are signed so that the tag message body itself is not part of the signature, allowing you to fix spelin errors so you are not stuck with them in your commit history. But I assume we will have to keep the more typical headers of object / type / tag / tagger fields, as that is the key information the signature needs to be over to be of any value. So now there will be two different ways in which a Git annotated tag object will have its signature created, as certainly you don't mean to remove the tag message body from the PGP signature content for release tags. I fail to see how shoving Git object data fields and a complete PGP signature block into a merge commit message body, which will show by default in all git log type tools, and exist in cherry-picks or rebases that might make that data less valuable, is somehow better than the gpgsig header that neatly tucks it away until requested. I also fail to see how scraping the message body for the proper fields in order to implement automated verification of the signature (because no human can do it themselves and copy-paste sucks) is a good idea. Everywhere else in Git that we have machine readable formats its very well structured so that no guessing is required. > So signed tags are not mis-designed from a conceptual standpoint - > they just work really really awkwardly right now for what the kernel > would like to do with them. > > With a few UI fixes, I think the signed tag thing would "just work". Well, UI fixes, protocol changes, improvements to manage a large reference space which we have previously said is an insane and stupid workflow, etc. One reason you picked up all of those extra tags was the include-tag capability kicking on and picking up older tag history. We now have to disable it in certain cases. Its not just a few UI fixes. And there is a lot more work to write a verify for the tag contents+signature that appears in the body of a merge commit message. Not to mention we now have to do that verify logic twice, once in the signed pull request tag like but not quite a tag but uses a tag thing you are advocating, and again for the merge commit message body that contains the tag object data that we don't normally show to an end user, but will now be in every merge commit you make. Go ahead and call me stupid, but this already is a bigger amount of surgery to the git-core code, not to mention worse user experience for the average `git log` reading human, than having a hidden by default gpgsig header that might ask a contributor to take 2 extra seconds before making a commit to consider the useful lifespan of that commit. Or $DEITY forbid, write a new empty commit to record the equivalent of their Signed-off-by. Oh, and while I am on that subject... <rant> I have never grasped why sometimes a Signed-off-by is added to a patch, and why sometimes its not. It seems to be this weird function of "If the commit SHA-1 is already stable DON'T FUCKING TOUCH IT BY ADDING SIGNED-OFF-BY IT RUINS THE HISTORY", but if you are too far down the food chain to be fortunate enough for your commit SHA-1 to remain frozen, the Signed-off-by has to be added to assert that the code can be contributed. It sounds like the workflow developed around where it wasn't acceptable to force history rewriting, you suffer by not having the SOB, but whenever possible you force a history rewrite on the contributor just so you can add a SOB and feel good about the fact that the SOB is added to the commit message. Get over it. Add the fucking empty commit to show the flow of a change. Stop forcing every fucking contributor to rebase/rewrite his commits just so someone higher up in the food chain can wank with their SOB line. Everyone I talk to that contributes code to the kernel who isn't Linus or Ted Tso complains about this, and then asks me to fucking fix it. They want stable SHA-1s so they know their change arrived into Linus' tree unmolested. Unfortunately, despite their volume of changes, they aren't high enough in the food chain to be this lucky. Nope, someone has to wank their SOB in first. And maybe fix a spelin error. </rant> -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 2, 2011 at 7:14 PM, Shawn Pearce <spearce@spearce.org> wrote: > > So you propose we put the tag contents into the merge commit message > so it can be verified after the fact? So merges are now going to be > something much more horrific to read, because it will end with Git > object tag cruft, the tag message, and the PGP signature spew that no > human can decode in the head? Actually, I wanted to just drop the damn thing. To me, the point of the tag is so that the person doing the merge can verify that he merges something trusted. However, everybody else seems to disagree, and wants that stupid signature to live along in the repository. And I can live with that, although I do agree with you that it's not exactly pretty. I can live with "ugly signature that I don't care for" way more than "stupid design". Because unlike your crazy empty commit, it at least fits the workflow, and it certainly isn't any uglier that extraneous pointless commit. You can disagree. You obviously do. I simply don't care. Because I'm right. (And your claim that it's big UI fixes and protocol changes is pure and utter garbage. I just sent a patch that cleans the code up, removes a line that improperly drops information and gets rid of the biggest problem with our current handling of tags. No protocol changes involved, no big UI fixup). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 2, 2011 at 7:14 PM, Shawn Pearce <spearce@spearce.org> wrote: > > <rant> I'm answering this separately, because it's a separate rant. It's also totally bogus, but whatever. > Get over it. Add the fucking empty commit to show the flow of a > change. Stop forcing every fucking contributor to rebase/rewrite his > commits just so someone higher up in the food chain can wank with > their SOB line. Shawn, stop using whatever drugs you are using. NOBODY EVER REBASES ANYTHING FOR SIGNED-OFF-BY. If they do, they are doing things very very wrong. Signed-off-by: is *purely* for sending patches by email. No git operations involved. None. Nada. Zilch. No rebasing involved, because there's not even a git repository involved, for chissake! Once something is in git, it's not signed off on - there should be a sign-off-chain from the author to the committer, and that's it. Anything else would be crazy. So stop the crazy rants. Stop with the bad drugs. Seriously. You're acting crazy. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 02, 2011 at 06:02:37PM -0700, Shawn O. Pearce wrote: > > So I really think that signing the top commit itself is fundamentally wrong. > > I really disagree. I like the signed commit approach. It allows for a > lot more workflows than just providing a way for you to validate a > pull from a trusted lieutenant. Debian/Gentoo folks want a way to sign > every commit in their workflow. Just because you don't want that and > think its crazy doesn't mean its not a valid workflow for that > community and is something Git shouldn't support. I never use `git > stash`. I hate the damn command. Yet its still there. I just choose > not to use it. Junio's gpgsig header on each commit is also optional, > and communities/contributors can choose to use (or ignore) the feature > as they need to. Stop for a minute and think about what it _means_ to sign a commit. Is it saying "I wrote this commit?" Or "I think this commit is good?" Or "I think all of the history leading to this is good?" It's obviously going to be a per-project thing, but it's very constricting. Leaving aside all of the workflow issues Linus brought up (but which I do agree with), think about what it would mean for Linus to fetch a commit from a lieutenant and then sign it. Whatever it means, it can really only be _one_ thing. But big projects that are interested in signatures probably want to say more. They want to say "this developer really wrote this commit". They want to say "QA passed this commit". They want to say "the history up to here looks good". And so on. But they can't say those things without binding some data to the commit (i.e., making a certificate saying "this commit passed QA"). Data which might only make sense to assert much later than the commit is written. So you're going to need to support detached commit signatures in some form anyway to make everybody happy. Which isn't to say in-commit signatures are wrong, but they are just one tool in a toolbox. Personally, I think the only thing that makes sense to assert inside a commit itself is that you are the author, and the author line of the key should match the email UID of the signing key. And then anything you want to say about _other_ people's commits (or even your own commits, but later) should come in the form of detached signatures with some content. That's how signed tags work. It's not just Linus signing a commit. It's Linus signing a binding between a commit and the statement "this is v2.6.28". The only thing wrong with the signed tag model for more general use is that you need some way of naming and organizing large numbers of tags (e.g., several per commit if you have things like QA signatures). -Peff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 02, 2011 at 10:55:32PM -0400, Jeff King wrote: > But big projects that are interested in signatures probably want to say > more. They want to say "this developer really wrote this commit". They > want to say "QA passed this commit". They want to say "the history up to > here looks good". And so on. On the Gentoo side, we've also pondered the question of: author != committer != pusher And how to preserve many signatures from sources. We're on a central repo model, with some ~250 committers. I was originally primarily after the push certificates/signed-push, and recording that data in the notes, but that still has the problems of third-party verification as mentioned in the thread. If we require that the tip of every push is a signed commit via a hook, we get knowledge of the pushers. Either your real commit itself is signed, or you have a signed merge commit on top, or you have a signed empty commit. In all of the cases, I can verify your signature at the recv hook. Having signed push in this case has a benefit that you could ship the data as a bundle, or async from the signing. The QA value of multiple signatures per commit is also valuable, to assert SOB WITHOUT altering the commit. I see spearce's rant and the retort, and really think there needs to be a middle ground - some of commits that are coming from pulls, and not getting additional SOB, could really benefit from them being recorded (I see them on mailing lists, but not introduced since that would break 'stable' IDs). > But they can't say those things without binding some data to the commit > (i.e., making a certificate saying "this commit passed QA"). Data which > might only make sense to assert much later than the commit is written. > > So you're going to need to support detached commit signatures in some > form anyway to make everybody happy. Which isn't to say in-commit > signatures are wrong, but they are just one tool in a toolbox. I was proposing that Git supports _all_ of these models: - signed commits - signed pushes (via certs) - whatever signed lightweight tag idea happens - existing annotated tags Choices. Each with their own costs and advantages.
Hi, On Wed, Nov 02, 2011 at 07:25:17PM -0700, Linus Torvalds wrote: > To me, the point of the tag is so that the person doing the merge can > verify that he merges something trusted. > > However, everybody else seems to disagree, and wants that stupid > signature to live along in the repository. It seems quite useless and leading to false conclusions in several cases where the merger's gpg output differs from someone's checking later on, e.g. when - the signing key has been revoked in the mean time (for whatever reasons) - the signing key has expired - the public part of the signing key is not available for the general public. AFAIK gpg just gives you an error code and a message like e.g. "Key has expired" without stating if the key was valid _when signing the commit_. How do you plan to handle this when keeping the signature in the repository? Or am I overlooking something? Thanks, Jochen. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 2, 2011 at 8:22 PM, Jochen Striepe <jochen@tolot.escape.de> wrote: > > It seems quite useless and leading to false conclusions in several cases > where the merger's gpg output differs from someone's checking later on, > e.g. when > > - the signing key has been revoked in the mean time (for whatever > reasons) > - the signing key has expired > - the public part of the signing key is not available for the general > public. So I don't think those are *big* issues. Sure, you'd want the public key to be public for it to make any real sense to save, but on the other hand, they *are* generally public. Yes, yes, you might have keys that are only used - and only made public - within some particular organization, but in that case the source code that gets signed with those keys would tend to be private to that organization too, so.. And yes, keys get revoked or they expire, but that's still a pretty rare event, so it doesn't really invalidate the argument that making the original signed content available can quite often be useful - even if it's not guaranteed to *always* be useful. No, my main objection to saving the data is that it's ugly and it's redundant. Sure, in practice you can check the signatures later fine (with the rare exceptions you mention), but even when you can do it, what's the big upside? And there are much bigger real downsides, imho. For example, let's say that we do eventually end up switching from SHA1 to SHA256 in git, and we do a full re-import of the tree. Guess what? All those signatures are now just so much garbage. Sure, you can recreate them (create some trusted script that you agree does a 1:1 transform, and re-sign everything), but in practice you can't ever really do that - because all those things are tied to the tree, so you need to have *everybodys* private keys in one place to do so. And the people who signed things initially would have to be insane to allow that. So I'm actually of the opinion that "internal signatures" are bad design at a rather fundamental level. In contrast, the "external signed tags" are fine: it's not just that there are much fewer of them, it's that they are *independent*. So you can easily re-generate the signed tags, because each signer can *individually* decide to validate the newly converted tree, and sign off on the fact that the conversion was done identically using new external tags with signatures. This was one of the reasons I made the signed tags work the way they do. And it wasn't because I was extremely far-sighted and thought of all the problems that internal signatures have - it's because monotone had their internal signatures, and every other email on the monotone list was about all the problems it caused. > AFAIK gpg just gives you an error code and a message like e.g. "Key has > expired" without stating if the key was valid _when signing the commit_. > > How do you plan to handle this when keeping the signature in the > repository? Or am I overlooking something? So see above - I just wouldn't worry about it. The possible few cases where it would occur are dwarfed by the cases where it *doesn't* occur, and those are the ones I'd concentrate on. They are the ones that need to be important enough that it's even worth carrying the random noise around. Are they? So I do think that there are real upsides at the *process* level where you can use the signatures to verify that what is pulled is pulled from the person you thought it was. I don't think anybody disputes those advantages. But outside of that I think it gets very gray, and there real disadvantages. That said, I don't care *that* much. I don't mind polluting the merge commits with information that I don't think is really worth it. So I'd be willing to carry the signature information around, although I'd hope to minimize it and have some sane way to hide it. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > [torvalds@i5 linux]$ git fetch > git://github.com/rustyrussell/linux.git > rusty@rustcorp.com.au-v3.1-8068-g5087a50 > fatal: Couldn't find remote ref rusty@rustcorp.com.au-v3.1-8068-g5087a50 > > oops. Ok, so his tag naming is *really* akward. Whatever. It is not "Whatever". $ git fetch git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v3.0 fatal: Couldn't find remote ref v3.0 I do not think we ever DWIMmed fetch refspecs to prefix refs/tags/, so it is not the naming but fetching tags without saying "git fetch tag v3.0" (which IIRC was your invention long time ago). If we changed this "git fetch $there v3.0" to fetch tag, it would help the final step in your illustration, and I do not think it would be a huge regression---the only case it becomes fuzzy is when they have v3.0 branch at the same time, but the owner of such a repository is already playing with fire. > [torvalds@i5 linux]$ git fetch > git://github.com/rustyrussell/linux.git > refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50 > From git://github.com/rustyrussell/linux > * tag > rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> FETCH_HEAD > > Ahh, success! > > Oops. Nope. It turns out that git will *peel* the tag when you fetch > it, so FETCH_HEAD actually doesn't contain the tag object at all, but > the commit object that the tag pointed to. MAJOR FAIL. > > Quite frankly, I think that's a git bug, but it's a git bug because > "git fetch" was designed to get the commit to merge. Fair enough. And because FETCH_HEAD started as (and probably still is) an internal implementation detail of communication between fetch and merge inside pull. So I do not have any issue in changing it to store tags unpeeled there. > [torvalds@i5 linux]$ git fetch > git://github.com/rustyrussell/linux.git > refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50:refs/tags/rusty > From git://github.com/rustyrussell/linux > * [new tag] > rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> rusty > * [new tag] > rusty@rustcorp.com.au-v3.1-2-gb1e4d20 -> > rusty@rustcorp.com.au-v3.1-2-gb1e4d20 > * [new tag] > rusty@rustcorp.com.au-v3.1-4896-g0acf000 -> > rusty@rustcorp.com.au-v3.1-4896-g0acf000 > * [new tag] > rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> > rusty@rustcorp.com.au-v3.1-8068-g5087a50 > > WTF? This is not WTF but "fetching a history to store the tip of it in your refs/ namespace causes tags pointing into the history line followed automatically", and it exactly is what you want to happen if rusty asked you to fetch his for-linus branch (which the tag may point at) instead. > We got three other > tags too that we didn't even ask for! We could change the rule to read "fetching a history to store the tip of it in your refs/heads namespace causes autofollow". I am not sure if that is what we really want, though. > Again - not a fundamental design mistake in the data structures, and > it actually made sense from a "signed tags are important release > points" standpoint, but it makes it *really* inconvenient to use > signed tags for signature verification. We could update three things: - DWIM $name in "git fetch $there $name" to refs/tags/$name when it makes sense; - FETCH_HEAD stores unpeeled object names; and - "git pull" learns --verify option. Then $ git pull --verify rusty rusty@rustcorp.com.au-v3.1-8068-g5087a50 could integrate the history leading to that tag to your current branch while running verify-tag on it. For this, disabling the tag-auto-following is not necessary, as you are not storing the retrieved tag anywhere. That is a longwinded way to say I agree what you said below. > So signed tags are not mis-designed from a conceptual standpoint - > they just work really really awkwardly right now for what the kernel > would like to do with them. > > With a few UI fixes, I think the signed tag thing would "just work". > > That said, I do think that the "signature in the pull request" should > also "just work", and I'm not entirely sure which one is better. I do not think it is necessarily either/or choice. Either way does not solve anything other than validating the last hop between the last lieutenant to the integrator without having a way to give the verification material to third parties. Your earlier "pull request signature could be copied into the message of the merge that integrates the pulled history" solves 90% of the "third party validation" issue. With the signed tags approach, you could push out these signed tags you get from lieutenants, but there are quite a few things that need to happen for it to be usable: - You or your lieutenants do not want to keep these tags in your working repository, to be listed in "git tag -l". They are ephemeral to you and your lieutenant, even though they have to be permanent for third party auditors. - Normal users of your project do not want to see them in "git tag -l" either. - Responses to "git fetch" and "git ls-remote" produced by "git upload-pack" do need to (optionally) include them to allow third party auditors to ask for them. I wonder if an approach like the following, in addition to the three things I listed above, may give us a workable solution: * "git fetch linus v3.0" called by "git pull --verify linus v3.0" fetches the v3.0 unpeeled into FETCH_HEAD, GPG verifies it, creates refs/audit/$u, before running "git merge". $u is derived from v3.0 (given tag), the identity of the GPG signer, and perhaps timestamp to make it both identifiable and unique under refs/audit/ hierarchy. * You "git push origin". This causes refs/audit/* refs that point at commits in the transferred history to auto-follow, just like the current "git fetch $there $src:$dst" causes refs/tags/* auto-follow. The refs/audit/* hierarchy in your public repository will be populated by lieutenant signatures. * (Optional) You may have signed "git tag -s 'Linux v3.2' v3.2 master" before you push origin out, or you may have not. Currently, you do have to "git push origin v3.2" separately if you did. The above auto-follow could be extended to push refs/tags/* hierarchy to eliminate this step as well. Note that because of the way "upload-pack" protocol is structured, the first response from "upload-pack" after it gets connection is the advertisement of refs, and there is no way for "fetch-pack" to ask for customized refs advertisement to it. So for this to work without incurring undue overhead for normal users, we would need to exclude refs/audit/* from the normal ref advertisement (i.e. "ls-remote" does not see it) so that "git fetch" by casual users will not have to wait for megabytes of ref advertisements before issuing its first "want" request. Probably we can change "upload-pack" to advertise only refs/heads/*, refs/tags/*, and HEAD by default, and a protocol extension could be added to ask for other hierarchies for specialized needs like third party auditors. BUT. This does not allow third party auditors to audit how sub-subsystem histories came into your lieutenants' history unless you also fetch from your lieutenants in "auditor" mode to retrieve their refs/audit/* refs to be propagated to your public repository, which all of us involved in this thread know you wouldn't bother if it is an additional manual step (and I personally do not think I would bother if I were you). So the audit trail will end at one level unless we have even more complex arrangements. The auditors know the history up to some point in the past came from you (your last signed tag at release time, which some people may feel a bit too sparse for auditing purposes when a security incident like that one happens in between releases), and they know subhistories of what you merged came from your direct lieutenants (the refs/audit/* tags the above change allowed you to forward automatically when you published), but they have to take the word of your direct lieutenants at face value. I do not know if that is acceptable for $DAYJOB types, though. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > On Tue, Nov 1, 2011 at 2:56 PM, Junio C Hamano <gitster@pobox.com> wrote: >> >> But on the other hand, in many ways, publishing your commit to the outside >> world, not necessarily for getting pulled into the final destination >> (i.e. your tree) but merely for other people to try it out, is the point >> of no return (aka "don't rewind or rebase once you publish"). "pushing >> out" might be less special than "please pull", but it still is special. > > So I really think that signing the top commit itself is fundamentally wrong. It merely is a stronger form of the "committer" line in the commit object. A random repository at Github anybody can create repositories at can serve you a random commit with any random name on "committer" line, and the new gpgsig header is a way to let the committer certify it genuinely is from the committer. I do not think for that purpose, in-commit signature is fundamentally wrong. I was hoping it would be more useful than it turned out to be, but I agree that it just is not suitable as a vehicle to convey "I made that commit some time ago, and now I want you to pull it for such and such reasons" in a larger workflow. The "now I want you to pull it for such and such reasons" part is the pull request, and if we are to protect them with GPG signatures, and perhaps copy the signed part in the resulting merge, don't we have a reasonable solution, without all the downsides the signed tag approach would cause if we wanted to allow third party auditors to have access to the signatures for independent auditing purposes (described in a separate message)? Perhaps what is causing the problem is the desire to allow third party auditors finer grained audit trail, but after having heard that $DAYJOB folks went through each and every commit after known release points with fine-toothed comb, I am not brave/rude/blunt enough to dismiss it as unimportant. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Junio C Hamano <gitster@pobox.com> writes:
> BUT.
Ahh, sorry for the noise. I realize that we already have a winner, namely,
the proposal outlined in your message I was responding to.
It just didn't click to me that you were replacing "signed material from
pull request copied into the merge" with "contents of signed tag copied
into the merge".
So forget everything I said in the later parts of my response that talks
about refs/audit/*, and the other message except for gpgsig header being a
stronger form of existing committer line.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 3, 2011 at 11:16 AM, Junio C Hamano <gitster@pobox.com> wrote: > > It is not "Whatever". > > $ git fetch git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v3.0 > fatal: Couldn't find remote ref v3.0 > > I do not think we ever DWIMmed fetch refspecs to prefix refs/tags/, so it > is not the naming but fetching tags without saying "git fetch tag v3.0" > (which IIRC was your invention long time ago). Ahh. Yeah, and not DWIM'ing tags is probably ok. I'd completely forgotten about the special "tag" shortcut. Which probably means it was a bad ui decision to begin with. But once more, the UI is clearly designed for fetching the tags into your own tag-space (ie it does "refs/tags/<tag>:refs/tags/<tag>") rather than fetching the tag just for verification. > If we changed this "git fetch $there v3.0" to fetch tag, it would help the > final step in your illustration, and I do not think it would be a huge > regression---the only case it becomes fuzzy is when they have v3.0 branch > at the same time, but the owner of such a repository is already playing > with fire. Yeah, extending DWIM for remote repos to do the same thing it does for local repositories is probably the right thing regardless of any other issues. We already have the "tag and branch with the same name" issue for local repositories, and we have perfectly good disambiguation rules for when disambiguation is necessary. Making the DWIM rules be the same for a remote case sounds sane. That said, I don't think it's a big deal either. I was just confused by the expansion being different, but having to have the refs/tags/ there isn't a dealbreaker by any means. >> Quite frankly, I think that's a git bug, but it's a git bug because >> "git fetch" was designed to get the commit to merge. Fair enough. > > And because FETCH_HEAD started as (and probably still is) an internal > implementation detail of communication between fetch and merge inside > pull. Well, I certainly don't consider it to be just "an implementation detail" personally. I use FETCH_HEAD all the time (the same way I use ORIG_HEAD and just plain HEAD). It's very useful for "fetch and check what they have", when you want to look at something but you don't want all the remote tags and crud. So I consider it a honest-to-goodness real user feature. >So I do not have any issue in changing it to store tags unpeeled there. In fact, storing the peeled was really surprising to me, especially since it actually *says* "tag" in the .git/FETCH_HEAD file. So the .git/FETCH_HEAD file really currently ends up being actively wrogn and misleading for tags we fetch: it looks something like <sha-of-commit> tag '<tagname>' of <reponame> and says it is a tag, but the SHA1 is of the peeled commit. That's just crazy, and actually made me think the other end (Rusty, in this case) had done something wrong initially (ie I quite reasonably - I thought - blamed it on Rusty using a non-signed tag). >> WTF? > > This is not WTF but "fetching a history to store the tip of it in your > refs/ namespace causes tags pointing into the history line followed > automatically", and it exactly is what you want to happen if rusty asked > you to fetch his for-linus branch (which the tag may point at) instead. Well, yes and no. But mostly no. If I just fetch his for-linus branch, I don't get (and I don't want) his tags. It's only because I fetched it into my ref-space. And I only fetched it into my ref-space, because otherwise the crazy git peeling happened if I don't do that. So I didn't want those other tags, and I really normally wouldn't have gotten them. Only because I had to do that odd work-around to avoid the peeling did I get it, because then the totally unrelated logic of "ok, get the tags too" triggered. So it's a WTF, because this work-around ends up having the special side effects - and they make sense when you *really* fetch his branch and make it part of your name-space, but not when you only did the "part of my namespace" as a workaround for another git issue. Obviously, you can use "-n" (--no-tags) to fetch the tag, and that actually fixes the issue, but that is it's own kind of WTF too: in order to fetch just *one* tag, you have to specify that you don't want tags? Not exactly a greatly intuitive use case ;) Anyway, the one-line rpatch I sent basically avoids all these WTF moments, by just making "git fetch <repo> <tagname>" work (apart from the DWIMmery on the tag-name, but that's a totally independent small detail that doesn't really matter) >> We got three other >> tags too that we didn't even ask for! > > We could change the rule to read "fetching a history to store the tip of it > in your refs/heads namespace causes autofollow". I am not sure if that is > what we really want, though. No, I think the current "follow tags" rule is fine. It's just that it didn't really mesh well with "damn, I have to work around this other git issue". > We could update three things: > > - DWIM $name in "git fetch $there $name" to refs/tags/$name when it makes > sense; > - FETCH_HEAD stores unpeeled object names; and > - "git pull" learns --verify option. Yes. I think that would indeed solve everything. > Then > > $ git pull --verify rusty rusty@rustcorp.com.au-v3.1-8068-g5087a50 > > could integrate the history leading to that tag to your current branch > while running verify-tag on it. Agreed. The only remaining issue then would be how that "yes, I verified the tag" part would be actually saved for posterity. My suggestion would be to to just punt that question, and let the user decide, by simply: - start the editor by default with "--verify" - output the "gpg --verify" result into the end of the commit file, along with the tag content (which has the original pgp signature, of course). - let the user decide what part of it he wants to use. In particular, the "gpg --verify" result may well be something that the user wants to *act* on - maybe the key didn't exist in the key ring, or maybe it does exist but doesn't have quite enough trust and gpg complains about that etc etc. But that's all something that "start the editor and show the user what is up" would let the user decide on. > For this, disabling the tag-auto-following is not necessary, as you are > not storing the retrieved tag anywhere. Exactly, >> That said, I do think that the "signature in the pull request" should >> also "just work", and I'm not entirely sure which one is better. > > I do not think it is necessarily either/or choice. No, I think we can do both, and it actually ends up being just a matter of convenience which one a particular project ends up using (or even use both depending on preferences of particular sub-lieutenants within the project). > I wonder if an approach like the following, in addition to the three > things I listed above, may give us a workable solution: > > * "git fetch linus v3.0" called by "git pull --verify linus v3.0" fetches > the v3.0 unpeeled into FETCH_HEAD, GPG verifies it, creates > refs/audit/$u, before running "git merge". $u is derived from v3.0 > (given tag), the identity of the GPG signer, and perhaps timestamp to > make it both identifiable and unique under refs/audit/ hierarchy. So far so good, but see above: it may turn out that the user will *re-verify* the key after having done some gpg action. So.. > * You "git push origin". This causes refs/audit/* refs that point at > commits in the transferred history to auto-follow, just like the > current "git fetch $there $src:$dst" causes refs/tags/* auto-follow. > The refs/audit/* hierarchy in your public repository will be populated > by lieutenant signatures. So I don't think auto-follow is good here. I could *easily* see various companies using this for their own internal audit, without really wanting to expose things outside of the company. So auto-following sounds like the wrong approach. Make it an explicit "expose audit checks" thing. > * (Optional) You may have signed "git tag -s 'Linux v3.2' v3.2 master" > before you push origin out, or you may have not. Currently, you do have > to "git push origin v3.2" separately if you did. The above auto-follow > could be extended to push refs/tags/* hierarchy to eliminate this step > as well. So far I haven't really had any issues with having to do a "git push --tags" to push things out. That said, maybe the auto-push could just be a per-repo option, and then you can have it both ways. > Note that because of the way "upload-pack" protocol is structured, the > first response from "upload-pack" after it gets connection is the > advertisement of refs, and there is no way for "fetch-pack" to ask for > customized refs advertisement to it. So for this to work without incurring > undue overhead for normal users, we would need to exclude refs/audit/* > from the normal ref advertisement (i.e. "ls-remote" does not see it) so > that "git fetch" by casual users will not have to wait for megabytes of > ref advertisements before issuing its first "want" request. I think that would be a good thing, and make it much more palatable. After all, th elikelihood is that *nobody* will ever care about the audit cases at all. They are very much a "..but what if xyz happens" kind of safety net for the extreme badness, not anything you'd expect to use. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 3, 2011 at 11:52 AM, Junio C Hamano <gitster@pobox.com> wrote: > > Ahh, sorry for the noise. I realize that we already have a winner, namely, > the proposal outlined in your message I was responding to. No, no, don't consider my "put in the merge message" a winner at all. I personally dislike it, and don't really think it's a wonderful thing at all. I really does have real downsides: - internal signatures really *are* a disaster for maintenance. You can never fix them if they need fixing (and "need fixing" may well be "you want to re-sign things after a repository format change") - they are ugly as heck, and you really don't want to see them in 99.999% of all cases. So putting those things iin the merge commit message may have some upsides, but it has tons of downsides too. I think your refs/audit/ idea should be given real thought, because maybe that's the right idea. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 03, 2011 at 12:09:55PM -0700, Linus Torvalds wrote: > I personally dislike it, and don't really think it's a wonderful thing > at all. I really does have real downsides: > > - internal signatures really *are* a disaster for maintenance. You > can never fix them if they need fixing (and "need fixing" may well be > "you want to re-sign things after a repository format change") Note that a repository format change will break a bunch of other things as well, including references in commit descriptions ("This fixes a regression introduced in commit 42DEADBEEF") So if SHA-1 is in danger of failing in way that would threaten git's use of it (highly unlikely), we'd probably be well advised to find a way to add a new crypto checksum (i.e., SHA-256) in parallel, but keep the original SHA-1 checksum for UI purposes. > - they are ugly as heck, and you really don't want to see them in > 99.999% of all cases. So we can make them be hidden from "git log" and "gik" by default. That bit is a bit gross, I agree, but 3rd party verification really is a good thing, which I'm hoping can be added in a relatively clean fashion. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 4, 2011 at 7:59 AM, Ted Ts'o <tytso@mit.edu> wrote: > > Note that a repository format change will break a bunch of other > things as well, including references in commit descriptions ("This > fixes a regression introduced in commit 42DEADBEEF") No they won't. Not if you do it right. It's easy enough to automatically replace the SHA1's in the description, the same way we replace everything else. Really. It's *trivial*. Maybe some current tools don't do it, but if I were to convert the kernel tree, I'd absolutely *require* the conversion to be done right. And "right" means "don't just get the parent SHA1's right, but the ones hiding in the description too". Any conversion tool has to keep track of the translation from "old SHA1 to new SHA1" *anyway* because of all the other issues (ie exactly things like parent pointers etc), so conversion tools by definition have the information to do things like this right. But "internal cryptographic signatures" are fundamentally different. A conversion tool *cannot* convert them, since it won't have access to the private keys in question, and thus cannot fix up the signature. Sure, if I do the conversion, I could make *my* signatures match. And that is true for every signer out there - individually. But only individually, never collectively. Sure, we could all meet in one place and synchronously re-sign things on our private machines with some "distributed conversion tool", but realistically that really really doesn't work. It's a fundamental problem. And it really isn't a theoretical one - it's one we know will happen *some* day. I haven't worried about SHA1, exactly because I know it's not a real problem - we can always convert. But internal signatures very fundamentally change that. And it really is about *internal* signatures. The kinds of signed tags we have now are not a problem. Those can trivially be converted in a distributed manner, exactly because they are "detatched" from what they sign. We carry them along with the git repo, but they don't mess up history, and they can be re-created individually without changing anything else. And yes, this was actually a design issue for me, which is why I feel so strongly about it. I actually *thought* about issues like this five+ years ago: I wanted to have cryptographic security, but I very much on purpose wanted it to be "outside" the repo. (Ok, so the git tag objects can sign other git tag objects recursively, and in that case you have an ordering issue where a conversion would first have to get somebody to re-sign their "inner" tag before the "outer" signature can be re-created, but even if that were to happen - and I don't think anybody does it - it's a trivial problem with no real complexity issues). >> - they are ugly as heck, and you really don't want to see them in >> 99.999% of all cases. > > So we can make them be hidden from "git log" and "gik" by default. > That bit is a bit gross, I agree, but 3rd party verification really is > a good thing, which I'm hoping can be added in a relatively clean > fashion. I agree that we can hide them - that's after all what the pgpsig thing does in the "internal commit signature" that git has in pu/next. That one hides ie even more specifically, by putting it in the headers of the commit, but that's just a random implementation detail. But I really think that "internal signatures" that actually affect the SHA1 of the object and its history have fundamental design problems. They may not be "insurmountably bad", but they are definitely real. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > On Thu, Nov 3, 2011 at 11:52 AM, Junio C Hamano <gitster@pobox.com> wrote: >> >> Ahh, sorry for the noise. I realize that we already have a winner, namely, >> the proposal outlined in your message I was responding to. > > No, no, don't consider my "put in the merge message" a winner at all. > > I personally dislike it, and don't really think it's a wonderful thing > at all. I really does have real downsides: > > - internal signatures really *are* a disaster for maintenance. You > can never fix them if they need fixing (and "need fixing" may well be > "you want to re-sign things after a repository format change") > > - they are ugly as heck, and you really don't want to see them in > 99.999% of all cases. > > So putting those things iin the merge commit message may have some > upsides, but it has tons of downsides too. > > I think your refs/audit/ idea should be given real thought, because > maybe that's the right idea. While I agree that re-signing is a problem, I do not see it as a huge issue. In your "SHA-1 to SHA-256 transtion" scenario, the conversion is a flag day event in the hopefully fairly distant (in the git timescale) future, and I am reasonably sure that by that time we would already have infrastructure updates necessary to support huge number of refs, including the "lazily scan only the refs necessary" and the "some refs are optional in advertisement" topics that are useful for other purposes. In the worst case, even if we used your "merge commit records the merged tag as the record of requested pull" design today, we could choose not to rewrite these in-merge-commit signatures when the conversion becomes necessary. Instead, the conversion procedure can prepare a mapping table between the old SHA-1 and the rewritten SHA-256, and contributors can prepare detached signature for the mappings of their own commits after verifying that the conversion produced what they are happy with. And then we store concatenation of these detached signatures in a blob to help future third party auditors to audit these (by-then) historical commits. About the ugliness of the merge commit log messages, you have already learned to ignore them with "log --no-merges" ;-) and the material the patch series I sent out adds are at the end, so "/^commit.*$" in less would hopefully work well enough in "log --no-merges" as well. Because the refs/audit/ approach requires too much infrastructure we still do not have today, and workflow elements are not fully worked out (e.g. propagating audit trails fully from sub-sub-sub-...-lieutenants upwards is tricky as I outlined in the other message), I think we should start from a design that we can see how it would work now. With the posted series, the workflow would become something like this: contributor$ work work work contributor$ git tag -s -m 'Signed pull This series is to allow the integrator to pull from contributors by specifying a signed tag, not the tip of the branch, and verify the authenticity of the series while merging' for-linus contributor$ git push public for-linus contributor$ git request-pull origin \ $(git config remote.public.url) for-linus >msg contributor$ edit msg contributor$ mail torvalds@... integrator$ mail ;# read the pull request integrator$ git pull git://github.com/contributor/linux.git for-linus ... editor opens with the usual merge message, but with ... the contents of the tag and the "GPG verify" result at ... the end. It might make sense to also teach the "git tag" part somehow use branch description of the tip of branch being tagged to prime the tag message. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 4, 2011 at 11:36 PM, Junio C Hamano <gitster@pobox.com> wrote: > > About the ugliness of the merge commit log messages, you have already > learned to ignore them with "log --no-merges" ;-) Absolutely not. I look at merges all the time. I never use "--no-merges" except when I'm doing certain statistics (ie "How many real changes do we have") or when I do release files. But I actually think it's important that people write *good* merge messages. I've berated some people for it when they just have Merge branch 'origin' in their commit message, because I think a merge commit should say why it happened or what it brought in. > and the material the > patch series I sent out adds are at the end, so "/^commit.*$" in less > would hopefully work well enough in "log --no-merges" as well. I agree that being at the end helps, but I do a lot of "git log ORIG_HEAD.." etc, and I don't do a lot of "/^commit" searching. The "/commit" thing I do tends to be because I do "git log -p" to see patches, but at the same time am not going to read through everything.. So I'd really like some way to not see it. Ted suggested a NUL character in the commit message in front of the "hidden content". What do you think? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > So I'd really like some way to not see it. > > Ted suggested a NUL character in the commit message in front of the > "hidden content". What do you think? You do not have to resort to NUL; we could just stuff whatever you do not need to see but needs to be left *intact* in the new header fields just like the embedded GPG signatures are stored in signed commits. By the time the integrator is presented the merge commit template, we would have: 1. The merge title (e.g. "Merge tag for-linus of git://.../rusty.git/"); 2. Payload of the signed tag (or just "annotated tag"), which is used to convey meaningful topic description from the lieutenant; 3. The signature in the tag, if the tag is not just merely annotated, but is signed; 4. The output from GPG verification of the above (only when 3. is available); and 5. The traditional "merge summary", if merge.log is enabled. The 10-patch series I sent earlier appends 2 and 3 with "tag:" prefix and 4 with "# " prefix in the commit log template, but it does not have to be that way. We could arrange things so that we put only 1, 2, 4 (still with "# " prefix because this is meant to help you verify the authenticity, not for later third-party audit, and to be stripped away with stripspace before the commit is made) and 5 in the commit log template, and the original signed tag contents (only when the tag is signed, not merely annotated) in a separate file MERGE_SIG in $GIT_DIR/ next to MERGE_MSG, and teach "git commit" to pick it up and stuff it in a new header field. That way, the integrator can use the message 2 for the commit log message and is free to typofix it, without breaking later third-party audit which would use what is taken literally from the signed tag and stored in the new header field, because the integrator's editor would never touch the latter. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Nov 5, 2011 at 4:49 PM, Junio C Hamano <junio@pobox.com> wrote: > > You do not have to resort to NUL; we could just stuff whatever you do not > need to see but needs to be left *intact* in the new header fields just > like the embedded GPG signatures are stored in signed commits. Agreed, [ details removed ] that sounds perfect. And makes it easy to get at if you want to with just "git cat-file commit" - without ever really being visible to people who don't care. And having it visible in the editor with '#' means that the user who does the merge gets to see what actually ended up being put in there, along with the fact that yes, it verified correctly. So I think I really like that approach - it seems to solve all problems. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 04 Nov 2011 08:14:52 PDT, Linus Torvalds said: > On Fri, Nov 4, 2011 at 7:59 AM, Ted Ts'o <tytso@mit.edu> wrote: > > Note that a repository format change will break a bunch of other > > things as well, including references in commit descriptions ("This > > fixes a regression introduced in commit 42DEADBEEF") > No they won't. Not if you do it right. It's easy enough to > automatically replace the SHA1's in the description, the same way we > replace everything else. OK.. I'll bite. How do you disambiguate a '42deadbeef' in the changelog part of a commit as being a commit ID, as opposed to being an address in a traceback or something similar? Yes, I know you only change the ones that actually map to a commit ID, but I'd not be surprised if by now we've got enough commits and stack tracebacks in the git history that we'll birthday-paradox ourselves into a false-positive in an automatic replacement. (And it's OK to say "the 3 stack tracebacks in changelogs we just mangled can just go jump", but it does need at least a few seconds consideration..)
On Sun, Nov 6, 2011 at 11:52 PM, <Valdis.Kletnieks@vt.edu> wrote: > > OK.. I'll bite. How do you disambiguate a '42deadbeef' in the changelog part > of a commit as being a commit ID, as opposed to being an address in a traceback > or something similar? Yes, I know you only change the ones that actually map to > a commit ID, but I'd not be surprised if by now we've got enough commits and > stack tracebacks in the git history that we'll birthday-paradox ourselves into > a false-positive in an automatic replacement. I don't think we are quite there yet. And (sadly) most of the commit ID's in the history are 7 hex characters, because that used to be the default git abbreviation. So there is unlikely to be any real conflicts. If we do miss one or two, that will be sad and embarrassing, but is not a real problem in practice. We probably could add various heuristics (the SHA1 values are *often* preceded by the string "commit"), and a really good import would also have somebody at least visually inspecting ones that other heuristics say might be debatable (for example - because they have 8 hex digits and there are other numbers around them that were *not* converted), but in the end perfection is the enemy of good. It's not really worth the headache to worry about *all* the cases, if you can cheaply and simply get 99+% right. And I think the 99% is almost trivial. While the last 1% may or may not be worth worrying about. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Linus Torvalds <torvalds@linux-foundation.org> writes: > No, no, don't consider my "put in the merge message" a winner at all. > > I personally dislike it, and don't really think it's a wonderful thing > at all. I really does have real downsides: > > - internal signatures really *are* a disaster for maintenance. You > can never fix them if they need fixing (and "need fixing" may well be > "you want to re-sign things after a repository format change") > > - they are ugly as heck, and you really don't want to see them in > 99.999% of all cases. > > So putting those things iin the merge commit message may have some > upsides, but it has tons of downsides too. > > I think your refs/audit/ idea should be given real thought, because > maybe that's the right idea. With the latest round of touch-ups, modulo a few bugs I will be fixing before the 1.7.8 final, I think what we have is more or less OK in the shorter term and should be ready for general consumption. The ugliness is gone, but the issue around internal signatures may remain to be solved in the longer term. At least, by storing the full contents of the tag today in an extended header, when we figure out how a detached signature should really work, we could convert by extracting them from the history. In a separate message earlier in the thread, you raised another issue. > I hate how anonymous our branches are. Sure, we can use good names for > them, but it was a mistake to think we should describe the repository > (for gitweb), rather than the branch. > > Ok, "hate" is a strong word. I don't "hate" it. I don't even think > it's a major design issue. But I do think that it would have been > nicer if we had had some branch description model. At the first glance, our branch model is indeed peculiar in that a branch does not have a global identity. The scope of its name is local to the repository, and it is just a pointer into the history. A "note" [*1*] that can annotate a commit long after the commit is made is not a good way to describe what a branch is about, because the tip of the branch can advance beyond the commit that is annotated by such a note. A commit on a branch does not serve as a good anchoring point to describe the branch. However, a commit that merges the history of a branch, whether the merged branch is from a local repository or from a remote one, does serve as a good anchoring point. The work on a branch is finished as complete as possible at the time of the merge, and the committer who merges the branch agrees with both the objective and the implementation of the work done on the branch, and that is why the merge is made [*2*]. Describing what the history of the side branch was about in the resulting merge is a perfectly sensible way to explain the branch. So in that sense, I am very happy with the way the merge message template uses the pull request tag to let the lieutenant explain and defend the history behind the tag used for the pull request. Such an explanation does not have to be keyed with anybody's local branch name (e.g. "for-linus" would mean different things for different pull requests even from the same person), but keying it with the resulting merge commit is a sensible way to leave the record in the history. After justifying with the above two paragraphs that it is perfectly sensible to record the annotations on commits and not on "branch names", I do agree that we would eventually want to be able to have such annotations on commits after the fact. Neither "tags" nor "notes" is necessarily a very good mechanism, however, for the purpose of "signed pull requests" and "signed commits" [*3*]. Here are some pros and cons: - tags must be named, but the only thing we need is to be able to look the contents (with signature if signed) up given a commit object. Unlike the usual "I want to check out v3.0 release" look-up that goes from tag names to the commits, annotation look-ups go the other way, do not have to have a tagname, and having tagname does not help our look-up in any way. If we want to use tag to annotate various commits by various people and keep them around, we would need global namespace that would not cause them to crash (we can work this around by using the object name of the tag, e.g. renaming 'for-linus' tag to $(git rev-parse tags/for-linus), but that is merely a workaround of having to name things that do not have to be named in the first place). As a local storage machinery for annotations, tags hanging below refs/tags/ (or refs/audit for that matter) hierarchy with their own names is an inappropriate model. + tags can auto-follow the commits when object transfer happens (at least in the fetch direction), and for the purpose of "signed pull requests" and "signed commits", this is a desirable property. When a repository gains a commit, the annotations attached to the commit that are missing from the receiving repository are automatically transferred from the place the commit comes from. Annotations given to other commits that are not transferred into the repository do not come to the repository. - "git notes" is represented as a commit that records a tree that holds the entire mapping from commit to its annotations, and the only way to transferr it is to send it together with its history as a whole. It does not have the nice auto-following property that transfers only the relevant annotations. + "git notes" maps the commits to its annotations in the right direction; the object name of an annotated object to its annotation. In the longer term, I think we would need to extend the system in the following way: - Introduce a mapping machanism that can be locally used to map names of the objects being annotated to names of other objects (most likely blobs but there is nothing that fundamentally prevents you from annotating a commit with a tree). The current "git notes" might be a perfectly suitable representation of this, or it may turn out to be lacking (I haven't thought things through), but the important point is that this "mapping store" is _local_. fsck, repack and prune need to be told that objects that store the annotation are reachable from the annotated objects. - Introduce a protocol extension to transfer this mapping information for objects being transferred in an efficient way. When "rev-list --objects have..want" tells us that the receiving end (in either fetch/push direction) would have an object at the end of the primary transfer (note that I did not say "an object will be sent in this transfer transaction"; "have" does not come into the picture), we make sure that missing annotations attached to the object is also transferred, and new mapping is registered at the receiving end. The detailed design for the latter needs more thought. The auto-following of tags works even if nothing is being fetched in the primary transfer (i.e. "git fetch" && "git fetch" back to back to update our origin/master with the master at the origin) when a new tag is added to ancient part of the history that leads to the master at the origin, but this is exactly because the sending end advertises all the available tags and the objects they point at so that we can tell what new tags added to an old object is missing from the receiving end. This obviously would not scale well when we have tens of thousands of objects to annotate. Perhaps an entry in the "mapping store" would record: - The object name of the object being annotated; - The object name of the annotation; - The "timestamp", i.e. when the association between the above two was made--this can be local to the repository and a simple counter would do. and also maintain the last "timestamp" this repository sent annotations to the remote (one timestamp per remote repository). When we push, we would send annotations pertaining to the object reachable from what we are pushing (not limited by what they already have, as the whole point of this exercise is to allow us to transfer annotations added to an object long after the object was created and sent to the remote) that is newer than that "timestamp". Similarly, when fetching, we would send the "timestamp" this repository last fetched annotations from the other end (which means we would need one such "timestamp" per remote repository) and let the remote side decide the set of new annotations they added since we last synched that are on objects reachable from what we "want". Or something like that. [Footnote] *1* By this word, I do not necessarily mean what the "git notes" command manipulates. A tag that points at a commit is also equally a good vehicle to annotate a commit after the fact. *2* For this reason, it may make sense to "commit -S" such a merge commit. The "mergetag" asserts the authenticity of the pull request from the lieutenant whose history is being integrated, and the "gpgsig" asserts the authenticity of the merge itself--the fact that it was made by the integrator. *3* I do not mean what "git commit -S" parked in 'pu' produces, which is to store the signature in the commit. Adding "Signed-off-by:" after the fact to an existing commit by many people is a more appropriate example. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 9, 2011 at 18:26, Junio C Hamano <gitster@pobox.com> wrote: > - "git notes" is represented as a commit that records a tree that holds > the entire mapping from commit to its annotations, and the only way to > transferr it is to send it together with its history as a whole. It > does not have the nice auto-following property that transfers only the > relevant annotations. True. However, consider these mitigating factors: - The annotations in question (the "signing" of commits) are all intended to be merged eventually (i.e. there is no reason for a developer to (after the fact) sign a commit that will never end up in the public record). Therefore, most or all of the notes in the notes tree are already relevant, or will become relevant in the near future (when the associated commits are merged). - Additionally, you could organize these notes into two (or more) notes trees, one for merged/official annotations, and one for unmerged/pending annotations. Then make the relevant tools (e.g. "git merge") transfer notes from one tree to the other, thereby making sure that the "official" record only contains notes that are relevant to the merged history. - Finally, there's always "git notes prune" to purge annotations for commits that ended up never being merged. My point is that although "notes" might end up transferring more annotations than strictly necessary, I believe that in practice all the notes being transferred are already (or will soon become) relevant. > + "git notes" maps the commits to its annotations in the right direction; > the object name of an annotated object to its annotation. > > In the longer term, I think we would need to extend the system in the > following way: > > - Introduce a mapping machanism that can be locally used to map names of > the objects being annotated to names of other objects (most likely > blobs but there is nothing that fundamentally prevents you from > annotating a commit with a tree). The current "git notes" might be a > perfectly suitable representation of this, or it may turn out to be > lacking (I haven't thought things through), but the important point is > that this "mapping store" is _local_. fsck, repack and prune need to be > told that objects that store the annotation are reachable from the > annotated objects. IMHO this is precisely what "git notes" does today. > - Introduce a protocol extension to transfer this mapping information for > objects being transferred in an efficient way. When "rev-list --objects > have..want" tells us that the receiving end (in either fetch/push > direction) would have an object at the end of the primary transfer > (note that I did not say "an object will be sent in this transfer > transaction"; "have" does not come into the picture), we make sure that > missing annotations attached to the object is also transferred, and new > mapping is registered at the receiving end. > > The detailed design for the latter needs more thought. The auto-following > of tags works even if nothing is being fetched in the primary transfer > (i.e. "git fetch" && "git fetch" back to back to update our origin/master > with the master at the origin) when a new tag is added to ancient part of > the history that leads to the master at the origin, but this is exactly > because the sending end advertises all the available tags and the objects > they point at so that we can tell what new tags added to an old object is > missing from the receiving end. This obviously would not scale well when > we have tens of thousands of objects to annotate. Perhaps an entry in the > "mapping store" would record: > > - The object name of the object being annotated; > > - The object name of the annotation; > > - The "timestamp", i.e. when the association between the above two was > made--this can be local to the repository and a simple counter would > do. > > and also maintain the last "timestamp" this repository sent annotations to > the remote (one timestamp per remote repository). When we push, we would > send annotations pertaining to the object reachable from what we are > pushing (not limited by what they already have, as the whole point of this > exercise is to allow us to transfer annotations added to an object long > after the object was created and sent to the remote) that is newer than > that "timestamp". Similarly, when fetching, we would send the "timestamp" > this repository last fetched annotations from the other end (which means > we would need one such "timestamp" per remote repository) and let the > remote side decide the set of new annotations they added since we last > synched that are on objects reachable from what we "want". > > Or something like that. You would also have to keep track of deleted annotations, to enable the local side to delete an annotation corresponding to an already-deleted annotation on the remote side. Pretty soon, you end up having to record something similar to a DAG, describing the history of manipulating these annotations. At that point, your "timestamp" calculation starts to look very similar to the "have..want" calculation already done when transferring "regular" refs. At which point you have a system that is very similar to what "git notes" does today... ...Johan
On Wed, 2011-11-02 at 21:13 -0700, Linus Torvalds wrote: > No, my main objection to saving the data is that it's ugly and it's > redundant. Sure, in practice you can check the signatures later fine > (with the rare exceptions you mention), but even when you can do it, > what's the big upside? Another objection (although it may not be insurmountable) is that it's not necessarily *entirely* clear what's being signed. In the simple case where I clone your tree, make a few commits with my Signed-off-by:, sign a tag and then ask you to pull, that's easy enough. I'm vouching for what I committed, and not for everything that was in your tree beforehand. But what if I'm working on top of someone else's published git tree? Does a signed tag at the top of *my* work imply that I'm vouching for all of theirs too? In the case where the signature is ephemeral and only used for you to trust my pull request, the answer is simple: If that other work wasn't in your tree yet at the time I send my pull request, I'd damn well better be vouching for it when I ask you to pull it. Nothing new there. But if we're keeping signatures around for auditing purposes, we'd better have a coherent answer to that question. One that isn't "a signature cover everything since the last commit with torvalds@ as the committer", if we want it to be useful for the general case.
On Tue, 2011-11-01 at 14:21 -0700, Linus Torvalds wrote: > I hate how anonymous our branches are. Sure, we can use good names for > them, but it was a mistake to think we should describe the repository > (for gitweb), rather than the branch. > > Ok, "hate" is a strong word. I don't "hate" it. I don't even think > it's a major design issue. But I do think that it would have been > nicer if we had had some branch description model. I actually quite like it. I take it as a hint: if the contents of a branch are *so* wildly different from the main repository that they need a different description, perhaps I should be using a separate repository instead of just a branch.
Johan Herland <johan@herland.net> writes: > On Wed, Nov 9, 2011 at 18:26, Junio C Hamano <gitster@pobox.com> wrote: >> - "git notes" is represented as a commit that records a tree that holds >> the entire mapping from commit to its annotations, and the only way to >> transferr it is to send it together with its history as a whole. It >> does not have the nice auto-following property that transfers only the >> relevant annotations. > > True. However, consider these mitigating factors: > ... > > My point is that although "notes" might end up transferring more > annotations than strictly necessary, I believe that in practice all the > notes being transferred are already (or will soon become) relevant. Sorry, but I do not think you are considering what would happen when you have many branches with different purposes, whose commits near tips will never get merged with each other. "automatic following" semantics like what "git fetch" does for signed tags is absolutely necessary in such a case, and the above are not mitigating factors at all in that context. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11-11-10 08:51 AM, David Woodhouse wrote: > On Wed, 2011-11-02 at 21:13 -0700, Linus Torvalds wrote: >> No, my main objection to saving the data is that it's ugly and it's >> redundant. Sure, in practice you can check the signatures later fine >> (with the rare exceptions you mention), but even when you can do it, >> what's the big upside? > > Another objection (although it may not be insurmountable) is that it's > not necessarily *entirely* clear what's being signed. I think this is a non-issue as far as the implementation is concerned. That is, the question exists regardless of what actual bits get (hashed and) encrypted by a private key. Furthermore, the answer will depend on who's using the signatures and in what context, and it's not appropriate for the git tool to make assumptions about those things. > In the simple case where I clone your tree, make a few commits with my > Signed-off-by:, sign a tag and then ask you to pull, that's easy enough. > I'm vouching for what I committed, and not for everything that was in > your tree beforehand. > > But what if I'm working on top of someone else's published git tree? > Does a signed tag at the top of *my* work imply that I'm vouching for > all of theirs too? <philosophy> It all depends on what you mean by "vouch for". You obviously thought that the 3rd-party repo was good for something, otherwise why did you base your work on it in the first place? So maybe you're just vouching for the 3rd-party repo being good enough for what you're trying to do. Or, maybe you've done a thorough analysis of the 3rd-party code and are ready to certify it as completely memory-leak-free or something. Or or, maybe you're only making a statement about the commits that you've authored yourself. (You probably want to individually sign each of those commits in this case.) These sorts of issues have been debated on PKI mailing lists ad nauseum. I think the best approach is that if you want your signature to have a particular meaning, then put that into some text that's part of what's being signed. Let other humans read that text and make their own decisions. </philosophy> And whatever the case, the software that makes and validates the signatures shouldn't make any assertions about how to interpret good or bad signatures. (Yes, other software could interpret meanings according to some criteria, and that software could exist alongside or be incorporated into the basic digital signature software, but the interpretation software is doing a different job.) M. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 10, 2011 at 16:15, Junio C Hamano <junio@pobox.com> wrote: > Johan Herland <johan@herland.net> writes: >> On Wed, Nov 9, 2011 at 18:26, Junio C Hamano <gitster@pobox.com> wrote: >>> - "git notes" is represented as a commit that records a tree that holds >>> the entire mapping from commit to its annotations, and the only way to >>> transferr it is to send it together with its history as a whole. It >>> does not have the nice auto-following property that transfers only the >>> relevant annotations. >> >> True. However, consider these mitigating factors: >> ... >> >> My point is that although "notes" might end up transferring more >> annotations than strictly necessary, I believe that in practice all the >> notes being transferred are already (or will soon become) relevant. > > Sorry, but I do not think you are considering what would happen when you > have many branches with different purposes, whose commits near tips will > never get merged with each other. "automatic following" semantics like > what "git fetch" does for signed tags is absolutely necessary in such a > case, and the above are not mitigating factors at all in that context. What about having one notes ref per branch? If/when the branch is merged, the associated notes ref containing the annotations for the commits on that branch would be merged as well (using "git notes merge"). Sure, using one notes ref per branch is more expensive than a single notes ref, but it's still cheaper than one ref per signed commit (which is what we get when using annotated tags). And it prevents the added code and complexity of the timestamped mapping approach. ...Johan
Johan Herland <johan@herland.net> writes: > What about having one notes ref per branch? If/when the branch is merged, > the associated notes ref containing the annotations for the commits on that > branch would be merged as well (using "git notes merge"). That is a crude workaround that you could (with help from users) make it work, but it does not change the fact that the current mechanism to transfer and integrate notes across repositories is a bad match for what the "signed commit" type annotations wants to achieve. In fact, the need for such a workaround is an illustration of how bad a match the mechanism is. When you merge a history that has commit A into another history that did not have that commit, the act of creating a merge commit itself should be enough to make the resulting history to contain that commit. The commit DAG already expresses it, and if a parallel "notes" mechanism needs to be futzed with to match that DAG, and command like "merge" needs to be told to help that process, that is a shortcoming of the "notes" mechanism. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 10, 2011 at 18:18, Junio C Hamano <junio@pobox.com> wrote: > Johan Herland <johan@herland.net> writes: > >> What about having one notes ref per branch? If/when the branch is merged, >> the associated notes ref containing the annotations for the commits on that >> branch would be merged as well (using "git notes merge"). > > That is a crude workaround that you could (with help from users) make it > work, but it does not change the fact that the current mechanism to > transfer and integrate notes across repositories is a bad match for what > the "signed commit" type annotations wants to achieve. In fact, the need > for such a workaround is an illustration of how bad a match the mechanism > is. > > When you merge a history that has commit A into another history that did > not have that commit, the act of creating a merge commit itself should be > enough to make the resulting history to contain that commit. The commit > DAG already expresses it, and if a parallel "notes" mechanism needs to be > futzed with to match that DAG, and command like "merge" needs to be told > to help that process, that is a shortcoming of the "notes" mechanism. [ ...and from elsewhere in this thread: ] > Note that in this thread, I am not saying that "git notes" mechanism is > not good for anything. A tree whose node names encode an object name is a > valid way to store the mapping from that object to a set of other objects, > and we already agreed that as the "local" storage mechanism, "git notes" > may be used as-is for the purpose of this thread. > > But the transfer and merge semantics "git notes" mechanism offers treats > the entire "notes" that appear in _one_ repository and merging that set to > the entire "notes" in another repository and it is not a good match for > the purpose of this thread. Ok. Point taken. Given that we need an alternative way to transfer annotations between repos (using auto-follow to select the relevant set of annotations, and then transferring only those annotations): Can we leverage existing functionality in "notes" where useful (e.g. using existing notes merge strategies to deal with colliding annotations), while at the same time extending the current "notes" feature with this alternative transfer mechanism? FWIW, I expect there are other "notes" use cases that would also prefer the auto-follow only-relevant transfer behavior. So, how can we use "notes" to better support the transfer semantics you suggest? The mapping from the object being annotated to the annotation object is already contained in the notes tree, but the "timestamp" you describe (needed to efficiently calculate the set of annotations to auto-follow) is not [1]. However, we could easily enough add a sorted list of (timestamp, annotated object name) pairs, to allow fast lookup of annotations created after a given timestamp. We could even store this list in a blob or tree object referenced directly from the notes tree [2]. Have fun! :) ...Johan [1]: Although I did at some point experiment with using timestamps in the internal organization of the notes tree (see for example http://article.gmane.org/gmane.comp.version-control.git/127966 ), I ended up using only the annotated object name (with flexible fanout). I don't think that reintroducing timestamps in the notes tree organization will pay off, because we need both lookup by annotated SHA1 and lookup by newer-than-given-timestamp to be fast, and there's AFAIK no way to get both from a single notes tree organzation. [2]: E.g. accessible with "git cat-file refs/notes/foo:timestamps". When a notes tree contains an entry that is obviously not an object name (SHA1), the notes code will leave it alone/untouched in the tree (see "struct non_note" and associated code in notes.c for further details).
Johan Herland <johan@herland.net> writes: > Given that we need an alternative way to transfer annotations between > repos (using auto-follow to select the relevant set of annotations, and > then transferring only those annotations): Can we leverage existing > functionality in "notes" where useful (e.g. using existing notes merge > strategies to deal with colliding annotations), while at the same time > extending the current "notes" feature with this alternative transfer > mechanism? FWIW, I expect there are other "notes" use cases that > would also prefer the auto-follow only-relevant transfer behavior. > > So, how can we use "notes" to better support the transfer semantics you > suggest? The mapping from the object being annotated to the annotation > object is already contained in the notes tree, but the "timestamp" you > describe (needed to efficiently calculate the set of annotations to > auto-follow) is not [1]. Please do not take the "timestamp" part too seriously. I am starting to think that what we want in this context actually is very close to annotated tags. I said we want a mapping from an annotated object to "a set of other objects" that annotate it, but it was an unnecessary and premature generalization. There is no reason that these annotations have to be structured "Git" objects such as blobs and trees. A set of annotated tags that have the same value on their "object" field is a perfect match for "a set of annotations attached to a given object". We already know that using the real tags has its own problems coming from having to give each and every one of them unique names somewhere in the refs hierarchy (be it refs/tags/ or refs/audit/), but imagine if we somehow had a way to: - keep these annotated tags in the object store; - keep them from getting pruned even if they are not referenced from anywhere in refs/ hierarchy; - given an object, efficiently enumerate such annotate tags that refer to the object. And then imagine that we are pushing history leading to a commit from one repository to another. Both repositories store these "anonymous" (that is what they are---they do not have a name in the refs/ hierarchy) tags. The two repositories can individually enumerate all these "anonymous" tags that annotate commits in the history that is being exchanged, and run a set reconciliation algorithm (e.g. [*1*]) to find out the anonymous tags that are missing from the recipient repository. Such an approach does not require any timestamp. My point is _not_ that the alternative in this message is superiour to the handwaving in my other message, but is that I think it may not be the best approach to think what needs to be added to "notes" to make it applicable for the problem we are solving. Rather, I think we should design how the overall system should look like (i.e. what property the resulting system should have) and then find out what is necessary in each part of the resulting solution (i.e. the list of "somehow had a way to..." above, plus "efficient set reconciliation"). [Footnote] *1* What's the Difference? Efficient Set Reconciliation without Prior Context http://cseweb.ucsd.edu/~fuyeda/papers/sigcomm2011.pdf -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git-request-pull.sh | 9 +++++++++ 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/git-request-pull.sh b/git-request-pull.sh index fc080cc5e45d..22b51930959f 100755 --- a/git-request-pull.sh +++ b/git-request-pull.sh @@ -20,11 +20,14 @@ GIT_PAGER= export GIT_PAGER patch= +sign= while case "$#" in 0) break ;; esac do case "$1" in -p) patch=-p ;; + -s) + sign=-s ;; --) shift; break ;; -*) @@ -73,6 +76,12 @@ are available in the git repository at:' $baserev && echo " $url $branch" && echo && +if test -n "$sign" +then + printf "Commit $headrev\nfrom $url\n" | gpg --clearsign + echo +fi && + git shortlog ^$baserev $headrev && git diff -M --stat --summary $patch $merge_base..$headrev || exit exit $status