IRC Logs:oot yeliaB Changesets IRC Log
From rPath Wiki
This has been slightly edited -- there was another conversation woven into this, which has been removed. oot also got fileid/pathid confused in one place, and his correction has been replaced by getting it right to begin with. Finally, mkj had some confusion over what was marked as config files by default which was removed.
(10:16:02) yeliaB: Are changesets the internal format for the repo, like diffs are the internal format for CVS? (10:16:05) oot: no (10:16:10) oot: not at all (10:16:15) yeliaB: So they really are generated dynamically (10:16:19) oot: yes (10:16:27) oot: which is what cvs has to do in many cases anyway (10:16:36) yeliaB: ok' (10:16:46) oot: the repository is basically a sql db (10:16:51) oot: though file contents are stored somewhere else (10:17:43) yeliaB: Does the fileid itself give you any leverage so that you can easily tell whether the client's copy of file foo has been changed from the server's copy? (10:18:03) oot: yes, loads (10:18:16) yeliaB: Seems you'd have to do a giant diff for each trove and all its contents otherwise(10:18:28) oot: the fileid could optimize that, but we don't use it (10:18:39) oot: remember we store sha1's for the files (10:18:47) oot: and we use compare those hashes to see if files changed (10:19:11) ***oot thinks about that (10:19:15) oot: we do both really (10:19:18) yeliaB: ok, so the client just passes its sha1s up to the server, who can then figure things out from there... (10:19:23) oot: no (10:19:28) oot: there are two things going on (10:19:36) oot: one is a changeset generated from repository(s) (10:19:50) oot: those are generated between different versions of a trove (10:19:58) yeliaB: right... (10:19:59) oot: and when we do that, we do use the fileid as an optimization for that (10:20:30) oot: the other case is generating a changeset between a trove and what's actually on the system -- a "local changeset" (10:20:35) oot: that's a purely client side operation (10:20:49) yeliaB: hmmm. (10:21:12) oot: after all, all the original data for the trove is stored in the local database (10:21:20) oot: and we assume that sha1 is a perfect hash (10:21:44) yeliaB: ah, I'd forgotten about the local db (10:21:48) oot: and we have piles of optimizations for that so we don't actually generate sha1's for the installed files unless it looks like they've changed (10:21:54) oot: local changesets are important for two reasons: (10:21:56) oot: 1. rollbacks (10:22:08) oot: 2. use can create them by hand and share them or commit them to a repository (10:22:27) yeliaB: right (10:22:47) oot: (all the code to create these changsets is common fwiw; it's just used slightly differently in the two cases) (10:22:51) ***oot thinks about tht (10:22:58) oot: well, it's not *entirely* common, but it's close (10:23:51) yeliaB: I'm still having a bit of trouble grokking how, when I update trove foo, and it has only one file changed (call it bar), how does it work that only bar's changes (or bar itself, if binary) is pushed over the wire? (10:24:23) oot: the client asks the repository for the changeset between trove foo version 1 and foo version 2 (10:24:24) yeliaB: seems like both sides would need complete knowledge of the other to make it work... (10:24:32) oot: the repository looks in it's database, sees which file changed, and sends that change (10:24:41) oot: yeliaB: that would be true if we didn't preserve local changes (10:24:55) oot: one of the features of conary is that if the sysadmin changes a file, we try to preserve that change (10:25:15) oot: so if /etc/unchanged didn't change from version 1 to versoin 2, conary leaves it alone (10:25:22) oot: if the sysadmin changed it, he probably meant to (10:25:36) yeliaB: ok... (10:25:51) oot: that's a *big* difference between conary and rpm btw (10:26:08) yeliaB: yeah, that's why I'm trying to make sure I really understand it... :-) (10:26:22) oot: when you update a trove we quite literally apply the diff (10:27:00) oot: if there are no conflicts between local changes and that diff, both get to stay (10:27:18) yeliaB: So in the case where bar has been changed by the sysadmin, *and* version 2 also changes bar, the client gets the changeset from the server, and then the client gets to deal with merging the changes, right? (10:27:28) oot: right (10:27:37) oot: which it does with diff/patch for config files (10:27:44) oot: for non-config files it just complains loudly and stops. (10:27:45) yeliaB: I think I was making this out to be more complex than it really is... (10:27:51) ***oot thinks it's complex enough already! (10:27:58) yeliaB: heh (10:28:47) yeliaB: So you do a regular diff/patch for config files; how about non-config text files (shell scripts, for example)? (10:28:55) oot: nope (10:29:19) oot: that would be nice, but we don't do it because for the diff/patch stuff to work we have to store the original version of the file so we can generate the local diff (10:29:34) yeliaB: ah -- good point... (10:29:34) oot: we don't want to do that for loads of files (10:29:41) yeliaB: makes sense... (10:29:52) oot: (that's what's stored in /var/lib/conarydb/contents) (10:30:06) oot: (those are gzipped, original config files indexed by their sha1s) (10:30:09) ***yeliaB makes a note to dig in there a bit) (10:30:20) oot: I also doubt that doing diff/merge on those types of files is terribly useful (10:30:59) yeliaB: well, perhaps in terms of keeping traffic to/from the repo to a minimum, but yeah, I get your point (10:31:00) mkj: note: any shell script that it makes sense to do this on should simply be marked as a config file (10:31:04) mkj: we do that already for many of them (10:31:07) mkj: including tag handlers (10:31:18) oot: yeliaB: for conarydb, try this (10:31:18) mkj: and init scripts (10:31:20) yeliaB: mkj, cool -- good to know... (10:31:26) oot: conary q --sha1s setup:runtime (10:31:55) yeliaB: ok (10:31:58) oot: pick one of those sha1s -- say, 531f01224bbf41c545061947747ef62d12698686 for /etc/passwd (on my old system at least) (10:32:07) yeliaB: yup (10:32:17) oot: zcat /var/lib/conarydb/contents/53/1f/01224bbf41c545061947747ef62d12698686 (10:32:44) oot: if I got that right, you'll see the original passwd file (10:32:55) oot: cool, huh? (10:33:14) merlin262: Nifty (tm) (10:33:15) oot: silly it's not there really (10:33:16) oot: :-) (10:33:18) oot: morning merlin262 (10:33:26) merlin262: Good Morning oot (10:33:38) ***oot didn't know anyone was lurking (10:33:52) merlin262: I'm always lurking. ;) (10:33:59) yeliaB: pretty neat! (10:34:17) mkj: yeliaB: actually, I lied (10:34:17) oot: if we didn't support absolute change sets we probably could get away w/o that, but absolute changesets are basically an absolute requirement (10:34:27) mkj: yeliaB: we don't automatically mark initscripts as config files (10:34:42) yeliaB: mkj, thx for the clarification (10:34:51) mkj: by default, everything in /etc/ is a config files (10:35:01) mkj: it's taghandlers that we don't automatically mark (10:37:16) yeliaB: So are there any tools I can use to poke around in a changeset (assuming I have a .ccs file) (10:37:36) oot: conary scs (10:37:38) oot: (scs == showchangeset) (10:37:54) oot: conary/scripts/showchangeset works as well, sometimes, but it's harder to use (10:38:02) oot: conary scs works a lot like the query commands (10:38:03) mkj: hmm, I wonder if we should make all taghandlers config files by default (10:38:10) yeliaB: yeah I've done that, but it doesn't seem to have much in the way of details... (10:38:13) mkj: that was something that we proposed before and I forgot wasn't implemented (10:38:16) oot: things like --ls --full-versions (10:38:20) oot: yeliaB: try that script (10:38:25) yeliaB: oot, will do (10:38:46) oot: yeliaB: though it is very buggy. if I need to fix something for you, holler (10:38:52) oot: oh, and we don't ship it. you'll need a source tarball (10:38:53) yeliaB: thx (10:38:58) ***oot is the only one who runs that script (10:39:12) yeliaB: one last question, and then I'll let y'all go (10:39:16) oot: oh, and there is conary/dumpcontainer (10:39:37) oot: if you run that on a changeset you'll see the file list for that change set, made up of a header and a bunch of [[Conary:path Ids|path Ids]] (which are md5sums) (10:40:03) oot: it also has a number in () telling you if it's a config file, and a type which says it's a "file" or a "diff" (10:40:31) yeliaB: oot, ah -- *that's* the level of detail I was looking for. Thx! (10:40:36) oot: if you pass a pathid as the second parameter to dumpcontainer, it'll send the contents of that to stdout (10:40:44) oot: but if you try and use that to display the header it won't be helpful (10:41:03) yeliaB: ok (10:41:40) ***oot waits for the "last" question (10:41:55) yeliaB: So committing a changeset into a repo -- does this mean that, in addition to files and troves, I should say that a repo can also include changesets? (10:42:01) oot: no (10:42:11) oot: you're using the changeset to get new files and troves into the repository (10:42:17) oot: the changeset isn't stored there (10:42:33) oot: if you ask for that changeset back, it generates a new one (10:43:11) yeliaB: ah -- bingo! That makes complete sense now... (10:43:23) oot: :-) (10:43:36) oot: changesets are the *only* way of moving troves and files around in conary (10:43:52) oot: moving files and troves around is also the only thing changesets do (10:43:58) oot: when you cook something, a changeset is generated and committed (10:44:07) oot: when you run "verify", a changeset is created and displayed (10:44:32) oot: "Verify" is just like "localchangeset" followed by "showchangeset" in fact (though a bit faster since it doesn't bother with gzipping file contents and such) (10:44:43) yeliaB: So looking at the repo as a black box, it "speaks" changesets, and contains files and troves (10:44:49) oot: right (10:45:05) oot: changesets are an integral part of the protocol for talking with a repository (10:45:10) yeliaB: and so the light dawns on a foggy mind... :-) (10:45:18) oot: when you query a trove, a changeset is actually created and sent (10:45:35) oot: (that changesets excludes stuff like file contents for performance reasons, but the file format is exactly the same) (10:45:51) yeliaB: slick (10:45:55) oot: :-) (10:46:45) oot: change sets were one of the first bits of conary to get implemented, and they've gone through at least 2 major revisions (10:48:41) yeliaB: Well, I'm all questioned out atm -- thanks for clearing this up for me! (10:48:42) kenvandine: just read back, very interesting conversation :) (10:48:55) kenvandine: good stuff
