Doing the BK Thing, Penguin-Style This set of notes is intended mainly for kernel developers, occasional or full-time, but sysadmins and power users may find parts of it useful as well. It assumes at least a basic familiarity with CVS, both at a user level (use on the cmd line) and at a higher level (client-server model). Due to the author's background, an operation may be described in terms of CVS, or in terms of how that operation differs from CVS. This is -not- intended to be BitKeeper documentation. Always run "bk help " or in X "bk helptool " for reference documentation. BitKeeper Concepts ------------------ In the true nature of the Internet itself, BitKeeper is a distributed system. When applied to revision control, this means doing away with client-server, and changing to a parent-child model... essentially peer-to-peer. On the developer's end, this also represents a fundamental disruption in the standard workflow of changes, commits, and merges. You will need to take a few minutes to think about how to best work under BitKeeper, and re-optimize things a bit. In some sense it is a bit radical, because it might described as tossing changes out into a maelstrom and having them magically land at the right destination... but I'm getting ahead of myself. Let's start with this progression: Each BitKeeper source tree on disk is a repository unto itself. Each repository has a parent. Each repository contains a set of a changesets ("csets"). Each cset is one or more changed files, bundled together. Each tree is a repository, so all changes are checked into the local tree. When a change is checked in, all modified files are grouped into a logical unit, the changeset. Internally, BK links these changesets in a tree, representing various converging and diverging lines of development. These changesets are the bread and butter of the BK system. After the concept of changesets, the next thing you need to get used to is having multiple copies of source trees lying around. This -really- takes some getting used to, for some people. Separate source trees are the means in BitKeeper by which you delineate parallel lines of development, both minor and major. What would be branches in CVS become separate source trees, or "clones" in BitKeeper [heh, or Star Wars] terminology. Clones and changesets are the tools from which most of the power of BitKeeper is derived. As mentioned earlier, each clone has a parent, the tree used as the source when the new clone was created. In a CVS-like setup, the parent would be a remote server on the Internet, and the child is your local clone of that tree. Once you have established a common baseline between two source trees -- a common parent -- then you can merge changesets between those two trees with ease. Merging changes into a tree is called a "pull", and is analagous to 'cvs update'. A pull downloads all the changesets in the remote tree you do not have, and merges them. Sending changes in one tree to another tree is called a "push". Push sends all changes in the local tree the remote does not yet have, and merges them. From these concepts come some initial command examples: 1) bk clone -q http://linux.bkbits.net/linux-2.5 linus-2.5 Download a 2.5 stock kernel tree, naming it "linus-2.5" in the local dir. The "-q" disables listing every single file as it is downloaded. 2) bk clone -ql linus-2.5 alpha-2.5 Create a separate source tree for the Alpha AXP architecture. The "-l" uses hard links instead of copying data, since both trees are on the local disk. You can also replace the above with "bk lclone -q ..." You only clone a tree -once-. After cloning the tree lives a long time on disk, being updating by pushes and pulls. 3) cd alpha-2.5 ; bk pull http://gkernel.bkbits.net/alpha-2.5 Download changes in "alpha-2.5" repository which are not present in the local repository, and merge them into the source tree. 4) bk -r co -q Because every tree is a repository, files must be checked out before they will be in their standard places in the source tree. 5) bk vi fs/inode.c # example change... bk citool # checkin, using X tool bk push bk://gkernel@bkbits.net/alpha-2.5 # upload change Typical example of a BK sequence that would replace the analagous CVS situation, vi fs/inode.c cvs commit As this is just supposed to be a quick BK intro, for more in-depth tutorials, live working demos, and docs, see http://www.bitkeeper.com/ BK and Kernel Development Workflow ---------------------------------- Currently the latest 2.5 tree is available via "bk clone $URL" and "bk pull $URL" at http://linux.bkbits.net/linux-2.5 This should change in a few weeks to a kernel.org URL. A big part of using BitKeeper is organizing the various trees you have on your local disk, and organizing the flow of changes among those trees, and remote trees. If one were to graph the relationships between a desired BK setup, you are likely to see a few-many-few graph, like this: linux-2.5 | merge-to-linus-2.5 / | | / | | vm-hacks bugfixes filesys personal-hacks \ | | / \ | | / \ | | / testing-and-validation Since a "bk push" sends all changes not in the target tree, and since a "bk pull" receives all changes not in the source tree, you want to make sure you are only pushing specific changes to the desired tree, not all changes from "peer parent" trees. For example, pushing a change from the testing-and-validation tree would probably be a bad idea, because it will push all changes from vm-hacks, bugfixes, filesys, and personal-hacks trees into the target tree. One would typically work on only one "theme" at a time, either vm-hacks or bugfixes or filesys, keeping those changes isolated in their own tree during development, and only merge the isolated with other changes when going upstream (to Linus or other maintainers) or downstream (to your "union" trees, like testing-and-validation above). It should be noted that some of this separation is not just recommended practice, it's actually [for now] -enforced- by BitKeeper. BitKeeper requires that changesets maintain a certain order, which is the reason that "bk push" sends all local changesets the remote doesn't have. This separation may look like a lot of wasted disk space at first, but it helps when two unrelated changes may "pollute" the same area of code, or don't follow the same pace of development, or any other of the standard reasons why one creates a development branch. Small development branches (clones) will appear and disappear: -------- A --------- B --------- C --------- D ------- \ / -----short-term devel branch----- While long-term branches will parallel a tree (or trees), with period merge points. In this first example, we pull from a tree (pulls, "\") periodically, such as what occurs when tracking changes in a vendor tree, never pushing changes back up the line: -------- A --------- B --------- C --------- D ------- \ \ \ ----long-term devel branch----------------- And then a more common case in Linux kernel development, a long term branch with periodic merges back into the tree (pushes, "/"): -------- A --------- B --------- C --------- D ------- \ \ / \ ----long-term devel branch----------------- Submitting Changes to Linus --------------------------- There's a bit of an art, or style, of submitting changes to Linus. Since Linus's tree is now (you might say) fully integrated into the distributed BitKeeper system, there are several prerequisites to properly submitting a BitKeeper change. All these prereq's are just general cleanliness of BK usage, so as people become experts at BK, feel free to optimize this process further (assuming Linus agrees, of course). 0) Make sure your tree was originally cloned from the linux-2.5 tree created by Linus. If your tree does not have this as its ancestor, it is impossible to reliably exchange changesets. 1) Pay attention to your commit text. The commit message that accompanies each changeset you submit will live on forever in history, and is used by Linus to accurately summarize the changes in each pre-patch. Remember that there is no context, so "fix for new scheduler changes" would be too vague, but "fix mips64 arch for new scheduler switch_to(), TIF_xxx semantics" would be much better. You can and should use the command "bk comment -C" to update the commit text, and improve it after the fact. This is very useful for development: poor, quick descriptions during development, which get cleaned up using "bk comment" before issuing the "bk push" to submit the changes. 2) Include an Internet-available URL for Linus to pull from, such as Pull from: http://gkernel.bkbits.net/net-drivers-2.5 3) Include a summary and "diffstat -p1" of each changeset that will be downloaded, when Linus issues a "bk pull". The author auto-generates these summaries using "bk push -nl 2>&1", to obtain a listing of all the pending-to-send changesets, and their commit messages. It is important to show Linus what he will be downloading when he issues a "bk pull", to reduce the time required to sift the changes once they are downloaded to Linus's local machine. IMPORTANT NOTE: One of the features of BK is that your repository does not have to be up to date, in order for Linus to receive your changes. It is considered a courtesy to keep your repository fairly recent, to lessen any potential merge work Linus may need to do. 4) Split up your changes. Each maintainer<->Linus situation is likely to be slightly different here, so take this just as general advice. The author splits up changes according to "themes" when merging with Linus. Simultaneous pushes from local development go to special trees which exist solely to house changes "queued" for Linus. Example of the trees: net-drivers-2.5 -- on-going net driver maintenance vm-2.5 -- VM-related changes fs-2.5 -- filesystem-related changes Linus then has much more freedom for pulling changes. He could (for example) issue a "bk pull" on vm-2.5 and fs-2.5 trees, to merge their changes, but hold off net-drivers-2.5 because of a change that needs more discussion. Other maintainers may find that a single linus-pull-from tree is adequate for passing BK changesets to him. Frequently Answered Questions ----------------------------- 1) How do I change the e-mail address shown in the changelog? A. When you run "bk citool" or "bk commit", set environment variables BK_USER and BK_HOST to the desired username and host/domain name. 2) How do I use tags / get a diff between two kernel versions? A. Pass the tags Linus uses to 'bk export'. ChangeSets are in a forward-progressing order, so it's pretty easy to get a snapshot starting and ending at any two points in time. Linus puts tags on each release and pre-release, so you could use these two examples: bk export -tpatch -hdu -rv2.5.4,v2.5.5 | less # creates patch-2.5.5 essentially bk export -tpatch -du -rv2.5.5-pre1,v2.5.5 | less # changes from pre1 to final A tag is just an alias for a specific changeset... and since changesets are ordered, a tag is thus a marker for a specific point in time (or specific state of the tree).