• Subscribe

    Subscribe to This Week In Panospace by eMail.
    Subscribe in a reader
  • License

    Creative Commons License
    This work is © 2008-2012
    by Yuval Levy
    and licensed under a
    Creative Commons License.
  • Entries

    November 2017
    M T W T F S S
    « Dec    
     12345
    6789101112
    13141516171819
    20212223242526
    27282930  
  • Archives

Thank you SourceForge, hello Launchpad!

A tracker is the second most critical infrastructure for a software project after a source control management tool (SCM).

The tracker is a structured communication tool, arguably more important than unstructured communication tools such as mailing lists and chats.

While the memory of chats is short lived and that of mailing lists buried in history of archives, a tracker’s function is to keep relevant issues on top of the priorities list.  A tracker helps developers keep an overview of what to do next.

The Old Tracker

Since inception Hugin has used SourceForge’s proprietary tracker.  But the project has grown and the SourceForge tracker is no longer enough for our needs.

Our problem with the tracker are of qualitative and quantitative nature, and of resources/time.

We could and did address qualitative issues while still using the SourceForge tracker.  Changing the settings to prevent anonymous reports enabled us to interact with the reporters and help them provide more detailed feedback and complete the reports.

This also helped reducing the quantity, although problem such as duplicate or invalid reports persisted, and the increased popularity of Hugin increased the number of incoming reports.  Not a bad thing if the project has the resources to deal with the increased traffic.

In the end it was a resourcing issue.  When it takes one hour just to work through a dozen reports without taking any action outside of the tracker, it is the tool that does not work.

Over the years I started a few attempts to bring the ticket count down and to prioritize the issues.  It was an uphill struggle doomed to fail.  The number of open tickets diminished only temporarily.  More than 200 open reports coupled with the limitation of SourceForge’s tracker interface; which include a stupid default view listing all tickets including closed, invalid, stale and expired ones; made the tracker difficult to use.

So I looked for alternatives.  It had to be a hosted solution.  We don’t have the resources for self-hosting such critical infrastructure.  It had to enable faster processing than a dozen tickets per hour.  It had to provide a migration path on the way in, but also on the way out: I don’t want our data to be captive.

During an extensive research I considered multiple tools, including Google Project Hosting (wide range of supported/integrated SCM) and Github (elegant and efficient web based interface to the tracker).  In the end I settled on Launchpad.

Welcome to Launchpad

There is so much good to be said about Launchpad.

Best of all is the support.  The people at Canonical are responsive.  You get to communicate with real people who own the issues and take pride in solving them.  Exploring other tools I was on my own.  At Canonical I was never far away from expert help.  They set up a prototype with our real data so that we could test how life with Launchpad would be before migrating.

Next is the guarantee.  Not only are there import and export tools that are maintained current and documented.  As an ultimate guarantee, Launchpad is Open Source, so there will always be a solution to export data if necessary.

The web interface is light and well designed.  The thread of communication related to a ticket easily readable and the disposition of information and actionable widgets on the screen reflects their relevance  (don’t ask me to comment about SourceForge’s layout disposition and consumer-hostile JavaScript spread throughout their aging site).

But the real productivity booster is the email interface.  Processing a ticket becomes as easy and fast as answering to an email.  No time wasted on web pages latency.  Process tickets in your tested and tried environment, with fonts and colors optimized for you.

We needed a better way to manage the life cycle of a ticket and I found it.  Plenty of other goodies such as tagging, voting on issues, transforming tickets into answers/FAQ, automated search for duplicates, moving tickets between projects… all rounded up the excellent impression of a mature tool that is still moving forward.

Change Management

Now I needed to help our community find it.  Unlike the SCM, which is a developers-only tool, the tracker has a broader constituency with disparate stakeholders, including ordinary users.  A migration can only be successful if the users adopt the tool.  I had no doubt about the superiority and user-friendliness of Launchpad (although a few users managed to discover rough edges at sign up and when merging the imported tickets into their profiles), but will the users sign up and use it?

Introducing the change was slightly more challenging than introducing Mercurial earlier this year.   Not everybody is comfortable with change and if I was a conspiracy theorist I would swear that some resentful users were plotting to sabotage my plan.

Obstacles

Over time I have given them plenty of reasons to be resentful.  I have little patience for a certain kind of consumers.  Just calling them consumers probably offends them.  I make no secret of the low esteem I have for “demanding consumer” – those that demand that others do things for them (for free, of course); those that start with the assumption that they are right and everything else must be wrong.  I stick obvious facts in their face when it would be wiser to  ignore them.

I have been harsh with them and I am getting back what I’ve been seeding.  In some cases I don’t care.  In some other cases it’s no longer fixable.  In some cases the fix takes more effort than I want to invest and in some cases time will heal the rifts.

I need to learn better to ignore, and to fix/reply only what is really an impediment to moving forward.

Overcoming Obstacles

  1. Let the first barrage of negative fire go past you.  Don’t do like me, i.e. don’t react.
  2. In a second round get the key people in the project to express publicly the need for change.  In our case, that “the current tracker is inadequate for our needs”.
  3. Invite a small representative panel of stakeholder to test.  It’s like statistics: if it works for them, it will work for the population they represent at large.
  4. Get back to the community with the supporting findings of the representative panel and pull the change through.

I privately invited six people to the panel.  Two developers, one builder, one admin, one triager, one user and myself.  I carefully selected them to be representative enough of the community as a whole and including people who are hostile toward me.

In a small group it is easier to ask for people to actually do something rather than talk, and indeed they followed my request to spend a few minutes doing specific tasks and a lot of intelligent comments and critique came out.  Not all comments were positive, but enough to make me feel confident that this will work well.

Pulling it Off

Pulling it off technically was easy.  After the test run, Gavin at Canonical knew already what to expect.  The instructions are here if you want to try.  In a nutshell:

  • Export your SourceForge tracker
  • Run it through the import converter.  You will need good pipes to the internet as the converter will scrape attachments from SourceForge and enrich the XML export file.  Make sure your SourceForge tracker are world-readable when doing this.
  • Freeze/block access to the SourceForge tracker
  • Publish the enriched XML dump  on the web
  • Start a question ticket on Launchpad pointing to the file on the web
  • Wait for Canonical to process your file into staging
  • Test that everything is well on staging
  • Repeat again on the go live day with everything going on Launchpad
  • Configure your tracker on Launchpad and you’re all set

We planned to minimize the outage to the project:  I produce the dump on late Sunday night on the (East Coast) and a few hours later Gavin started his working day in the UK and Monday morning I woke up to a nice surprise.

Our new bug tracker is fully operational and slowly but surely life and activity is coming into it.  The challenge ahead: triage and sort the whole backlog of bugs.  The work has just started.

Thank you Gavin.  Thank you Canonical.

 

Hacking Hugin: Preferences

Let’s face it:  the Hugin codebase has a steep learning curve that is difficult to climb.  We’ve had this feedback for years, from the 2007 Google Summer of Code students and earlier.  I’ve been around the project for a few years but I still find it difficult to get back into it after taking a break for a while.  The solution is known: documentation.  Hence one of my first contributions to the project was to drive the effort to document the build.

Documenting the code is a much larger endeavor.  Sure we have the (incomplete) doxygen documentation, a good reference, but as dry and interesting and captivating as reading the complete Oxford dictionary end to end.  We need a broader description of how the codebase is organized and small tutorials to get new contributors started.  I realize that I always scribbled down my little notes when doing things, but never really polished them into proper documentation.  Time to change this bad habit.  Here is a first simple tutorial: adding a preference to Hugin.

There are four steps involved:

  1. Edit the GUI
  2. Set the defaults
  3. Make the preference widget functional
  4. Use the preference

Edit the GUI

Hugin uses the wxWidgets GUI toolkit.  Layouts can be built in two ways:  on the fly by the functional code; or using a declarative language, XRC, an XML derivate.  The preferred way for Hugin is to use XRC,  Panels are defined in /src/hugin1/hugin/xrc.

There are many tools to edit XRC files.  To be honest I have not found one of them that I really like.  I edit the XRC files manually, with a simple text editor and a little bit of discipline.

Before proceeding, back up your existing XRC file.  For this technique to work, Hugin must be set to display the GUI in the English language.  I start Hugin and I look at the GUI layout that I want to change. ./src/hugin1/hugin/xrc/pref_dialog.xrc for the preferences.  I find a string that is unique enough to appear only there in this context.  Then I open the panel’s XRC and use the editor’s search functionality to zero in on the area of interest.

In the area of interest I first separate visual/logical blocks by adding temporary white spaces and temporary XML comments enclosed in <!– –>.  Indentation helps too, but alon it is not enough to keep track of where I am.  Remember to clean up after editing, i.e. make sure indentation is correct and remove white spaces and comments.

Usually there is already a similar widget to the one I want to add somewhere else in the code.  I look for that widget with the same technique and I copy/adapt its definition.  Widgets can have many attributes and they are surrounded by a lot of layout definitions.  Detailed stuff that is very important to keep a nice, polished and uniform look.  When adding a widget, especially when copying it, it is important to take care of the name attribute.  Give it a meaningful name, to avoid duplicates, and to be able to refer to it from the code.  Follow the same naming convention as the other widgets names: all_small_caps_underscore_separated_words.

No need to rebuild the application to preview the changes.  Just close the application, copy the edited XRC file from ./src/hugin1/hugin/xrc/ to /usr/local/share/hugin/xrc (or wherever the files are installed on your system), restart the application and preview the result.  Start the application from a terminal to see diagnostic messages.  If the XRC file was badly damaged/malformed, Hugin may crash right after the start, or when the panel is displayed.  You can always revert to the backed up version if the edits are beyond repair (you did backup, did you?) and start over again.  Making the changes in small incremental steps helps prevent getting into situations beyond repair.

Set the Defaults

Choose a meaningful default for your widget and add it to ./src/hugin1/hugin/config_default.h.  Name the default value ALL_CAPS_UNDERSCORE_SEPARATED_WORDS like the other default values.  You will find inspiration in the default values around you.  It is good practice to add the default value in a place that follows the logical order of preferences grouping in pref_dialog.xrc.

Make the Preferences Widget Functional

Preferences functionality is in ./src/hugin1/hugin/PreferencesDialog.cpp.  Here preferences are read/written/reset with cfg->Read() and cfg->Write(), referring to the names you defined above.  Note that strings must be encapsulated in the wxT() function, and that here you give (yet) another name to the preference, this time in CamelCaseNoSpaces.  Preferences are referred to by this name in the code.

Now it is a good time to test the new preference.  Build the code, start the application, go into the Preferences menu edit / change / reset the preference.  Quit and restart the application to check that it is really persistent.

Use the Preference

Now you can use your newly created preference everywhere in the code with wxConfigBase::Get()->Read(wxT()).  For two practical examples, see the smart undo preference and the file naming convention preference.

The two links above demonstrate another useful technique to dive into the codebase:  read commit diffs.  Commit diffs show the difference between two revisions.  Developers comment such differences in the changelog.  By knowing from the changelog what the intended result of the change is, and looking at the (usually not too many) lines of code changed, you can understand how to make similar changes.

I hope this tutorial has helped you demystifying a little bit Hugin’s codebase.  If you need any help, the developers read the mailing list and will be happy to help you come on board.

Hugin 2010.2.0 is out

Hugin 2010.2.0 has been released more than a week ago.  I was traveling when Bruno made the announcement officially sanctioning the second release candidate, moving new features such as mosaic mode and masking one step closer to the user.  But this is not the last step.

As usual it is a source-code only release.  Contributors will build binaries out of it that will eventually trickle down the official distribution channels in the coming weeks and months.  As usual the beggars are faster than the contributors.

While the project strives to make user’s life easy with binaries, the only way to release source code and binaries for all platforms at the same time would be to introduce an artificial delay in the release process, which in my opinion is a very bad idea trying to creep in from the world of proprietary software.

There is a natural sequence of events before a new feature hits the user’s desktop:  first there is source code, then it is compiled, then it is tested.  When the tests are conclusive the source code is packaged and distributed to builders who compile user-friendly installers for distribution.  It does not happen overnight.  The source code is naturally ready before any binary.

Marketing controls the release process in the world of proprietary software.  Everything is done in secret.  The effort of keeping everything under the radar screen until a big-bang launch is justified by the resulting publicity and boost in sales.  Everything is delayed until the launch campaign is ready, the product is ready, the distribution channels are ready.  The release announcement is made with the sole purpose of motivating consumers to buy.

It is the opposite dynamics in Open Source.  Delaying the source code release until all installers for all platforms are ready to ship only hinders progress.  It is better to let the developers move on with further improvements rather than freeze them while waiting for a synchronized release.  Since we don’t sell the software, there is little benefit to such synchronization.  Builders and distributors will take care of the details of producing binary installers for the end-user asynchronously in due time.

Open Source development is a continuum.  Strictly speaking, every time a developer pushes a changeset into the repository he is making a “release”.  He is releasing his newly add code to the general public.  There is even an RSS Feed of such releases.  It’s interesting to developers and others that are threading on the bleeding edge, but to the average user this is mostly boring overload.

The purpose of a release cycle is to motivate contributors to polish up their contribution – whether they are code, translations, manuals, tutorials, installers.  The release announcement is the point where the focus of the release cycle shift from code/bugfixes to preparations for distribution.  Stay tuned for binary distribution to start as soon as possible.

From Subversion to Mercurial. Part 3, Implementation Day and Beyond

If you followed the steps described in the first and second parts of this series, you should have a Mercurial (Hg) repository ready to replace your project’s Subversion (SVN) repository.  In this third and last part we’ll go over Implementation Day, with particular detail on how to implement this migration on the SourceForge infrastructure.

Test

Can’t test enough.  Your script produces an Hg repository that looks OK on superficial investigation with tools like hg log and hg view.  But does the code build?  Hugin’s build system had a couple of dependencies on SVN and they needed to be updated.  Thomas Modes and Kornel Benko stepped up to the task.  Can developers and builders use this repository?

Educate

On Implementation Day the project will transition from SVN to Hg.  While all relevant contributors are proficient in SVN, Hg was new territory for many.  While progressing on the conversion I kept the community informed and took every opportunity to encourage learning of the new tools, including public tutoring that continues after the transition.  You want to encourage people to share their experiences and learn from each other.  Conceptually the biggest difference between SVN and Hg is that with Hg the repository sits on your local client.  A check in to SVN is the equivalent of a check in and a push to Hg.  Offline operation is not possible with SVN but it is with Hg.  However both are revision control system (RCS) and very similar to use.

Implementation Day Overview

Warn everybody one last time.  Create a new repository on SourceForge for each migrating code line.  Lock down SVN by revoking write access to everybody but a few maintainers who will clean up after the transition.  Run once again the whole migration on a green field, from scratch, to be sure that everything to the very last SVN commit is included.  Test one last time this new local repository (compare it to previous results); create the new repository on SourceForge and push your local repository to it. Last but not least, configure the repository on SourceForge and announce the transition to the world.  Sounds easy.  The devil is in the detail.

Mercurial on SourceForge

SourceForge has been very generous with the projects it is hosting: we can have unlimited Hg repositories.  Unfortunately there are rough edges.

To activate Mercurial for your project:

  • Login via the web as a project administrator and go to the “Develop” page for your project.
  • Select the Project Admin menu, and click on “Feature Settings”.
  • Select “Available Features”.
  • Select the checkbox to the left of the “Mercurial” heading. Your repository will be instantly enabled.

This first repository will be fine but if you want to activate more than one repository you will have to manually set them to be group writable.  To activate additional repositories:

  • Log on to SourceForge’s shell service (assuming you have set up your SSH key) with `ssh -t USER,PROJECTUNIXNAME@shell.sourceforge.net create`
  • Navigate to your project’s Mercurial space with `cd /home/scm_hg/P/PR/PROJECTUNIXNAME`, e.g. for Hugin this would be `cd /home/scm_hg/h/hu/hugin`
  • Create a new directory with the name you want for the repository.  E.g. for Hugin’s website this was `mkdir hugin-web`
  • Execute `hg init DIRNAME` (where DIRNAME is the directory you just created, e.g. `hg init hugin-web`). This will initialize the new repository.
  • Inside the new repository, edit the configuration file .hg/hgrc (see configuration section below)
  • SourceForge rough edge: group write access must be given manually `chmod -R g+w /home/scm_hg/P/PR/PROJECTUNIXNAME/DIRNAME`

Configuration of the Mercurial Repository on SourceForge

SVN support on SourceForge is mature and projects are used to amenities such as email commit notifications.  Hg support is better than what the scant documentation suggests.  Most standard functionality, including email notification, works, even if it is officially unsupported.  One only has to find out how to configure it.  I played around with some trial and error already when optimizing the Enblend repository last year.  This is the hgrc file template that works for us:

[hooks]
changegroup.notify = python:hgext.notify.hook

[email]
from = NOTIFICATION_ADDRESS@lists.sourceforge.net

[smtp]
host = localhost

[web]
baseurl = http://PROJECT.hg.sourceforge.net/hgweb/PROJECT/DIRNAME

[notify]
sources = serve push pull bundle
test = False
config =
template = \ndetails:   {baseurl}{webroot}/rev/{node|short}\nchangeset: {rev}:{node|short}\nuser:      {author}\ndate:       {date|date}\ndescription:\n{desc}\n
maxdiff = -1

[usersubs]
NOTIFICATION_ADDRESS@lists.sourceforge.net = **[trusted]

users = *

You’ll have to replace your own project unix name PROJECT; your own Hg repository top directory DIRNAME; and your own NOTIFICATION_ADDRESS mailing list.  The configuration options are documented.

Committer Write Access

With a dRCS like Mercurial write access has a completely different meaning.  Everybody can `hg clone` an existing repository and once cloned has full write access and can publish their own repository.  The d in dRCS stands for distributed.  Technically there are no more hierarchies and no more central control.  All clones are equal.  Whoever owns a clone can decide to publish it on the web, e.g. with `hg serve`, and give write access to whomever they want.  Granting SourceForge write access only means that the committer can push to the repository hosted on SourceForge.  What makes a repository authoritative is user’s trust, and this is given implicitly by pulling from it.

SourceForge Rough Edges, Again

I wish there was a way to group-manage access rights on SourceForge.  I have not found it.  I needed to revoke SVN access to most developers, and grant them Hg access.  I had to click through each and every contributor registered with the project and single handedly managed their access rights.  To make things worse the pseudo-ajax web interface of SourceForge is everything but asynchronous: it reload the page after each change.  Ajax-cosmetics with underlying old technology from the last century.

One point projects on SourceForge will need to pay attention to are default access rights.  I did not find a place to change those, so every new project member gets by default SVN access right, unless you explicitly remove them.  It seems to me that the defaults on SourceForge are based on the principle of random uncoordinated historical growth.  Have they ever heard of the generally accepted principle of least privilege?  And the default file access for newly created extra Hg repositories is less than reasonable least privilege (see above).

<rant>And don’t tell me about SourceForge’s IdeaTorrent and ways to request and enhancement.  In my experience it does not work and some things on that site have been broken for years when the fix is simple, easy, and does not take much time.  Have you tried to use a SourceForge mailing list archive?</rant>

Push

Now that everything is set, you can simply `hg push` from your local repository to the SourceForge one.  Or if you’re really confident, you can rsync the .hg directory (but don’t forget to edit the .hg/hgrc configuration file on the SourceForge end).

CMake Build System

Our CMake build system depended on SVN and after the push it was broken.  Kornel Benko and Thomas Modes fixed it.  Bruno Postle added a break in the CMake build system in the SVN repository, to warn users of that repository that newer versions are in Hg.  Harry van der Wolf updated the OSX build system.

Conclusion

The disruption was short.  A few hours after going live, developers started committing again, using Hg.  Builders started building and distributing again, using Hg.  The Google Summer of Code students cloned away their own copies of the source code and started working on the next major developments for Hugin.  After taking on the most complex of the code lines in the SVN first,  I migrated the remaining ones over a few hours Sunday night.  Hugin and most of its related projects live now happily in Hg and can easily be converted to other formats, including Bazaar, git, and even SVN.  Initially I thought to mirror the default code branch from Hg to SVN, but our project does not really need that.  Subversion has been made completely redundant by a newer, better, superior tool. Mercurial and its likes would not exist without Subversion, and should be seen as a continuum in the lineage rather than a break from the past.  With Mercurial, Hugin is freer than ever, and you are free to take it further on a journey to the future.  For now Hugin still lives on SourceForge, where the next critical bit of infrastructure is the bug tracker.  But with Mercurial the dependency on SourceForge; and the dependency on any single central service or person; has been further reduced.  Long and Free live Hugin.

From Subversion to Mercurial. Part 2, Mapping the Road

In the first part we started a community buy-in process to support the migration and we set out the technical stage. In this part we’ll map out the road for moving the code from Subversion (SVN) to Mercurial (Hg).

Repository Layout

Source and Target layout are most likely different from one another.  You need to test if the selected conversion tool supports the source layout.  Most tools handle standard/canonical layouts, but few repositories follow such layouts strictly and consistently over time.

The Hugin SVN repository was itself the result of a migration from an even older tool, CVS.  The subdivisions of the Hugin codeline did not follow the canonical trunk/branches/tags subdivision to the letter: We had good reason to distinguish three kind of branches: development branches, obsolete_branches, releases. Moreover the repository contained seven unrelated code lines because of the SourceForge limitation to one SVN repository per project.  The sensible choice was to separate each of the seven code lines into its own Hg repository.  In Hg, branches and tags are not part of the layout and they only need to be addressed in terms of history conversion.

History Clean Up

The next big question is how far back do you need to go?  And to what level of detail?  We decided to keep the SVN repository publicly accessible to document history.  This freed us from the need  for a detailed reconstruction of the past.

You will have a wide range of choices from painstakingly reconstructing every single past changeset to pragmatically start from scratch with a current code snapshot.  The trade-off is between effort, storage requirements, and benefits to the project.  I decided to go as far back and into as much detail as the automated tools enable me with little effort; and to step beyond that only in case the benefits outweigh the extra effort.

This meant giving up on the history of past development branches.  The nature of SVN merge operations implicitly omits carrying the history of the development branch into trunk. To fully reconstruct history one must extract the development branch and transplant it into the Hg default code line.  Maybe feasible but time consuming.

Save that time.  You will need it to comb a few knots you’ll find hidden inside SVN history.  The result of less than optimal manipulations, these knots are quickly fixed in subsequent SVN revisions so that they do not affect day to day operation.  They get forgotten until somebody has to dig up history.  We had two such knots in Hugin:

Movie files that do not belong in the repository landed there by mistake.  A few revisions later they were removed and stopped affecting daily checkout operations.  But they’re still there, represent more than  75% of the weight of the Hugin SVN repository, and will affect the Hugin Hg repository if left untreated.

We also had an unorthodox switch of a branch to replace trunk completely.  It worked well while using SVN but automated conversion tools trip over unconventional layout operations.  Luckily this only left a small cosmetic scar with the tool retained.  I decided not to spend time on cosmetic aspects and left the scar untouched.

Tools

The advent of distributed RCS spurred development of a panoply of tools to efficiently move around bits of code.  It was difficult to discern upfront which tool would work for my specific scenario.  I’ve tried a few of them and  the one that worked best for me was Mercurial’s own convert extension. Another tool that was helpful in the process was Mercurial’s hgk extension.

Edit the following lines int your ~/.hgrc file (create it if it does not exist) to activate these extensions.  You will also need the directives in the [ui] section:

[extensions]
convert =
hgext.hgk =
[ui]
username = YOU <your@email.add>
verbose = True

Mapping Users

Changesets are committed by users.  The definition of a user in Hg differs from SVN.  We need to map SVN users to Hg users.  The syntax of the file is one user per line with a statement listing the SVN user and the corresponding Hg user, e.g.
yuv = Yuval Levy <yuv@example.com>

The following command will produce a file listing alphabetically all users that ever committed to SVN, one per line:
svn -q log | grep ^r | cut -d'|' -f 2 | sort | uniq > svn_users.txt

I used a quick script to generate SourceForge users addresses (@users.sourceforge.net) from that file, but some manual cleanup will be inevitable (and is a good opportunity to keep the buzz going and the stakeholders interested).

While it is possible to enter any thing in the username directive of ~/.hgrc, the best practice is to put in a name and an email address.  This is important to establish the legitimacy of the code committed.

Conversion Process

Mapping out the conversion is an iterative process:  set up the conversion command, kick it off, go for a walk while the computer churns through the repository.  When you come back, hopefully there is an Hg repository that you can analyze to determine the next step.  Usually the next step will be to refine some of the configuration files or conversion parameters.  Rinse/repeat until the resulting Hg repository fulfills your expectations.

I strongly recommend that you document each single step and minuscule change.  Even better: if I was to start such a process again, I’d keep a shell script to run everything from scratch to the reconstruct the current state.  You will find yourself going back to the same operations again and again, sometimes days or weeks later. Memory may betray you on small details.

Convert, Again, Again, and Again.

The basic command to convert a repository is
hg convert --branchsort --config convert.svn.branches=hugin/branches --config convert.svn.tags=hugin/tags --config convert.svn.trunk=hugin/trunk --authors svn_users.txt --filemap hugin_filemap.txt hugin-mirror hugin-mercurial

The paths to the branches, tags, and trunk depend from the SVN repository’s layout and the intended outcome. You’ll tweak those many times.

When I wanted to add the 2010.0 release branch on top of the converted trunk, the command was:

hg convert --branchsort --config convert.svn.branches= --config convert.svn.tags=hugin/tags --config convert.svn.trunk=hugin/releases/2010.0 --authors svn_users.txt --filemap hugin_filemap.txt hugin-mirror hugin-mercurial

hugin_filemap.txt is used to include/exclude paths.  To filter out the heavy movies, I used the following:

exclude "GSoC 2007/Presentation 1"
exclude "GSoC 2007/Presentation 2"

Examine The Results

When you first walk into the newly converted repository with cd hugin-mercurial, it feels empty.  There is only one invisible .hg folder.  The repository.  Use hg view to have a first look at the resulting revisions tree. You need to hg checkout a revision if you want to see more. Or delve into internals. The file .hg/shamap will list all SVN revisions with path and revision number against Hg SHA1 changeset IDs.  These are helpful in case you need to manipulate history, e.g. to skip on some revisions or to link a disconnected part of history such as a separately extracted branch with a parent and child changesets.  For such manipulations you will use the –splicemap and –branchmap options.  They point to  files, like –filemap, but work differently.  They are described in hg help convert and can help you fix the most broken of repositories.  I was thankful I did not have to deal with this – for adding the release branches into the repository it was sufficient to simply run convert again on the same hugin-mercurial target.

Test

As you proceed, you will find your repository to improve iteration after iteration.  As soon as you have a result to show, pack it into a tarball and community contributors to download and try the repository in the tarball.  Share as much information as you can, enable them to do the same as what you did.  Unless you have unlimited time and resources, this is the only way to go beyond basic repository integrity checks.  The tests will reveal corrupted repositories, and if the contributors will go one step further and try to build the code, they will also reveal dependencies into the build system that may require the committing of specially crafted code to support Hg instead of SVN.  Keep trying and refining until you have on your hard disk an Hg repository that is ready to replace the old SVN repository. Then you’ll know you’re askready for Implementation Day.

Moved 2: From Subversion to Mercurial. Part 1, Setting the Stage

It’s less than four weeks since I drove that 26′ U-Haul truck full of stuff and I’ve had enough of moving for a while.  So why move again?  This is a different kind of move: a move to more efficient infrastructure.  To a decentralized source code repository.  Thank you Subversion, you’ve served us well over the past years.  Welcome Mercurial, a distributed revision control system (RCS) of the next generation.  In this series of three articles I describe how I moved Hugin from Subversion (SVN) to Mercurial (Hg).  In the first part I’ll describe how to kick off the process in the community and set the technical stage on your machine.  The second part deals with the technical code conversion.  The third part with the conversion aftermath and the actual switch.  Once the road is mapped out, the process is a relatively straight forward one.  I made some mistakes while mapping the road and I hope that if you find yourself in the same situation, these articles will help you prevent such mistakes.

Why Mercurial and why Now?

It could have been git, or Bazaar.  They are all equally good.  But I found Mercurial to be the one with the more mature client support, particularly GUI clients on disparate operating systems; and it is well supported at SourceForge where Hugin is currently hosted.  Our project needs to accommodate contributors using Linux, Windows, OSX, BSD and we do not want to leave anybody behind.  To get all stakeholders buy into the process I started a public discussion.

Spring was the right time for repository cleaning.  With a tight integration schedule the team merged most outstanding development branches into the main code line.  Migrating before branching out again for a new set of Google Summer of Code projects will avoid extra complexity.

For more than two years Hugin has been humming along on an asynchronous development and release process that has helped increase the capacity of the project to absorb changes.  Despite a diligent, disciplined and careful team we seem to have hit a scalability ceiling.  It may be lack of resources (except for the Google Summer of Code students during their three months on Google’s payroll, we’re all here in our spare-time) but I suspect that it is also the infrastructure and I expect Mercurial will further increase the capacity of the project to absorb changes.

SourceForge

One of the first questions to arise from the community was the scope of the change.  If already changing RCS, how about reviewing all infrastructure?  Hugin has been at SourceForge since inception.  A lot has happened in the project hosting arena since.  Sites like GoogleCode, Launchpad, BerliOS, GIThub offer a panoply of services – RCS; bug tracking; mailing lists; web and download hosting.  Often different implementations of the same Open Source tools.  Mostly “free” (as in beer, but beware of the alcohol)  for Open Source projects like Hugin.

The RCS, while central to the project, is just part of a project’s infrastructure.  Migrating the whole infrastructure is beyond the scope of this project.  And beyond the available resources too.   Just moving the nearly 200 open bug reports (many of which are stale or duplicate – the bug tracker needs a good spring cleaning too) to a new bug tracking system can keep a spare-timer busy for months.  SourceForge may not be the most fashionable choice, but it works for us.

Server and Client

On the one side is an existing SVN repository on the SourceForge server.  On the other side is a new Hg repository on the SourceForge server.  How do I move the code from one side to the other?  The first mistake I made was to work on the SourceForge server itself.  This slowed me down and ate their precious bandwidth.  I should have known better: SVN runs on the server sitting in my office closet.  Even that was too much overhead.  The most efficient way to go about the task is to mirror the SVN repository to a local client and work from there.

These are the steps for a K/X/Ubuntu distribution:

sudo apt-get install mercurial subversion python-subversion
svnadmin create hugin-mirror
cd hugin-mirror
echo '#!/bin/sh' > hooks/pre-revprop-change
echo 'exit 0' >> hooks/pre-revprop-change
chmod +x hooks/pre-revprop-change
export FROMREPO=https://hugin.svn.sourceforge.net/svnroot/hugin/
export TOREPO=file://`pwd`
svnsync init ${TOREPO} ${FROMREPO}
svnsync --non-interactive sync ${TOREPO}

The initial sync can take hours or more.  This is a good time to take a break.  If the sync is aborted, you may need to reset the lock state and restart the conversion:

svn propdelete svn:sync-lock --revprop -r 0  ${TOREPO}
svnsync --non-interactive sync ${TOREPO}

It’s a good idea to repeat the above two commands in a cron job or a startup job to keep in sync with the repository over time.

Your local machine is set for the job.  Keep the discussion in your community going, to get all relevant stakeholders to buy into the process. On the next installment we’ll look at how to map the road.

Here We Go Again

hugin-logoHugin 2009.2.0 is still warm and fresh out of the oven and I have started the next release cycle. Hugin 2009.4.0 is coming soon to a download server in your neighborhood. The beauty of parallel development without trunk freeze and the discipline of managing the integration queue are starting to pay off: while I was managing the 2009.2.0 release, Tim added lens calibration to trunk and Thomas added a tool to clean outlaying control points by statistical method.

Three simple criteria and three increasing levels of maturity reduced the bottleneck that has plagued Hugin in the past three years as it had to learn to absorb more change than ever.

The three levels of maturity are:

  • “works for me”
  • integration in trunk
  • release to the general public

And the three tests are

  • functionality
  • multi-platform build
  • regression

Works for me

Developers are no longer hindered by trunk freezes. They can work on their ideas any time, simply by branching out a development codeline either locally or in the central subversion repository (I know distributed revision control systems. Andrew implemented Mercurial for Enblend and I love it. Right now Hugin is undergoing enough change. In a quiet moment we’ll open a discussion about the revision control system).

When the implementation is ready to move to the next level, it is checked against the three criteria. A development codeline – whether under revision control or presented as a patch against trunk – is considered mature when:

  • the functionality it is intended to implement works on the developer’s machine (“works for me” condition)
  • it has been tested to build on the major supported platforms (“does not leave them behind” condition) by at least one contributor for each: Windows, OSX, Linux
  • it does not unintentionally break existing functionality (“no regression” condition)

When a development codeline reaches maturity, it enters the integration queue.

Integration Queue

The integration queue is the ordered list of new features / development codelines waiting to be integrated in trunk. The prioritization is a collective decision by consensus of the developers. Silence = consent. The discussion, and the latest version of the list, are on the mailing list.

The integration queue is not set in stone: a change in the maturity status of a feature in waiting is good reason to review/change the ordered list. In any case it is reviewed after every release branching.

The integration queue is all about coordination. The code matures further. Ideally developers talk and there should not be conflicts. In the real world changes in trunk may affect the new functionalities or the other way around. This is the the time to solve this kind of conflicts, and to test the code on a (slightly) broader basis.

A feature from the integration queue is ready to be merged into trunk when

  • the functionality it is intended to implement works on the machines of the testers who bothered to play with the codeline (“works for them” condition)
  • it has been tested that a merge with trunk builds on the major supported platforms (“does not break trunk” condition) by at least one contributor for each: Windows, OSX, Linux
  • it does not unintentionally break existing functionality (“no regression” condition)

This limits the time during which trunk is unavailable. Integration of the features that have gone into 2009.2.0 and will go into 2009.4.0 was almost seamless. And the clean up from the 2009.2 release cycle has benefited trunk so mcuh that the 2009.4 release cycle is expected to be even faster.

If trunk has absorbed a feature from the integration queue and we’re not yet ready for a new release (e.g. because the current release cycle is still on), integration of the next mature feature from the integration queue starts.

Release

Developers are polled once a month whether trunk warrants a release. If there are enough new features and/or bug fixes compared to the last release somebody starts a release cycle. Instead of releasing from trunk, forcing a freeze, we now branch out a release codeline. That release codeline absorbs all the change control that was slowing down development in trunk to a halt. And the polishing changes that it get benefits trunk at the same time.

Again, a release branch is declared matured and released to the public when

  • the functionality it is intended to implement works for a larger number of testers (“works for the public” condition)
  • it has been tested to build on the major supported platforms (lives up to Hugin’s ideal of being “truly cross-platform”)
  • it does not unintentionally break existing functionality (“no regression” condition)

Resources

All of this would not be possible without human resources: developers, translators, builders, documenters. A big thank you goes to all of them for making this new, more dynamic pace of development possible. The recipe is relatively simple: every resource is self-directed. The least control is exercised on it, the more likely it is to contribute to the project. So the only controls are the maturity tests – and they can be self-administered. Self-responsibility and empowerment reign.

To me resources are like water. They will flow to an attractive codeline like a stream flows into a riverbed. Barrages (such as a trunk freeze) only impede the flow. Water finds alternative ways, and so do potential contributors. My only task left is to remove bottlenecks. Let them flow!