(Almost) distributed CVS with Eclipse

NOTE : unfortunately, this trick is again unavailable starting with Eclipsev3M9, the last version which made that possible was v3M8


In Eclipse 3.0, in the CVS repository properties it is possible to separate the 'read' and 'write' access locations (see picture). AFAIK, this feature was meant for developers using extssh for CVS connection. They would use an anonymous account for updates in 'clear', and their respective accounts for commits over encrypted link. The basic idea is that clear connections are considerably faster than encrypted ones. However, you may change the whole connection string, not only the login and protocol, thus enabling the usage of two completely different repositories for 'read' and 'write' actions on CVS.

This might come handy in situations like the one we've recently been confronted with, in my team. The main CVS repository is located at a certain geographical distance (in the headquarters, in France) and the VPN bandwidth is nothing to brag about. Working with CVS was decent, but things have started to got meaner lately, mainly because of three reasons:

- both teams have grown = more activity on the repository, - vast majority of developers is integrating frequently, aka every little bit of functionality is committed as soon as it's stable and working. Of course, before committing, there's the mandatory update to check for consistency against the most recent codebase. This means that at least a few synchronizations will be performed by each member of the team, each day. - Code generators. It's true that common sense dictates that nothing generated should be stored in CVS, because this may be source of frequent conflicts and loads the repository in an inefficient manner. However, when one routine generation (for each of the few modules) may take between 3 and 10 minutes and produces thousands of files, it becomes pretty obvious that it should not become part of the build process. What gives : in the days when the model has some important changes, a few different subsequent versions of 3-15MB jar files are committed on the repository. Update process starts to slow down and soon a part of the team is lagging, waiting to download the update. The irony is that usually everybody is downloading slowly and inefficiently the SAME big file. Of course, there are also other kinds of traffic via the VPN, such as Netmeeting (but you can't really have a conversation when everybody in the team is updating the codebase and slows the VPN to a crawl).

This boils down to having a 'read-only' local repository, perfect mirror of the main CVS server, which will be used only for updates. Both the server and its mirror are Linuxes. My choice for mirroring was good old rsync. While cvsup seems to be all the rage nowadays, I headed towards rsync because a) it comes pre-installed within any decent Linux distro and b) I am using Gentoo on my home desktop, so rsync is a tool that I learned to use [and abuse] almost daily, via portage.

We won't lose a lot of time explaining how to set up an rsync server, since it's very well explained in this rather old but useful tutorial. There's just a small twist : we were planning to synchronize frequently so we ran rsync daemon independently, not via xinetd. The config file on the server:

2;root@dev /etc> more rsyncd.confmotd file = /etc/rsyncd.motdlog file = /var/log/rsyncd.logpid file = /var/run/rsyncd.pidlock file = /var/run/rsync.lock[cvs] path = /opt/cvsroot comment = CVS Rsync Server uid = nobody gid = nobody read only = yes list = yes hosts allow =

is a classical read-only, anonymous (we're on VPN, right ?) mirroring setup. Next step is making a rsync service in /etc/init.d or appending the line “/usr/bin/rsync –daemon” to /etc/rc.d/rc.local, to be sure that the daemon restarts after reboot [especially when you are NOT administrating the server].

On the client, there are some small tricks to make it work. First one, setup your client CVS repository in the same location as on the server, '/opt/cvsroot' in our case (because you are going to synchro CVSROOT as well, which contains absolute paths in some if its files).

The mirroring script is something in the spirit of:

#!/bin/bashsource /etc/profilecd /opt/cvsrootRUN=ps x | grep rsync | grep -v grep | wc -lif [ "$RUN" -gt 0 ]; then #already running exit 1firsync -azv --stats /home/cvs/history >/tmp/histosum1=sum /home/cvs/historysum2=sum /opt/cvsroot/CVSROOT/historyif [ "$sum1" = "$sum2" ]; then #nothing to do date > /tmp/empty exit 0fidate > /tmp/startedcat /dev/null > /tmp/endedrsync -azv --stats --delete --force /opt/cvsroot > /tmp/statuscd /optdate > /tmp/ended

This (badly written) script is heavily inspired from a script found on a Debian user mailing list:

- exits in the case of a long running rsync process - assures that the full rsync is triggered by changes in CVSROOT/history. This is a neat trick which minimizes the server activity if you are syncing frequently, as we will do. - outputs stuff in some files in the /tmp directory. This has a double purpose. First is avoiding root mailbox pollution (because we're 'cron-ing' it, the output will be visible as a few hundreds of mails each day). Second is providing data for a web page on the client machine which tells at a glance what is the synchronization status. AFAIK rsync will not write incomplete files, but the usual sound advice is to make an update between successive synchronizations.

The small 'synchronization status' page (left as an exercise for the reader :) ) just prints the dates and the last few lines of the output files; this is a dumbed-down sample :

CVS synchronization started at Mon Feb 23 09:45:18 EET 2004

, ended at Mon Feb 23 09:46:00 EET 2004

Last empty synchronization recorded at Mon Feb 23 09:40:05 EET 2004

No knowledge about recent deleted files

Last synchronized

wrote 60767 bytes read 471522 bytes 13142.94 bytes/sec

Last added files


The only step left is to create a file in cron.d containing the line:

0-59/5 7-23 * * * cvs your_mirroring_script.sh

(meaning sychronization each 5 minutes from 07:00 to 23:00) and you may start enjoying blazingly fast CVS updates in Eclipse.