I've been assigned recently a task to prepare development process for two teams that are working on separate version control systems (GIT and Perforce in my case). One of important parts of this task is to create effective method of syncing codebases between both storages.
Of course we have git-p4 tool, but my requirements are a bit complicated for this tool:
- Only subset of whole GIT repository will be stored in P4
- GIT repository already exists with some history (the same for P4)
so I decided to write small script that will do at least P4 -> GIT sync.
My first attempt was:
- Sync GIT with main repository: git pull
- Sync dir to latest sync point: p4 sync -f subdir/...@$CL1
- Reload local changes from GIT: git reset --hard
- make files acceptable by P4: find subdir -type f -print0 | xargs --null chmod u-w
- Learn P4 about local GIT changes based on $CL1: p4 diff -se subdir/... | p4 -x - edit
- Inspect local changes from P4 point of view: p4 diff -du subdir/...
- Merge latest changes from P4 up to $CL2: p4 sync subdir/...@$CL2
- Resolve potential conflicts: p4 resolve -af
- make files acceptable by GIT: find subdir -type f -print0 | xargs --null chmod u+w
- Add missing files: git add .
- Build + tests
- Upgrade GIT repo: git commit -am "subdir merged up to CL $CL2"
But I noticed that merges performed by P4 aren't fast nor accurate. A developer from my team suggested that it may be useful to import P4 in smaller chunks in order to allow do bisection if there's a bug in imported codebase.
Then I created simple script:
#!/bin/sh CL1=$1 CL2=$2 p4 changes ...@$CL1,@$CL2 | sort -n | awk -v CL1=$CL1 ' BEGIN { print "p4 sync mw/...@" CL1 print "p4 sync ui/...@" CL1 } { CL=$2 print "p4 sync mw/...@" CL print "p4 sync ui/...@" CL print "git add -A ." print "p4 changes -l ...@" CL ",@" CL " | git commit -a -F -" } '
It creates mirror of subset of P4 history and creates GIT commit per every CL. After that import merges can be done inside GIT (fast with good algorithms).