Robert Cowham's Weblog 12 to 16 of 43 articles Syndicate: full/short

Perforce European Conference 19 September 2006   22 Sep 06
[print permalink all comment ]

Life has been just a touch busy recently having been flat out on various client projects pretty much over the whole summer (managed a week away but only just!). All grist to the mill for future blogging, so hopefully a variety of articles to come!

Meanwhile one of the things I was doing was preparing and then giving a presentation for the (first) Perforce European Conference on 19th September in central London.

I think the papers will be out pretty shortly on the Perforce site, but meanwhile a few highlights and personal notes. There were some big names present and it was good to hear about various practices and principles in operation.

Keynote

Christopher Seiwald did a variation on his slightly "aw shucks" style keynote. Some key points:

  • Perforce doing fine: 200,000 users and 4,000 companies
  • Company motto: "Aim low and hit!" (do one thing well, remain best of breed and wait for the analyst pendulum to swing back to best of breed rather than suite integration, which it seems to do on a regular basis)
  • Working on a variety of things for future world domination, but don't want to pre-announce as usual
  • Very pleased with the way things are happening in Europe, and obviously at the response to this event.
  • Next US Conference 9 - 11 May 2007, Las Vegas.
  • Sydney office now opened to give global timezone support coverage!

Symbian

Deepak Modgill did a nice presentation on the challanges faced by Symbian for their offshoring. Another in the Symbian series of how their business and vairous configuration management practices have evolved. Not deeply technical but interesting never-the-less.

SAP

Obviously a flagship site for Perforce. Thomas Kroll and Claudia Loff did a good presentation. Interesting how much process and tools they had wrapped around Perforce. A few key stats:

  • 4,800 users
  • 80+ Perforce servers (but all on same cluster hardware)
  • Fujitsu Siemens clusters with 32Gb RAM running SunOS 9
  • SAN (mirrored) for main data

They use a very structured process (repository structure and branching scheme) and a parallel (P4SAP) system with its own database to record things like changes and migrations (they call them transports) of releases between different servers. There is also a layer P4MS (Management System) to handle users etc.

Quite impressive.

Process Automation

Obviously my talk was wonderful! I was thought fairly pleased with how it went down and got some good comments afterwards. For anyone interested, the Ruby triggers framework and a couple of utilities are in my area of the Perforce Public Depot.

I will no doubt be blogging on various related aspects (that I haven't already touched on).

Bank of America

Good talk by Sean Cody and Kevin Breidenbach about different approaches with the bank. They have been replacing ClearCase with Perforce in various groups, mainly due to the performance for shared development between US, UK and India. Experience of Multisite sometimes taking hours to "sync up", vs. 10-20 minutes max in Perforce.

Another feature of the talk was the power of continuous integration.

Google

Dan Bloch discussed Google's use of Perforce and in particular how they manage issues around Perforce database locking and identifying and bumping off rogue commands.

Some more stats:

  • 3,000+ users
  • Single Perforce Repository
  • HP DL585 4-way Opeteron with 128Gb RAM
  • Linux 2.4 and NetApp filer

Sounds like it wins the contest for largest number of users against a single server!

The details of the lock identification was very interesting and Dan said he would be releasing the lock.pl script and some docs on the Public Depot real soon now!

Perforce 2006.1 Update

A very interesting and technical talk by Michael Shields regarding a variety of performance optimisations made between 2005.2 and 2006.1.

Summary: 2006.1 is quite a bit faster!

Read the slides for more details.

Laura Wingerd

Laura did another fairly technical talk on what has happened to the branching/merging algorithm, and more particularly common ancestor detection algorithm used in various releases of 2006.1. In her usual inimitable style she came up with some very useful ways of explaining things like convergence and divergence of branches over time. Things got decidedly more technical with discussions on common ancestors and I was left knowing I have to go through some of this in detail in a quiet moment just to make sure I really do understand it! The changes with 2006.1 look good, but I did get the impression some edge cases could give some slightly surprising results if you don't know what's going on behind the covers (and indeed the driving intentions behind the algorithm).

Summary

Venue worked very well for location. Networking with both Perforce people and various other delegates was as ever a highlight.

Unfortunately the room booked was not huge which meant the event sold out well ahead of time - a shame a more flexible venue wasn't chosen, but that was only quibble. Organisation well run.

An excellent day!

Perforce and Keyword Expansions   30 Jun 06
[print permalink all comment ]

There are times when you receive a third party code drop which you wish to import. The classic method is documented in Tech Note 15 and its reference to working disconnected (Tech Note 2). The techniques mentioned work very well to find new files, deleted files and changed files.

There is sometimes a fly in the ointment to do with keyword expansion. This is things like a CVS code drop containing expanded keywords:

$Id: //depot/robertcowham.com/main/blog/data/scm/p4_handling_keywords.html#1 $
The Perforce equivalent of this might be:
$Id: //depot/robertcowham.com/main/blog/data/scm/p4_handling_keywords.html#1 $

The simple command to find differences is "p4 diff -se". If your local version has Perforce keyword expansion turned on then you will get a load of files spuriously identified as having changed where the only real change is in the keywords.

Thus we want a simple script to run through the diffs and exclude any diffs where only keywords are found (note that this includes where the keyword is embedded, such as in a static variable assignment).

The following simple script is a good base for this. It does the job, and performs pretty well, handling thousands of files in a few minutes. It makes use of unified diff format where changed lines have a prefix in the first character of the output.

# Script to import a set of changed files with existing keywords already expanded
# (either Perforce or CVS).
# Does "diff -se" and processes the output

# Args: current directory to check
  
require 'P4'

p4 = P4.new
p4.tagged
p4.connect

def process_file(p4, f)
  diffs = p4.run_diff("-f", "-du", f)
  real_diffs = Array.new
  diffs.each { |line|
    case line
    when /^====/
    when /^\@\@/ 
    when /^ /
    else
      if line !~ /\$Id|\$DateTime|\$Revision|\$Date|\$Author|\$Name|\$RCSfile|\$Source/
        real_diffs << line
        # puts f, line
      end
    end
  }
  if real_diffs.size > 0
    print "Editing #{f}\n"
    p4.run_edit(f)
  end
end

all_files = p4.run_diff("-se", ARGV[0])
print "Processing #{all_files.size}\n"
i = 0
all_files.each{|f|
  i += 1
  # print"Processing #{i}\r" if i % 10 == 0
  # print"Processing #{f['depotFile']}\n"
  process_file(p4, f)
}

It is pretty easy to run, e.g.:

diff_se.rb ...

The net result will be a list of files checked out (p4 edit) in the default changelist.

Note that one of the big advantages of Perforce branching and merging is that it handles merges neatly when keyword expansion is used between branches (and thus you don't get spurious conflicts).

CVS Imports

If you use the cvs2p4 scripts to import a CVS repository you can end up with a slight problem since the conversion copies the CVS archive files (in RCS format) and Perforce uses them unchanged. The problem comes about because CVS stores the keywords already expanded in the RCS archive. Perforce stores its RCS files with the keywords not expanded, which makes it easier for it to do the merging between branches (without keyword conflicts). While Perforce can handle a CVS archive with the the keywords "pre-expanded", it does lead to spurious merge conflicts. Note that this problem is only really present during the early merges after the CVS import. It will no longer be present as soon as the base file for any merge is fully in Perforce format (i.e. after at least one merge has been done).

Perforce Triggers   13 May 06
[print permalink all comment ]

Writing good Perforce triggers, and, more importantly, debugging them in live use, turns out to be one of those things that seems simple but has lots of tricky issues that can lead to lots of time being wasted.

In spite of thinking that I understood lots of the issues, I still spent a couple of hours recently debugging a problem that turned out to be a combination of environment and password issues. This was particularly annoying as I had rather though I knew about this stuff (and indeed have advised people over the years about it!), and yet was blindsided and caught out by some issues I had forgotten about or not thought through deeply enough.

I reserve the right to revisit this subject more than once in the future with further insights and news...

Assume Nothing About The Environment!

The classic approach to triggers is to write a nice script (Python or Ruby for me these days - no Perl, though just occasionally I miss it!) and debug it by running with the appropriate parameters from the command line (e.g. create a pending changelist and pass in the pending changelist number). This does indeed tend to turn up a number of issues, but the good thing is you can usually debug them with the appropriate command (<rant> why does python require you to execute pdb.py which isn't by default put in the path on Windows machines, and why does Ruby not learn from Perl and for example use -d as a parameter to debug things instead of "-rdebug" - very unobvious!</rant>).

The major problem turns out to be the fact that the trigger is executed by the Perforce server process and may have a very different environment to what you might think as you run a "login" session. One sort of expects this on Unix, but on Windows it can be particularly surprising how little is in the environment due to the username that the Perforce process is running in when it is running as a service (default installation on Windows).

Thus the first rule of trigger writing is "assume nothing about the environment!".

It is very easy to forget this and assume very simple things, like:

  • P4PORT is always defined
  • P4USER is always defined
  • failures of individual p4 commands within the trigger will be obvious

Thus immediate recommendations are:

  • Give full pathnames to executables. For example, "/usr/bin/ruby" or "C:\ruby\bin ruby.exe" as the initial parameter for the ruby script, rather than assuming that "ruby" or "python" or whatever will always be in the PATH of the user executing the command.
  • When in doubt (I'm generally always in doubt) give full pathnames to scripts too.
  • Pass in as parameters the p4port and any other parameters to be used rather than expecting them to be already present in the environment.
  • Within the script, explicitly add any extra directories to the search path for commands such as "import p4" in Python or "require 'P4' " in Ruby or any equivalent import-type statement, unless you are absolutely sure that the imported libraries are globally installed on the machine your are working with. Don't assume the same directory as the trigger script itself is in is in the path unless you can prove it.
  • Trap and print to stdout (or stderr which goes to the p4d server log file) any errors/stack traces including exceptions from your p4 interface to aid hunting out problems. This is much easier to say than to do!

Passwords Cause Problems

In the good old days, before "p4 login" was even a twinkle in Christopher's eye, you could write your trigger assuming super user privileges (says in Yorkshire accent "we had it tough - could only dream of admin privileges in those days") and everything would work.

Life became substantially more complicated with security level 3 and login being required. Commands failed due to not being logged in, and this turned out to be a bit of a bugger ('scuse my French) to work out (why it had failed that is).

Received wisdom is "run your triggers as a special trigger/admin user, put that user in a special group with timeout of some very large number, log them in manually and all will be sweetness and light".

The interesting thing about this approach is that it often works, but as I discovered recently, can flatter to deceive. The problem I had was that the super user was indeed in a special "long timeout" group, and logged in on the same box (generating a suitable ticket). However, as I discovered only after some hair was torn out, the P4PORT that the user was logged in under was different to that used by trigger and thus the P4TICKET file entry was also different and the existing "login" had had no effect and my trigger was unfortunately failing silently.

Thus P4PORT=localhost:1666 where localhost=some_server.some_company.com will not work if the superuser is logged in using P4PORT=some_server.some_company.com:1666, since the latter is what will be in P4TICKET and the former will not be found and thus commands will fail. Be warned and expect/check for this!

When in doubt print out the environment within your script (via some sort of debug parameter).

Belt and Braces

My current intentions on this front are to produce a trigger framework that helps detect the above problems, and helps both avoid them and, when necessary, debug them in a (relatively) painless manner. This, at the moment of writing, is a work in progress, but I hope to be able to share it with the wider Perforce community as it emerges into the glare of publicity. I do reserve the right to retain the right of surprise to add some slight spice to my upcoming presentation at the European Perforce User Conference on the 19th September (in London).

Update: hopefully will be able to share a rework/expansion of Tony Smith's P4Trigger.rb framework which addresses some of the above issues fairly shortly - seems to be working at a client - time will tell - but fairly quickly.

Future topics will include ideas on test frameworks etc.

Review of Pratical Perforce (Part 1)   06 Apr 06
[print permalink all comment ]

This is a partial review of Practical Perforce by Laura Wingerd, published by O'Reilly (ISBN 0-596-10185-6). The reason it is partial is that I intend to comment in more detail in future blog articles on some parts of the book, and wanted to post this without waiting for the whole thing!

As Laura mentions in the preface, the book is not intended for complete beginners, but more for readers with experience in other SCM (software configuration management) tools who are looking to understand how Perforce works.

To quote the introduction, there are two parts to this book:

  • Part I (Chapters 1-6) is a whirlwind technical tour of Perforce commands and concepts. It's not a tutorial, nor a reference, but helpful nonetheless.
  • Part II (Chapters 7-11) describes the big picture, using Perforce in a collaborative software development environment. It is particularly strong on branching patterns, how to structure codelines and tips and tricks in this area.

The real meat of the book for most Perforce sites is thus Part II, but there are definitely some goodies in Part I.

Chapter 1 presents some fundamentals about Perforce syntax and concepts. The diagrams on pages 6 &7 explain the relationship between revisions and changelists very well.

In Chapter 2, Laura discusses client workspaces and things like view syntax. She also describes basic check outs (open for edit in Perforce command line parlance), and introduces branching when she refers to cloning of files. She includes concepts of renaming and replacing content in files, reconciling changes made offline, and even introduces a couple of bits of undocumented syntax such as "p4 files @=1452". Quite a chunk of information in this chapter.

Resolving and Merging are the subject of chapter 3 and includes some very useful diagrams showing various scenarios. If you have ever had any questions about 3-way merging in Perforce - read this! On pp68-69 she gives examples of reconciling changes you have made to a file someone else renamed using the undocumented merge3 command - interesting if a touch esoteric (also referred to in "How to undo a merge" on p80. The recommendation on p74 to sync and resolve one changelist at a time is certainly worth considering, although I think it will depend on your environment as to how necessary that is.

The basics of branching are covered in chapter 4 including initial scenarios and how to track merge requirements across branches. She makes quite a lot of use of the interchanges command (not yet exposed in the GUIs) and explains the gory details of "yours", "theirs" and "base" nicely. Her approach of using filespec integrations for the initial examples is nice and simple, but I suspect more people are likely to use branch specs in real life. On p111 she gives a useful couple of commands to show how to find which changes have been merged in (more likely to be automated in scripts for most sites I would expect). Other subjects ocvered include all the gory details of what integrate actually does, as well as some very useful details as to what the interchanges command can tell us, particular with respect to cherry picked integrations.

Chapter 5 is quite short on labels and jobs and shows all the basics. A quick note on the final section where a job is used as a reference for a changelist - as of release 2005.2 there is an undocumented "dynamic label" option where a label can have an attached revision which probably makes the job trick unnecessary.

Chapter 6 gets into the subject of remote depots and proxies and also mentions the very useful spec depot option (automatic versioning of all spec objects). There is also some good advice on using p4web in browse mode to access your repository. The section on triggers and automation is a little light, but understandable.

Part II starts with Chapter 7 "How software evolves". This chapter is perhaps the highlight of the book, and introduces concepts that are totally independent of Perforce and apply to many SCM tools. Fortunately the chapter is currently available as a free PDF document from the O'Reilly website for the book. A firm understanding of the concepts introduced here will make it much easier for you to come up with suitable branching patterns for use in your organisations, and also, perhaps more importantly, give you some incredibly useful concepts for explaining your structure to other people within the organisation. Most SCM problems are due to poor communication rather than poor tools, or poor ideas. Laura relates the problems in the real world prevent us from an overly simplistic ideal world, and yet how some simple concepts allow us to manage this real world complexity. The "flow of change" and the "tofu scale" are classic concepts which should be in everyone's SCM vocabulary.

Summary

I am going to stop this post here, and will get to further chapters and some detailed comments on them as I have time.

But I will finish with the recommendation - buy this book!!

Regular Expressions in Ruby and Python   27 Mar 06
[print permalink all comment ]

A personal foible perhaps but I do find Ruby's regular expression syntax remains in my brain much more easily than the Python equivalent.

Maybe it's the Ruby inheritance from Perl that makes the difference. For simple scripts I can just write standard regexps without any recourse to documentation and they just work! For example:

some_var = "prefix interesting_match some suffix"
if some_var =~ /prefix (\w+) some suffix/
  interesting_bit = $1
  print "Match found: ", interesting_bit, "\n"
end

In order to do the same in Python I find myself faffing around with the documentation (using ActiveState Python it's great to have a proper help file, but I would really like more links between the class reference and real examples of usage to help me out) and trying to remember if I want re.search or re.match and how do I get a match group and use it, etc. I have sundry Python scripts floating around that I open up and copy relevant examples from, but it does rather take time.

import re
some_var = "prefix interesting_match some suffix"
pat = re.compile('prefix (\w+) some suffix')
m = pat.match(some_var)
if m:
    print 'Match found: ', m.groups(0)[0]

Now I have to admit that it's not a huge deal in terms of the resulting code, but it took me 5-10 minutes just now to code and debug the Python version as opposed to the Ruby version which I typed in and which worked first time.

The net result is that I am noticeably  more productive in Ruby for those little scripts that make life easier, or when I am under strict time pressure. Now this is not to say that I don't like Python, or indeed that when I have a little more time I don't get use it and enjoy it. Having done some reasonably significant work in Python, e.g. rework P4DTI for PVCS (now Serena) Tracker I feel reasonably qualified to comment.

I also took the time to get sufficiently proficient in Python extensions to enhance and maintain P4Python. Mind you I now feel somewhat humbled by the most recent efforts at a Perforce integration (PyPerforce) - shows a depth of Python extensions knowledge before which I can only bow in admiration! (Minor aside - Ruby extensions are much easier to write than Python ones due mainly to the different garbage collection models).

Finishing up, I am definitely Perl-averse these days. There are a few Perl scripts that I maintain and can't be bothered, or can't find a convincing "business" case to rewrite, but anything new is Ruby or Python.

 

Copyright © 2008 Robert Cowham