Monday, December 16, 2013

Searching Subversion Logs with Groovy

There are times when I want to quickly search a Subversion repository by author, by range of revisions, and/or by commit messages. Krzysztof Kotowicz has posted the blog post Grep Subversion log messages with svn-grep that introduces svn-grep, a bash script making use of the command line XML toolkit called xmlstarlet (xmlstarlet is also available on Windows). This is a pretty useful script in and of itself, but it gave me an idea for a Groovy-based script that could run on multiple (all JVM-supported) platforms.

searchSvnLog.groovy
#!/usr/bin/env groovy
//
// searchSvnLog.groovy
//
def cli = new CliBuilder(
   usage: 'searchSvnLog.groovy -r <revision1> -p <revision2> -a <author> -s <stringInMessage>')
import org.apache.commons.cli.Option
cli.with
{
   h(longOpt: 'help', 'Usage Information', required: false)
   r(longOpt: 'revision1', 'First SVN Revision', args: 1, required: false)
   p(longOpt: 'revision2', 'Last SVN Revision', args: 1, required: false)
   a(longOpt: 'author', 'Revision Author', args: 1, required: false)
   s(longOpt: 'search', 'Search String', args: 1, required: false)
   t(longOpt: 'target', 'SVN target directory/URL', args: 1, required: true)
}
def opt = cli.parse(args)

if (!opt) return
if (opt.h) cli.usage()

Integer revision1 = opt.r ? (opt.r as int) : null
Integer revision2 = opt.p ? (opt.p as int) : null
if (revision1 != null && revision2 != null && revision1 > revision2)
{
   println "It makes no sense to search for revisions ${revision1} through ${revision2}."
   System.exit(-1)
}
String author = opt.a ? (opt.a as String) : null
String search = opt.s ? (opt.s as String) : null
String logTarget = opt.t

String command = "svn log -r ${revision1 ?: 1} ${revision2 ?: 'HEAD'} ${logTarget} --xml"
def proc = command.execute()
StringBuilder standard = new StringBuilder()
StringBuilder error = new StringBuilder()
proc.waitForProcessOutput(standard, error)
def returnedCode = proc.exitValue() 
if (returnedCode != 0)
{
   println "ERROR: Returned code ${returnedCode}"
}

def xmlLogOutput = standard.toString()
def log = new XmlSlurper().parseText(xmlLogOutput)
def logEntries = new TreeMap<Integer, LogEntry>()
log.logentry.each
{ svnLogEntry ->
   Integer logRevision = Integer.valueOf(svnLogEntry.@revision as String)
   String message = svnLogEntry.msg as String
   String entryAuthor = svnLogEntry.author as String
   if (   (!revision1 || revision1 <= logRevision)
       && (!revision2 || revision2 >= logRevision)
       && (!author    || author == entryAuthor)
       && (!search    || message.toLowerCase().contains(search.toLowerCase()))
      )
   {
      def logEntry =
         new LogEntry(logRevision, svnLogEntry.author as String,
                      svnLogEntry.date as String, message)
      logEntries.put(logRevision, logEntry)
   }
}
logEntries.each
{ logEntryRevisionId, logEntry ->
   println "${logEntryRevisionId} : ${logEntry.author}/${logEntry.date} : ${logEntry.message}"
}

One thing that makes this script much easier to write is the ability of Subversion's log command to write its output in XML format with the --xml flag. Although XML has been the subject of significant criticism in recent years, one of the things I've liked about its availability is the widespread tool support for writing and reading XML. Subversion's ability to write certain types of output in XML is a good example of this. Without XML, the script would have required custom parsing code to be written to parse the non-standard SVN log output. Because Subversion supports writing to the standard XML format for its output, any XML-aware tool can read it. In this case, I leveraged Groovy's incredibly easy XML slurping (XML parsing) capability.

The script also uses Groovy's enhanced (GDK) Process class as I briefly described in my recent post Sublime Simplicity of Scripting with Groovy.

Groovy's built-in command-line support (CliBuilder) is used in the script to accept parameters for narrowing the search (such as applicable revisions, authors who committed, or strings to search the commit comments for). The one required parameter is the "target" which can be a file, directory, or URL.

The script references a Groovy class called LogEntry and the code listing for that class is shown next.

LogEntry.groovy
@groovy.transform.Canonical
class LogEntry
{
   int revision
   String author
   String date
   String message
}

That simple-looking LogEntry class is much more powerful than it might first appear. Because it's Groovy, there are automatically setter/getter methods available for the four attributes. Thanks to the @Canonical annotation, it also supports a constructor, equals, hashCode, and toString methods. In other words, this class of under ten lines total has accessor and mutator methods as well as common class methods overridden appropriately for it.

Conclusion

Groovy offers numerous features to make script writing easier. In this post, I used an example of "searching" Subversion commits via the Subversion log command (and its --xml option) to demonstrate some of these useful Groovy scripting features (command line parameter parsing, native operating system integration, and easy XML parsing). Along the way, some of Groovy's nice syntax advantages (closures, dynamic typing, GString value placeholders) were also used.

No comments: