Thursday, July 7, 2011

Walking the File Tree with Java 7's Files and FileVisitor

In the previous post Java SE 7 Brings Better File Handling than Ever to Groovy, I discussed the ability to use Java 7's NIO.2 implementation to discover a wide set of attributes, characteristics, and metadata regarding files and file systems. I followed that up with coverage of various operations that can be performed on directories and files using Java 7's NIO.2 support in the post File and Directory Operations with Java 7's Files Class. In this post, I focus on the ridiculously useful new overloaded Files.walkFileTree methods that provide an easy way for Java, Groovy, or other JVM language developers to easily traverse a file/directory tree and process files and directories along the way. I just touch the surface of this powerful technique in this post with a single simple example.

Both overloaded versions of Files.walkFileTree accept a starting point for the "walk" as an instance of the newly introduced Path class. Both of these methods also expect to be passed an instance of an implementation of the FileVisitor interface (SimpleFileVisitor in this case) of parameterized type Path. My example in this post uses the simpler (and less flexible) of the two. It does not allow options such as following symbolic links to be specified. However, it works sufficiently for my example, demonstrates the technique nicely, and it is still easy enough to use the other version of the method with the options for a more flexible example.

My simple example is a script written in Groovy that renames files. For simplicity's sake, I have hard-coded the root directory that will serve as the starting point of the walk, but this could have been easily retrieved as a command-line parameter to the script. This script renames files if they have spaces in their name and changes any uppercase letters to lowercase. The specific details of how files are renamed is documented as comments in the script itself. Here is that simple script.

#!/usr/bin/env groovy
 * renameFiles.groovy.
 * For simplicity, this script assumes all files to be renamed are in the
 * directory C:\workarea. This script renames all files for which renaming is
 * allowed in the C:\workarea directory that meet certain characteristics.
 * This script replaces spaces in file names with underscores, trims spaces off
 * altogether (no underscore) that are the first or last characters of the file
 * name, and changes all letters in the file name to lowercase.

import java.nio.file.attribute.BasicFileAttributes
import java.nio.file.Files
import java.nio.file.Path
import java.nio.file.Paths
import java.nio.file.FileVisitResult
import java.nio.file.SimpleFileVisitor

Path start = Paths.get("C:\\workarea")
Files.walkFileTree(start, new SimpleFileVisitor()
   public FileVisitResult visitFile(Path filePath, BasicFileAttributes attrs)
      throws IOException
      def fileName = filePath.fileName
      def dirPath = filePath.parent
      def newFileName = fileName.toString().trim().toLowerCase().replace(" ","_")
      def newFilePath = Paths.get(dirPath.toString(), newFileName)
      Files.move(filePath, newFilePath)
      println "Renamed ${fileName} to ${newFileName}."
      return FileVisitResult.CONTINUE;

That's all there is to it! The output is captured in the screen snapshot that comes next. The snapshot shows a directory listing with the original files' names followed by a directory listing after this script has been run. The snapshot also includes the terminal in which this script was run.

As the example above indicates, I can override the method FileVisitor.visitFile to prescribe exactly what actions should be taken on each file that is encountered during my traversal of the files (walking the file tree) underneath the starting directory using Files.walkFileTree. The FileVisitor interface also advertises other hooks that I can implement to control other aspects of walking the file tree: visitFileFailed, preVisitDirectory, and postVisitDirectory.

There are other examples of walking the file tree with Java 7 available online. The Javadoc documentation for FileVisitor includes a "Usage Example" section that provides two more complicated examples than I've shown here. The first example shows deleting the contents of a directory first before deleting the directory itself and implements visitFile and postVisitDirectory (delete the directory AFTER the contained files) in the process. The second example contained in that Javadoc implements preVisitDirectory to first create a directory and then implements visitFile to copy files to that target directory. The second example also shows off using of the other overloaded version of Files.walkFileTree that accepts the option to follow symbolic links.

The Java Tutorials now contain a section focusing exclusively on walking the file tree. Another interesting example online that is also an idea for another application of Java 7 NIO.2 features is in the post The ZIP filesystem provider in JDK7.

Finally, this is a big week for Java 7. Today is Oracle's Java 7 party. I'm looking forward to hearing what they have to say about Java 7 (includes coverage of file I/O). In addition, other recent posts of interest related to Java 7 include Using JDK 7's Fork/Join Framework and the announcement that build 147 is the first release candidate of Java 7.


This is a big week for Java 7. In this post, I've provided a quick taste of the power of the newly available file tree walking mechanism provided by Java 7's NIO.2 support.

1 comment:

Dave Slomer said...

All well and (VERY) good, but let's say we're searching for files that match a filename pattern and don't want to wait to see intermediate results on-screen (no output to console). Then (it seems like) SwingWorker needs to get involved. And that's where I am and that's where I am STUCK because doInBackground() is where (I believe you have to) publish() from and there's no loop within which to do so, if that's where the call to walkFileTree() should be placed. And if not, then surely doInBackground() would then need visitFile(), but there's no loop to be written. Any thoughts? (Consider