Friday, November 16, 2018

JDK 12's Files.mismatch Method

JDK 12 introduces a new method to the Files class. The method, Files.mismatch(Path,Path), has been introduced to JDK 12 via JDK-8202302 and is available in JDK 12 Early Access Build 20 (same early access build that supports the new {@systemProperty} Javadoc tag).

JDK-8202302 ["(fs) New Files.mismatch method for comparing files"] adds the Files.mismatch(Path,Path) method "to compare the contents of two files to determine whether there is a mismatch between them" and can be used to determine "whether two files are equal." There was talk at one time of adding a Files.isSameContent() method, but it was decided to use Files.mismatch(Path,Parh) because of its consistency "with the Arrays.mismatch and Buffer.mismatch methods."

The next code listing contains a simple Java class that demonstrates the new Files.mismatch(Path,Path) and contrasts it with Files.isSameFile(Path,Path). The full source code is available on GitHub.

package dustin.examples.jdk12.files;

import java.nio.file.Files;
import java.nio.file.Path;

import static java.lang.System.out;

/**
 * Demonstrate {@code Files.mismatch(Path,Path)} introduced with JDK 12
 * and useful for determining if two files have the same content even
 * if they're not the same files.
 */
public class FilesDemo
{
   public static void main(final String[] arguments) throws Exception
   {
      if (arguments.length < 2)
      {
         out.println("USAGE: FilesDemo <file1Name> <file2Name>");
         return;
      }

      final String file1Name = arguments[0];
      final Path file1Path = Path.of(file1Name);
      final String file2Name = arguments[1];
      final Path file2Path = Path.of(file2Name);

      out.println("\nFiles '" + file1Name + "' and '" + file2Name + "' are "
         + (Files.isSameFile(file1Path, file2Path) ? "the" : "NOT the")
         + " same.\n\n");
      out.println("\nFiles '" + file1Name + "' and '" + file2Name + "' are "
         + (Files.mismatch(file1Path, file2Path) == -1 ? "the" : "NOT the")
         + " same content.\n\n");
   }
}

When the above code is executed against various combinations of files, it provides results that are captured in the next table. Note that the Files.mismatch method does not itself return a boolean, but the code shown above adapts its response to the appropriate boolean.

Results of Code Snippet Using Files.isSameFile() and Files.mismatch()
Files Relationship Files.isSameFile(Path,Path) Files.mismatch(Path,Path)
Same File "Same" (true) "Same" (-1)
Copied File "Not Same" (false) "Same" (-1)
Different Files "Not Same" (false) "Not Same" (positive integer)
Soft-linked "Same" (true) "Same" (-1)
Hard-linked "Same" (true) "Same" (-1)
Either File Not Found FileNotFoundException FileNotFoundException

The addition of Files.mismatch(Path,Path) is another step in accomplishing JDK-6852033 ["Inputs/Outputs methods to make common I/O tasks easy to do"] and makes it easier to determine when two files that are not the same file are still "equal" or have the same content.

4 comments:

Unknown said...

I hope it has saner behaviour than Files.isSameFile() when one of the files does not exist. ie. I hope it returns false rather than throwing NoSuchFileException.

@DustinMarx said...

Hello Mike,

Thanks for the comment. You bring up an interesting point, so I tried it out. It turns out that Files.mismatch(Path, Path) does throw a NoSuchFileException just like Files.isSameFile(Path, Path). I'm going to update the post's table with this information.

Dustin

Tobias Schulte said...

Shouldn't the last column values be reversed? Mismatch for the same files and copied files return false instead of true?

@DustinMarx said...

Hello Tobias,

Thanks for the comment. The table (despite its current labeling) is showing the results of the code snippet above it rather than the direct results of the mismatch method(). In other words, it's showing the results for determining "sameness" when using the mismatch() method rather than what the mismatch() method returns directly (-1 or integer indicating positive where the files differ). You are correct that, in essence, the mismatch() method's direct return is opposite of sameness because it's returning whether files are different or not. I plan to update that table tonight to be more clear that the table really represents the results of the code shown above using the isSameFile() and mismatch() methods rather than necessarily what each method directly returns.

Thanks again for the comment,

Dustin