Monday, November 7, 2016

Fixed-Point and Floating-Point: Two Things That Don't Go Well Together

One of the more challenging aspects of software development can be dealing with floating-point numbers. David Goldberg's 1991 Computing Surveys paper What Every Computer Scientist Should Know About Floating-Point Arithmetic is a recognized classic treatise on this subject. This paper not only provides an in-depth look at how floating-point arithmetic is implemented in most programming languages and computer systems, but also, through its length and detail, provides evidence of the nuances and difficulties of this subject. The nuances of dealing with floating-point numbers in Java and tactics to overcome these challenges are well documented in sources such as JavaWorld's Floating-point Arithmetic, IBM DeveloperWorks's Java's new math, Part 2: Floating-point numbers and Java theory and practice: Where's your point?, Dr. Dobb's Java's Floating-Point (Im)Precision and Fixed, Floating, and Exact Computation with Java's Bigdecimal, Java Glossary's Floating Point, Java Tutorial's Primitive Data Types, and NUM04-J. Do not use floating-point numbers if precise computation is required.

Most of the issues encountered and discussed in Java related to floating-point representation and arithmetic are caused by the inability to precisely represent (usually) decimal (base ten) floating point numbers with an underlying binary (base two) representation. In this post, I focus on similar consequences that can result from mixing fixed-point numbers (as stored in a database) with floating-point numbers (as represented in Java).

The Oracle database allows numeric columns of the NUMBER data type to be expressed with two integers that represent "precision" and "scale". The PostgreSQL implementation of the numeric data type is very similar. Both Oracle's NUMBER(p,s) and PostgreSQL's numeric(p,s) allow the same datatype to represent essentially an integral value (precision specified but scale not specified), a fixed-point number (precision and scale specified), or a floating-point number (neither precision nor scale specified). Simple Java/JDBC-based examples in this post will demonstrate this.

For the examples in this post, a simple table named DOUBLES in Oracle and doubles in PostgreSQL will be created. The DDL statements for defining these simple tables in the two database are shown next.

createOracleTable.sql

CREATE TABLE doubles
(
   int NUMBER(5),
   fixed NUMBER(3,2),
   floating NUMBER
);

createPgTable.sql

CREATE TABLE doubles
(
   int numeric(5),
   fixed numeric(3,2),
   floating numeric
);

With the DOUBLES table created in Oracle database and PostgreSQL database, I'll next use a simple JDBC PreparedStatement to insert the value of java.lang.Math.PI into each table for all three columns. The following Java code snippet demonstrates this insertion.

Inserting Math.PI into DOUBLES Columns

/** SQL syntax for insertion statement with placeholders. */
private static final String INSERT_STRING =
   "INSERT INTO doubles (int, floating, fixed) VALUES (?, ?, ?)";


final Connection connection = getDatabaseConnection(databaseVendor);
try (final PreparedStatement insert = connection.prepareStatement(INSERT_STRING))
{
   insert.setDouble(1, Math.PI);
   insert.setDouble(2, Math.PI);
   insert.setDouble(3, Math.PI);
   insert.execute();
}
catch (SQLException sqlEx)
{
   err.println("Unable to insert data - " + sqlEx);
}

Querying DOUBLES Columns

/** SQL syntax for querying statement. */
private static final String QUERY_STRING =
   "SELECT int, fixed, floating FROM doubles";

final Connection connection = getDatabaseConnection(databaseVendor);
try (final Statement query = connection.createStatement();
     final ResultSet rs = query.executeQuery(QUERY_STRING))
{
   out.println("\n\nResults for Database " + databaseVendor + ":\n");
   out.println("Math.PI :        " + Math.PI);
   while (rs.next())
   {
      final double integer = rs.getDouble(1);
      final double fixed = rs.getDouble(2);
      final double floating = rs.getDouble(3);
      out.println("Integer NUMBER:  " + integer);
      out.println("Fixed NUMBER:    " + fixed);
      out.println("Floating NUMBER: " + floating);
   }
   out.println("\n");
}
catch (SQLException sqlEx)
{
   err.println("Unable to query data - " + sqlEx);
}

The output of running the above Java insertion and querying code against the Oracle and PostgreSQL databases respectively is shown in the next two screen snapshots.

Comparing Math.PI to Oracle's NUMBER Columns

Comparing Math.PI to PostgreSQL's numeric Columns

The simple examples using Java and Oracle and PostgreSQL demonstrate issues that might arise when specifying precision and scale on the Oracle NUMBER and PostgreSQL numeric column types. Although there are situations when fixed-point numbers are desirable, it is important to recognize that Java does not have a fixed-point primitive data type and use BigDecimal or a fixed-point Java library (such as decimal4j or Java Math Fixed Point Library) to appropriately deal with the fixed-point numbers retrieved from database columns expressed as fixed points. In the examples demonstrated in this post, nothing is really "wrong", but it is important to recognize the distinction between fixed-point numbers in the database and floating-point numbers in Java because arithmetic that brings the two together may not have the results one would expect.

In Java and other programming languages, one needs to not only be concerned about the effect of arithmetic operations and available precision on the "correctness" of floating-point numbers. The developer also needs to be aware of how these numbers are stored in relational database columns in the Oracle and PostgreSQL databases to understand how precision and scale designations on those columns can affect the representation of the stored floating-point number. This is especially applicable if the representations queried from the database are to be used in floating-point calculations. This is another (of many) examples where it is important for the Java developer to understand the database schema being used.

No comments: