Inspired by Actual Events: January 2010

Saturday, January 30, 2010

Groovy and SQL*Plus

Oracle's SQL*Plus is not as user-friendly as SQL Developer or JDeveloper for manipulating data in an Oracle database, but it is still commonly used. In fact, SQL*Plus is often preferred to the tools with fancier user interfaces for quick and dirty manipulation, for running SQL scripts, for being run as part of shell scripts, and for other non-interactive uses. Furthermore, Oracle's SQL*Plus page calls SQL*Plus "the primary interface to the Oracle Database server." In this blog post, I look at using SQL*Plus in conjunction with Groovy scripts.

Groovy's Process Management plays the most significant role in the process of using Groovy with SQL*Plus. This functionality is easily applied thanks to the GDK's String extension that includes an execute() method which returns a Process running the command contained within that GDK-extended String. This mechanism will be used repeatedly in the examples in this blog post.

Notes Related Specifically to Windows/Vista

Many books, blogs, and articles on using Groovy with Windows rightly point out that some of the commonly used commands in Windows are not actually executables, but are instead built-in commands. These commands must be prepended with cmd /C to be executed appropriately. However, this step is unnecessary when invoking SQL*Plus from Groovy on Windows because SQL*Plus is a separate executable.

When using Vista, it is important to run SQL*Plus with the appropriate privileges. Because the Groovy scripts in this post run SQL*Plus, the scripts should be executed in a console window that is being run under administrative privileges. An error message including the phrase "The requested operation requires elevation" is seen when the Groovy scripts using SQL*Plus are executed without the appropriate privileges. This is demonstrated in the next screen snapshot.

Example 1: Basic Query Statement

One of the easiest ways of using SQL*Plus is to write a file containing commands to be run within SQL*Plus. These files often end with the .sql suffix and are typically executed in SQL*Plus by prefacing the name of script file (with path if the file is not in the current working directory) with the @ symbol.

In this first example, that SQL*Plus script file is called 01-employeeIds.sql and is a single-line script:

01-employeeIds.sql


select employee_id from employees;

The above script is expected to be executed against the HR sample schema that is supplied with most modern versions of the Oracle database. The Groovy to run this SQL*Plus script file is contained in the next code listing for sql01.groovy:

sql01.groovy


#!/usr/bin/env groovy
def sqlplusCmd = "sqlplus hr/hr @01-employeeIds.sql".execute()
println "${sqlplusCmd.class}"
sqlplusCmd.in.eachLine { line -> println line }

The above Groovy script specifies an explicit call to SQL*Plus with the 'hr' username and associated 'hr' password. It then runs the script showed above by prefacing its name with the "@" symbol. All of this is included in a GDK-extended String upon which the execute() method is invoked. The object returned from that call is a GDK-extended Process instance (as illustrated by the println call below its instantiation) from which each line is printed.

The initial results of running the above script are shown in the next screen snapshot:

In the above output, the first of many rows returned from the database are shown after the SQL*Plus banner is displayed and the line showing that the GDK String.execute() method returns a java.lang.ProcessImpl is displayed.

When run as shown above, this script never ends because the SQL*Plus session is never exited. This is easily addressed by adding the word "exit" to the end of the SQL script, which will be done in the next example. The next screen snapshot demonstrates how the script appears to "hang" when the SQL*Plus session has not been explicitly exited.

Example 2: Cleaning Up SQL*Plus's Output

We often don't want the SQL*Plus banner to be displayed when running SQL*Plus via script. SQL*Plus provides the -S option to suppress this banner. The next version of the Groovy script will take advantage of that as shown in sql02.groovy:

sql02.groovy


#!/usr/bin/env groovy
def sqlplusCmd = "sqlplus -S hr/hr @02-employeeIds.sql".execute()
sqlplusCmd.in.eachLine { line -> println line }

The only changes to this script from the Groovy script in the first example is the addition of -S to suppress SQL*Plus's banner and the calling of a different SQL script file (02-employeeIds.sql), which will be shown next.

When running SQL*Plus with scripts, it is often convenient to suppress other portions of feedback in addition to suppressing the SQL*Plus banner. Many of these types of output are turned off or suppressed within SQL*Plus by using the SET command to set the relevant property to a desired value. The code listing for 02-employeeIds.sql demonstrates turning off query results feedback, header information, and page separation in the SQL*Plus results and also explicitly exits from SQL*Plus.

02-employeeIds.sql


-- HEADING off turns off column headings in output of query results
set HEADING off
-- FEEDBACK turns off message saying number of rows returned
set FEEDBACK off
-- PAGESIZE to 0 removes spaces between "pages" of results
set PAGESIZE 0

select employee_id from employees;

-- Return SQL*Plus settings to defaults
set HEADING on
set FEEDBACK 6
set PAGESIZE 14
exit

The next two screen snapshots demonstrate the leaner/cleaner output with the SQL*Plus banner suppressed along with no headers, no page separation, and no feedback. The second of the images proves that exit successfully exits SQL*Plus (and the invoking Groovy script).

Example 3: Passing in Parameter

It is often the case that we want to run SQL*Plus scripts depending on parameters set dynamically when the script is executed. SQL*Plus accepts command-line parameters and this feature is leveraged in the next SQL script listing.

03-employeeFind.sql


-- HEADING off turns off column headings in output of query results
set HEADING off
-- FEEDBACK turns off message saying number of rows returned
set FEEDBACK off
-- PAGESIZE to 0 removes spaces between "pages" of results
set PAGESIZE 0

select first_name || ' ' || last_name "NAME" from employees where employee_id = &1;

-- Return SQL*Plus settings to defaults
set HEADING on
set FEEDBACK 6
set PAGESIZE 14
exit

In the above script, an employee's full name is returned based on a provided parameter (&1) representing the employee's ID. To support this expected parameter, the Groovy script must supply that ID to the SQL*Plus script. This is shown in the Groovy script sql03.sql:

sql03.grooy


#!/usr/bin/env groovy
if (!args)
{
   println "You must supply the ID of the employee of interest."
   System.exit(-1)
}
def sqlplusCmd = "sqlplus -S hr/hr @03-employeeFind.sql ${args[0]}".execute()
sqlplusCmd.in.eachLine { line -> println line }

This Groovy script exits promptly if no parameter is provided and informs the user that an ID is required. If the ID is provided, it is passed to the SQL*Plus script by appending it to the end of the SQL*Plus invocation. The output from this is shown next.

The screen snapshot just shown demonstrates that the parameter passing from command-line to Groovy script to SQL*Plus worked fine. The only downside is that the SQL*Plus variable substitution is explicitly shown and this may not always be desirable. I can turn this off by specifying set VERIFY off in the SQL*Plus script. When I do that, the output is as shown next.

Example 4: Handling Return SQL*Plus Return Code

Because Groovy's process management capability uses a GDK-extended Process, the Process.exitValue() method can be used to access the codes returned from invoked commands. This allows the Groovy code to analyze the codes returned from the invoked script and act appropriately.

To illustrate this, the SQL*Plus script used above is modified to return a -4. The modified version is shown next.

04-employeeFind.sql


-- HEADING off turns off column headings in output of query results
set HEADING off
-- FEEDBACK turns off message saying number of rows returned
set FEEDBACK off
-- PAGESIZE to 0 removes spaces between "pages" of results
set PAGESIZE 0
-- VERIFY off turns off the prompts showing variable substitution
set VERIFY off

select first_name || ' ' || last_name "NAME" from employees where employee_id = &1;

-- Return SQL*Plus settings to defaults
set HEADING on
set FEEDBACK 6
set PAGESIZE 14
set VERIFY on

-- Pretend there is some type of error or other condition in which the returning
-- of -4 is appropriate
exit -4

The modified Groovy script that takes advantage of Process.exitValue() to process this returned code is shown next. This Groovy script simply prints the value, but it could implement alternative logic based on the code returned.

sql04.groovy


#!/usr/bin/env groovy
if (!args)
{
   println "You must supply the ID of the employee of interest."
   System.exit(-1)
}
def sqlplusCmd = "sqlplus -S hr/hr @04-employeeFind.sql ${args[0]}".execute()
sqlplusCmd.in.eachLine { line -> println line }
println "Return value: ${sqlplusCmd.exitValue()}"

When this script is executed, its output looks like that shown in the next screen snapshot.

Example 5: Processing Significant SQL*Plus Output

There are times when the Groovy script needs to process more than a return code. One way of doing this is to have the SQL*Plus script write out data to an external file that the Groovy script can process. It helps, of course, that Groovy provides some nice GDK File extensions.

The SQL*Plus SPOOL command can be used to spool output to an operating system file.

An example of a SQL*Plus script that writes employee last names and first names, separated by a comma, on individual lines of a file named "name.txt" is shown next.

05-employeeList.sql


-- HEADING off turns off column headings in output of query results
set HEADING off
-- FEEDBACK turns off message saying number of rows returned
set FEEDBACK off
-- PAGESIZE to 0 removes spaces between "pages" of results
set PAGESIZE 0
-- VERIFY off turns off the prompts showing variable substitution
set VERIFY off
-- COLSEP sets character to be placed between returned columns
set COLSEP ,
-- TRIMSPOOL affects spooling output only; not SQL*Plus
set TRIMSPOOL on

spool name.txt
select last_name, first_name from employees;
spool off

-- Return SQL*Plus settings to defaults
set HEADING on
set FEEDBACK 6
set PAGESIZE 14
set VERIFY on
set COLSEP " "
set TRIMSPOOL off
exit

When a Groovy script runs this file, the expected file named name.txt is generated. A portion of it (top and bottom) is shown next.


Abel                     ,Ellen
Ande                     ,Sundar
Atkinson                 ,Mozhe
Austin                   ,David
Baer                     ,Hermann
Baida                    ,Shelli
Banda                    ,Amit
Bates                    ,Elizabeth
Bell                     ,Sarah
Bernstein                ,David
Bissot                   ,Laura
Bloom                    ,Harrison
Bull                     ,Alexis

. . .

Vollman                  ,Shanta
Walsh                    ,Alana
Weiss                    ,Matthew
Whalen                   ,Jennifer
Zlotkey                  ,Eleni

This output file can be processed in the same Groovy script that caused it to be generated. Groovy provides rich features for file handling and String manipulation, making it highly effective in this situation.

sql05.groovy


#!/usr/bin/env groovy
def sqlplusCmd = "sqlplus -S hr/hr @05-employeeList.sql".execute()
def file = new File("name.txt")
file.eachLine { println "Name (Last, First): ${it}" }

Much more sophisticated logic could be performed on the contents of the name.txt file, but this demonstrates how easy it is to fix the spooled output with the Groovy script. The bottom portion of the script's execution is shown in the next screen snapshot.

Conclusion and Final Remarks

Groovy and SQL*Plus can be used effectively together to script data-related functionality. Groovy offers many of the advantages of traditional scripting languages when used in conjunction with SQL*Plus, but offers the added advantage of enjoying full access to the JVM and the plethora of Java libraries. Groovy can be used with existing SQL*Plus scripts or can access the database directly with its powerful and convenient JDBC abstraction GroovySql.

My Personal Blog Policies

The recent posting of the The Oracle Social Media Participation Policy on Justin Kestelyn's (Oracle Technology Network Editor in Chief) blog has caused me to reevaluate my own personal and previously unwritten policies that affect how and what I write in this blog. In this post, I briefly look at the informal and unwritten policies that I have attempted to follow in writing posts for this blog and why these internally-motivated policies are important. Other developers who blog (and I think more should) may wish to similarly take an inventory of their own blogging practices.

Mixing of Personal and Technical

One of the main differences Justin calls out between Oracle's policy and Sun's policy is the mixing of personal with work-related content. In an update to the post, Justin points out that not all personal is discouraged. In many ways, my own post (though hosted on Google's Blogger rather than on any employer's infrastructure) follows the same advice. I do weave small personal details into my posts, but the focus of each post is always technically related.

The reason for this is that I have found that I prefer that approach as a reader of other peoples' blogs. I present in a way I most enjoy as an attendee at a presentation (fast-talking, content-rich presentations) and I write my blog in the way I prefer others' blogs (strongly focused on the technical subjects with just enough personal details to make the topic less dry, somewhat personable, and to provide informed opinion).

I tend to prefer it when other bloggers separate purely personal non-technical content from their technical content. The reason for this is that I often don't know these people personally and while their technical expertise and insight is valuable to me, I really don't care about their purely personal interests (especially in terms of politics). I don't want to wade through meaningless and uninteresting non-technical details to find the technical gems. I do "maintain" a personal blog with non-technical subjects, but it doesn't get the same attention from me as my technical blog.

Long-Term Blog Availability

One of the respondents to Justin's post asked about the future of his Sun blog. This is a reminder that one's blog can get removed at anytime. Hosting one's own blog obviously reduces the chances of anyone else removing it, but insufficient backups can mean problems even then. Hackers, hardware problems, and even user error could wipe out a blog whether it is hosted externally or on one's own. Very few things in life are guaranteed and online resource availability is certainly not (see the demise of GeoCities as an example). It is important to back up one's blogs. It helps if a blog has been reproduced on other hosts (which is sometimes the case), but even this can be difficult to piece back together.

I was very happy when Blogger announced the ability to export a blog and import a blog. That certainly contributes to the ability to move one's blog to another host if necessary or desired. The ability to do that is one of the reasons that I've stayed with Blogger.

In addition, I have my blog posts forwarded to multiple e-mail accounts as additional backup. The export/import is certainly easier if I ever need to actually do this in bulk, but the e-mail messages are automatic and are easily stored. In other words, the export/import can be more difficult to remember to do in a timely manner, but are easier to do if the time comes to export/import the blog in bulk.

In the worst case, there are other options for resurrecting a "lost blog" such as using Google Cache or the Internet Archive WayBack Machine. Reconstructing a blog from these two sources would be tedious and depends on the blog being archived/cached in these resources.

In summary on the topic of long-term blog availability, I continue to use Blogger and back it up occasionally using the export feature. I also have my posts forwarded to multiple e-mail accounts as soon as they are submitted. If Google ever stops supporting Blogger, I will hopefully have the ability to easily move my blog posts to another provider or, in the less optimal case, reconstruct them from my e-mail messages.

Watching What I Say

In my blog, I do endeavor to keep my criticisms as constructive as possible and I try not to allow emotion to override technical merit. Although I feel I do pretty well at this generally, I do better at it sometimes more than others. One of the blog posts that seemed to lead to Justin's post, Assimilation Begins...Oracle Censors Blogs.Sun.com, the blog author does seem to drop into an emotionally-charged state with significant use of hyperbole (though I love the use of the image of one of my favorite fictional villains [The Borg]).

The author uses words like "muck-mucks", "subjugate Sun culture" (cultural change seems to inevitably happen in mergers and acquisitions between very large companies), "Snoracle," "draconian rules," and "cultural imperialism." Normally, use of exaggeration and hyperbole undermines the credibility of a blog topic in my mind, but in this case, I attribute it more to the blog author's referenced personal connection to the creation of the blogs.sun.com site. It is easier to understand this emotion in that light, even if it seems a little overly dramatic from an outsider's perspective.

I acknowledge that I have said things and written things that are emotionally-driven and often overly critical, especially in e-mail and newsgroups. I try a little harder with blogs to not go that far and be more controlled in what I write.

I prefer blogs with opinions and facts as long as the opinions are informed opinions and are presented as opinion rather than fact. I also understand the importance of not revealing trade secrets regarding employers, clients, etc. and in not committing libel, slander, and the like. When one writes a blog post, the onus is on that person to think carefully about what he or she writes.

In general, I sometimes feel the "online frontier" is far too wild in terms of lack of civility and respect. I do try to keep my blog more positively focused and to talk about things I like or appreciate more often than my occasional posts about things I don't like or wish would be better. There is a place for constructive criticism and constant improvement, but it doesn't need to be rude and condescending.

Conclusion

I think it is important to have personal guidelines that shape what we say, do, and write. Having internal guidelines that I attempt to adhere to in writing this blog is an extension of that. It is my opinion that these guidelines benefit me and those who read one or more of my posts. It's time to back up my blog while I'm thinking about it.

Wednesday, January 27, 2010

Oracle/Sun: The Deal Has Closed

"Oracle has finalized the Sun transaction and the deal has closed."

The above statement can be found on the Sun and Oracle: Overview and Frequently Asked Questions for the Developer Community page. Besides this simple but definitive statement, the page includes some answers to some of the early questions Java developers might have about Java under Oracle. Questions asked cover subjects such as Sun's Java development network pages, Java.net, future of JavaOne, and more.

Today's Oracle+Sun Strategy presentation has been well-covered in the blogosphere (though the dominant topic on the blogsphere today was the announcement regarding the Apple iPad). There is also a web page devoted to Oracle + Sun Product Strategy Webcast Series.

If you are a Java developer and don't already have an Oracle Technology Network account, it might be a good time to consider getting one.

The official press release on Oracle's finalizing of the Sun acquisition is available here and includes the simple statement: "Oracle Corporation announced today that it had completed its acquisition of Sun Microsystems, Inc."

Monday, January 25, 2010

GroovySql and MarkupBuilder: SQL-to-XML

I blogged previously on the ease of use of GroovySql, which is Groovy's easy-to-use JDBC-based data access approach. In another blog post, I covered the utility of OracleXMLQuery, a class which easily generates XML from an Oracle database query. This class makes it very easy, almost trivial, to convert SQL query results into XML, but a major disadvantage of OracleXMLQuery is that it is a proprietary solution. In this blog post, I demonstrate combining GroovySql with Groovy's MarkupBuilder to easily transform database data to XML.

The following script demonstrates the tremendous benefit achieved from the combination of GroovySql (and specifically the groovy.sql.Sql class) and MarkupBuilder. In this script's case, the MarkupBuilder is instantiated with its no-arguments constructor, implying that its markup output is to be written out via System.out.


#!/usr/bin/env groovy
// Be sure to include appropriate JAR with OracleDataSource class on classpath
// (or use the appropriate datasource class if using different driver)

import groovy.sql.Sql
import java.sql.SQLRecoverableException
Sql sql = null;
try
{
   sql = Sql.newInstance(
      "jdbc:oracle:thin:@localhost:1521:orcl", "hr", "hr",  
      "oracle.jdbc.pool.OracleDataSource")
}
catch (SQLRecoverableException recoverableEx)
{
   println "SQLRecoverableException: ${recoverableEx}"
   println "\tError Code: ${recoverableEx.getErrorCode()}"
   println "\tSQL Status Code: ${recoverableEx.getSQLState()}"
   System.exit(-1)
}

import groovy.xml.MarkupBuilder
def xml = new MarkupBuilder()
xml.Employees
{
   sql.eachRow("select employee_id, last_name, first_name, email, phone_number from employees") { row ->
      Employee
      {
         Id(row.employee_id)
         LastName(row.last_name)
         FirstName(row.first_name)
         Email(row.email)
         Telephone(row.phone_number)
      }
   }
}

When the above script is executed, XML is generated with elements featuring names provided above. A portion of the output is displayed next.


<Employees>
  <Employee>
    <Id>198</Id>
    <LastName>OConnell</LastName>
    <FirstName>Donald</FirstName>
    <Email>DOCONNEL</Email>
    <Telephone>650.507.9833</Telephone>
  </Employee>
  <Employee>
    <Id>199</Id>
    <LastName>Grant</LastName>
    <FirstName>Douglas</FirstName>
    <Email>DGRANT</Email>
    <Telephone>650.507.9844</Telephone>
  </Employee>

   . . .

  <Employee>
    <Id>196</Id>
    <LastName>Walsh</LastName>
    <FirstName>Alana</FirstName>
    <Email>AWALSH</Email>
    <Telephone>650.507.9811</Telephone>
  </Employee>
  <Employee>
    <Id>197</Id>
    <LastName>Feeney</LastName>
    <FirstName>Kevin</FirstName>
    <Email>KFEENEY</Email>
    <Telephone>650.507.9822</Telephone>
  </Employee>
</Employees>

The names of these tags (Employees root tag, Employee row tag, and the individual employee elements) are exactly as specified in the Groovy code.

It is not always preferable to have the generated XML written to standard output. The next Groovy script is a slightly modified version of the above that writes the generated XML to a file whose filename is specified on the command line.


#!/usr/bin/env groovy
// Be sure to include appropriate JAR with OracleDataSource class on classpath
// (or use the appropriate datasource class if using different driver)

import groovy.sql.Sql
import java.sql.SQLRecoverableException
Sql sql = null;
try
{
   sql = Sql.newInstance(
      "jdbc:oracle:thin:@localhost:1521:orcl", "hr", "hr",  
      "oracle.jdbc.pool.OracleDataSource")
}
catch (SQLRecoverableException recoverableEx)
{
   println "SQLRecoverableException: ${recoverableEx}"
   println "\tError Code: ${recoverableEx.getErrorCode()}"
   println "\tSQL Status Code: ${recoverableEx.getSQLState()}"
   System.exit(-1)
}

import groovy.xml.MarkupBuilder
MarkupBuilder xml
if (args.length < 1)
{
   xml = new MarkupBuilder()  // no-arguments constructor: System.out
}
else
{
   xml = new MarkupBuilder(new PrintWriter(new File(args[0])))
}

xml.Employees
{
   sql.eachRow("select employee_id, last_name, first_name, email, phone_number from employees") { row ->
      Employee
      {
         Id(row.employee_id)
         LastName(row.last_name)
         FirstName(row.first_name)
         Email(row.email)
         Telephone(row.phone_number)
      }
   }
}

The script immediately above writes the generated XML out to a file with the name passed in as a command-argument. If there is no command-line argument provided, the generated XML is written to System.out as was done in the first script.

The MarkupBuilder class allows the developer to specify customization beyond where the generated XML is written. Two such methods are MarkupBuilder.setOmitEmptyAttributes(boolean) and MarkupBuilder.setOmitNullAttributes(boolean). These specify whether null and empty values in the database should be written to the generated XML.

Conclusion

GroovySql and MarkupBuilder are both simple to use while at the same time being highly useful. They are especially powerful together and provide a general approach to presenting database data as XML without proprietary classes specific to any one database vendor.

Saturday, January 23, 2010

Groovy: Injecting New Methods on Existing Java Classes

One of Groovy's most significant "goodies" is the ability to intercept calls, even to standard Java classes, and either override existing methods or even add new methods dynamically. In this blog post, I will demonstrate how to add a method to the Calendar that provides the appropriate Zodiak Tropical Sign.

Groovy offers rich dynamic capabilities via its MetaClass support. In this blog post, I will take advantage of this feature to have a method called getZodiakTropicalSign() injected into the SDK-provided Calendar. The purpose of this method is to provide the applicable Zodiak sign for a given instance of Calendar.

The portion of this script most relevant to this post appears at the top of the script. The syntax Calendar.metaClass.getZodiakTropicalSign = { -> defines a new method getZodiakTrpopicalSign on Calendar and then uses delegate to work with the underlying instance of java.util.Calendar.


#!/usr/bin/env groovy

Calendar.metaClass.getZodiakTropicalSign = { ->
   def humanMonth = delegate.get(Calendar.MONTH) + 1
   def day = delegate.get(Calendar.DATE)
   def zodiakTropicalSign = "Unknown"
   switch (humanMonth)
   {
      case 1 :
         zodiakTropicalSign = day < 20 ? "Capricorn" : "Aquarius" 
         break
      case 2 :
         zodiakTropicalSign = day < 19 ? "Aquarius" : "Pisces"
         break
      case 3 :
         zodiakTropicalSign = day < 21 ? "Pisces" : "Aries"
         break
      case 4 :
         zodiakTropicalSign = day < 20 ? "Aries" : "Taurus"
         break
      case 5 :
         zodiakTropicalSign = day < 21 ? "Taurus" : "Gemini"
         break
      case 6 :
         zodiakTropicalSign = day < 21 ? "Gemini" : "Cancer"
         break
      case 7 :
         zodiakTropicalSign = day < 23 ? "Cancer" : "Leo"
         break
      case 8 : 
         zodiakTropicalSign = day < 23 ? "Leo" : "Virgo"
         break
      case 9 :
         zodiakTropicalSign = day < 23 ? "Virgo" : "Libra"
         break
      case 10 :
         zodiakTropicalSign = day < 23 ? "Libra" : "Scorpio"
         break
      case 11 :
         zodiakTropicalSign = day < 22 ? "Scorpio" : "Sagittarius"
         break
      case 12 :
         zodiakTropicalSign = day < 22 ? "Sagittarius" : "Capricorn"
         break
   }
   return zodiakTropicalSign
}

def calendar = Calendar.instance
// First argument to script is expected to be month integer between 1 (January)
// and 12 (December).  Second argument is expected to be the date of that month.
// If fewer than two arguments are provided, will use today's date (default) instead.
if (args.length > 1)
{
   try
   {
      calendar.set(Calendar.MONTH, (args[0] as Integer) - 1)
      calendar.set(Calendar.DATE, args[1] as Integer)
   }
   catch (NumberFormatException nfe)
   {
      println "Arguments ${args[0]} and ${args[1]} are not valid respective month and date integers."
      println "Using today's date (${calendar.get(Calendar.MONTH)+1}/${calendar.get(Calendar.DATE)}) instead."
   }
}
print "A person born on ${calendar.getDisplayName(Calendar.MONTH, Calendar.LONG, Locale.US)}"
println " ${calendar.get(Calendar.DATE)} has the Zodiak sign ${calendar.getZodiakTropicalSign()}"

The final line of the above script calls the injected getZodiakTropicalSign() method on Calendar. This is demonstrated in the next screen snapshot.

The first execution of the script in the above screen snapshot is executed without parameters and so it prints out the Zodiak Sign for today (23 January). The remainder of the script runs are for different dates. This script could have been made more user-friendly using Groovy's built-in CLI support, but the focus here is on the dynamically injected method.

Conclusion

Groovy allows the developer to inject methods on instances of existing classes even when the developer does not own or have control over the existing class. This can be a powerful capability to bending an existing class to one's will.

Thursday, January 21, 2010

Groovy: Switch on Steroids

UPDATE: This post underwent significant updates on 17 November 2016 to correct erroneous statements and examples, to fix the underlying HTML layout (not obvious to readers unless you view HTML source in a web browser), and to fix some spelling issues.

I have blogged before regarding Groovy's support for switching on String. Groovy can switch on much more than just literal Strings (and literal integral types that Java allows switching on) and I demonstrate this briefly here.

Groovy's switch statement will use a method implemented with the name "isCase" to determine if a particular switch option is matched. This means that custom objects are "switchable" in Groovy. For the simple example in this blog post, I'll use the Java classes SimpleState and State.java.

SimpleState.java

package dustin.examples;

import static java.lang.System.out;

/**
 * Java class to be used in demonstrating the "switch on steroids" in Groovy.
 * The Groovy script will be able to {@code switch} on instances of this class
 * via the implicit invocation of {@code toString()} if the {@code case}
 * statements use {@code String}s as the items to match.
 */
public class SimpleState
{
   private String stateName;

   public SimpleState(final String newStateName)
   {
      this.stateName = newStateName;
   }

   @Override
   public String toString()
   {
      return this.stateName;
   }
}

The above Java class's String representation can be switched on in a Groovy script as shown in the next code listing for switchOnSimpleState.groovy:

switchOnSimpleState.groovy

#!/usr/bin/env groovy

import dustin.examples.SimpleState

SimpleState state = new SimpleState("Colorado")
print "The motto for the state of ${state.stateName} is '"
switch (state)
{
   case "Alabama":      print "Audemus jura nostra defendere"
                        break
   case "Alaska":       print "North to the future"
                        break
   case "Arizona":      print "Ditat Deus"
                        break
   case "Arkansas":     print "Regnat populus"
                        break
   case "California":   print "Eureka"
                        break
   case "Colorado":     print "Nil sine numine"
                        break
   case "Connecticut":  print "Qui transtulit sustinet"
                        break
   default : print "<<State ${state.stateName} not found.>>"
}
println "'"

When the above Groovy script is run against the above simple Java class, the code prints out the correct information because Groovy implicitly invokes the toString() method on the "state" instance of State being switched on. Similar functionality can now be achieved in Java, but one needs to explicitly call toString() on the object being switched on. It's also worth keeping in mind that when I wrote the original version of this post in early 2010, Java did not support switching on Strings. The output of running the above is shown in the screen snapshot below (the name of the script doesn't match above because this is an old screen snapshot from this original post before it was corrected and updated).

With Groovy and the isCase method, I can switch on just about any data type I like. To demonstrate this, the Java class State will be used and its code listing is shown below. It includes a isCase(State) method that Groovy will implicitly call when instances of State are being switched against as the case choices. In this case, the isCase(State) method simply calls the State.equals(Object) method to determine if that case is true. Although this is the typical behavior for implementations of isCase(Object), we could have had it determine if it was the case or not in any way we wanted.

State.java

package dustin.examples;

import static java.lang.System.out;

public class State
{
   private String stateName;

   public State(final String newStateName)
   {
      this.stateName = newStateName;
   }

   /**
    * Method to be used by Groovy's switch implicitly when an instance of this
    * class is switched on.
    *
    * @param compareState State passed via case to me to be compared to me.
    */
   public boolean isCase(final State compareState)
   {
      return compareState != null ? compareState.equals(this) : false;
   }

   public boolean equals(final Object other)
   {
      if (!(other instanceof State))
      {
         return false;
      }
      final State otherState = (State) other;
      if (this.stateName == null ? otherState.stateName != null : !this.stateName.equals(otherState.stateName))
      {
         return false;
      }
      return true;
   }

   @Override
   public String toString()
   {
      return this.stateName;
   }
}

The simple standard Java class shown above implements an isCase method that will allow Groovy to switch on it. The following Groovy script uses this class and is able to successfully switch on the instance of State.

#!/usr/bin/env groovy

import dustin.examples.State
State state = new State("Arkansas")
State alabama = new State("Alabama")
State arkansas = new State("Arkansas")
State alaska = new State("Alaska")
State arizona = new State("Arizona")
State california = new State("California")
State colorado = new State("Colorado")
State connecticut = new State("Connecticut")

print "The motto for the state of ${state.stateName} is '"
switch (state)
{
   case alabama     : print "Audemus jura nostra defendere"
                      break
   case alaska      : print "North to the future"
                      break
   case arizona     : print "Ditat Deus"
                      break
   case arkansas    : print "Regnat populus"
                      break
   case california  : print "Eureka"
                      break
   case colorado    : print "Nil sine numine"
                      break
   case connecticut : print "Qui transtulit sustinet"
                      break
   default : print "<<State ${state.stateName} not found.>>"
}
println "'"

The output in the next screen snapshot indicates that the Groovy script is able to successfully switch on an instance of a State object. The first command is using the "simple" example discussed earlier and the second command is using the example that needs to invoke State's isCase(State) method.

The beauty of this ability to have classes be "switchable" based on the implementation of an isCase() method is that it allows more concise syntax in situations that otherwise might have required lengthy if/else if/else constructs. It's preferable to avoid such constructs completely, but sometimes we run into them and the Groovy switch statement makes them less tedious.

It is entirely possible with the Groovy switch to have multiple switch options match the specified conditions. Therefore, it is important to list the case statements in order of which match is preferred because the first match will be the one executed. The break keyword is used in Groovy's switch as it is in Java.

There is much more power in what the Groovy switch supports. Some posts that cover this power include Groovy Goodness: The Switch Statement, Groovy, let me count the ways in which I love thee, and the Groovy documentation.

European Commission Approves Oracle Acquisition of Sun

The European Commission issued a press release today (21 January 2010) stating:

The European Commission has approved under the EU Merger Regulation the proposed acquisition of US hardware and software vendor Sun Microsystems Inc. by Oracle Corporation, a US enterprise software company. After an in-depth examination, launched in September 2009 (see IP/09/1271), the Commission concluded that the transaction would not significantly impede effective competition in the European Economic Area (EEA) or any substantial part of it.

On MySQL

One of the sentences I found most interesting in this press release was actually not directly related to Java:

The Commission's investigation showed that another open source database, PostgreSQL, is considered by many database users to be a credible alternative to MySQL and could be expected to replace to some extent the competitive force currently exerted by MySQL on the database market.

As a developer who generally prefers PostgreSQL over MySQL, this is not news to me, but it is interesting (and appropriate) that the European Commission considered it.

The drama in the Oracle acquisition of Sun has hinged primarily on MySQL. I have blogged before on how hypocritical it has seemed for those who once sold off interest in MySQL (for a handsome profit) to try and use government to force the purchaser of that product to do with it as the original seller wants. Marc Fleury has gone even further, pointing out that these actions had potential to only hurt Sun and its employees worse and has the potential to damage the likelihood of corporations investing in open source. His words included the entirely true: "this is making OSS acquisitions look very dangerous and dicey." Charles H. Schulze also has excellent post on this.

I do think the save MySQL crusade has been counter-productive for the open source software movement and one of the positives of this acquisition will be the end of that destructive force.

The End of an Era

I am sorry to see the end of a company that has been partially or fully responsible for many great innovations and for making my life as a developer easier in many different ways. I realized my tangible sorrow over this when I saw James Gosling's tribute to Sun. The simplicity of the post, titled "So long, old friend" with the stirring image, successfully accomplished what I believe to be its intended point. It's worth viewing for anyone with significant interest in Java. You can also see the same image in the blog post Every Good Thing Has An End.

Wednesday, January 20, 2010

First Solid Look at Oracle + Sun?

Since the announcement of Oracle's intent to buy Sun 9 months ago (20 April 2009), many of us have been only able to speculate on what this means for all things Java. During this relatively long period of time, we've had some clues such as Oracle's advertisement aimed at "Sun Customers", the closely covered events and announcements at 2009 JavaOne, the Oracle Buys Sun slides, and the Reuters interview with Oracle founder and CEO Larry Ellison.

The European Union's concerns with Oracle's acquisition of Sun seem to have been addressed, so it is not all that surprising that we're starting to hear more. The potential for the most clear view into Oracle's plans for Sun's assets may lie in the just-announced Oracle+Sun Strategy event scheduled for 9 am to 2 pm Pacific Time on 27 January 2010 (the same day that is the EU's deadline for their regulatory approval of the acquisition). According to the press release, the event will be broadcast globally.

The press release also states that "product roadmaps" and the "strategy for the combined companies" will be outlined in this event. I am certain that the blogosphere and other media formats will be abuzz with coverage, analysis, conspiracy theories, innuendos, and so forth following this event, but I think many of us hope to better understand what this all means for the future of Java. Hopefully we will know quite a bit more about all of this within a week's time.

Tuesday, January 19, 2010

Reproducing "too many constants" Problem in Java

In my previous blog post, I blogged on the "code too large" problem and reproduced that error message. In this post, I look at the very similar "too many constants" error message (not the same thing as the question too many constants?) and demonstrate reproducing it by having too many methods in a generated Java class.

With a few small adaptations, I can adjust the Groovy script that I used to generate a Java class to reproduce the "code too large" error to instead generate a Java class to reproduce the "too many constants" error. Here is the revised script.

generateJavaClassWithManyMethods.groovy


#!/usr/bin/env groovy

import javax.tools.ToolProvider

println "You're running the script ${System.getProperty('script.name')}"
if (args.length < 2)
{
   println "Usage: javaClassGenerationWithManyMethods packageName className baseDir #methods"
   System.exit(-1)
}

// No use of "def" makes the variable available to entire script including the
// defined methods ("global" variables)

packageName = args[0]
packagePieces = packageName.tokenize(".")  // Get directory names
def fileName = args[1].endsWith(".java") ? args[1] : args[1] + ".java"
def baseDirectory = args.length > 2 ? args[2] : System.getProperty("user.dir")
numberOfMethods = args.length > 3 ? Integer.valueOf(args[3]) : 10

NEW_LINE = System.getProperty("line.separator")

// The setting up of the indentations shows off Groovy's easy feature for
// multiplying Strings and Groovy's tie of an overloaded * operator for Strings
// to the 'multiply' method.  In other words, the "multiply" and "*" used here
// are really the same thing.
SINGLE_INDENT = '   '
DOUBLE_INDENT = SINGLE_INDENT.multiply(2)
TRIPLE_INDENT = SINGLE_INDENT * 3

def outputDirectoryName = createDirectories(baseDirectory)
def generatedJavaFile = generateJavaClass(outputDirectoryName, fileName)
compileJavaClass(generatedJavaFile)


/**
 * Generate the Java class and write its source code to the output directory
 * provided and with the file name provided.  The generated class's name is
 * derived from the provided file name.
 *
 * @param outDirName Name of directory to which to write Java source.
 * @param fileName Name of file to be written to output directory (should include
 *    the .java extension).
 * @return Fully qualified file name of source file.
 */
def String generateJavaClass(outDirName, fileName)
{
   def className = fileName.substring(0,fileName.size()-5)
   outputFileName = outDirName.toString() + File.separator + fileName
   outputFile = new File(outputFileName)
   outputFile.write "package ${packageName};${NEW_LINE.multiply(2)}"  
   outputFile << "public class ${className}${NEW_LINE}"  
   outputFile << "{${NEW_LINE}"
   outputFile << "${SINGLE_INDENT}public static void main(final String[] arguments)"
   outputFile << "${NEW_LINE}${SINGLE_INDENT}{${NEW_LINE}"
   outputFile << DOUBLE_INDENT << 'final String someString = "Dustin";' << NEW_LINE
   outputFile << "${SINGLE_INDENT}}${NEW_LINE}"
   outputFile << buildManyMethods()
   outputFile << "}"
   return outputFileName
}


/**
 * Compile the provided Java source code file name.
 *
 * @param fileName Name of Java file to be compiled.
 */
def void compileJavaClass(fileName)
{
   // Use the Java SE 6 Compiler API (JSR 199)
   // http://java.sun.com/mailers/techtips/corejava/2007/tt0307.html#1
   compiler = ToolProvider.getSystemJavaCompiler()
   
   // The use of nulls in the call to JavaCompiler.run indicate use of defaults
   // of System.in, System.out, and System.err. 
   int compilationResult = compiler.run(null, null, null, fileName)
   if (compilationResult == 0)
   {
      println "${fileName} compiled successfully"
 }
   else
   {
      println "${fileName} compilation failed"
   }
}


/**
 * Create directories to which generated files will be written.
 *
 * @param baseDir The base directory used in which subdirectories for Java
 *    source packages will be generated.
 */
def String createDirectories(baseDir)
{
   def outDirName = new StringBuilder(baseDir)
   for (pkgDir in packagePieces)
   {
      outDirName << File.separator << pkgDir
   }
   outputDirectory = new File(outDirName.toString())
   if (outputDirectory.exists() && outputDirectory.isDirectory())
   {
      println "Directory ${outDirName} already exists."
   }
   else
   {
      isDirectoryCreated = outputDirectory.mkdirs()  // Use mkdirs in case multiple
      println "Directory ${outputDirectoryName} ${isDirectoryCreated ? 'is' : 'not'} created."
   }
   return outDirName.toString()
}


/**
 * Generate the body of generated Java class source code's main function.
 */
def String buildManyMethods()
{
   def str = new StringBuilder() << NEW_LINE
   for (i in 0..numberOfMethods)
   {
      str << SINGLE_INDENT << "private void doMethod${i}(){}" << NEW_LINE
   }
   return str
}

When the above script is run with a parameter of 5 for the number of methods, the following Java code is generated.


package dustin.examples;

public class LotsOfMethods
{
   public static void main(final String[] arguments)
   {
      final String someString = "Dustin";
   }

   private void doMethod0(){}
   private void doMethod1(){}
   private void doMethod2(){}
   private void doMethod3(){}
   private void doMethod4(){}
   private void doMethod5(){}
}

When I turn up the number of generated methods to 65000 methods, I run out of heap space as shown in the next screen snapshot.

The next screen snapshot shows the output of running the script again, but this time with 512 MB maximum heap space specified for the JVM.

What happens when we try to compile a class with too many methods? That is shown in the next screen snapshot that demonstrates what happens when just such a compilation is attempted.

The "too many constants" error message is shown with a pointer at the class keyword in the class definition. The method has too many methods to compile.

When I run javap -c -private dustin.examples.LotsOfMethods (-c to disassemble the code, -private to display the many private methods, and dustin.examples.LotsOfMethods is the name of the generated Java class), I see output like the following (only the first and end shown instead of displaying all 60,000+ methods).


Compiled from "LotsOfMethods.java"
public class dustin.examples.LotsOfMethods extends java.lang.Object{
public dustin.examples.LotsOfMethods();
  Code:
   0: aload_0
   1: invokespecial #1; //Method java/lang/Object."":()V
   4: return

public static void main(java.lang.String[]);
  Code:
   0: return

private void doMethod0();
  Code:
   0: return

private void doMethod1();
  Code:
   0: return

private void doMethod2();
  Code:
   0: return

private void doMethod3();
  Code:
   0: return

private void doMethod4();
  Code:
   0: return

private void doMethod5();
  Code:
   0: return

private void doMethod6();
  Code:
   0: return

private void doMethod7();
  Code:
   0: return

private void doMethod8();
  Code:
   0: return

. . .

. . .

. . .

private void doMethod64992();
  Code:
   0: return

private void doMethod64993();
  Code:
   0: return

private void doMethod64994();
  Code:
   0: return

private void doMethod64995();
  Code:
   0: return

private void doMethod64996();
  Code:
   0: return

private void doMethod64997();
  Code:
   0: return

private void doMethod64998();
  Code:
   0: return

private void doMethod64999();
  Code:
   0: return

private void doMethod65000();
  Code:
   0: return

}

Conclusion

As with the last blog post, this post used Groovy and the Java Compiler API to intentionally reproduce an error that we hope to not see very often.

Additional Reference

Error Writing File: too many constants

Monday, January 18, 2010

Reproducing "code too large" Problem in Java

Code conventions and standard software development wisdom dictate that methods should not be too long because they become difficult to fully comprehend, they lose readability when they get too long, they are difficult to appropriately unit test, and they are difficult to reuse. Because most Java developers strive to write highly modular code with small, highly cohesive methods, the "code too large" error in Java is not seen very often. When this error is seen, it is often in generated code.

In this blog post, I intentionally force this "code too large" error to occur. Why in the world would one intentionally do this? In this case, it is because I always understand things better when I tinker with them rather than just reading about them and because doing so gives me a chance to demonstrate Groovy, the Java Compiler API (Java SE 6), and javap.

It turns out that the magic number for the "code too large" error is 65535 bytes (compiled byte code, not source code). Hand-writing a method large enough to lead to this size of a .class file would be tedious (and not worth the effort in my opinion). However, it is typically generated code that leads to this in the wild and so generation of code seems like the best approach to reproducing the problem. When I think of generic Java code generation, I think Groovy.

The Groovy script that soon follows generates a Java class that isn't very exciting. However, the class will have its main function be of an approximate size based on how many conditions I tell the script to create. This allows me to quickly try generating Java classes with different main() method sizes to ascertain when the main() becomes too large.

After the script generates the Java class, it also uses the Java Compiler API to automatically compile the newly generated Java class for me. The resultant .class file is placed in the same directory as the source .java file. The script, creatively named generateJavaClass.groovy, is shown next.

generateJavaClass.groovy


#!/usr/bin/env groovy

import javax.tools.ToolProvider

println "You're running the script ${System.getProperty('script.name')}"
if (args.length < 2)
{
   println "Usage: javaClassGeneration packageName className baseDir #loops"
   System.exit(-1)
}

// No use of "def" makes the variable available to entire script including the
// defined methods ("global" variables)

packageName = args[0]
packagePieces = packageName.tokenize(".")  // Get directory names
def fileName = args[1].endsWith(".java") ? args[1] : args[1] + ".java"
def baseDirectory = args.length > 2 ? args[2] : System.getProperty("user.dir")
numberOfConditionals = args.length > 3 ? Integer.valueOf(args[3]) : 10

NEW_LINE = System.getProperty("line.separator")

// The setting up of the indentations shows off Groovy's easy feature for
// multiplying Strings and Groovy's tie of an overloaded * operator for Strings
// to the 'multiply' method.  In other words, the "multiply" and "*" used here
// are really the same thing.
SINGLE_INDENT = '   '
DOUBLE_INDENT = SINGLE_INDENT.multiply(2)
TRIPLE_INDENT = SINGLE_INDENT * 3

def outputDirectoryName = createDirectories(baseDirectory)
def generatedJavaFile = generateJavaClass(outputDirectoryName, fileName)
compileJavaClass(generatedJavaFile)


/**
 * Generate the Java class and write its source code to the output directory
 * provided and with the file name provided.  The generated class's name is
 * derived from the provided file name.
 *
 * @param outDirName Name of directory to which to write Java source.
 * @param fileName Name of file to be written to output directory (should include
 *    the .java extension).
 * @return Fully qualified file name of source file.
 */
def String generateJavaClass(outDirName, fileName)
{
   def className = fileName.substring(0,fileName.size()-5)
   outputFileName = outDirName.toString() + File.separator + fileName
   outputFile = new File(outputFileName)
   outputFile.write "package ${packageName};${NEW_LINE.multiply(2)}"  
   outputFile << "public class ${className}${NEW_LINE}"  
   outputFile << "{${NEW_LINE}"
   outputFile << "${SINGLE_INDENT}public static void main(final String[] arguments)"
   outputFile << "${NEW_LINE}${SINGLE_INDENT}{${NEW_LINE}"
   outputFile << DOUBLE_INDENT << 'final String someString = "Dustin";' << NEW_LINE
   outputFile << buildMainBody()
   outputFile << "${SINGLE_INDENT}}${NEW_LINE}"
   outputFile << "}"
   return outputFileName
}


/**
 * Compile the provided Java source code file name.
 *
 * @param fileName Name of Java file to be compiled.
 */
def void compileJavaClass(fileName)
{
   // Use the Java SE 6 Compiler API (JSR 199)
   // http://java.sun.com/mailers/techtips/corejava/2007/tt0307.html#1
   compiler = ToolProvider.getSystemJavaCompiler()
   
   // The use of nulls in the call to JavaCompiler.run indicate use of defaults
   // of System.in, System.out, and System.err. 
   int compilationResult = compiler.run(null, null, null, fileName)
   if (compilationResult == 0)
   {
      println "${fileName} compiled successfully"
 }
   else
   {
      println "${fileName} compilation failed"
   }
}


/**
 * Create directories to which generated files will be written.
 *
 * @param baseDir The base directory used in which subdirectories for Java
 *    source packages will be generated.
 */
def String createDirectories(baseDir)
{
   def outDirName = new StringBuilder(baseDir)
   for (pkgDir in packagePieces)
   {
      outDirName << File.separator << pkgDir
   }
   outputDirectory = new File(outDirName.toString())
   if (outputDirectory.exists() && outputDirectory.isDirectory())
   {
      println "Directory ${outDirName} already exists."
   }
   else
   {
      isDirectoryCreated = outputDirectory.mkdirs()  // Use mkdirs in case multiple
      println "Directory ${outputDirectoryName} ${isDirectoryCreated ? 'is' : 'not'} created."
   }
   return outDirName.toString()
}


/**
 * Generate the body of generated Java class source code's main function.
 */
def String buildMainBody()
{
   def str = new StringBuilder() << NEW_LINE
   str << DOUBLE_INDENT << "if (someString == null || someString.isEmpty())" << NEW_LINE
   str << DOUBLE_INDENT << "{" << NEW_LINE
   str << TRIPLE_INDENT << 'System.out.println("The String is null or empty.");'
   str << NEW_LINE << DOUBLE_INDENT << "}" << NEW_LINE
   for (i in 0..numberOfConditionals)
   {
      str << DOUBLE_INDENT << 'else if (someString.equals("a' << i << '"))' << NEW_LINE
      str << DOUBLE_INDENT << "{" << NEW_LINE
      str << TRIPLE_INDENT << 'System.out.println("You found me!");' << NEW_LINE
      str << DOUBLE_INDENT << "}" << NEW_LINE
   }
   str << DOUBLE_INDENT << "else" << NEW_LINE
   str << DOUBLE_INDENT << "{" << NEW_LINE
   str << TRIPLE_INDENT << 'System.out.println("No matching string found.");'
   str << DOUBLE_INDENT << NEW_LINE << DOUBLE_INDENT << "}" << NEW_LINE
   return str
}

Because this script is intended primarily for generating Java code to learn more about the "code too large" error and to demonstrate a few things, I did not make it nearly as fancy as it could be. For one thing, I did not use Groovy's built-in Apache CLI support for handling command-line arguments as I have demonstrated in previous blog posts on using Groovy to check seventh grade homework.

Even though the script above does not apply Groovy's full potential, it still manages to demonstrate some Groovy niceties. I tried to add comments in the script describing some of these. These include features such as Groovy GDK's String.tokenize method and other useful Groovy String extensions.

When I run this script from the directory C:\java\examples\groovyExamples\javaClassGeneration with the arguments "dustin.examples" (package structure), "BigClass" (name of generated Java class), "." (current directory is based directory for generated code, and "5" (number of conditionals to be in generated code), the script's output is shown here and in the following screen snapshot:


You're running the script C:\java\examples\groovyExamples\javaClassGeneration\generateJavaClass.groovy
Directory .\dustin\examples already exists.
.\dustin\examples\BigClass.java compiled successfully

This output tells us that the generated Java class with five conditionals compiled successfully. To get a taste of what this generated Java class looks like, we'll look at this newly generated version with only five conditionals.

BigClass.java (generated with 5 conditionals)


package dustin.examples;

public class BigClass
{
   public static void main(final String[] arguments)
   {
      final String someString = "Dustin";

      if (someString == null || someString.isEmpty())
      {
         System.out.println("The String is null or empty.");
      }
      else if (someString.equals("a0"))
      {
         System.out.println("You found me!");
      }
      else if (someString.equals("a1"))
      {
         System.out.println("You found me!");
      }
      else if (someString.equals("a2"))
      {
         System.out.println("You found me!");
      }
      else if (someString.equals("a3"))
      {
         System.out.println("You found me!");
      }
      else if (someString.equals("a4"))
      {
         System.out.println("You found me!");
      }
      else if (someString.equals("a5"))
      {
         System.out.println("You found me!");
      }
      else
      {
         System.out.println("No matching string found.");      
      }
   }
}

The above code includes two default conditionals every time regardless of how many conditionals are selected when the class generation script is run. In between the check for null/empty String and the else clause if no other else if has been satisfied are the number of else if statements specified when the class generation script was run. In this case, 5 was that number and so there are five else if conditionals between the two default conditionals on either end. As this demonstrates, it will be easy to scale up the number of conditionals until the Java compiler just won't take it anymore.

I now try the Groovy script for generating the Java class again, but this time go all out and select 5000 as the number of desired conditionals. As the output shown below and in the following screen snapshot indicate, Groovy has no trouble generating the text file representing the Java class with this many conditionals in its main() function, but the Java compiler doesn't like it one bit.


You're running the script C:\java\examples\groovyExamples\javaClassGeneration\generateJavaClass.groovy
Directory .\dustin\examples already exists.
.\dustin\examples\BigClass.java:5: code too large
   public static void main(final String[] arguments)
                      ^
1 error
.\dustin\examples\BigClass.java compilation failed

Obviously, the attempt to compile the generated class with a 5000+2 conditional main was too much. Through a little iterative trial-and-error, I was able to determine that 2265 conditionals (beyond the two defaults) was the maximum compilable number for my main() function and 2266 would break it. This is demonstrated in the next screen snapshot.

Knowing our limits better, we can now "look" at the byte code using the javap tool provided with Sun's JDK to analyze the corresponding class file. Because there was a compiler error when we tried to compile the code with 2266 additional conditionals, we must run javap against the BigClass.class file generated with 2265 additional conditionals. The output of running javap with the -c option for this large class is too large (~1 MB) to bludgeon readers with here. However, I include key snippets from its output below.


Compiled from "BigClass.java"
public class dustin.examples.BigClass extends java.lang.Object{
public dustin.examples.BigClass();
  Code:
   0: aload_0
   1: invokespecial #1; //Method java/lang/Object."<init>":()V
   4: return

public static void main(java.lang.String[]);
  Code:
   0: ldc #2; //String Dustin
   2: ifnonnull 10
   5: goto_w 23
   10: ldc #2; //String Dustin
   12: invokevirtual #3; //Method java/lang/String.isEmpty:()Z
   15: ifne 23
   18: goto_w 36
   23: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream;
   26: ldc #5; //String The String is null or empty.
   28: invokevirtual #6; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   31: goto_w 65512
   36: ldc #2; //String Dustin
   38: ldc #7; //String a0
   40: invokevirtual #8; //Method java/lang/String.equals:(Ljava/lang/Object;)Z
   43: ifne 51
   46: goto_w 64
   51: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream;
   54: ldc #9; //String You found me!
   56: invokevirtual #6; //Method java/io/PrintStream.println:


. . . 

. . .

. . .

   65411: goto_w 65512
   65416: ldc #2; //String Dustin
   65418: ldc_w #2272; //String a2263
   65421: invokevirtual #8; //Method java/lang/String.equals:(Ljava/lang/Object;)Z
   65424: ifne 65432
   65427: goto_w 65445
   65432: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream;
   65435: ldc #9; //String You found me!
   65437: invokevirtual #6; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   65440: goto_w 65512
   65445: ldc #2; //String Dustin
   65447: ldc_w #2273; //String a2264
   65450: invokevirtual #8; //Method java/lang/String.equals:(Ljava/lang/Object;)Z
   65453: ifne 65461
   65456: goto_w 65474
   65461: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream;
   65464: ldc #9; //String You found me!
   65466: invokevirtual #6; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   65469: goto_w 65512
   65474: ldc #2; //String Dustin
   65476: ldc_w #2274; //String a2265
   65479: invokevirtual #8; //Method java/lang/String.equals:(Ljava/lang/Object;)Z
   65482: ifne 65490
   65485: goto_w 65503
   65490: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream;
   65493: ldc #9; //String You found me!
   65495: invokevirtual #6; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   65498: goto_w 65512
   65503: getstatic #4; //Field java/lang/System.out:Ljava/io/PrintStream;
   65506: ldc_w #2275; //String No matching string found.
   65509: invokevirtual #6; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   65512: return
}

From the snippets of javap output shown above, we see that the highest Code offset (65512) for this function pushing the limits of the method size was getting awfully close to the magic 65535 bytes (2¹⁶-1 or Short.MAX_VALUE - Short.MIN_VALUE).

Conclusion

Most Java developers don't see the "code too large" problem very often because they write methods and classes that are reasonable in size (or at least more reasonable than the limits allow). However, generated code can much more easily exceed these "limitations." So of what value is intentionally reproducing this error? Well, the next time someone tries to convince you that bigger is better, you can refer that person to this post.

Other Resources

⇒ Java Language Specification, Third Edition

⇒ Class File Format

⇒ Code Too Large for Try Statement?

⇒ Code Too Long

⇒ Is There Any Number of Lines Limit in a Java Class?

Thursday, January 14, 2010

Float/Double to BigDecimal in Groovy

In a previous blog post, I looked at the subtle handling required to handle double with BigDecimal. As I discussed in that post, using the BigDecimal(double) constructor rarely does what one would expect when passed a float or double. For a double, the Java developer is typically better off using BigDecimal.valueOf(double) or converting the double to a String and then using the BigDecimal(String) constructor. For a float, the String approach is typically the only desirable approach for getting a float to a BigDecimal in Java. With Groovy's dynamic typing and automatic BigDecimal use, these subtleties can be abstracted away from the Groovy developer in many cases.

The following simple Groovy script is based on the Java class in the previous blog post, but it is now "Groovier" and has a few more things demonstrated.


//import java.math.BigDecimal;

/**
 * Simple example of problems associated with using BigDecimal constructor
 * accepting a double.
 *
 * http://marxsoftware.blogspot.com/
 */
NEW_LINE = System.getProperty("line.separator")

//
// Demonstrate BigDecimal from double
//
double primitiveDouble = 0.1
BigDecimal bdPrimDoubleCtor = new BigDecimal(primitiveDouble)
BigDecimal bdPrimDoubleValOf = BigDecimal.valueOf(primitiveDouble)
Double referenceDouble = Double.valueOf(0.1)
BigDecimal bdRefDoubleCtor = new BigDecimal(referenceDouble)
BigDecimal bdRefDoubleValOf = BigDecimal.valueOf(referenceDouble)

println "Primitive Double: ${primitiveDouble}"
println "Reference Double: ${referenceDouble}"
println "Primitive BigDecimal/Double via Double Ctor: ${bdPrimDoubleCtor}"
println "Reference BigDecimal/Double via Double Ctor: ${bdRefDoubleCtor}"
println "Primitive BigDecimal/Double via ValueOf: ${bdPrimDoubleValOf}"
println "Reference BigDecimal/Double via ValueOf: ${bdRefDoubleValOf}"

println NEW_LINE

//
// Demonstrate BigDecimal from float
//
float primitiveFloat = 0.1f
BigDecimal bdPrimFloatCtor = new BigDecimal(primitiveFloat)
BigDecimal bdPrimFloatValOf = BigDecimal.valueOf(primitiveFloat)
Float referenceFloat = Float.valueOf(0.1f)
BigDecimal bdRefFloatCtor = new BigDecimal(referenceFloat)
BigDecimal bdRefFloatValOf = BigDecimal.valueOf(referenceFloat)

print "Primitive Float: ${primitiveFloat}"
println " (${primitiveFloat.class})"
print "Reference Float: ${referenceFloat}"
println " (${referenceFloat.class})"
print "Primitive BigDecimal/Float via Double Ctor: ${bdPrimFloatCtor}"
println " (${bdPrimFloatCtor.class})"
print "Reference BigDecimal/Float via Double Ctor: ${bdRefFloatCtor}"
println " (${bdRefFloatCtor.class})"
print "Primitive BigDecimal/Float via ValueOf: ${bdPrimFloatValOf}"
println " (${bdPrimFloatValOf.class})"
print "Reference BigDecimal/Float via ValueOf: ${bdRefFloatValOf}"
println " (${bdRefFloatValOf.class})"

println NEW_LINE

//
// More evidence of issues casting from float to double.
//
double primitiveDoubleFromFloat = 0.1f
Double referenceDoubleFromFloat = new Double(0.1f)
double primitiveDoubleFromFloatDoubleValue = new Float(0.1f).doubleValue()

print "Primitive Double from Float: ${primitiveDoubleFromFloat}"
println " (${primitiveDoubleFromFloat.class})"
print "Reference Double from Float: ${referenceDoubleFromFloat}"
println " (${referenceDoubleFromFloat.class}"
print "Primitive Double from FloatDoubleValue: ${primitiveDoubleFromFloatDoubleValue}"
println " (${primitiveDoubleFromFloatDoubleValue.class})"

println NEW_LINE

//
// Using String to maintain precision from float to BigDecimal
//
String floatString = String.valueOf(new Float(0.1f))
BigDecimal bdFromFloatViaString = new BigDecimal(floatString)
print "BigDecimal from Float via String.valueOf(): ${bdFromFloatViaString}"
println " (${bdFromFloatViaString.class})"

println NEW_LINE

//
// Using "duck typing"
//
def decimalWithoutStaticType = 0.1
print "Decimal Without Static Type: ${decimalWithoutStaticType}"
println " (${decimalWithoutStaticType.class})"
def floatWithoutStaticType = 0.1f
print "Float Without Static Type: ${floatWithoutStaticType}"
println " (${floatWithoutStaticType.class})"
def explicitDoubleWithoutStaticType = 0.1d
print "Explicit Double Without Static Type: ${explicitDoubleWithoutStaticType}"
println " (${explicitDoubleWithoutStaticType.class})"

The Groovier script shown above demonstrates that there is no need in Groovy to explicitly import java.math.BigDecimal. The script also has several of the classes's Class definitions printed out to show how Groovy treats these classes.

The following output is generated when this script is executed.

There are several observations one can make from this output. One, Groovy allows variables to be statically typed. When they are statically typed, they are essentially treated exactly as they are in Java with precision loss and all. Two, Groovy's dynamic typing automatically applies BigDecimal to floating-point numbers if no specific type is specified. Three, as the last few lines of the output demonstrate, Groovy prints out the value of 0.1 properly even when it reports Float, Double, or BigDecimal. In other words, the value of 0.1 is treated consistently as exactly 0.1 as long as the static types of float and double were not explicitly specified when declaring the variable. Now that's nice.

Conclusion

Groovy simplifies many of the nuances of floating-point arithmetic and representation. The Groovy Math documentation page calls Groovy's approach to mathematical operations a "least surprising" approach. That's something we can all appreciate.

Additional Resources

The following online resources contain further information about BigDecimal and Groovy.

⇒ Getting Around BigDecimal Pain with Groovy

⇒ The Evil BigDecimal Constructor

⇒ Make Cents with BigDecimal

⇒ The Need for BigDecimal

⇒ Why is the BigDecimal(double) Construction Still Around?

Inspired by Actual Events

Dustin's Pages