Calliope Sounds: December 2011

A comment on "So which IDE is the best and why?"

My comment to So which IDE is the best and why? Im a beginner and only know of a few I prefer Eclipse : java:

An IDE gives you three very useful tools. Code editing. Code navigation. And code debugging. Each of the commenters, I am sure, have different needs and styles of these tools. You will also.

All the IDEs tend to have very aggressive code editing help. This is great when you are working with new packages but the help gets tiresome as your packages familiarity increases and yet the IDE continues to make suggestions when you know full well what you want. Code navigation in modern IDEs is a godsend. To be able to move easily between classes, up and down the inheritance hierarchy, and around usages. Code debugging is less about the source and more about the data and the threads. The better the tools are at showing data structures -- the plural is important! -- and visually connecting threads w/ stack traces the easier your job becomes.

I have not answered your question because there is no answer. We shape our tools and our tools shape us. It almost does not matter which you pick -- Eclipse, IntelliJ, NetBeans, Emacs w/ JDEE, Vim w/ JDE, .... What matters is that using the IDE becomes second nature to you.

And don't fool yourself that you will fix the IDE's problems. You don't have that much free time.

Code now on GitHub

I have started to move and/or clone my code to GitHub -- like everyone else....

Calliope Sounds has an ISSN

I work for a small publisher intermediary and while working with our data recently I needed a testing ISSN. It got me thinking about the work involved with getting an ISSN for my blog -- an online serial publication. There is one form to complete and that is then mailed to the Library of Congress along with the "front page" of the publication. I did this about two weeks ago and today the Library of Congress, United States, ISSN Center delivered my new ISSN. In all its glory, here it is

ISSN 2165-0861

Using reflection for command line option parsing

Over the last few days I have needed a few command line tools -- mostly for data cleanup. I got tired of manually parsing command line arguments. I wanted something more automated. My first thought was to design a data-structure that expressed the command lines arguments then write an interpreter that would parse the arguments with its guidance. I spent an hour doing this only to realize that preparing the data-structure was almost as much work as manual parsing. Then I remembered that Ant has a simple facility that uses reflection to execute a task expressed in XML against an object. This is what I wanted. So, for example, the command line tool

java ... Sum --multiplier 25 --verbose 1 2 3 4

would be implemented as

public class Sum implements Runnable {

   public void setVerbose() {
      this.verbose = true;
   }

   public void setMultiplier( Long multiplier ) {
      this.multiplier = multiplier;
   }

   public void addPositional( Long number ) {
      this.numbers.add(number);
   }

   public void run() {
      long sum = 0;
      for ( Number number : this.numbers } {
         if ( verbose ) {
            System.out.printf("%d + %d= %d\n", 
               sum, 
               number.longValue(), 
               sum + number.longValue() );
         }
         sum += number.longValue();
      }
      if ( this.multiplier != null ) {
         if ( verbose ) { ... }
         sum *= multiplier;
      }
      System.out.println(sum);
  }

  ...

}

The magic, of course, is reflection. The reflection-based parser sees "--verbose" and matches it against setVerbose(), "--multiplier" matches against setMultiplier() and that it takes one numeric option. The remaining positional arguments are passed to addPositional() as numbers. Once parsed run() is called.

The ReflectiveCommandLineParser is used to handle the magic within the main()

   public static void main( String[] args ) {
      Sum sum = new Sum();
      ReflectiveCommandLineParser.run(sum,args);
   }

See ListFiles implementation for another example.

How MySql loads result sets

I was having a devil of a time yesterday with the simple task of using a Java program to copy records as objects from one MySql database to another [1]. I kept running out of memory. While the root problem had to do with creating 2M objects in memory it did lead to a better understanding of how MySql loads result sets. In short, it loads the whole result set into memory in one shot. So, if you have 1M records at 1K each in the result set you will need at least 1G of memory to hold them. If you then build 1M objects from these records you will need an additional 1M * object size of memory. In other words, a lot of memory. You can have the MySql JDBC driver "stream" the result set, however. That is, read the records row by row from the database. It is less efficient for the driver -- multiple trips back and forth between the server -- but doing so requires far less memory. You can turn on streaming at the statement level or at the datasource level.

Statement Level

To turn on streaming at the statement level you need to use a set of common JDBC settings that, when used together, inform the driver to stream. When you create or prepare a statement you must define how the result will be used and what is the fetch size. For example,

Statement statement = connection.createStatement(
 ResultSet.TYPE_FORWARD_ONLY, 
 ResultSet.CONCUR_READ_ONLY);
statement.setFetchSize(Integer.MIN_VALUE);

and

PreparedStatement statement = connection.prepareStatement(
 "select ... from ... where ...",
 ResultSet.TYPE_FORWARD_ONLY,
 ResultSet.CONCUR_UPDATABLE);
statement.setFetchSize(Integer.MIN_VALUE);

For more information see section Result Set in JDBC API Implementation Notes. DataSource Level
To turn on streaming at the statement level you need to add a property to the JDBC uri. For example, Integer. MIN_VALUE is -2^31 and so use

jdbc:mysql://localhost/?defaultFetchSize=-2147483648

For more information see Driver/Datasource Class Names, URL Syntax and Configuration Properties for Connector/J. [1] I could not use the MySql tools for dumping and loading the table data because I used an auto_increment column in one of the related tables, the target database was active with data, and so could not reset the target's auto_increment column to an appropriate value.

Response to "EasyMX - An alternative to JMX"

A response to EasyMX - An alternative to JMX | Javalobby:

An advantages to using JMX is that the JMX client gets to the service via a different network path than the service's users. When the service is running well the path taken does not matter much. It is often the case, however, that the main service path becomes inaccessible under adverse conditions. Your HTTP requests are not being serviced before timeouts kick in, for example. And, consequently, your monitoring is also inaccessible. JMX clients use RMI or direct socket pathways to connect to the service and so the JMX client can continue to monitor and manage the service.

As Mr Fisher says (first comment to the posting), JMX is one of the "golden parts" of the Java ecosystem. (JBose was built on top of it.) Current JMX coding practices are more sophisticated than in the early days. The "MBean" and "MXBean" interface suffix continue to support quick and dirty monitoring and publishing. And for those with lots of monitoring and management touchpoints we too use sophisticated Java annotations processing to turn existing code into touchpoints.

Revised log levels proposal

@jbarnette: Revised log levels proposal: "fyi," "wtf," and "omg."

In praise of gnuplot's dumb terminal support

I have to say, again, I find gnuplot's dumb terminal support is so useful when you are at the command line and need to see a plot of some data. The plot is very rough but this is usually enough to give you enough insight into the data as to whether or not to continue to exploring it. The script I am using now is

#!/bin/bash

function show_usage {
 echo \
  usage: $(basename $0) \
  [-x label] \
  [-y label] \
  [-t title] \
  [-s width:height] \
  [-f time-format-strftime] \
  timeseries-input ...
}

function swap {
 echo $2 $1
}

function parse_size {
 W=$(expr $1 : "\\([0-9]*\\):[0-9]*")
 H=$(expr $1 : "[0-9]*:\\([0-9]*\\)")
 echo "$W $H"
}

TIMEFMT="%Y-%m-%dT%H:%M:%S"
XLABEL="Time"
YLABEL="Units"
TITLE="Timeseries"
SIZE=$(swap $(stty size))

while getopts "x:y:t:f:s:h" opt
do
 case $opt in
  x) XLABEL=$OPTARG ;;
  y) YLABEL=$OPTARG ;;
  t) TITLE=$OPTARG ;;
  f) TIMEFMT=$OPTARG ;;
  s) SIZE=$(parse_size $OPTARG) ;;
  h) show_usage ; exit 0 ;;
  *) show_usage ; exit 1 ;;
 esac
done
shift $(expr $OPTIND - 1)

for INPUT in $*
do
 if [ "$INPUT" = "-" ]
 then
  INPUT=$(mktemp /tmp/timeplot.XXXXXXXXXX)
  cat > $INPUT
 fi

 gnuplot <<EOH
  set terminal dumb $SIZE
  set autoscale
  set xdata time
  set timefmt "$TIMEFMT"
  set xlabel "$XLABEL"
  set ylabel "$YLABEL"
  set title "$TITLE"
  plot "$INPUT" using 1:2 with lines
EOH
done

# END

For example, if the data in /tmp/data is

then this can be quickly plotted using

timeplot -f %Y /tmp/data

and get