Tuesday, April 28, 2009

Jakarta Commons Net API

Jakarta Commons Net(JCN) Java Library is just one of those open source projects I've enjoyed working with. Specially its FTP API. Which is gonna be the focus of this discussion.

I've implemented a component that uses this library in a Web Service environment which basically connects to a remote FTP service then iterate through the list of directories, process the files and handle the reply codes properly. The list of files may go to several thousands. 

Here are the things I found really useful from JCN.

1. It automatically issues a new PORT command to the server so we don't have to worry about manually setting the port for different platforms (Unix, Linux, Windows, Mac, etc.) when connecting. Also validates data connections to client to ensure the request had originated from the intended host. We don't want our application deal with strangers eh?

2. It allows you to page through the list of FTP files. I think this is an awesome feature! From my previous project the list of files can span to several thousands. But through FTPListParseEngine class I was able to page through the files and process them in smaller chunks. This is a great performance boost. We only load the ftp file objects we currently need. Creating this objects are expensive. We keep the metabolism of an application running this way and saves us the evil of bottlenecks.  


3. Provides very easy to use and effecient way of handling FT Reply Codes. The codes are basically messages passed by the FTP Server indicating if the request was processed successfully or not. There are about 49 codes which you can check at this link http://www.turboftp.com/turboftp/manual/TURBOFTPFTP_Server_Reply_Codes.htm . After a command is issued to the server, the server returns an int value representing the code. Just capture this code and apply the behavior you want to happen in the application. 

This is a great way to isolate specific ftp issues and handle it properly so the application can still function, say even if the FTP service is down. Or if your previous request gave a code that identifies your request as not processed, you can resend it upto a certain number of times until you mark it failed. There's no point sending an ftp command or request if the data connection is not established, as shown below.


Also shown below is an example of flagging the reply if its ok or not with their matching simple behavior implementation.


I can see no reason why JCN can't be used. It provides the necessary security and performance gain we need. Plus it provides you an elegant way of handling FTP Reply Codes that you can use to manipulate the behavior of an application consuming an ftp service. Documentation is well written too.




Tuesday, April 21, 2009

AJUG (SOA and BPM Meeting) - My Take

Rick Geneva from Intalio company presented an interesting thought on how SOA and BPM work together tonight. So SOA is IT driven while BPM is business driven. Rick's title is a Process Expert. Alot from his slides pointed some technologies that instantiate an SOA architecture like WS, ESB, JMS..you name it. But I think we missed his concrete experiences with his customers/clients that he worked with to improve their business process. Though the framework "Process Driven Development" was mentioned there were no information as to how it is executed from inception to deployment. I agree with him that Process is very important. That's why we have these different SDLC methodologies like SCRUM, Waterfall, Extreme Programming, TDD, etc. But all of these methodologies are focused on building the software products. How do you tie this up with Process Driven Development (PDD)? So shall we say PDD is to Business Team and SDLC is to Development Team?

I'm interested to know how PDD works. I wished he talked more on that part. To me PDD sounds like a 6 Sigma methodology but the only difference is 6 Sigma not only focuses on the product but to the processes as well from business processes/operations down to the smallest detail of assembly of a product. Ten years ago I was a Process Engineer and got trained for Greenbelt 6 Sigma and was certified. A project was identified and was assigned to me and used 6 Sigma methodology to eliminate or at least minimize a product defect. The rest of the Process Engineers also had their own. After my 6 Sigma project was completed I saved the company $160,000 annualy and the product which I'm assigned had a better quality and the ROI is very tangible. I have not heard of any process driven development yet that had a very tangible ROI. The question is how do you measure a business process? With 6 Sigma you have 98.5% as your confidence level, computed statistically. I've seen alot of Software Companies using 6 Sigma as their methodology for delivering products. But I have not heard any welcoming news that it worked for them. I might be wrong. But I've never seen anything published.

Rick made it clear that there's no organization responsible for setting the standards for Process Driven Development and everything is still up in the air but alot of companies are already using software products that enable them to Choreograph and Orchestrate their services that will eventually define their BPM. I guess everybody will just have to wait and see. But there's definitely a need for a standard business process management.

BPMN was also mentioned which basically means Business Process Management Notation. It's another set of notations that BAs and Developers will have to learn. UML is there and I believe its sufficient to translate a descriptive business requirement into a more readable, simplified diagram. This is the part where I don't cast my vote in. I think the reason why management and development team cannot meet in the middle is because of unclear requirements sitting on top of a thick bureaucrat who is wrapped in his own fragile ego.

For now I consider BPM as a buzz. Its just basically a workflow. Nothing more. BPEL and the like on the other hand simplfies the complexity of implementing tons of workflows. But how do you measure its complexity to use such a product? Maybe if there are 100 steps in a workflow? Or if the integration points consumed in that workflow is more than 10 then do we consider that as a complex workflow?

Its hard to quantify the workflow metrics and I'm not sure how to do it at this point. We can't just dive in into our Infrastructure and start doing workflow metric testing to establish a benchmark. Also, you might have to hire a person with a position as Workflow Expert, if they do exists, to do the job.

In the end, all the business want is to see if they get any value from BPM. How much can they $ave, how much processes can they reduce and make efficient, and how can they make their IT Infrastructure flexible to accommodate the ever changing business needs.

Oracle BRM Portal Java PCM API Design Issues


I respect Oracle. They're awesome. They’re very smart and that’s why they bought Sun Microsystems. But no great companies are exempted from mistakes and great mishaps. I know you know some. I'm not surprised if you'd pick Microsoft. Dang Vista really hit them hard and that wasn’t there first. But I'm not gonna talk about Vista. I want to focus on what my experience was working with Oracle BRM Portal Java PCM API, specifically their FList class and Web Service Interface.

I have not used every single API but enough to point out what's wrong with it.

Here are the design issues I’ve found:

1. FList class is a Hashtable

2. FList data structure follows the anti-pattern “Yo-yo Problem”

3. Heavy use of Reflection

4. Vague documentation


FList class is a Hashtable

FList is the point of entry/gateway in the Oracle BRM Java API to initiate and create transactions. It can contain aggregates of different types of objects that represent the data and its relationship in the BRM database. The result coming back from any brm request is also a FList. So the brm request/response is of the FList form.

Now here comes the bad part. FList is a Hashtable! It’s thread-safe because it’s synchronized but at the same time greatly compromises performance. FList is very coarse-grained object that can have a deep tree structure. If you put it on a multi-threaded environment the application will bend down on its knees. So their Java PCM API is not scalable at all.


FList data structure follows the anti-pattern “Yo-yo Problem”

Think of this. All of the available object types in the Java PCM API are contained in the FList. You have to dig deep into it and know what type of object you are getting. The problem comes when you keep flipping from one object to another and that doesn’t clearly explain how these objects are related to each other and what subject do they convey. It’s confusing. Their PCM Java Documentation API is not enough to understand what are involved in a specific BRM process. You have to reference as well their Portal Documentation which is about 312MB containing around 10,756 files. If you need to understand something it’s like finding a needle in a haystack.


Heavy use of Reflection

I have never seen an application that uses heavy reflection till I came across FList. The image above shows how many times reflection was used to just grab one Sales Order through Oracle’s BRM Web Service interface. I’ve used JProfiler to take a snapshot of the call trace. The lookup of the class being invoked incurs the most overhead compared to its invocation (seehttp://www.jguru.com/faq/view.jsp?EID=246569). Oracle BRM WS has definitely some performance issues with its Java engine.


The Field class from the package com.portal.pcm made 12,855 lookups and off course another 12,855 invocations incurring 20 seconds to complete just these two tasks. Its ridiculous. What in the world is it doing???


Vague documentation

Oracle BRM is a beast. The documentation as well as mentioned above earlier. It’s so huge that you can easily get lost without the guidance of an Oracle Consultant. It’s not trivial. The goal of every complex system should be to provide enough and clear documentation. What really gets into my nerves is when we start customizing fields in BRM and the document doesn’t tell you what data you need to pass to FList to properly fetch what you want from BRM. Even outside of the customization discussion, you need several pieces of information from one page to another and combine them to formulate your desired FList structure.


This is just one of those tools where you definitely have to go to a training. I'd say hire one good Oracle Consultant and let your trained employees work with him to manipulate the behavior of their BRM tool. My current company is totally dependent on Oracle Consultants. We have 6 of them and they get paid a ton. I wish they invested in their employees and let them be trained and be guided by just one consultant.

Thursday, April 16, 2009

Sun should treat eveything in Java as Objects!

I have posted a question in LinkedIn as shown below.

Do you think Sun will remove primitive data types(int, long, float, etc.) in their Java Platform?

I got 10 answers from different people and they are all against the idea of treat everything in Java as Objects. But I did clarify my point why I like the idea of Sun updating the Java Platform or creating new breed of Java Platform that will treat everything in Java as an Object. Here's my reason behind the question.

I have seen lots of Java code that keeps switching from a primitive type to a Wrapper class. Reflection API does not return the classe's primitive type as an object and of course you have to use Wrapper classes to later convert it to the right object type then switch back to a primitive type. Autoboxing just does the same thing. All of these are basically CPU overhead not heap overhead. RAMs are not that expensive and they get cheaper and cheaper while getting bigger in temporary storage. I guess I like the idea of having Java as everything is Object because that will minimize switching between primitive types and makes the code alot readable. But I do agree some of their points but I'm thinking not far from now it may happen. Heap size is easier to control than CPU time. There's a big difference between memory usage and cpu time in terms of overhead. The latter gives you a better sense of tuning performance. You would be surprised that scalability is actually affected more by CPU Time rather than Heap Size. If you have an out of memory issue, its almost always that you have run away processes that the CPU keeps it running till you run out of memory. There's something in the code that drove the CPU nuts.

I hope that explains my curiosity behind the question.

Monday, April 13, 2009

Singleton without "synchronized" keyword in Multi-Threaded App

When you ask a designer or developer to write a multi-threaded application, in their mind, the magic word - 'synchronized' immediately pops-up! That was my mentality as well until I came across this wonderful website explaining the different design patterns. It hit me hard. You can check it on this link http://www.oodesign.com/singleton-pattern.html.

Typical Singleton object looks like this:

//This is bad. The whole method is synchronized. If your method has
//thousands of lines of code then you are creating a bottleneck here... 
public class MySingletonObject{
private static MySingletonObject instance;

private MySingletonObject(){}

public synchronized static MySingletonObject  getInstance(){
if(instance==null){
instance = new MySingletonObject();
}
return instance;
}
}

or

//This is better. Only the block of instantiation is synchronized. But....
//There's always this null check overhead. Also, you still have the 
//synchronized keyword. It's still an overhead.
public class MySingletonObject{
private static MySingletonObject instance;

private MySingletonObject(){}

public static MySingletonObject  getInstance(){
synchronized(MySingletonObject .class){
if(instance==null){
instance = new MySingletonObject();
}
}
return instance;
}
}

//This is what you want. What? Why?...... Never underestimate
//the power of static modifier. Once you call this class
//say MySingletonObject.getInstance(), the classloader will automatically load
//and create the instance of MySingletonObject type.
//This member variable only has 1 instance all throughout the application
//since its at the class level. No more bottlenecks. No more waiting for the
//threads. They immediately get an instance. Looks very simple too...

public class MySingletonObject{
private static MySingletonObject instance = new MySingletonObject();

private MySingletonObject(){}

public static MySingletonObject  getInstance(){
return instance;
}
}

Using Rally Java Web Service API

 There are a number of developer tools that can be used from Rally ranging from integrating Rally to a different application(s) or just plainly extracting data from Rally. This link https://rally1-wiki.rallydev.com/display/Word/Developer+Tools provides necessary rally developer documentation.

Web Service API

Since Rally supports several implementation of their WS API this will mainly focus on SOAP in Java. You can find on this link https://rally1.rallydev.com/slm/doc/webservice/ the other implementations. The only thing I don't like about this API implementation is that you always have to pass the object reference through the wire to get the values of the object. The WS calls are so fine-grained that the number of objects queried is directly proportional to the number of round-trip calls. The sample usage of the SOAP in Java implementation is shown below in sequence.

Assuming our target of interest is to extract a Story from Rally. The Story in Rally is actually map to a SOAP object called HierarchicalRequirement. Always remember to use the read() method of RallyService? object to grab the physical object from Rally.

  1. Grab the WSDL from the current version of your Rally application which might have the form https://rallyx.rallydev.com/slm/webservice/x.xx/meta/34343483/rally.wsdl.
  2. Generate the Java code from the given wsdl file. Their will be 3 packages generated - com.rallydev.webservice.domain and com.rallydev.webservice.service.
    • com.rallydev.webservice.domain contains all SOAP objects that represent the data in Rally.
    • com.rallydev.webservice.service contains the web service interface.
  3.  Acquire connection from the web service endpoint and grab available Workspaces.
     URL url = new URL("https://rally1.rallydev.com/slm/webservice/1.10/RallyService");
    RallyService service = (new RallyServiceServiceLocator()).getRallyService(url);

    Stub stub = (Stub)service;
    stub.setUsername(rally_username);
    stub.setPassword(rally_password);
    stub.setMaintainSession(true);
    Subscription subscription = (Subscription)service.getCurrentSubscription();
    Workspace[] workspaces = subscription.getWorkspaces();
    if(workspaces==null || workspaces.length==0){
    errorBuf.append("The login credentials doesn't have any subscription or there are " +
    "no Workspaces configured from Rally.");
    writeToFile(serviceBean, errorBuf.toString());
    return null;
    }
  4. If the target workspace is "IT: the next generation" then loop through the workspaces that matches that workspace.
    Workspace workspace = null;
    for(int i=0; i<workspaces.length;i++){
    WSObject wsObject = (WSObject)service.read(workspaces[i]);
    workspace = (Workspace)wsObject;
    String workspaceName = workspace.getName();
    if(workspaceName.equalsIgnoreCase("IT: the next generation" )){
    break;
    }
    }
  5. Submit query and get results (DomainObject?[]). The serviceBean.getQuery() is a name/value pair which might be of the form Release.Name= "Test Release For TWiki" AND ScheduleState? = "Completed". Process each DomainObject?.
    QueryResult queryResult = service.query(workspace, "HierarchicalRequirement", serviceBean.getQuery(), "", false, 1, 100);
    if(queryResult.getErrors().length>0){
    for(int i=0; i<queryResult.getErrors().length;i++){
    errorBuf.append("ERROR: ");
    errorBuf.append(queryResult.getErrors()[i]);
    errorBuf.append("\n");
    }
    writeToFile(serviceBean, errorBuf.toString());
    return null;
    }
    DomainObject[] domainObjects = queryResult.getResults();
    if(domainObjects!=null && domainObjects.length>0){
    List releaseNotesBeanList = new ArrayList();
    TwikiBean twikiBean = new TwikiBean();
    Map packageStoryMap = new HashMap();
       for(int i=0;i<domainObjects.length;i++){
    HierarchicalRequirement story = (HierarchicalRequirement)service.read(domainObjects[i]);
    Release release = (Release)service.read(story.getRelease());
    DateFormat dateFormat = DateFormat.getInstance();
    String releaseDate = dateFormat.format(release.getReleaseDate().getTime());
          if(story.getAttachments()==null || story.getAttachments().length==0){
    errorBuf.append(NO_ATTACHMENT).append("\n");
    }
    if(story.get_package()==null || story.get_package().equals("")){
    errorBuf.append(NO_PACKAGE_NAME).append("\n");
    }
    if(release==null || release.getName()==null
    || release.getName().equals("")){
    errorBuf.append(NO_RELEASE_NAME).append("\n");
    }
          if(errorBuf.length()>0){
    errorBuf.insert(0, "ERROR: User Story ID: " + story.getFormattedID() + "\n");
    writeToFile(serviceBean, errorBuf.toString());
    continue;
    }
          ReleaseNotesBean releaseNotesBean = new ReleaseNotesBean();
    releaseNotesBean.setPackageName(story.get_package());
    releaseNotesBean.setUserStoryId(story.getFormattedID());
    releaseNotesBean.setUserStoryName(story.getName());
    releaseNotesBean.setReleaseDate(releaseDate);
    releaseNotesBean.setReleaseName(release.getName());
          Attachment[] attachments = story.getAttachments();
    Attachment attachment = attachments[0];
    attachment = (Attachment)service.read(attachment);
          AttachmentContent attachmentContent = (AttachmentContent)service.read(attachment.getContent());
    byte[] content = attachmentContent.getContent();
          String twikiTopic = new String(content);
    releaseNotesBean.setTwikiTopic(twikiTopic);
          releaseNotesBeanList.add(releaseNotesBean);
    prepareTopics(releaseNotesBean, packageStoryMap);
    }
       twikiBean.setPackageStoryMap(packageStoryMap);
    twikiBean.setReleaseNotesBeanList(releaseNotesBeanList);
    return twikiBean;
    }

Friday, April 10, 2009

Running JBoss On MyEclipse On Debug Mode

MyEclipse

MyEclipse is one of the standard development environment for all Java Applications. It has a wide range of support for server/client plugins. This section will focus on JBoss server plugin. As of this writing, there were no references on how to glue together the JBoss Server and MyEclipse on Debug mode. Follow the instructions below to properly configure MyEclipse to let your application run in Debug mode successfully.

  1. From MyEclipse menu click on Windows > Preferences.
  2. Go to MyEclipse Enterprise > Servers > JBoss.
  3. There are several versions of JBoss server available. Pick the version that matches what is installed on your machine. Currently its version 4.0.5.
  4. Enable the server.
  5. Provide the JBoss home directory. Its usually on /usr/local/jboss.
  6. Server name must be 'default' to match the jboss deploy directory.
  7. Optional shutdown arguments should be '--shutdown'.
  8. Click on JDK.
  9. Provide the JDK installation and its name.
  10. Under the optional JVM arguments populate it with the following information

-Xms128m -Xmx1024m -Djboss.server.log.dir=/var/log/jboss -Djboss.server.base.dir=/usr/local/htapps/server -Djboss.server.base.url=file:///usr/local/htapps/server

  1. Click on Launch.
  2. Click the Debug mode.
  3. Click on Paths.
  4. Append to classpath the bin directory of the Java installation.
  5. Do not run JBoss from the command line. Run it by clicking on the server icon from MyEclipse and choose JBoss. 

Using JProfiler for JBoss running on Solaris SPARC

JProfiler

Jprofiler is a profiling tool written in pure Java which can be used to profile applications running in any JVM implementation. There are two components to it. One is the Agent which gets deployed to a machine where the application will be profiled and the other is just basically a GUI Client that displays the statistics on Memory, CPU and Threads. The data presented at the client are real time which can be used to analyze performance bottlenecks, memory leaks, threading issues.

Setup and Configuration for Solaris SPARC

Requirements for the Remote Agent
  1. Ensure installation directory of the JProfiler has right permissions and is not affected by any the regular run of of a cleanup/process engine if there's any or else you will have to redo the installation the following morning for further testing.
  2. Identify if machine supports 32-bit and/or 64-bit computing. Choose one that is supported by JProfiler, your server/client machine where the JProfiler agent and GUI JProfiler will be installed respectively. Both client and agent must have the same computing platform. Run the command "sudo isainfo –v" to determine what the machine supports.
  3. There should be two files that need to exists, in my example - in a user’s home directory. The jboss and run-jboss.profiler.sh scripts (but you can name them however you want it).

This jboss script is used to kickstart jboss . Should look like the one below.

LD_LIBRARY_PATH=/opt/jprofiler/jprofiler5/bin/solaris-sparc export LD_LIBRARY_PATH case "$1" in start) [ -x /home/staff/rgarcia/run-jboss.profiler.sh -a -d /opt/htapps ] || ex it 0 echo Starting JBoss in background... /home/staff/rgarcia/run-jboss.profiler.sh -d /opt/htapps & ;; stop) /usr/local/jboss/bin/shutdown.sh -S ;;

The run-jboss.profiler.sh is basically a script that calls the jboss run.sh script (/usr/local/jboss/bin/run.sh) and initialize the VM parameters instead of passing that activity over to run.sh. The VM arguments specific to jprofiler should look similar to the one below.

#va Overrides : ${JAVA_HOME:=/usr/local/java} : ${JAVA:=$JAVA_HOME/bin/java} : ${JAVA_HEAP_MIN_MB:=128} : ${JAVA_HEAP_MAX_MB:=512} JAVA_OPTS="$JAVA_OPTS -agentlib:jprofilerti=port=2505 -Xbootclasspath/a:/opt/jprofiler/jprofiler5/bin/agent.jar -Xms${JAVA_HEAP_MIN_MB}m -Xmx${JAVA_HEAP_MAX_MB}m -Djboss.server.base.dir=$FINAL_DEPLOY_CFG_ROOT -Djboss.server.base.url=file://$FINAL_DEPLOY_CFG_ROOT"

export JAVA_HOME JAVA JAVA_OPTS

Setting up the Jprofiler GUI Client
  1. Install JProfiler client (Windows).
  2. Create Session. Enter the URL of the machine and port number. The port number must be the same as the one configured in the run-jboss.profiler.sh script file.

Running JProfiler
  1. Run the remote application using the jboss script from the user’s home directory.
  2. On the console it should show that its waiting for the JProfiler GUI Client connection.
  3. Open the Session created from the JProfiler GUI Client.
  4. Application continues to load from the remote machine.
  5. You can choose what to profile from the GUI – Memory, CPU, Thread.
  6. You can record snapshots from the real-time statistics and export it to any image or an html or save the current session as .jps file which will save all the raw data gathered.

Please take note that if profiling is completed on the target remote application shutdown its Agent and restart the application using the normal process.