Wednesday, July 29, 2009

Micro Focus Acquired Borland Software...

While I've been busy preparing for my wedding day this Sunday, one recent acquisition captured my attention. I just received the information that Borland Software now acquired by Micro Focus. If you are developers groomed in the era of 90s, you should know or at least have heard  about this Borland Software. The company has been a major player in the Integrated Development Environment (IDE) software for popular languages at those times, such as TurboC, TurboC++, TurboPascal, TurboProlog, Borland Pascal, Delphi, and JBuilder series. For some time it also acquired the once famous dBase IV's Ashton Tate, and have its own Quattro database. The IDE business was later totally spinned off to a wholly owned subsidiary named CodeGear.

Borland also produced good quality JEE application server series: Borland Enterprise Server, which was used with some of former employer's clients.

Last 3-5 years, Borland Software had shifted its focus into Application Lifecycle Management (ALM), by acquiring companies such as Segue Software, Inc. I was sent by my employer that time for a training on their productsand sales knowledge of these ALM products (Borland SilkTest - automated testing tool, Borland SilkPerformer - load testing tool, Caliber) back to year 2006. These products compete with Mercury (now subsidiary of HP) Load Runner, IBM Rational Robot, etc.

Just recently this year we heard that Borland Software sold off CodeGear to Embarcadero Technologies. So, right now, Borland was left only with its ALM products and application servers/ORB brokers product, now sold to Micro Focus.

Wednesday, June 17, 2009

Syntax Highlighting in Wordpress.com blog

I have to admit that I am a late adopter. While I see my peers in JUG community already using syntax highlighter in their blog post, I was just discovered how to do it.

In order to do it, you need to use the

[sourcecode lang="<language>"] tag:
[sourcecode language="xml"]


burger


[/sourcecode]

Above code is using lang="xml".

For Java code snippets you could highlight it pretty easily using the lang="java" with the tag.
[sourcecode language="java"]
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, world!");
}
}
[/sourcecode]

I am just really excited on how it works. I think Alex Gorbatchev has done a great job on the syntax highlighter!

Tuesday, June 16, 2009

I have moved most of my blogging activity to this site:

Please go there for vibrant activities.

Monday, February 23, 2009

Batch Process in Clustered Environment

This week I have to answer some questions relating to the running of batch process in clustered environment. Our prospective client, a big corporation with the sophisticated infrastructure wants to make sure that the application to be deployed could run well in their clustered environment. Having a nice clustered server for high availability with fail over fault-tolerance is really nice. But a single point of failure in the batch process would surely spoil the fun.

It seems like they have encountered problems in the past with batch processing that could clutter the production application. That's why they need to make sure every applications to be installed are robust enough to some disturbance and provide the same level of resilience with their software components running on the JEE application servers counterpart.

When talking about clustered environment, usually we aim for high availability and fail over. The data that we are processing should be consistent and the application should not get crippled when one of those clustered machine goes down, both intentionally or unintentionally.

When talking about batch processing, the point that we are trying to achieve are:

  • restartability

  • idempotence or rerunabilty


As it is common for the batch process to be in the class of long running processes, usually it will be very expensive for them if they have to start the whole thing from the beginning again.

For example, if you have a batch process that calculate the billing for public utility company, the billing generation itself could easily take 10 hours. If after 9 hours the machine crashes, powercuts, et cetere, it would then be too expensive to restart the whole billing generation again from scratch.

In some way this kind of long running process should maintain the state at certain check points, and pick up at that certain check point and then continue.

I remember the time when we have to generate 400MB dummy data to test our system. The generation itself took more than 3 days to run (running simple SQL INSERT statements). If you put all the rows in one transaction, both commit and roll back process will take a very long time. The other way also pose the same problem. If you use auto commit and commit the rows every INSERT operation, the performance will also be very very poor.

We decided then to commit the INSERTSs every certain number of rows (we chose 10,000 rows in PostgreSQL), we sacrifice atomicity for performance. We have to maintain the state when were the last time we generate and not repeating ourself.

Same thing happens when we are running a long running batch process in a clustered deployment. We expect the component to be able to pickup where it has left in case a failure happened. We don't want the component to assume it has finished, nor we want it to restart from the beginning.