Archive for the ‘Software Development’ Category

Ganymede is Coming!

Thursday, June 12th, 2008

Ganymede is coming!

June 25th will mark the release of Eclipse “Ganymede”, aka Eclipse 3.4. To celebrate the Eclipse Foundation is organizing all sorts of informational and fun activities. For example, there is a spoof movie poster contest running. Here is my favourite so far:

Day_The_Net_Stood_Still.png

We need to set aside some time to get familiar with the new features, assess the porting costs and evaluate when/if we will migrate Rule Studio for Java to Eclipse 3.4.

Deploying a Babel Server

Thursday, April 24th, 2008

250Px-Brueghel-Tower-Of-Babel

I’ve been enamored of the Eclipse Babel project for a while now so I’m very pleased to have a guest post on Babel by Benoit Nachawati. Benoit recently joined the JRules Rule Studio team. We are currently experimenting with using an internal Babel Server to streamline interactions between developers, the localization team, QA and the build team. Benoit discusses how we succeeded in installing Babel on an in-house LAMP server, uploading all the translated strings for Rule Studio and connecting Babel to our Subversion code repository.

You like Babel Fish? You’ll Love Babel!

By Benoit Nachawati

Localization has always been a challenge for software companies — creating costly overhead due to the interactions between different entities of the company: developers, configuration management, QA, documentation and translators.

On the other hand, addressing the local market with a localized version of your software will give you a great advantage, especially when your target is not tech users but business users. Those people just want to use your software in their native language. Hence our current interest in the Eclipse Babel project!

So, what is Babel?

Babel is an Eclipse project dedicated to supporting translation of Eclipse projects into different locales. Babel already offers three tools to help translation of Eclipse projects:

  • Babel Message Editor (Eclipse plugin)
    This plugin is developer oriented and it provides an editor to edit Java resources bundles to produce languages files for the program under development.

  • Babel Runtime Translation Editor (Eclipse plugin)
    This plugin is end-user oriented and it provides an editor to translate what is visible (context sensitive) on the screen at runtime.
  • Babel Translation Server
    It provides a web based interface to enter and edit translations.
    It contains all strings to be translated for a specific project and version. You choose your language and enter the translated value for strings.

    It can then generate the Eclipse language packs required to localize your software.

In this article, I’ll focus on the Babel Translation Server.

Why use Babel?

Babel Translation Server can be seen as a collaborative platform. It provides a unique entry point to allow users to make and share translations. Some of the key features which make Babel very attractive are:

  • Simple web based interface
  • Maintain project version
  • Maintain history of translation
  • Easily start a new localized version of your software (you just need to add the locale)
  • Build framework to create Eclipse plugin fragments required for localization

Here are a few screenshots of the Babel interface:

Babel Home Page

Babel Home

Babel Translate Page

Babel Translate

Babel Map Files Addition Page

Babel Committers

If you don’t have a well defined process for localization in your company, Babel could provide a nice starting point for defining the collaboration required to build localized software. Of course, everything is not perfect, Babel is still in its incubation phase. In particular Babel could benefit from some additional GUI to ease its usage and make it more accessible. For example:

  • Upload already existing translations for different locales
  • Upload a single properties file - whether localized or not
  • Generate NL packs for a specific project-version-language

Babel Workflow

The Babel workflow can be divided into three steps:

Scan (Inputs)

The goal of this step is to populate the database with properties files path and its key-value pairs.
Babel uses PDE Build map files to perform this task: they are uploaded into the Babel database and will be parsed to retrieve properties files.

The map file processor will checkout each map file entry to a temporary folder and looks for properties files. It will upload (or update) in the database all properties file paths as well as the key-value pairs for each propert file. Unfortunately, Babel doesn’t currently handle the upload of already existing translations for a specific locale.

Translate

Users just need to login into the Babel server and access the translation page. From this page, one can provide translation for each string. The GUI is divided in three parts:

  • The languages, projects, version and files selection area
  • The properties file content area (where you select the string to translate)
  • The translation area (where you edit / add the translation)

Deploy (Output)

The goal of this step is to generate NL packs per language. Currently, Babel provides a script to generate an Eclipse Update Site: it creates one feature per language which contains all plugin fragments. Only strings that are translated will be included. Afterwards, you’ll just need to publish this Update Site on your web server or into a local directory.

Installing a Babel Server

The installation process is well described from the Babel Project wiki page at
http://wiki.eclipse.org/Babel_/_Server_Tool_Development_Process. I’ll just make three comments about the installation process:

  1. Install PHP version >= 5.2.0 as Babel is using JSON functions
  2. Install MySQL version >= 5.0.32 or version >= 5.1.14 to be able to use IF EXISTS clause for DROP TRIGGER statement
  3. Modify the Babel sql script - namely babel-setup.sql - to prevent an error from occurring for a trigger that does not exist:
    Replace: DROP TRIGGER `ins_version`;
    by: DROP TRIGGER IF EXISTS `ins_version`;

My repository is under SVN, how do I integrate it with Babel?

For now, Babel only works with CVS (SVN support is a planned feature). Luckily enough, the CVS support from the map file processor script is just a matter of calling CVS for a checkout with the right CVS repository location. Here is the corresponding line of code from process_map_files.php file for CVS support:

$command = "cvs -d " . $aStuff['cvsroot'] . " co " . $tagstring . $aElements[1];

One would naturally expect to just change this line by the appropriate SVN support like for example:

$command = "svn co --username " . $svnusername . " --password " . $svnpassword . "  " . $svnURL;

However, Apache does not change its user properly, so svn command cannot be properly run as apache. To overcome this issue, we will use the sudo command to get more privileges. To do so, we first need to update the /etc/sudoers file by:

  • Adding the following line: apache ALL=NOPASSWD:/usr/bin/svn
  • Commenting the following line (if available): #Defaults requiretty

Ultimately, the SVN checkout will be done by the following line of code:

$command = "sudo svn co --username " . $svnusername . " --password " . $svnpassword . "  " . $svnURL;

SVN support not only means just changing the cvs command by the svn command but also provide a correct SVN location (url) to checkout from.
From the process_map_files.php file this just means providing your own parseLocation($in_string) function to update the way the map file entries are handled to complying with your map file entry format.

Identity in Software Design

Friday, March 28th, 2008

Identity

Over the years I have been designing and building software I have noticed one recurring set of problems that keep cropping up, regardless of company, product domain or programming language. Software developers’ often have a naïve understanding of identity (myself included!), and this leads to all sorts of bugs, hacks and design compromises. You’d think something as fundamental as how to identify a Thing would have been settled by now! To make matters even worse, changing how you identify a Thing after you’ve already amassed a lot of data (Thing Instances) is typically very complicated and expensive.

The philosophy of identity has a long and rich history, so software developers are in good company when it comes to struggling with these issues. What I find particularly interesting is that many of the classic identity thought experiments are very concrete issues for software developers. For example, you may lie awake at night and philosophize as to whether you are identical to your clone in a parallel universe. However programmers frequently write code to clone and move objects between two systems separated by space and time. For example, every time you synchronize your iPod, a program, and by extension a programmer, applies some quite sophisticated identity management rules.

Who’s Asking?

One of the first things we notice when we start to identify a real-life object is that the attributes we use to identify the object typically depend on the identity of the person asking, or the overall client context. For example, if someone asks me, “Who are you?” I might answer “Daniel”, “Daniel Selman”, “Rule Studio for Java Team Lead”, identify myself based on my relationship to other people, as often used in the Bible, or using biometric data such as DNA or fingerprints. This is a performance optimization as it is not feasible to list all our identifying characteristics to all clients! It is the client that imposes their identity requirements on us: if we given the client too little information they simply ask for more, if we supply too much, they probably ignore what they don’t need or cannot interpret.

Types of Identity

Gottfried Wilhelm Von Leibniz

Gottfried Leibnitz famously stated that “x is the same as y if and only if every predicate true of x is true of y as well.” If you tug gently at this little philosophical thread you quickly become entangled in the fascinating and complex questions related to identity – many of which are still actively debated today.

There are two definitions of identity: numerical identity and qualitative identity.

Objects a and b are numerically identical if a and b are one and the same thing. It is the relation an object has with itself and nothing else – a circular definition as “nothing else” means, “no numerically non-identical thing”. For example, I will be numerically equal for as long as I exist.

Objects a and b can be said to be qualitatively identical if a and b are duplicates, that is if a and b are exactly similar in all respects. This implies that things can be more or less qualitatively identical. Twins may be qualitatively equal even though they are numerically different.

I-Predicates

I-predicates are used to express qualitative identity relationships, taking into account the richness of a given theory or application context. For example, “having the same income as” will be an I-predicate in a theory in which persons with the same income are indistinguishable, but not in a richer theory. For example, within the “Selman Family Theory” I can safely use “has the same first name as” as an I-predicate to identify people. This I-predicate would be a foolish choice for the “ILOG Employees Theory” however!

Some philosophers contend that there is no absolute identity, and that identity is always relative, this is controversial and contested however.

Criteria of Identity

Similar to I-predicates is the concept of criteria of identity. For example, the criterion of identity for directions is parallelism of lines. Criterion of identity for numbers is equinumerosity of concepts, that is, the number of F’s is identical with the number of G’s if and only if there are exactly as many F’s as G’s.

Identity over Time

Heraclitus, Johannes Moreelse

Identity over time is particularly controversial, because time involves change. For example, Heraclitus famously argued that one could not bathe in the same river twice – as the water continuously flowing through the river changes its identity.

Take a simple statement such as “Tabby was fat on Monday.” Endurance theorists state that persisting things endure and change through time, but do not extend through time, but only through space. I.e. Things are different from events or processes. Perdurance theorists refute this and do not distinguish between Things and processes.

If Tabby is fat on Monday, that is a relation between Tabby and Monday. Perdurance theorists would state that Tabby-on-Monday is intrinsically fat.

Personal Identity

It is very useful to also consider the questions applicable to personal identity when designing software systems. These questions are:

  • Who am I? What are the attributes that make me, me?
  • Personhood: What is required to be a person? What is the definition of person?
  • Persistence: What events can you survive? What brings your existence to an end?
  • Evidence: How do we find out who is who?
  • Population: What determines how many of us there are now?
  • What am I? What am I composed of?
  • How could I have been? Which of my properties do I have essentially, and which only accidentally? Could I have had different parents for example?

For example, take a business rule, copy it, rename it, update some of its properties and delete its history. Is it the same business rule as the original? If I now deploy the business rule from a development server to a cloned staging server how many business rules do I have? How about if I download the business rules from both the development and the staging servers into separate projects within an Eclipse workspace on my local computer? The point is that there can be fairly complex answers to some of these questions, particularly when you have multiple software systems interacting over space and time.

Metaphysical Questions

The metaphysical questions below are also very useful to consider as you design software systems:

  • What does it mean for an object to be the same as itself?
  • If x are y are identical (are the same thing), must they always be identical? Are they necessarily identical?
  • What does it mean for an object to be the same, if it changes over time? I.e. is x at time t the same as x at time t+1?
  • If an object’s parts are entirely replaced over time, in what way is it the same?

Best Practices?

Qualitative Identity in Java is expressed using the equals method as well as the compare method. Equals allows you to test for qualitative identity (which can include numerical identity) whereas the compare method is used to order a list of objects using a comparison predicate.

Determine your I-predicates and in Java perhaps code them as Comparators. Does your domain model require several I-predicates? If yes, you will need something other than a single equals method. In one software system I designed we had a dedicated object comparison service that could compare different types of objects using different criteria, based on the client as well as the objects. For example, you might compare an Integer with a Float (with or without rounding), two Doubles (with precision), or two EJBObject instances. Note that most equals methods also test that the classes for the two instances are identical. The JVM determines that two classes are identical if they have the same fully qualified class name and were loaded using the same ClassLoader.

Think in terms of namespaces. Many identity schemes rely on namespaces, however namespaces must be rooted and managed to prevent copying or cloning corrupting the namespace. Internet domain names are a popular basis for namespaces precisely because they are globally managed and controlled. E.g. Java, XML Schema and the Semantic Web all use variations of Internet domain name namespace identifiers.

Decide whether you need numerical identity. How will you determine numerical identity? Object references within a JVM? Generated statistically unique identifiers such as UUIDs? Automatically generated database row IDs? What about object serialization? Object cloning? Database replication?

For the reasons above numerical identity is very difficult to apply in computer systems. Numerical identity is often used as an optimization however where the scope of the optimization is well understood, such as within a single JVM/ClassLoader or within a single database table. It is usually hidden from end users because it is machine generated and has no inherent business sense. End users typically find opaque machine-generated identifiers difficult to work with, as they cannot understand why two artifacts that appear to be superficially qualitatively identical are numerically different. Even a UUID which should be universally unique is problematic because it is trivial to create exact clones of objects in computer systems, rendering the uniqueness property useless.

A common problem scenario is deciding that you are going to put a UUID in a document and identify documents by UUID. The end user then copies the document on the file system and ends up with two artifacts that have the same UUID but different file names. When the documents are loaded into your system one of four things can happen:

  1. The documents are both stored but they are retrieved non-deterministically. Your user interface makes it impossible for the user to understand which document they are editing (very bad!)
  2. The documents are both stored but the first or last document loaded is always returned. Your user interface makes it impossible for the user to understand which document they are editing (bad!)
  3. An error is detected as a duplicate UUID was loaded and the end user must intervene to fix the document they did not realize was broken — because your identifier is opaque (bad!)
  4. The second document silently overwrites the first, typically because they are being stored in a Map using the UUID as a key (very bad!)

The scenario above happened because the application developer’s identifier conflicts with the underlying storage mechanism’s identifier — typically a fully-qualified (case sensitive?) name for a file system. The file system will happily create exact clones of the developer’s supposedly numerically unique resources. Identity criteria mismatch scenarios are a very common source of identity related bugs – particularly across space and time, as in distributed systems.

Conclusion

I hope this entry has helped you understand how those pesky identity bugs keep cropping up in your products and code. I Am Not a Philosopher (IANAP!) so if I have piqued your interested I encourage you to look at some of the references below for far more detail.

What identity related bug did you workaround or fix this week?

References

Eclipse and VisualStudio in 2010

Saturday, March 22nd, 2008

This talk attempted to zoom out and present some of the impending challenges for IDE design — particularly around GUI. The current MDI stype IDE interface has remained essentially unchanged since inception, while using 2+ large monitors has become increasingly common. Many people now develop while on the road (train tracks in my case!) using a laptop, where screen space is much more constrained. Input devices are also changing, with support for multitouch and gestures already in mainstream use. Developers are also building larger systems and require more focused and efficient filtering of information. Multiple CPUs allow the IDE to be more proactive in offering developer assistance, without disrupting the developer’s chain of thought with modal operations. Apparently developers are also spending far more time exploring and reading code, rather than “just” editing source files.

Babel

Saturday, March 22nd, 2008

I had a very enjoyable couple of drinks with Gabe from the Eclipse Foundation. Amongst many other things Gabe has been implementing the localization server for the Babel project. The server allows any Eclipse users to log in and supply translations for localized strings. These strings are then built and can be downloaded as a language pack. I took the opportunity to pick Gabe’s brain to understand how hard it would be to install a Babel server within ILOG to help us manage the localization for Rule Studio — currently a considerable challenge as we support English, French, German, Spanish, Korean, Japanese and Chinese.

Ganymede Packaging

Saturday, March 22nd, 2008

The talk on the packaging efforts for Eclipse 3.4 (Ganymede) was interesting in that it described the process the Eclipse Foundation uses to create the master Update Site from the 30+ individual project Update Sites that compose the Ganymede release. Buckminster is used to resolve project dependencies while some custom scripts are capable of creating the master Update Site. The source code is (of course) Open Source — so I will have to take a look to see if there is something we can use to improve our internal build processes.

What is new in the Eclipse 3.4 JDT

Saturday, March 22nd, 2008

This short talk showed off the enhancements to JDT coming in Eclipse 3.4: code complete for classes that have not been imported, and automatic addition of casts after “instanceof” tests were my favourites. The breadcrumb navigation bar also looked useful, allowing you to navigate to classes from within the Java source editor.

Eclipse 4.0

Saturday, March 22nd, 2008

The talk on Eclipse 4.0 (e4) made it clear from the outset that discussions were just starting. Details were sketchy, but one theme that emerged was web-enabling the Eclipse platform. There was talk of being able to implement plug-ins using Javascript, supporting CSS styling rules for all user interface elements and even supporting server-side deployment of the platform runtime to enable web applications. This ambitious effort will provide an Flash/HTML/AJAX port of SWT, allowing graphical Eclipse applications to run within a web-browser.

The presenters stressed however that they were still in the brainstorming phase and aim to produce more concrete plans and demos for EclipseCon 2009. They also tried to dispell any compatability fears by saying that Eclipse 3.x would be maintained and enhanced for many years to come.

Cloudsmith

Saturday, March 22nd, 2008

Stefan Daume, a fellow University of Edinburgh AI graduate, was kind enough to give me a demo of the Cloudsmith software distribution solution. Cloudsmith is currently in beta and allows you to define custom software distributions, assembled from components published in Maven repositories or from Eclipse plug-ins. The very slick Cloudsmith GUI makes managing distributions easy, while the powerful runtime, based on the Eclipse Buckminster project, performs all the heavy lifting to ensure component dependencies are resolved. Distributions can be easily “materialized” into an Eclipse workspace. I’m definintely going to take a second look at Cloudsmith — perhaps it can help to manage the 80+ plug-ins that compose a Rule Studio distribution?

Services vs. Extensions

Saturday, March 22nd, 2008

This panel discussion delved into the use cases for Services vs. Extensions. The most memorable analogy was that Extensions are like the relationship between a parent and their children, while Services are a like a peer-to-peer relationship between consenting adults. All the panelists agreed that both serve a valuable purpose, however there is some technical work to be done to ensure that the Extension lifecycle is as rich as the Service lifecycle and that the programming model for Services is as simple as the programming model for Extensions.