Tom Petrocelli's take on technology. Tom is the author of the book "Data Protection and Information Lifecycle Management" and a natural technology curmudgeon. This blog represents only my own views and not those of my employer, Enterprise Strategy Group. Frankly, mine are more amusing.

Wednesday, October 25, 2006

First, Do No Harm Mozilla

ARRRGGGHHH! No, that is not the sound of me practicing to be a pirate for Halloween. Instead, it is a strangled cry of anguish brought on by my attempts to run the new Firefox browser. As my gentle readers are aware of, I am usually a Mozilla fan. I have tossed away Microsoft Outlook in favor of Thunderbird and use Firefox, not Internet Explorer. The new version of Firefox, version 2.0, is giving me pause however.

You would think that these folks never heard of backward compatibility. I'm not talking about the extensions that no longer work. I kind of expect that and usually they catch up over the next couple of weeks. I mean things that are supposed to work that no longer do. Worse, features that work inconsistently.

For instance, I had my browser configured to open most links in a new tab with the exception of links within a page. Bookmarks, search results, new windows all come in a tab. It is the single most useful feature of Firefox, in my opinion. The only thing that I wanted to open in the current window, is a link from within the current window. This is typical application behavior. I loaded up Firefox 2 and, behold, everything works exactly the opposite. Worse yet, changing the tab settings seems to have absolutely no effect on the behavior of the browser. Tell it to open external links in the current tab, it still opens them in a new one. No matter what I tell it to do, the silly browser does what it wants to do. Frustrating as all get out.

My TagCloud system no longer works. It somehow generates a 403 error from the web server. To put it plainly, some of my AJAX applications no longer function. Perhaps it is some new security feature that is ignoring the security profile that is in my user preferences. Perhaps it's ignoring my user preferences all together. Maybe it's just acting randomly.

The point of this particular rant is that this is what kills software. Backward compatibility is a key component of upgrading software. If business enterprises are to be expected to implement these product, if these products are to become something other than a hobbyist's toy, then upgrades have to respect current configurations. New features don't matter if the old, useful ones, no longer work.

In case anyone is thinking that this is just me, look around the message boards. You will find all kinds of similar problems. This is also not the first time this has happened. Several times in the past when Mozilla came out with upgrades a lot of things broke and for good.

The extension situation also exposes the soft white underbelly of open source. Over time, certain extensions, plug-ins, or whatever you want to call them, become so widely used as to become a feature. When that happens you can no longer rely on a weekend programmer to maintain it and keep it current. It is incumbent on the main development team to make sure these critical features are delivered.

New features are nice, often important. You can't deliver those by breaking other existing features. For me, it means a return to the older version that I have, v1.5. That presents it's own problems since the update mechanism won't update to my previous version, 1.7 and a lot of extensions will no longer function below that level. All this aggravation is making me take a good look at IE 7. Exactly what the Mozilla team does not want to happen.

Sorry Mozilla, this is a lousy roll out. You won't win corporate hearts and minds this way.

Friday, October 20, 2006

Mother Nature 1, Small Business 0

I live near Buffalo New York. Last week we had a freak October snow storm. Now, before anyone starts with the usual jokes, keep in mind this was not the normal Buffalo snow. Since the storm was widely but poorly reported, it is understandable that most people have an incomplete understanding of the scope of the disaster. Having fallen off the news cycle in a day, it's also unlikely people will ever get the whole story unless you live here or know someone who does. That's sad because it is an excellent object lesson in disaster preparedness.

The Facts.

To give some context to the situation (and hopefully stop the snickering) here are the facts of the storm:

  • The last time it snowed this early here was in 1880. We expect a small sprinkling at the end of the month but not this much this early

  • According to NOAA, the average snowfall for Buffalo for October is 0.3 inches. This is averaged over 59 years. We got 23 inches in one night.

  • On October 13, 2006 over 400,000 homes were without power. A full week later, on the 20th, there were still over 32,000 people without power.

  • Costs for tree removal and damage for municipalities could top $135M. That doesn't include business losses and other economic factors, insurance losses, and repairs to homes.

  • Schools have been closed all week in many districts and some will still be closed into next week, nearly two weeks into the disaster situation

Now, it's not like Buffalo and Western New York are new to snow. We are usually prepared for anything. Not this though. What it tells you is that despite your best planning for known risks, there are any number of unknown risks that you can't anticipate.

The Impact.

Most large businesses were effected by the storm because their employees couldn't get to work. Streets were blocked, there were driving bans in most municipalities, and folks had families to worry about. Small businesses, on the other hand, suffered because they had no power. My office was offline for an entire week. After six days I was able to get a generator to run my computers. Power came back after seven days and I was up and running.

As an aside, the latest joke running around Western New York goes like this:

Q. What's the best way to get your power back
A. Get a generator

Pretty sad, eh?

The Lesson.

What this experience underlines is the need for disaster planning by even the smallest of businesses. I know a great number of lawyers, doctors, dentists, and accountants that could not operate or operate effectively because they had no electricity. Computers did not operate and cell phones ran out of juice quickly. A common problem: cordless phones that don't operate at all without power.While large businesses will have gas fired generators that can operate nearly indefinitely, most small businesses have no alternate source of energy.

The situation also shows the dark underbelly of computer technology. Without juice, computers are inert hunks of metal and plastic, more useful as door stops than productivity tools. Even worse is having your critical data locked up in a device that you can't turn on. There were quite a few times when I found myself wishing for my old fashioned paper date book.

So what should the small business professional do? There are some actions or products that I'm considering or that saved my bacon. Here's my top tips:

  1. Get a small generator. A 2000-watt generator can be had for $250.00. A computer will need anywhere from 150 watts to 600 watts depending on the type of power supply. Laptops use even less. For the average small business professional, a 2000-watt generator will allow you to work at some level.

  2. Offline backup. My data was well protected - for the most part. However, as it got colder I began to worry that some of my disk-based on-line backup might be damaged. Thankfully it wasn't. Still it is clear to me that I need to get my data offsite. Burning DVDs or taking tapes offsite is not practical. So, I'm looking into offsite, online backup

  3. Extra cell phone batteries. The truth is, I mostly use my cell phone when traveling and can usually recharge it regularly. That works great when you have juice. An extra battery might make the difference between being operational and losing business. Some people used car chargers, many for the first time, to power up dead cell phone batteries.

  4. Network-based communication services. One of the best things I did was get a VoIP system. It obviously wasn't working with the power out but the network was. That meant I never lost my voice mail. I could also get into the voice mail to change it. Small, onsite PBXs or answering machines don't work once the lights go out. Also, keep an old-fashioned analog phone handy. These work off of a landline's own 5-volt power. In many instances, folks had viable phone connections but no analog phone to hook up to it.

  5. Duct tape. It works for everything. No, really.

  6. Give help, ask for help. Buffalo is the kind of community where everyone helps everyone else out. That's how I got a generator. Others I spoke with were operating out of other people's offices. Keep a list of you friends and colleagues who can help you. Be prepared to help them too. That's right on so many levels

The last tip is the most important. Technology will never have the power that people do. Over the past week I saw more signs of that then I ever thought possible. I know people who had their neighbors stay with them for over a week because they somehow had heat. I saw generators loaned like library books. On my block I encountered a group of high school students - cleaning out driveways so people could get out to the street.

That's right, roving bands of teenagers committing random acts of kindness. If this is what the world is coming to then I'm all for it. And like the Boy Scouts say "be prepared".

Thursday, October 12, 2006

The Evil truth of Open Source Code Dependancies

Lurking... just beyond the shadows... lies an evil so hideous that...

Okay too dramatic. Still, there is an big problem that is cropping up more and more with many open source projects. It seems that in an effort to leverage existing open source code (which is good) we are creating a series of dependencies that make implementation daunting. Just look at the dependency list for the Apache Software Foundation's Ant. Ant is build tool, something programmers use to help compile and link a big program that has a lot of interdependent components. For all of you UNIX/Linux fans out there, think make on steroids. In any event, one look at the dependency list is enough to make a strong stomached open source supporter turn green. There are over 30 libraries and other related components, from nearly as many different sources, that are required to use Ant. Use, not compile.

The core problems with this type of system are:

  • Complexity - The obvious problem. It's so difficult to get things installed and configured right that you go nuts

  • Version Control - You now have to worry about what version of each dependant component you are dealing with. A change in a small library can break the whole system. Woe be to the programmer who uses an updated version of a component in his new application.

  • Bloat - Open source used to have the advantage of being fairly lean. Not so anymore. This is not to say it's any more bloated than proprietary systems like Windows Server 2003. It's just not very different anymore in that respect

  • Conflicts - So, you have applications that use different versions of some core component. Have fun working that out.

This is a good example of why people go with closed frameworks like .NET. Even though you are at the mercy of Microsoft, they at least do the heavy lifting for you. Dealing with all this complexity costs money. It costs in terms of system management and development time, as well as errors that decrease productivity.

Ultimately, these factors need to be worked into the open source cost structure. It's one thing when open source is used by hobbyists. They can get a charge out of monkeying around with code elements like that. For professionals, it's another story. They don't have time for this. What's the solution? One solution has been installers that put the whole stack plus applications languages on your system. Another option is to pull these components into a coherent framework like .NET . Then you can install just one item and get the whole package. Complexity and conflicts can be managed by a central project with proper version control for the entire framework. There are commercial frameworks that do this but we need an open source framework that ties together all open source components. Otherwise, open source development will be cease to penetrate the large scale enterprise software market.

Wednesday, October 11, 2006

Eating My Own Cooking

Like so many analysts, I pontificate on a number of topics. It's one of the perks of the job. You get to shoot your mouth off without actually having to do the things you write or speak about. Every once in awhile though, I get the urge to do the things that I tell other people they should be doing. To eat my own cooking you might say.

Over the past few months I've been doing a fair bit of writing about open source software and new information interfaces such as tag clouds and spouting to friends and colleagues about Web 2.0 and AJAX. All this gabbing on my part inspired me to actually write an application, something I haven't done in a long while. I was intrigued with the idea of a tag cloud program that would help me catalog and categorize (tag) my most important files.

Now, you might ask, "Why bother?" With all the desktop search programs out there you can find almost anything, right? Sort of. Many desktop search products do not support OpenOffice, my office suite of choice, or don't support it well. Search engines also assume that you know the something of the content. If I'm not sure what I'm looking for, the search engine is limited in it's usefulness. You either get nothing back or too much. Like any search engine, desktop search can only return files based on your keyword input. I might be looking for a marketing piece I wrote but not have appropriate keywords in my head.

A tag cloud, in contrast, classifies information by a category, usually called a tag. Most tagging systems allow for multidimensional tagging wherein one piece of information is classified by multiple tags. With a tag cloud I can classify a marketing brochure as "marketing", "brochure" and "sales literature". With these tags in place, I can find my brochure no matter how I'm thinking about it today.

Tag clouds are common on Web sites like Flickr and MySpace. It seemed reasonable that an open source system for files would exist. Despite extensive searching, I've not found one yet that runs on Windows XP. I ran across a couple of commercial ones but they were really extensions to search engines. They stick you with the keywords that the search engine gleans from file content but you can't assign your own tags. Some are extensions of file systems but who wants to install an entirely different file system just to tag a bunch of files?

All this is to say that I ended up building one. It's pretty primitive (this was a hobby project after all) but still useful. It also gave me a good sense of the good, the bad, and the ugly of AJAX architectures. That alone was worth it. There's a lot of rah-rah going on about AJAX, most it well deserved, but there are some drawbacks. Still, it is the only way to go for web applications. With AJAX you can now achieve something close to a standard application interface with a web-based system. You also get a lot of services without coding, making mutli-tier architectures easy. This also makes web-based applications more attractive as a replacement for standard enterprise appliacations, not just Internet services. Sweet!

The downsides - the infrastructure is complex and you need to write code in multiple languages. The latter creates an error prone process. Most web scripting languages have a syntax that is similar in most ways but not all ways. They share the C legacy, as does C++, C#, and Java, but each implements the semantics in their own way. This carried forward to two of the most common languages in the web scripting world, PHP and JavaScript. In this environment, it is easy to make small mistakes in coding that slow down the programming process.

Installing a WAMP stack also turned out to be a bit of a chore. WAMP stands for Windows/Apache/MySQL/PHP (or Perl), and provides an application server environment. This is the same as the LAMP stack but with Windows as the OS instead of Linux. The good part of the WAMP or LAMP stack is that once in place, you don't have to worry about basic Internet services. No need to write a process to listen for a TCP/IP connection or interpret HTTP. The Apache Web Server does it for you. It also provides for portability. Theoretically, one should be able to take the same server code and put it on an any other box and have it run. I say theoretically because I discovered there are small differences in component implementations. I started on a LAMP stack and had to make changes to my PHP code for it to run under Windows XP. Still, the changes were quite small.

The big hassle was getting the WAMP stack configured. Configuration is the Achilles heel of open source. It is a pain in the neck! Despite configuration scripts, books,a nd decent documentation, I had no choice but to hand edit several different configuration files and download updated libraries for several components. That was just to get the basic infrastructure up and running. No application code, just a web server capable of running PHP which, in turn, could access the MySQL database. I can see now why O'Reilly and other technical book publishers can have dozens of titles on how to set up and configure these open source parts. It also makes evident how Microsoft can still make money in this space. Once the environment was properly configured and operational, writing the code was swift and pretty easy. In no time at all I had my Tag Cloud program.

The Tag Cloud program is implemented as a typical three tier system. There is a SQL database, implemented with MySQL, for persistent storage. The second tier is the application server code written in PHP and hosted on the Apache web server. This tier provides an indirect (read: more secure) interface to the database, does parameter checking, and formats the information heading back to the client.

As an aside, I originally thought to send XML to the client and wrote the server code that way. What I discovered was that it was quite cumbersome. Instead of simply displaying information returned from the server, I had to process XML trees and reformat them for display. This turned out to be quite slow given the amount of information returned and just tough to code right. Instead, I had the server return fragments of XHTML which were integrated into the client XHTML. The effect was the same but coding was much easier. In truth, PHP excels at text formating and JavaScript (the client coding language in AJAX) does not.

While returning pure XML makes it easier to integrate the server responses into other client applications, such as a Yahoo Widget, it also requires double the text processing. With pure XML output you need to generate the XML on the server and then interpret and format the XML into XHTML on the client. It is possible to do that with fairly easily with XSLT and XPath statements but in the interactive AJAX environment, this adds a lot of complexity. I've also discovered that XSLT doesn't always work the same way in different browsers and I was hell-bent on this being cross-browser.

The JavaScript client was an exercise in easy programming once the basic AJAX framework was in place. All that was required was two pieces of code. One was Nicholas Zakas' excellent cross-browser AJAX library, zXml. Unfortunately, I discovered too late that it also included cross-browser implementations of XSLT and XPath as well. Oh well. Maybe next time.

The second element was the HTTPRequest object wrapper class. HTTPRequest is the JavaScript object used to make requests of HTTP servers. It is implemented differently in different browsers and client application frameworks. zXml makes it much easier to have HTTPRequest work correctly in different browsers. managing multiple connections to the web server though was difficult. Since I wanted the AJAX code to be asychronous, I kept running into concurrency problems. The solution was wrapper for the HTTPRequest object to assist in managing connections to the web server and encapsulate some of the more redundant code that popped up along the way. Easy enough to do in JavaScript and it made the code less error prone too! After that it was all SMOP (a Simple Matter of Programming). Adding new functions is also easy as pie. I have a dozen ideas for improvements but all the core functions are working well.

The basic architecture is simple. A web page provides basic structure and acts as a container for the interactive elements. It's pretty simple XHTML. In fact, if you look at the source it would look like nothing. There are three DIV sections with named identifiers. These represent the three interactive panels. Depending on user interaction, the HTTPRequest helper objects are instantiated and make a request of the server. The server runs the request PHP code which returns XHTML fragments that are for display (such as the TagCloud itself) or represent errors. The wrapper objects place them in the appropriate display panels. Keep in mind, it is possible to write a completely different web page with small JavaScript coding changes or even just changes to the static XHTML.

The system has all the advantages of web applications with an interactive interface. No page refreshes, no long waits, no interface acrobatics. It's easy to see why folks like Google are embracing this methodology. There's a lot I could do with this if I had more time to devote to programming but Hey! it's only a hobby.

At the very least, I have a very useful information management tool. Finding important files has become much easier. One of the nice aspects of this is that I only bother to tag important files, not everything. It's more efficent to bake bread when you have already seperated the wheat from the chafe. It's also good to eat my own cooking and find that it's pretty good.

Friday, October 06, 2006

ILM - The Oliver Stone Version

I've noticed that you don't hear much about ILM anymore. A few articles in InfoStor maybe. What news you do hear is mostly of restructurings or closings at small startup companies in the space. It seems that many ILM plays are moving into other markets or selling narrow interpretations of their technology under a different guise. I've pondered this turn of events and have collected a number of theories as to what has happened to ILM. So grab your popcorn and read "Tom's ILM Conspiracies"

It Was All Hooey To Begin With
One of the most popular theories is that ILM was a load of horse hockey from the start. The story goes this way:

  1. Big companies find core technology doesn't sell as well as it used to

  2. They quickly come up with a gimmick - take a basket of old mainframe ideas and give them a three letter acronym

  3. Hoist on stupid and unsuspecting customers

  4. Sit back an laugh as the rubes buy the same stuff with new facade

This is a variation of the "lipstick on a pig" theory that says ILM was only other stuff repackaged. It was all a scam perpetrated by evil marketing and sales guys to make their quota and get their bonuses.

Unfortunately, there are several holes in this theory. First, a lot of the companies in the space are new ones. While they take the lead from the big ones, they don't get technology and products from the big companies. They actually had to go out and make new technology not just repackage old stuff. The same is true for the big companies. Many had to develop or acquire entirely new technology to create their ILM offerings.

I also don't think that IT managers are as stupid as this theory would make them out to be. Your average manager with a college degree and a budget to manage rarely buys gear just because the salesman said it was "really important". So the marketing ploy argument falls down.

Good Idea. Too Bad It Didn't Work

Also known as a "Day Late and a Dollar Short. This theory says that ILM was a great idea but just too hard to implement or build products around. There is some truth to this one. It is hard to develop process automation products that don't require huge changes in the IT infrastructure. ILM is especially susceptible to this since it deals with rather esoteric questions such as "what is information? , "how do I define context?", and "what is the value of information?". Packaging these concepts into a box or software is not trivial.

It probably didn't help that a lot of ILM product came out of the storage industry. Folks in that neck of the woods didn't have a background in business process automation, the closest relative to ILM.

Put another way - making ILM products is hard and a lot of companies found it to too hard to stay in the market.

The Transformer Theory or Morphing Into Something Bigger
ILM. More than meets the eye! It's funny if you say it in a robotic voice and have seen the kids cartoon and toys from the 80's. But seriously folks, the latest theory is that ILM hasn't disappeared, it's simply morphed into Information Management. This line of thought goes like this:

  • ILM was only a start, a part of the whole

  • It was successful to a large degree

  • Now it is ready to expand into a whole new thing

  • Get more money from customers

This is one of those sort of true and not true ideas. It is true that ILM is part of a larger category of Information Management. So is information auditing and tracking, search, information and digital asset management, document and records management, and even CAS. That doesn't mean that ILM as a product category has gone away. Not everyone needs all aspects of Information Management. Some folks only need to solve the problems that ILM addresses.

So, while ILM is now recognized as a subcategory of Information Management, it has not been consumed by it. This theory does not explain the relative quiet in the ILM world.

A twist on this theory is that ILM has simply gotten boring. No one wants to talk about it because it is pretty much done. That's boloney! Most vendors are only just starting to ship viable ILM products and IT has hardly begun to implement them. Nope. Put the Transformers toys and videos away.

Back and To The Left. Back and To The Left.
ILM was assassinated! Just as it looked like ILM would become a major force in the IT universe, evil analysts (who never have anything good to say about anything) and slow-to-market companies started to bad mouth it. It was a lie, they said. It was thin and there was nothing there. Unlike the "Hooey" theory, assassination assumes that there was something there but it was killed off prematurely by dark forces within our own industry.

That's a bit heavy, don't you think? Sure, there are naysayers and haters of any new technology - heck! I can remember someone telling me that the only Internet application that would ever matter was e-mail - but there are also champions. When you have heavy hitters like EMC and StorageTek (when they existed) promoting something, it's hard to imagine that a small group of negative leaning types could kill off the whole market.

Remember, there were a lot of people who hated iSCSI and said it would never work. Now, it's commonplace. Perhaps, like iSCSI it will just take awhile for ILM to reach critical mass. It is by no means dead. So that can't explain the dearth of noise

It was always about the process anyway
They have seen the light! Finally, all is clear. ILM was about the process. The process of managing information according to a lifecycle. A lifecycle based on and control by the value of the information. No products needed, so nothing to talk about.

Sort of. Okay, many people have heard me say this a hundred times. It's The Process, Stupid. However, designing the process and policies is one thing. Actually doing them is something else entirely. ILM relies on a commitment by a business to examine their needs, create processes, translate these into information policies, and then spend money on products to automate them. Just as it's hard to build a house without tools, it's hard to do ILM without tools. Software primarily but some hardware too.

It's good that we have woken to the fact that ILM is about process and policy. That doesn't explain why there isn't more news about tools that automate them.

No Time To Say Hello - Goodbye! I'm Late!" Theory
Also known as "That Ain't Working'. That's The Way You Do It" effect.. This theory postulates that IT managers simply have more important things to do. Who has time for all that process navel gazing? We have too many other things to do (sing with me "We've got to install microwave ovens.") and simply don't have the time to map out our information processes and design policies for ILM. I'm sure that's true. I'm sure that's true for almost every IT project. It's a matter of priority. However, we do know that lots of people are struggling with information management issues, especially Sarbannes-Oxley requirements. If that wasn't true, the business lobby wouldn't be expending so much energy to get SOX eliminated or watered down.

There is a kernel of truth here. If IT doesn't see ILM as a priority, it will die on the vine. I just don't think that ILM is such a low priority that it explains the lack of positive news.

Everything is Fine Thanks. No, Really. It's Okay.
What are you talking about, Tom? Everything is peachy keen! We're at a trough in the product cycle and that makes us a bit quiet. No. Um. Customers are too busy implementing ILM to talk about it. That's the ticket. No wait! They are talking about it but no one reports it anymore. It's too dull. That's because it is so successful.

Nice try.

Not nearly enough IT shops are doing ILM, this is the time in the cycle when products (and press releases) should be pouring out, and the trade press will still write about established technology. Just look at all the articles about D2D backup and iSCSI, two very well established and boring technologies (and I mean that in only the most positive way). I will admit that ILM is no longer the flavor of the month but you would think that you'd hear more more about it. We still hear about tape backup. Why not ILM?

My Theory
Or non-theory really. My feeling is that ILM turned out to be too big a job for small companies. They couldn't put out comprehensive products because it was too hard to make and sell these products. Complete ILM takes resources at the customer level that only large companies have. Despite trying to develop all-encompassing solutions, many have fallen back on doing small parts of puzzle. The companies that are still in the game are either big monster companies like EMC and IBM, implementing total ILM solutions, or small companies that do one small piece of the information management process. Classification software and appliances are still pretty hot. No one calls them ILM anymore because it is more narrow than that.

So ILM is not dead, just balkanized. It is a small part of a larger market (Information Management) that has been broken down into even smaller components such as information movement and classification. ILM is not dead or transformed. Just hiding.