Business applications of unstructured text

Interresting article in the ACM Communications.

A widely touted IT factoid states that
80% of the information produced by
and contained in most organizations
is stored in the form of unstructured
data. Most of it is text (such as memoranda,
internal documents, email,
organizational Web pages, and comments
from customers and from
internal service personnel), and most
of the applications that reflect the
value of unstructured data are able to
process it. Although unstructured
data takes other forms, including
images and audio, here I focus on the
applications, technologies, and architectures
for unstructured text acquisition
and analysis (UTAA).

New OpenOffice.org target.

Many of you probaly know the “WONT FIX” target in the OpenOffice.org issue tracker.

What about introducing a new target: “HELPS MICROSOFT”.

But why do we need this? These days many people --- especially from the file formats camps --- are extremely sensitive of anything related to compatiblity 'cause they believe it helps Microsoft.

So lets give the ODF warriors an opportinity to clearly communicate with the users. Give them the “HELPS MICROSOFT” target to publicly exposing the issuer of the bug and the people working on it.

Field update --- preview for Windows.

I now have a preview for Windows available at http://download.go-oo.org/preview/oodemo.zip.

Simply download it and unzip it. To start execute soffice.exe in ooo2.3/program/.

Same features as the Linux Version. So no saving at this point.

And don't forget to give feedback :-)



IBM's Symphony.

Downloaded IBM's Symphony today to follow up on some of the problems discussed at the ODF Interop Camp. (Btw. its sad that the ODF Camp people want to treat the problems as confidential.).

So back to Symphony. Why the hell did they crippled all the cool OpenOffice.org easter eggs?

So why is =game("StarWars") crippled?

And look what they done to the lovely picture of the Calc team:

I think that contradicts the SISSL :-)

Update on field work --- Early preview available for Linux.

In a previous post I talked about my field-proof-of-concept. I continued to work on the issue and I'm happy to give an update on that front.

You can download a preview version of my work here:


(Linux only. Just untar the archive tar xzf oodemo.tgz and then cd oodemo/program and start ./soffice). This is a preview version. Do not use it for productive work! The preview demo shows

  • the core enhancements (tabbing!), and

  • .DOC import.

The work for .DOC export, ODF import/export is not done and not included in the demo.

For testing you can download the sample file formc1new.doc which is taken from issue 79720. It should look like this:

Again --- this is work in progress. So do not expect everything to work. However if you have issues please let me know. And remember “saving does not work yet :-)”.

I really hope I get some feedback,


I will make the patch available ASAP. It is the result of some weekend hacking --- it really needs some polishing first.