Archive for the ‘metadata’ category

IM Trends 2 – CMIS will save us

One of the big challenges for Enterprise Content Management in the last few years has been the sharing of different content types. ECM covers records, documents, images, emails, forum posts, web content, lists, people profiles, and more recently blog posts, wiki pages, and microblogging. These content types were managed in different stores. Traditionally the only way to get single sourcing of content and sharing/reuse/blending of different content types across different stores was to buy all of the solution components from one vendor. Because of the fast moving nature of the industry even that was problematic as most of the players grew by acquisition, picking up different pieces of the ECM stack from companies they bought. Sometimes they weren’t well integrated in, and compatibility/reuse was only at a very surface level, or was technically difficult to implement.

For organisations that couldn’t afford large integrated ECM stacks (which includes the very large majority of NZ organisations), the promise of single sourcing and content reuse seemed a far off dream.

Enter CMIS – the Content Management Interoperability Services standard. Think of it in the same light as the way major database vendors standardised on SQL in the 1980s. CMIS was formally initiated in October 2008 by OASIS, following work by EMC, IBM, Microsoft, Alfresco and others on the proposed standard. It is now governed by a multi-vendor technical commitee that works to:

“standardize a Web services interface specification that will enable greater interoperability of Enterprise Content Management (ECM) systems. CMIS uses Web services and Web 2.0 interfaces to enable rich information to be shared across Internet protocols in vendor-neutral formats, among document systems, publishers and repositories, within one enterprise and between companies.”

More specifically, CMIS provides standards for a set of Web Services and RESTful APIs to allow different content repositories and systems to:

  • search for and discover what different content types (Object Type definitions in CMIS language) and capabilities exist in a repository
  • create, read, update and delete content objects
  • file and categorise content objects
  • navigate and traverse a hierarchy of folders in a repository
  • create versions of content objects and see their version history
  • query to retrieve content objects by specific criteria

Currently the specification is at version 0.63 and is actively being worked on. It provides a Domain Model, a Schema, and sets of bindings for RESTful AtomPub, and Web Services. These are available here.

So what does this mean in practice? Once implemented it will be a way to break down the silos, and enable reuse of content amongst multiple systems. It should allow ECM applications, portals, and intranets to be built that aggregate content from a range of CMIS compliant repositories, and allow them to be mixed and mashed up in a ‘loosely coupled’ way. You’ll be able to have best of breed repositories/content applications, from different vendors, and join them together seamlessly.

Let’s look at some practical examples.

Scenario 1

Imagine you’re a government agency with a web site built in Drupal, and you’ve implemented Alfresco for records and document management. You’ve got a set of policy documents that you need to publish on the web. The traditional method would have been to work on the documents in the document management system, create a final version, send it to your web manager who’d upload it to the web site’s document repository, delete the old version, and make sure the new version appears in the right places on the site.

With CMIS you’d be able to have a content store for published documents in Alfresco, with appropriate metadata describing them. You’d then have a live query from Drupal to Alfresco using CMIS to retrieve those documents and display them. No going through the web manager, no uploading and deleting documents to and from the web site, just the completion of a controlled publication process, with the documents automatically displaying on the site. This example is already achievable with the CMIS Drupal-Alfresco module, and Alfresco’s draft CMIS implementation in Alfresco Community 3.1 and above.

Scenario 2

Let’s say you’re a University and you’ve implemented Microsoft Sharepoint to manage structured content including course information, news items, and staff profiles. You love Sharepoint’s handling of content workflows for news production and editing, and its ease of integration with Microsoft Office, but you want to publish the news items in multiple places including the public web site, the staff Intranet, and the learning management system. For various reasons these are built in EpiServer, Plone, and Moodle respectively. You’d also like some news items to be published to the new research collaboration system built in Sakai. Through CMIS you could have the news items stored in Sharepoint, and accessible from each of these systems, again with simple queries via REST or SOAP. Let’s say you’re also using Sharepoint for your records management solution. You could then have documents that are put into Moodle and Sakai automatically result in copies of correct versions being stored in Sharepoint for appropriate retention and disposal.

While this example isn’t all achievable yet, you can already use Sharepoint Server 2007 to access external content repositories using CMIS. Here’s how.

Conclusions

CMIS will open up the enterprise content management space to more innovation, remixing, and creative solutions than we’ve ever seen before. Organisations will be able to choose best of breed components, and glue them together with relatively minimal effort. Solutions won’t be restricted by vendor lock-in, but will be responsive to real business/user needs.

This is the second in a set of posts on NZ information management trends:

  1. OpenSource ECM
  2. CMIS will save us
  3. Enterprise Social Computing
  4. Doing Sharepoint wrong, and right
  5. Structured Content
  6. Toes in the mist

Next to come, Enterprise Social Networking

Australasian geospatial metadata, standards, spaghetti and disappearing spacecraft

I’ve just been to the ANZLIC metadata presentation held by Land Information New Zealand (LINZ).

ANZLIC is the Australia & New Zealand Spatial Information Council. They provide leadership in the collection, management and use of spatial information in Australasia.In Australia they are working on the standards for a national address register, including standards, schema etc, but stop short of the implementation.

They are associated with, but independent from The [Australian] Office of Spatial Data Management facilitates and coordinates spatial data management across Australian Government Agencies.

ANZLIC is working on a range of initiatives, including ANZsi, a spatial marketplace, similar to GeoConnections in Canada. This will provide a marketplace for all spatial resources in Australasia. It will include integration with and to existing supply side infrastructre and initiatives, and anticipates demand side involvement.

They believe that spatial data use is becoming an everyday thing, involving off the shelf technology, increased user knowledge (due to Google Maps, Google Earth etc), and driven in part because at least 80% of government transactions have a ‘where’ component. They challenged us to think of what fell into the 20%, and the audience couldn’t come up with any government transactions that don’t have a spatial component.

ACIL Tasman did a study which estimated that inefficient access to data reduces the direct productivity of some sectors by between 5-15%. (Summary of findings here). ANZLIC sees metadata as an important solution to this problem.

They used the metaphor of a can of spaghetti to explain what metadata is. The can’s label includes a title (product name), an abstract (product description), a statement of quality (99% fat free, no artificial preservatives or colours), instructions on use (heating/cooking directions), a detailed list of fields in the data (the ingredients), and the extent of the data (weight, nutritional information). They also illustrated the importance of the use of standards with this story “Two Teams, Two Measures Equaled One Lost Spacecraft“.

ANZMET Lite is a tool that has been developed by the OSDM, with the help of the jurisdictions. Its target user groups are organisations with up to 30 resources requiring metadata records to be published, contractors who are collecting resources on behalf of clients, and are required to provide metadata records. It allows for the production of linked (connected to the resource) or unlinked metadata records. It also allows for parent/child relationships between metadata. There are a number of classes in the parent/child hierarchy, including dataset, service, model, tile, document, and many others.

There is also the ANZLIC metadata profile, and the profile guidelines, which include a mapping between AGLS / NZGLS and the ANZLIC Metadata Profile.

The tool is pseudo opensource, in that its origins were in the Australian Defence Force, who won’t let it be fully opensourced. You can however get the source code, and modify it, as long as you notify OSDM of the changes, and provide them back.

LINZ is working with MoRST to create a GeoNetwork node for NZ. In the meantime metadata created using ANZMet Lite can be emailed to nzgo@linz.govt.nz for external publishing. More information on NZ Geospatial Office activity at www.geospatial.govt.nz.

Remember kids, in order to maintain an untenable position, you have to be actively ignorant
Stephen Colbert