He has no friends, but he gets a lot of mail.

Monday 16 November 2009

Wikis and the rule of the feature

In Alfresco Share*, teams do their work in separate shared spaces known as Sites. As well as a library for traditional documents, every Site has a selection of components which allow different types of collaboration:
  • blog
  • wiki
  • forum
So far so fair enough; but when you provide a fairly common front-end to them, these three features become very similar. In fact, I can summarize the entire set of differences pretty easily:
  1. Wiki pages can't have the same title as each other.
  2. You can't comment on wiki pages.
  3. You can't link from one blog post to another, or from one forum thread to another.
  4. You can't sort by "largest number of comments" in the blog (or, of course, the wiki.)
  5. You can have a "draft" blog post, but not a draft of the other two.
  6. You can publish a blog post to an external web site, but not a wiki page or a forum thread.
Notice something about this? All of these differences are limitations imposed on "bunch of documents with comments." When choosing to post content, I need to choose which set of limitations to work within. Alfresco Share is frustrating not because it imposes these limitations, but because it makes it so obvious that the limitations are arbitrary.**

Wiki pages can't have the same title as each other. This rule is only necessary because of the [square bracket] notation for creating links as you type, a feature that survives from the original Wiki philosophy. Ward Cunningham's wiki was designed from the first to encourage linking by making it easy. It was also designed to be really simple to code, so he decided to use a simple text box and apply a markup rule (in his case, RunningWordsTogether) to denote links. He also invented some actual markup, which has been superseded in Share by a javascript rich text editor. But the link markup remains as a relic. Linking could be made more intuitive with an application of some more javascript, and the constraint would not then be necessary.

You can't comment on wiki pages. Traditionally you would edit the actual page to reflect a discussion, which would then get refactored from "Thread Mode" into "Document Mode" as the outcome of the discussion. You can still do this in Share, if you have the rights to. But if you have these rights, you can also go into a blog post or a forum and edit it. Limiting the wiki feature like this is purely to force people to use the wiki like a wiki is traditionally used. (And in many modern wiki implementations, you can comment on wiki pages.)

You can't link from one blog post to another, or from one forum thread to another. Actually, you can, as long as you're happy typing in the URL. This restriction only exists because the wiki link markup convention doesn't apply to the blog or the forum. Again, a decent "make link" widget would completely remove this problem.

You can't sort by "largest number of comments" in the blog. This is the only real difference between the forum and the blog, except for:

You can have a "draft" blog post, but not a draft of the other two. It's a useful feature, why implement it only in one component? Many forum apps have draft functionality.

You can publish a blog post to an external web site, but not a wiki page or a forum thread. Another useful feature cruelly denied to users of 2 out of 3 components.

I don't want to appear like I am bashing Share here. In fact, we accept the arbitrary restrictions because they are a core part of the feature's identity. A wiki would not qualify as a wiki, in the minds of the Alfresco Share developers, if it completely removed the Wiki Markup. So it remains.**

The blog/forum distinction is even more subtle, because the core difference here is not even one of functionality. Simply, we are encouraged to pay more attention to the initiating post than to the comments in a blog, and to the thread as a whole in a forum. There is a whole continuum operating here. Often the first post in a forum thread is edited by the original poster to summarize discussion in the thread; likewise, some blogs are as well known for the activity in the comments section as for the original content. Political Betting being an example where the blog's authors often have to create new posts purely as a way of separating diverging threads of discussion.

Presented with a choice of site features in a system like Share, the interplay of limitations and expectations can make it hard to decide which component to use for a particular piece of information. But it seems to be important enough to us that we have a feature we can name, and assign a purpose to, that we will accept these issues anyway.***

Google Wave, to take a counterexample, has tried to approach the problem in the classic computer scientist way: by abstracting out as many of the restrictions as possible. As a result, people have struggled with working out how they are supposed to use it. Perhaps we can't win.

* Alfresco is a document management system, and Share is its "groupware" front-end. We've been trialling it as a collaborative tool as well as an improvement to the shared drive for documents. It's available under an open source license which is important for small and cheap companies like us.

**But a wiki would not qualify as a wiki, in the mind of Ward Cunningham, if (as in Share) only a restricted set of people can edit it. One for a subsequent post.

***Look at Twitter; you could have done the same thing for years in Blogger or Livejournal just by ignoring the "post body" box. But you didn't. 

Friday 13 November 2009

Regression

We are in regression this week.

This doesn't mean we behave more childishly than usual (though maybe we do), but that the developers have merged down all their features into the trunk in source control, and now they're testing whether anything has broken.

As part of that, all my topics have also moved into trunk, and I've been building the first feature-complete documents for the upcoming release, for QA to review.

A bit of an oversight in the whole source control plan is that this leaves me working on trunk for the week. This shouldn't be allowed, but they let me because fixes to documentation bugs don't need testing.

What should happen is for me to do all my reactive work on my own development branch which stays open after the rest of the product is frozen. When the release goes out, there will always be tweaks to be made which don't go with a feature, so it's good for that too.

Indexes

Compiling indexes must be one of the least-missed aspects of traditional document writing. It's the sort of thing a professor might get a grad student to do in order to break his spirit.

Nowadays we don't need to do the actual sorting and compilation, and in an environment that promotes re-use, we can make bits of content automatically produce consistent index entries. So far so good!

But as I've been going along I've realised I missed an important aspect of index writing: index entries are not tags.

Having previously used any excuse to avoid compiling an index, and as a child (or at least cousin) of Web 2.0, I hadn't really understood what they were about, and gaily added generic index terms to my topics, such as "security".

This led to my index looking like this:

security,10
security,10
security,11
security,12
security,15
security,21
security,22
security,46

No worries, I thought, I can get the style sheet to bunch up all the duplicate entries and produce something more index-like:

security,10,11,12,15,21,22,46

I've seen this in lots of books, so it must be okay, yes?

Trouble is, it's really unusable. If I look up "security" in the index, I then have to visit 8 places throughout the document to see if that page happens to mention it in the way I need. What would be better would be to provide more context:

security, 10
security, policy, 10
security, over the network, 11
and so on.

An index is used like a search engine, and since it's not interactive, such a search engine must able to tell you in advance which queries are the "good" ones, where a query is only good if it narrows down the results to just one.

After you've disambiguated everything, then, you can start bunching up entries:

security, 10; policy, 10; over the network, 11,...

This approach clearly needs to be applied throughout the topic base, so now I'm considering doing an audit of the whole thing to see which index entries are duplicated and disambiguating them - since I don't know in advance which entries are going in which document.


The exercise of writing good index entries is also helpful when writing topics, as it forces you to really think about what this topic is about, and how it distinguishes itself from all the other topics in the repository.

Tuesday 3 November 2009

Content independence

Today's post from Tom Johnson of I'd Rather Be Writing talks about "content independence" and how a wiki can help you achieve it.

Tom's argument starts with this observation:
  • Authors need to update documentation after release.
This is because documents have bugs in. They have more bugs in than they might, because authors don't get time to fully check their work before release. OK so this really means
  • The time required to create full documentation with no bugs is too much for a project manager to include in the plan.
Why's this? Because the product is ready; we need to release it or risk being behind the market. But we spent a lot of time in QA fixing the bugs in the product; why are bugs in the documentation worth any less to the project manager? I think there are two perceptions at work:
  1. Documentation is not absolutely essential. The product is "good enough", and will work even if nobody knows how to use it. In the worst case, we can send a support engineer over to the customer to make it work for them.
  2. Bugs in documentation are easy to fix. Most problems with documentation don't require a complete rewrite of large chunks of material, whilst bugs in software can sometimes be traced back to bad design assumptions and lead to large refactoring projects.
Also, and because of these two:
  • Fixes to documentation bugs don't need testing. Because of 2., the fix has already been adequately "unit tested" by the writer, which is normally enough, and because of 1., the product won't break if we incorporate it.
And because of this:
  • Documentation doesn't need release management. Project managers allow authors to provide what they can with the release, and fix later.
This is where Tom comes in: working in this environment, he naturally looks to wikis and other online tools to make fixing easier. But there are environments where the core assumptions don't hold, for example:

  • Correct documentation may be a legal or contractual requirement. In some fields, particularly safety-critical ones, the documentation must be correct on the first try. Many clients like to see full lists of changes to documentation as a condition of acceptance. This breaks assumption 1.
  • The supply channels for updates may be limited. In a world of wikis, we forget that people still naturally turn to the materials provided with the product when there is a problem, if only to find out where the wiki is. Also, some products are used in situations where it's impractical to check online for changes to documentation. (This leads to situations like of "If your modem is not working, check our website for instructions".) Again, this is assumption 1 failing.
  • Documentation may be widely translated. A small change to a document may require weeks of sending round to multiple translation agencies to implement it. This breaks assumption 2. (Though if you don't mind some translations lagging behind the "native" documentation, you can split the task up so that it's not a problem.)
Ultimately, whether or not documentation should be release managed depends on the product and its position in the market as much as on the documentation itself; but authors often end up making this decision on their own.

In my company, as for many smaller software businesses, I think the assumptions hold, and I'm working at establishing an online knowledge base which is updated independently of the release process. I hope that it won't take me five years to get the idea past IT, though.

Twitter / dajlinton