What is he building in there?

Wednesday, 13 January 2010

User focus - how far do we go?

Happy 2010!

I came down pretty heavily on the side of user focused documentation in my last post, with talk of "enacting use environments" as a way of deciding what documents to provide, and what content to put in them.

Briefly, to produce a manageable documentation set, we need to identify a set of users with similar goals and work with them through some kind of dialogue so that we and they inhabit the same use environment.

This is manageable enough when the only participants in the use environment are the system and the user, but it becomes a lot harder when more than two entities are involved.

A hypothetical company (let's call it Fictive, Inc.) provides two software products for servers, "Base" and "Do Stuff". "Do Stuff" needs "Base" to work, so when you install "Do Stuff", "Base" gets installed on the same server automatically. "Base" provides a Web interface to "Do Stuff", which could equally be applied to any other product built on "Base"; "Do Stuff" just defines a bunch of functions with parameters.

Today, I am the technical author/content strategist/information architect/knowledge manager* for Fictive. In my drive for user focused documentation, I have been treating the Fictive software as a single unit, because that's how it's experienced. To do some stuff, the user needs to know both the function names and the parameters ("Do Stuff") and the way to encode them over the Web ("Base"), so I combine these into structured use cases with a bunch of examples of how to do each thing.

Now, Fictive make the decision to promote "Base" and "Do Stuff" as independent products, so my user based documentation for "Do Stuff" suddenly balloons to include all the flavours of "Base" we want to support. But if I split "Base" from "Do Stuff" in the documentation, I force the user to learn about the interface: functions will have to go in one document, and how to call them in a different document. Concrete examples become impossible.

Within Fictive, we can spot this and it's clear that it would also cause support headaches, so we end up recommending a particular version of "Base" for our release of "Do Stuff", and the server documentation can remain together, and nicely user-focused.

Unfortunately this isn't the end of it. Fictive's server isn't actually what our customer cares about. All the data that goes into "Do Stuff" comes out of a separate system made by a sister company of ours (say, Ficsis GmbH.) Ficsis has written their own UI for their product, which ultimately controls what stuff is done. The customer needs to know how to work that UI so that the right answers come out of "Do Stuff". Is that our problem, or Ficsis's? What ends up happening is the user gets burdened with learning about another interface, to inhabit two use environments simultaneously-- with Ficsis and Fictive pulling them in two directions at once.

It's in fact worse than this: the user really cares about an entire system, which sits in a rack and has within it servers from four or five different vendors. Each vendor has its own documentation team, who are busy enacting use environments, trying to mould the user to fit their convenient model. And it's because of this that systems integrators have jobs. But the problems started much earlier, in fact back when we defined an interface between "Base" and "Do Stuff".

So what's the answer? I think I made a mistake right at the beginning. The use environment of the "Base" and "Do Stuff" documentation isn't the customer at all -- it's the systems integrator. They are the only people who know what the final user will want to do, and how they will be able to achieve it.

Now I made this mistake because it happens that Fictive often does the systems integration work itself, and that they are too small to have a dedicated documentation team for each project, so the core documentation has to serve two purposes. But that is a compromise Fictive have had to make as a company, and however much it may annoy me that I have to move away from user focus, I have to realise that my idea of "user focus" didn't see the big picture in the first place.

* I really must pick a job title this year.

Thursday, 3 December 2009

Single sourcing is a bad name for multiple sourcing

Michael Hiatt's The Myth of Single Source Authoring has caused a bit of comment in the technical communication arena. But it's not an attack on single sourcing, at least the way I understand it. The clue is in his second paragraph:

And who will be our emerging heroes to fill the promise of content reuse and localization savings? Knowledge mashups and applications using cloud-based linked data and the emergence of the semantic Web.

Wait, what? Single sourcing is essential for knowledge mashups! Let me explain:

Single sourcing may be a bad name. Single sourcing does not mean "a tightly-controlled, single, authoritative source for all information, presented in a canonical form which will be used regardless of the output format or the audience." It certainly doesn't mean, as he puts it, "the belief that static authoring from a single vantage point from a single author paid by a single organization is a workable system". Of course it isn't. Wasn't that the precise thing single sourcing was developed to overcome?

For me, single sourcing means "for each piece of information, having an identifiable owner, and empowering that owner to act as a single source for that information, in whatever information use environment it is presented."

In the old days, every document had a single author (paid by a single organization), which meant that the same information was presented in different ways in different documents. And this is the most important point that Michael makes: there was nothing wrong with this, because a well-designed documentation set is broken up into documents aimed at different use environments, therefore each document should be written in a different way. The biggest mistake in the single-sourcing world is the idea that you can reuse authored topics effectively between use environments, but even Wikipedia knows that.

The author's job, in both traditional and single-sourced contexts, is to identify an information use environment, in fact, to enact an information use environment. "Information use environment" includes audience, language, culture, expectations, anything which affects how someone uses information. There are technically as many information use environments as there are occasions a person has to use information; but readers are malleable, and willing to mould their environment to some extent to fit with the information they have access to.

Once the use environment has been enacted, and agreed between author and reader, then the author can "suit" the information she presents to that environment. The difference between traditional authoring and single-source authoring is that the process of "suiting" information in a single-sourced system occurs at the single, original, source of the information, or at least at the point where it enters the author's domain, not at the point where it leaves to be assembled into a document.

For material which is authored in-house, the difference is small, since it is the author who originated the material. Likewise, for material which an organization aims at a very specific information use environment, the difference is small, since it is the author's organization that enacted that use environment.

So how can we benefit from single sourcing? The key is in that action of enacting. In a small company, the sole technical author bears responsibility for enacting the use environment. Because she enacted it, she finds she cannot reuse anyone else's information. As organizations grow, authoring teams share house standards, and the enacted use environments get codified so that authoring teams can successfully collaborate. If, as Michael says, "a writer seldom grabs a topic wholesale and places it into his or her document. Topics rarely meet all needs of the author and usually throw off the context and purpose of the document", this is a symptom of a lack of standards in the organization, so that individual authors are making their own decisions about the target audience.

Sometimes this is the right thing to do, and as an organization grows, naturally the number of use environments it is exposed to also grows; but there is always a core, the "standard documentation set", whose use environment has been fully enacted and formalized.

Which is where mashups come in. We cannot expect a mashup to be successful unless we share enacted use environments between organizations; ultimately, globally. But when this happens, it will be a revolution, because readers of any content from any organization will understand their role in the environment, it will become part of the culture of information use, not just part of the house style of an organization. (Look at the "enacted use environment" of pictograms in airports and other places: a truly global standard, which almost everyone thinks of as "intuitive" purely because it's so culturally ingrained.)

With single sourcing, once we have agreed (to whatever extent) to enact a particular use environment and write content for it, an organization will be able to re-use content from any other organization, anywhere, and it will fit in seamlessly. Without it, an organization will always have to rewrite information so that it speaks their language.

And that is why single sourcing is really multiple sourcing.

Monday, 16 November 2009

Wikis and the rule of the feature

In Alfresco Share*, teams do their work in separate shared spaces known as Sites. As well as a library for traditional documents, every Site has a selection of components which allow different types of collaboration:

blog
wiki
forum

So far so fair enough; but when you provide a fairly common front-end to them, these three features become very similar. In fact, I can summarize the entire set of differences pretty easily:

Wiki pages can't have the same title as each other.
You can't comment on wiki pages.
You can't link from one blog post to another, or from one forum thread to another.
You can't sort by "largest number of comments" in the blog (or, of course, the wiki.)
You can have a "draft" blog post, but not a draft of the other two.
You can publish a blog post to an external web site, but not a wiki page or a forum thread.

Notice something about this? All of these differences are limitations imposed on "bunch of documents with comments." When choosing to post content, I need to choose which set of limitations to work within. Alfresco Share is frustrating not because it imposes these limitations, but because it makes it so obvious that the limitations are arbitrary.**

Wiki pages can't have the same title as each other. This rule is only necessary because of the [square bracket] notation for creating links as you type, a feature that survives from the original Wiki philosophy. Ward Cunningham's wiki was designed from the first to encourage linking by making it easy. It was also designed to be really simple to code, so he decided to use a simple text box and apply a markup rule (in his case, RunningWordsTogether) to denote links. He also invented some actual markup, which has been superseded in Share by a javascript rich text editor. But the link markup remains as a relic. Linking could be made more intuitive with an application of some more javascript, and the constraint would not then be necessary.

You can't comment on wiki pages. Traditionally you would edit the actual page to reflect a discussion, which would then get refactored from "Thread Mode" into "Document Mode" as the outcome of the discussion. You can still do this in Share, if you have the rights to. But if you have these rights, you can also go into a blog post or a forum and edit it. Limiting the wiki feature like this is purely to force people to use the wiki like a wiki is traditionally used. (And in many modern wiki implementations, you can comment on wiki pages.)

You can't link from one blog post to another, or from one forum thread to another. Actually, you can, as long as you're happy typing in the URL. This restriction only exists because the wiki link markup convention doesn't apply to the blog or the forum. Again, a decent "make link" widget would completely remove this problem.

You can't sort by "largest number of comments" in the blog. This is the only real difference between the forum and the blog, except for:

You can have a "draft" blog post, but not a draft of the other two. It's a useful feature, why implement it only in one component? Many forum apps have draft functionality.

You can publish a blog post to an external web site, but not a wiki page or a forum thread. Another useful feature cruelly denied to users of 2 out of 3 components.

I don't want to appear like I am bashing Share here. In fact, we accept the arbitrary restrictions because they are a core part of the feature's identity. A wiki would not qualify as a wiki, in the minds of the Alfresco Share developers, if it completely removed the Wiki Markup. So it remains.**

The blog/forum distinction is even more subtle, because the core difference here is not even one of functionality. Simply, we are encouraged to pay more attention to the initiating post than to the comments in a blog, and to the thread as a whole in a forum. There is a whole continuum operating here. Often the first post in a forum thread is edited by the original poster to summarize discussion in the thread; likewise, some blogs are as well known for the activity in the comments section as for the original content. Political Betting being an example where the blog's authors often have to create new posts purely as a way of separating diverging threads of discussion.

Presented with a choice of site features in a system like Share, the interplay of limitations and expectations can make it hard to decide which component to use for a particular piece of information. But it seems to be important enough to us that we have a feature we can name, and assign a purpose to, that we will accept these issues anyway.***

Google Wave, to take a counterexample, has tried to approach the problem in the classic computer scientist way: by abstracting out as many of the restrictions as possible. As a result, people have struggled with working out how they are supposed to use it. Perhaps we can't win.

* Alfresco is a document management system, and Share is its "groupware" front-end. We've been trialling it as a collaborative tool as well as an improvement to the shared drive for documents. It's available under an open source license which is important for small and cheap companies like us.

**But a wiki would not qualify as a wiki, in the mind of Ward Cunningham, if (as in Share) only a restricted set of people can edit it. One for a subsequent post.

***Look at Twitter; you could have done the same thing for years in Blogger or Livejournal just by ignoring the "post body" box. But you didn't.

Friday, 13 November 2009

Regression

We are in regression this week.

This doesn't mean we behave more childishly than usual (though maybe we do), but that the developers have merged down all their features into the trunk in source control, and now they're testing whether anything has broken.

As part of that, all my topics have also moved into trunk, and I've been building the first feature-complete documents for the upcoming release, for QA to review.

A bit of an oversight in the whole source control plan is that this leaves me working on trunk for the week. This shouldn't be allowed, but they let me because fixes to documentation bugs don't need testing.

What should happen is for me to do all my reactive work on my own development branch which stays open after the rest of the product is frozen. When the release goes out, there will always be tweaks to be made which don't go with a feature, so it's good for that too.

Indexes

Compiling indexes must be one of the least-missed aspects of traditional document writing. It's the sort of thing a professor might get a grad student to do in order to break his spirit.

Nowadays we don't need to do the actual sorting and compilation, and in an environment that promotes re-use, we can make bits of content automatically produce consistent index entries. So far so good!

But as I've been going along I've realised I missed an important aspect of index writing: index entries are not tags.

Having previously used any excuse to avoid compiling an index, and as a child (or at least cousin) of Web 2.0, I hadn't really understood what they were about, and gaily added generic index terms to my topics, such as "security".

This led to my index looking like this:

security,10
security,10
security,11
security,12
security,15
security,21
security,22
security,46

No worries, I thought, I can get the style sheet to bunch up all the duplicate entries and produce something more index-like:

security,10,11,12,15,21,22,46

I've seen this in lots of books, so it must be okay, yes?

Trouble is, it's really unusable. If I look up "security" in the index, I then have to visit 8 places throughout the document to see if that page happens to mention it in the way I need. What would be better would be to provide more context:

security, 10
security, policy, 10
security, over the network, 11
and so on.

An index is used like a search engine, and since it's not interactive, such a search engine must able to tell you in advance which queries are the "good" ones, where a query is only good if it narrows down the results to just one.

After you've disambiguated everything, then, you can start bunching up entries:

security, 10; policy, 10; over the network, 11,...

This approach clearly needs to be applied throughout the topic base, so now I'm considering doing an audit of the whole thing to see which index entries are duplicated and disambiguating them - since I don't know in advance which entries are going in which document.

The exercise of writing good index entries is also helpful when writing topics, as it forces you to really think about what this topic is about, and how it distinguishes itself from all the other topics in the repository.

Tuesday, 3 November 2009

Content independence

Today's post from Tom Johnson of I'd Rather Be Writing talks about "content independence" and how a wiki can help you achieve it.

Tom's argument starts with this observation:

Authors need to update documentation after release.

This is because documents have bugs in. They have more bugs in than they might, because authors don't get time to fully check their work before release. OK so this really means

The time required to create full documentation with no bugs is too much for a project manager to include in the plan.

Why's this? Because the product is ready; we need to release it or risk being behind the market. But we spent a lot of time in QA fixing the bugs in the product; why are bugs in the documentation worth any less to the project manager? I think there are two perceptions at work:

Documentation is not absolutely essential. The product is "good enough", and will work even if nobody knows how to use it. In the worst case, we can send a support engineer over to the customer to make it work for them.
Bugs in documentation are easy to fix. Most problems with documentation don't require a complete rewrite of large chunks of material, whilst bugs in software can sometimes be traced back to bad design assumptions and lead to large refactoring projects.

Also, and because of these two:

Fixes to documentation bugs don't need testing. Because of 2., the fix has already been adequately "unit tested" by the writer, which is normally enough, and because of 1., the product won't break if we incorporate it.

And because of this:

Documentation doesn't need release management. Project managers allow authors to provide what they can with the release, and fix later.

This is where Tom comes in: working in this environment, he naturally looks to wikis and other online tools to make fixing easier. But there are environments where the core assumptions don't hold, for example:

Correct documentation may be a legal or contractual requirement. In some fields, particularly safety-critical ones, the documentation must be correct on the first try. Many clients like to see full lists of changes to documentation as a condition of acceptance. This breaks assumption 1.
The supply channels for updates may be limited. In a world of wikis, we forget that people still naturally turn to the materials provided with the product when there is a problem, if only to find out where the wiki is. Also, some products are used in situations where it's impractical to check online for changes to documentation. (This leads to situations like of "If your modem is not working, check our website for instructions".) Again, this is assumption 1 failing.
Documentation may be widely translated. A small change to a document may require weeks of sending round to multiple translation agencies to implement it. This breaks assumption 2. (Though if you don't mind some translations lagging behind the "native" documentation, you can split the task up so that it's not a problem.)

Ultimately, whether or not documentation should be release managed depends on the product and its position in the market as much as on the documentation itself; but authors often end up making this decision on their own.

In my company, as for many smaller software businesses, I think the assumptions hold, and I'm working at establishing an online knowledge base which is updated independently of the release process. I hope that it won't take me five years to get the idea past IT, though.

Friday, 30 October 2009

Proactive and reactive

Writing the previous post it occurred to me that I work in two modes most of the time:

First, I work with architects and developers, helping a feature get designed for usability, and writing as we go. My work as part of that team is part of the feature development. This is the proactive mode.
Later, I step back to view the product as a whole, and reactively describe the feature in that context. Developers can deliver a feature, say, "that's enough", and move on to the next one. I don't have that luxury; people will experience the feature as part of the complete system, and they need to know how to find it and work with it in the context of everything else the system is doing.

The "structured" / "traditional" authoring debate is a struggle between these two modes. Each style pulls in the favour of one mode, although we always need both.

Structured authoring, as exemplified by DITA, is a proactive mindset. Topics should stand alone and tell people exactly what they need right now. They are authored separately from each other, and pulling a document together is deferred until the last possible moment (and may never happen, in the case of a knowledge base.)
Traditional authoring is a reactive mindset. The product is delivered fully-formed and is documented as a unit. In the OpenOffice authoring project authors work on entire chapters, which are organized according to user needs. Despite being an important part of a huge, global open source project, it's entirely divorced from the collaborative, proactive mode which is natural for the developers of the actual product.

Personally, I like being able to work with developers, and to promote design choices which improve usability, and the reuse abilities of structured authoring allow me to maintain, on my own, a much larger information base than I would be able to if I had to manage documents individually.

But it's in the reactive mode that information gets consolidated and where, in the end, the user's experience of the product is decided for better or for worse.