A shared vocabulary for content

When I sat down to talk with Jeff Eaton (who was originally slated to by my editor, but was distracted by work commitments), we had no particular subject in mind. Our ramblings took us in several directions, starting with the simple one of vocabulary.

Jeff Eaton

Jeff Eaton

Digital Strategist at Lullabot

Twitter: @eaton


Jeff: I have to say, you cover a lot of interesting stuff in your book. But probably one of the most challenging things you’ve tackled there was about working on the shared vocabulary.

Rick: Kind of appropriate, given that the first book in the series was The Language of Content Strategy.

Jeff: Maybe I haven’t been involved in the larger content management world as long, but it feels like one of the biggest challenges. It’s like little pockets and islands of disciplines have their own vocabularies for stuff, but there’s no real cross-disciplinary language for so many things. We are all using the same terms to describe completely different aspects of, and at different layers within, the wider ecosystem.

Rick: Oh, yes!

Jeff: That’s not just me?

Rick: Not at all. I’ve seen – I’ve been part of – teams that don’t have a shared vocabulary.

When you get a team together, everyone starts talking. But too often, people forget to ensure they are being properly understood. They use words from previous projects, assuming everyone understand them in the same way.

Jeff: Because these terms are self-evident when you’re used to them.

Rick: Of course they are. Then over the course of the project, the clarity comes out.

It’s not unusual for this miscommunication to last half a project. The bigger and more complex it is, the more jargon will be involved.

There is even a pair of terms I use in the book, which I described as “almost-synonyms” (p10). I’ve since written and article (it will go on my company site), which basically calls me out on that. They are not the same thing. Indeed, it is the distinction between them that is so important.

Jeff: “Past me, you are wrong!”

Rick: Various parts of the web community have been arguing over this since the Stone Age… what are the correct meanings of the terms “content” and “information”? I finally figured it out.

Jeff: I have a working definitions of those I use for myself. For starters, there are three elements.

  • Data… this is like, one. The number one, floating without context.
  • Information could be one dollar, one week, one person. Information is the typing of data; giving it scale and scope.
  • Content is when you add the concept of audience; when you identify a message to be communicated. One week doesn’t necessarily convey any meaning to anyone, until you give it context, such as: “Your appointment’s in one week.” It’s the wrapping of language around information, to put it in context, which makes it content.

It’s still hand-wavy, but that’s how I tend to think about it.

And – this is something you talk about in the book – it captures the idea that the concept of communication is inherent to content. It isn’t necessarily there for information and data.

Rick: I used to think along those same lines: data, then information, then content.

But now, I’ve changed my definitions. And I think they make things even clearer.

I agree, we have to start with data. But instead of saying data is one, I define data as anything that is in any way identifiable or measurable. It is a value, inherent in its existence. We don’t need to record it. Basically, everything in the universe is data.

Jeff: Yeah. The universe is a giant pile of data.

It’s a bold definition; I’ll say that. I like it. It makes sense. The data is there, even if we don’t bother to capture it.

Rick: Exactly. And technically, when we record it, we actually have metadata: a recording of a pre-existing datum. But let’s not get into that recursive loop.

It’s the next part of the definition where I diverge from you. The layer that you called information, I want to name something else, because there is a better definition of the term information. It’s data given meaning, understandability. A possible term – I’m not settled on it yet – would be significance.

The collection of all that stuff we have available, I refer to as the content pool.

Relative meaning of information and contentFrom James Gleick’s book, The Information, we learn that information is contextual. It depends on the audience. The audience determines what constitutes information, and what noise.

Jeff: The same raw stuff could be noise to one person, and valuable information to another?

Rick: Exactly.

Information is based on what I want to know. As the recipient, I determine if something is information, or not. Does it inform me? That will determine what I consider to be information.

It may be information the first time you tell me, but noise the second. Because I already know it.

Jeff: Alright.

Rick: And content – derived from Noz Urbina and Rahel Bailie’s book Content Strategy – is… the stuff you actually deliver.

Jeff: Like the fully baked form of information, as targeted to a person in context.

Rick: Not quite.

You want your delivered content to be a perfect match for the information your audience wants…

Jeff: … if we lived in a world where we could create infinite permutations of all the stuff in the content pool…

Rick: Yes. But basically, you deliver content. Whatever you deliver is content, whether it qualifies as information or not.

The diagram basically sums up what is data, what is content, what is information, what’s the overlap, and what is noise.

Jeff: Oh, so the delivered content and the provided information …. That’s interesting: the idea of the overlap between what they care about and what’s delivered being the actual information. That’s why tons of stuff could be delivered, but people feel it’s all chaff. What they cared about – what they’re looking for – isn’t there.

Rick: Precisely.

Jeff: Interesting. That’s good; I like it. I’m going mull on that for a while.

Rick: Let’s look at another example; one we touched on in our discussion at Congility earlier this year. We were talking about structured content, and how that term is vague to many people, even those using structured content all the time.

Jeff: It’s the term that scares them, not the reality of it.

Rick: Noz told the story of going into a university, to try to convince them to embrace structured content. The people there told him he was wasting his time; they’d been hammering away at it for months, and no one would budge. So he simply mapped it back to what the professors were doing when they write papers: a title, a description, a method, a discussion…

Jeff: But they just weren’t using the term “structured content.”

Rick: Exactly. Once they understood what the term meant, they were on board.

Jeff: And they’re actually really good at it.

Rick: It’s fundamental to everything you do.

Jeff: An interesting bit of terminology from the book is the concept of self-aware content (p86). What I took away from that was the idea that content with enough structure and effective metadata, combined with business rules and such, can essentially adapt itself to a lot of different environments, or be suitably repurposed.

I’ve actually gotten into arguments with designers about this very thing. They insist that content can’t be intelligent. Content is inert. It’s something that has to be combined with the intentionality of design decisions, et cetera.

Rick: To some extent, it’s a semantic argument.

Jeff: I’m curious what you think about that line.

Rick: I think… Asimov.

Jeff: I think I know the quote you’re going to bust out here, but go on.

Rick: “Any sufficiently advanced technology is indistinguishable from magic.”

Jeff: Yep.

Rick: Whether it’s a bunch of business rules run against the evolving content set, or something within each individual piece of content in the set, determining when some event is supposed to trigger, or how to recombine the content for a new use… Honestly, who gives…?

The result is the same. So we just call it self-aware content, because unless you are in there writing the code, you can’t tell the difference.

Jeff: Who cares if the smarts are in an XML file, or in an app that processes things, or in dynamic templates or whatever… If the system is robust enough to do these things, you can call it intelligent.

Rick: Exactly.

Of course, sometimes you have to wonder at the intelligence of the people you are communicating with. Sometimes, they don’t understand what you think are the most fundamental terms and concepts. In the book (p84), I mentioned some functionality for my own site: the ability to insert a reference to a person, and have it dynamically pull the person’s details into the content when it serves up the page.

The first supplier I went to, to get this system built, somehow read the spec as meaning that when I created the reference, it would import a static snapshot of the person detail.

Jeff: That’s the opposite of what you asked for.

Rick: Right. And also, because it’s in a hidden div, when the reference was deleted, the person content stayed there, hidden. So it ended up with ever more noise.

Jeff: Argh.

So, to wrap up here…

If there was one commonly held view in the world of content that you could just snap your fingers and make vanish – make everyone forget it had ever existed – what would that thing be?

Rick: That’s a really tough question.

Probably, it would be their misunderstanding of the term “system.”

Jeff: They imagine it is a product, or a server, or a CMS.

Rick: Exactly. They see two platforms as distinct systems that they need to integrate. They are trying to make these talk to each other. But actually, what they need to understand is that they have one system.

The system is not the platform or the environment. The system is the language the boxes use to talk to each other; the means of communication between all the parts.

Jeff: I like that. It doesn’t imply a single canonical place that magically holds everything. It means everyone’s running off the same playbook. They speak the same language.

Rick: Just as when we are trying to design and implement our systems, we need a shared vocabulary… a system is a shared vocabulary, and grammar.

Every part can talk to every other part. Who holds what becomes irrelevant. And, if you don’t like how any particular part of your stack is implemented, you can swap it out. All you need is for the replacement part to speak the same language, to integrate into the system.

In theory, this would let you do a phenomenally simple content migration. All you need do is plug the new platform into the existing system, back your content up to it, switch it over to be the primary home, and you’re done. Because everything speaks the same language – because it is one system – it becomes seamless.

Jeff: What a magical vision that is!