Back in 2000, the patterns, principles, and best practices for building web information systems were mostly anecdotal and folkloric. Roy Fielding’s dissertation on the web’s deep architecture provided a formal definition that we’ve been digesting ever since. In his introduction he wrote that the web is “an Internet-scale distributed hypermedia system” that aims to “interconnect information networks across organizational boundaries.” His thesis helped us recognize and apply such principles as universal naming, linking, loose coupling, and disciplined resource design. These are not only engineering concerns. Nowadays they matter to everyone. Why? Because the web is a hybrid information system co-created by people and machines. Sometimes computers publish our data for us, and sometimes we publish it directly. Sometimes machines subscribe to what machines and people publish, sometimes people do.
Given the web’s hybrid nature, how to can we teach people to make best use of this distributed hypermedia system? That’s what I’ve been trying to do, in one way or another, for many years. It’s been a challenge to label and describe the principles I want people to learn and apply. I’ve used the terms computational thinking, Fourth R principles, and most recently Mark Surman’s evocative thinking like the web.
Back in October, at the Traction Software users’ conference, I led a discussion on the theme of observable work in which we brainstormed a list of some principles that people apply when they work well together online. It’s the same list that emerges when I talk about computational thinking, or Fourth R principles, or thinking like the web. Here’s an edited version of the list we put up on the easel that day:
- Be the authoritative source for your own data
- Pass by reference not by value
- Know the difference between structured and unstructured data
- Create and adopt disciplined naming conventions
- Push your data to the widest appropriate scope
- Participate in pub/sub networks as both a publisher and a subscriber
- Reuse components and services
1. Be the authoritative source for your own data
In the elmcity context, that means regarding your own website, blog, or online calendar as the authoritative source. More broadly, it means publishing facts about yourself, or your organization, to a place on the web that you control, and that is bound in some way to your identity.
Why?
To a large and growing extent, your public identity is what the web knows about your ideas, activities, and relationships. When that knowledge isn’t private, your interests are best served by publishing it to online spaces that you control and use for the purpose.
Related
Mastering your own search index, Hosted lifebits
2. Pass by reference rather than by value
In the case of calendar events, you’re passing by value when you send copies of your data to event sites in email, or when you log into an events site and recopy data that you’ve already written down for yourself and published on your own site.
You’re passing by reference when you publish the URL of your calendar feed and invite people and services subscribe to your feed at that URL.
Other examples include sending somebody a link to an article instead of a copy of the article, or uploading a file to DropBox and sharing the URL.
Why?
Nobody else cares about your data as much as you do. If other people and other systems source your data from a canonical URL that you advertise and control, then they will always get data that’s as timely and accurate as you care to make it.
Also, when you pass by reference you’re enabling reuse (see 7 below). The resources you publish can be recombined, by you and by others, with other resources published by you and by others.
Finally, a canonical URL helps you measure how the web reacts to your data. If the URL is cited elsewhere you can discover those citations, and you can evaluate the context that surrounds them.
Related
The principle of indirection, Hyperlinks matter
3. Know the difference between unstructured and structured data
When you create an events page on your website, and the calendar on that page is an HTML file or a PDF file, you’re posting unstructured data. This is information that people can read and print, and it’s fine for that purpose. But it’s not data that networked computers can process.
When you publish an iCalendar feed in addition to your HTML- or PDF-based calendar, you’re publishing data that machines can work with.
Perhaps the most familiar example is your blog, if you have one. Your blog publishing software creates an HTML page for people to read. But at the same time it creates an RSS or Atom feed that enables feedreaders, or blog aggregation services, to automatically collect your entries and merge them with entries from other blogs.
Why?
When you publish an iCalendar feed in addition to your HTML- or PDF-based calendar, you’re publishing data that machines can work with.
The web is a human/machine hybrid. If you contribute data in formats useful only to people, you sacrifice the network effects that the machines can promote. If you also contribute in formats the machines understand, they can share your stuff amongst themselves, convey it to more people than you can reach through word-of-mouth human networks, and enable hybrid human/machine intelligence to work with it.
Related
The laws of information chemistry, Developing intuitions about data
4. Create and adopt disciplined naming conventions
When people publish calendars into elmcity hubs, they can assign unique and meaningful URLs and/or tags to each event they publish. And they can collaborate with curators of hubs to use tag vocabularies that define virtual collections of events.
The same strategies work in all web contexts. Most familiar is the first order of business at every conference attended by web thinkers: “The tag for this conference is ______.” When people agree to use common names in shared data spaces, effects like aggregation, routing, and targeted search require no special software.
Why?
The web’s supply of unique names (e.g., URLs, tags) is infinite. The namespace that you can control, by choosing URLs and tags for the things you post, is smaller but still infinite. Web thinkers use thoughtful, rigorous naming conventions to manage their own personal information and, at the same time, to enable network effects in shared data spaces.
Related
Heds, deks, and ledes, The power of informal contracts, Permalinks and hashtags for city council agenda items, Scribbling in the margins of iCalendar
5. Push your data to the widest appropriate scope
When you speak in electronic spaces you can address audiences at varying scopes. An email message addresses one or several people; a blog post on a company intranet can address the whole company; a blog post on the public web can address the whole world. Web thinkers know that keystrokes invested to capture and transmit knowledge will pay the highest dividends when routed to the widest appropriate scope.
The elmcity example: a public calendar of events can be managed in what is notionally a personal calendar application, say, Google Calendar or Outlook, but one that can post data to a public URL.
For bloggers, this principle governs the choice to explain what you think, learn, and do on your public blog (when appropriate) rather than in private communication.
Why?
Unless confidentiality precludes the choice, web thinkers prefer shared data spaces to private ones because they enable directed or serendipitous discovery and ad-hoc collaboration.
Related
Too busy to blog? Count your keystrokes
6. Participate in pub/sub networks as both a publisher and a subscriber
Our everyday calendar programs are, in blog parlance, both feed publishers and feed readers. Individuals and organizations can publish their own feeds to the web of calendar data while at the same time subscribing to others’ feeds. On a larger scale, an elmcity hub subscribes to a set of feeds, and in turn publishes a feed to which other individuals (or hubs) can subscribe.
Why?
The blog ecosystem is the best example of pub/sub syndication among heterogeneous endpoints through intermediary services. Similar effects can happen in social media, and they happen in ways that people find easier to understand, but they happen within silos: Facebook, Twitter. Web thinkers know that standard protocols and formats enable syndication that crosses silos and supports the most open kinds of collaboration.
Related
Personal data stores and pub/sub networks
7. Reuse components and services
In the elmcity context, calendar programs are used in several complementary ways. They combine personal information management (e.g., keeping track of your own organization’s public calendar) with public information management (e.g., publishing the calendar).
In another sense they serve the needs of humans who read those calendars on the web while also supporting mechanical services (like elmcity) that subscribe to and syndicate the calendars.
In general, a reusable web resource is:
- Effectively named
- Properly structured
- Densely interconnected (linked) both within and beyond itself
- Appropriately scoped
Why?
The web’s “small pieces loosely joined” architecture echoes what in another era we called the Unix philosophy. Web thinkers design reusable parts, and also reuse such parts where possible, because they know that the web both embodies and rewards this strategy.
Related
How will the elmcity service scale? Like the web!, How to manage private and public calendars together
The master of online living and organisation speaks again. We would do well to listen and perhaps even adopt a point or two. It would make many things more streamlined.