SmoothSpan Blog

For Executives, Entrepreneurs, and other Digerati who need to know about SaaS and Web 2.0.

Giant Global Graph: Do You Need A Clue?

Posted by Bob Warfield on November 29, 2007

Sir Tim, who more or less invented the World Wide Web, recently did a blog post entitled the Giant Global Graph.  It’s a long rambling post that touches on multiple themes.  It is a logical reductionist discussion that only geeks are equipped to fully understand and appreciate.  Not because it is a superior way to organize and article, but only because our minds are pitifully linear compared to more intuitive thinkers.  Opinions vary on how well these themes go together as the primary insight behind the whole article is that the notion of additional structure for the web beyond mere hyperlinks in the form of a graph is valuable and far reaching.  Let me say it again, slightly differently:  Berners-Lee is on about the idea of additional structure and content for the web beyond hyperlinks.  Hyperlinks are navigational.  They convey some meaning beyond navigation, but not much.  Perhaps the most famous is Google’s Page Rank which makes the assumption that lots of links to a page indicate the page may be of more value to a searcher than a page with few links into it.  There may be other things one can intuit by examining hyperlinks, but it’s hard.  Making it easy, and especially making it easy for computers, is what Tim Berners-Lee wants to accomplish with his Semantic Web notions.  As long as we’re layering weird but related notions into this mashup, I’d like to add one I haven’t seen the other commentators write about which is the use of what are essentially web hyperlinks (a bit more, but close) to allow computers to interact directly with one another in a practice that has been called RESTful Architecture.

It’s quite amazing, really, what’s possible with a clean, simple, and well designed architecture like the web.  The danger is that if we extend it as Sir Tim proposes, that we do so equally as elegantly.  There’s a lot out there now, and a lot of moving parts interacting with what’s out there.  Adding sand in the machinery is not helpful.  So what exactly did Sir Tim’s latest missive bring to the table?

There’s a nice historical / layered architectural view of what the Internet and World Wide Web are and how they differ.  Put simply, the Internet is the generic plumbing that lets computers talk to each other Internationally through standard protocols.  The World Wide Web is a notion of documents that users interact with over the Internet.  Both are what mathematicians call graphs, which are nothing more than nodes with connections between them.  The Internet is a graph of computers.  The World Wide Web is a graph of documents.  Again, for “graph” substitute “network of nodes with connections between them.”  Pretty easy so far, no?  Okay, we’re a third of the way through the post, and we’re going to kick things up a notch.

TBL’s next concept that he brings to the table is, “It’s not the documents, it is the things they are about which are important”.  He goes on to say this is obvious, but I don’t think it is as obvious as he thinks when you go on to consider the real ramifications of all that.  TBl wants to somehow factor out the core ideas in these documents and use those ideas to create another kind of graph, which he calls the Semantic Web or Giant Global Graph.  These core ideas become the nodes of the graph, and they link together documents and related ideas in interesting ways. 

Why?  Because computers are actually pretty lousy at reading plain English (or any other language) and figuring out what those underlying ideas are.  For examples, TBL mentions things like:

– Biologists wanting proteins, drugs, or genes.  BTW, any profession or interest area will have a big list of jargon that is peculiar to that interest area and that should be factored and graphed for any web document.

– Business People want customers, products, and sales information. 

– People in general want Social Relationship information, and that is what people refer to as the Social Graph.

You see where he is coming from?  I wasn’t trying to be insulting with my post title.  When I say, “Do you want a clue?”, I’m referring to this new graph structure as providing clues to computers about what the heck is actually on a web page so that you can use the web pages in novel ways that are hard today but very useful if you can get a fully annotated Semantic Web, er GGG.  This is not the easiest thing in the world to do, as you can imagine.  There is a heck of a lot of work involved in doing all that annotation, and a lot of it may have to be done by hand. 

However, if we are very very clever, some useful pieces may become automatic.  Take the Social Graph.  If we create our own Social Graph about our relationships with people, it may contain enough information that the web can meaningfully change how it interacts with us as regards those people.  Today, we look at it as happening in the context of a Social Network, but it should not be limited.  Why can’t I go into my address book, pick a person, and reach out with high certainty into the entire web to see as much as possible about that person?  Where are their blogs and home pages?  Which Social Networks do they belong to?  What articles quote them?  What company do they work for?  If I visit the company’s web site, wouldn’t it be cool to be able to tell who I know that works there?

A couple of things should be coming clearer now.  First, I hope you can see why many of us (and now TBL), recognize the term “Social Graph” as being separate from “Social Network.”  The Social Graph can be so much more than a particular web site focused on Social Networking.  It can literally impact every aspect of your web experience.  And it is a collection of data that is at once both very open and very private and personal.  There are pieces we want everyone to see, and pieces we want to keep entirely to ourselves.  It is a very tangled web we are weaving.  It grows and morphs constantly.  We would like to start building it once and never start over.  This is why I’ve said the Real Social Graph Hasn’t Shown Itself Yet.

Here is another way to think about the GGG or Semantic Web.  The web of today is manual and literally.  You create a concrete link between documents.  You traverse the links.  They are largely fixed and relatively inviolate.  This is a good thing.  You don’t want to lose track of a thing.  But that is only one form of navigation.  Sometimes you don’t know where the first bread crumb on the trail is.  For that, you need search tools of various kinds.  The Semantic Web can inform the search process much more fully than keywords and Page Rank.  Beyond search, we would like a living web that restructures itself as it learns.  A change in one place can ripple through this graph structure to have far reaching and beneficial effects.  Suddenly the map of the web can be personalized around your interests, knowledge levels, relationships, and needs.  That’s pretty cool!

TBL winds up with a cautionary note about control.  Each of these layers has involved some loss of control.  First we gave up the idea of private networks to get to the Internet.  Anyone can be on it, including your worst enemies, competitors, criminals, and other evil doers relative to you.  Second, the World Wide Web involved a loss of document control.  Everything went to HTML instead of native document formats.  HTML involves a lot of loss of control.  It has gotten better, but real page layout and typography afficionados cringe.  Now we’re talking about sharing that graph data.  A graph requires two components, a lock and a key.  You hold the key.  Your Social Graph is your set of friends and relationships.  The lock is the set of pages that the key unlocks.  There is cooperation on both sides.  And, as TBL points out, this loss of control doesn’t have to mean that someone can access data they have no right to access.  It is important that you maintain control.  Even though the Internet is not a private network you can still run HTTPS to encrypt the packets or even some other protocol.  We routinely trust sensitive information to the Internet these days because there are cultural patterns for how it’s done and real technology to help protect us.  These things have yet to evolve for the next graph layer that TBL wants us to construct, but it is necessary infrastructure for this all to work.

What has been the reaction of others to this? 

Umair, as usual, gets it, and points out an important pitfall to avoid: the social graph is not web 3.0 (and the converse is also important:  web 3.0 is not the social graph).  I hope from my post above it is easier to see how the Social Graph is a subset of the GGG, and how it is also different than Social Networks.  GGG is a lot bigger than just Social.

Stowe Boyd, for example, had been very anti-Social Graph, but now says he “gives”.  Boyd was right to insist on more clarity before giving, but he is still suspicious that TBL is somehow trying to hitch a free ride on Social Graphs for his Semantic Web.  On the latter Boyd is more suspicious, but I think needlessly.  I hope you can see from my notes above what its all about, this Semantic Web.  The reason Stowe sees so much more fire around Social Networking is because this is an area where users have found sufficient value to create Social Graphs for themselves.  So far, we haven’t seen much action elsewhere, but we should.

My guess is that there are others areas of sufficient interest to generate spontaneous volunteer work of the kind we see around Social Networking.  There just needs to be the right enablement.  Perhaps it will be some form or flavor of the bookmarking trend that surrounds sites like Digg.  Perhaps it will be around online retailling.  Merchants have uniform means of referring to products in the form of standardized EDI, Bar Code, SKU, and other information.  Perhaps if shopping search got dramatically better when such information was available in the GGG for a page, it would drive many to add the information.

Anne Zelenka on GigaOM takes a dim view of all this.  She feels that computers are poorly suited to understanding relationships, and that trying to shoehorn the Social Graph into the Semantic Web sells it short.  I can’t seem to find a good concrete objection or counter example in her article though, other than a vague sense of unease about it all.  She cites another of her articles that talks about the downsides of a distributed and open social graph.  My problem is I can’t see where TBL is advocating this.  In fact, I’m not sure I see where anyone is.  We all want control over our Social Graph.  Remember my analogy of the lock and key.  I’m the only one with my key.  Also go back to the example of how the Internet involved a loss of control but that various standards came into play so that privacy could still be preserved.  That has to be done for the GGG as well.  There will be lots of kinds of data there that we may not want out roaming freely.  Facebook’s tracking of what you purchase with Beacon is another great example of a GGG like Graph Structure (call it the Global Purchases Graph or GPG) that some folks are upset at losing control over.  In other words, with some maturity in the standards, the tradeoffs can be extremely palatable and do not amount to putting all that data right out in the open.  That we don’t have this today is another reason why I say nobody has yet seen the Real Social Graph or the Real GGG either for that matter.

One thing that still seems missing from many of the other commentaries I’ve read is that this GGG notion puts a lot of the value that is currently being delivered by proprietary platforms like Facebook back into the Web.  That’s a better and more open model.  It’s in the best interests of everyone to pursue that model.  In fact, GGG tries to go beyond just being a Social Graph precisely so that it becomes  general purpose means of capturing almost any kind of structural annotation and linkages around the Web’s document model.  That sort of thing can help future proof a big idea so it runs further before we find ourselves having to add a fourth and fifth layer on top of the first two that are already there.

In conclusion, I think TBL did a good job tying his vision back to some immediate realities thereby making it more concrete and touchable.  It still has quite a ways to go, but it’s great to still be in early days and not have to worry about whether Facebook and Google have a permanent and irrevocable stranglehold on all innovation.  My own little contribution is that now that we’ve tied Social Graphs into the Semantic Web, I’d like to see REST somehow get tied in.  After all, why shouldn’t we annotate rest API’s on web pages too so that they can easily be found and connected to?  It’s like putting electrical sockets in a room for future use.

Related Articles:

I guess it’s not just me seeing a connection between the GGG and REST, see Discipline and Punishment for more.

5 Responses to “Giant Global Graph: Do You Need A Clue?”

  1. […] You can read the rest of this blog post by going to the original source, here […]

  2. This post got me to subscribe to your blog. Great stuff.

  3. Very provocative. I would suggest that a most critical piece of this might be coming up with a model or analogy that is immediately accessible to the consumers of this architecture. The lock and key model might need more levels of access than just locked or unlocked, for example. I wonder if the library science people have anything to offer here?

    BTW, I am not volunteering to be your editor, but I think this sentence is not correct: “Just because the Internet is not a private network does not mean you can run HTTPS to encrypt the packets or even some other protocol.” I think you meant that you _can_ run HTTPS, etc.

    steveh
    http://www.pervasivedatarush.com

  4. smoothspan said

    Robert welcome! I’ve enjoyed your link blog since you recommended it.

    Steve, rather than go with the double negative I changed the sentence entirely.

    We will need many locks and keys. I’m working on a post that refers to what I call “Trust Shards” which tries to go into a little more detail. I’m trying to get the architectural underpinnings to make sense first, and then I’ll have a look at the user model. I’m also quite interested in thinking about which areas are like the Social Graph in that they offer enough value that people will take the time to actually do the work.

    Cheers!

    BW

  5. […] objects and links, and we let the computer navigate through them all. There’s a few others thinking along the same […]

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: