Archive for September, 2007 Page 2 of 3



Don’t get the Semantic Web? You will after this

Prior to 2006, I had sort of heard of the Semantic Web. To be honest, I didn’t know much – it was just another buzzword. I’ve been hearing about Microformats for years, and cool but useless initiatives like XFN. However to me it was simply just another web thing being thrown around.

Then in August 2006, I came across Adrian Holovaty’s article where he argues journalism needs to move from a story-centric world to a data-centric world. And that’s when it dawned on me: the Semantic web is some serious business.

I have since done a lot of reading, listening, and thinking. I don’t profess to be a Semantic Web expert – but I know more than the average person as I have (painfully) put myself through videos and audios of academic types who confuse the crap out of me. I’ve also read through a myriad of academic papers from the W3C, which are like the times when you read a novel and keep re-reading the same page and still can’t remember what you just read.

Hell – I still don’t get things. But I get the vision, so that’s what I am going to share with you now. Hopefully, my understanding will benefit the clueless and the skeptical alike, because it’s a powerful vision which is entirely possible

1) The current web is great for humans; useless for machines
When you search for ambiguous terms, at best, search engines can algorithmically predict some sort of answer that partially answers your query. Sometimes not. But the complexity of language, is not something engineers can engineer to deal with. After all, without ambiguity of natural languages, the existence of poetry is impossible.

Fine.

What did you think when you read that? As in: “I’ve had it – fine!” which is like another way of saying ok or agreeing with something. Perhaps you thought about that parking ticket I just got – illegal parking gets you fined. Maybe you thought I am applauding myself by saying that was one fine piece of wordcraftship I just wrote, or said in another context, like a fine wine.

Language is ambiguous, and depending on the context with other words, we can determine what the meaning of the word is. Search start-up company Powerset, which is hoping to kill Google and rule the world, is employing exactly this technique to improve search: intelligent processing of words depending on context. So by me putting in “it’s a fine”, it understands the context that it’s a parking ticket, because you wouldn’t say “it’s a” in front of ‘fine’ when you use it to agree with something (the ‘ok’ meaning above).

But let’s use another example: “Hilton Paris” in Google – the worlds most ‘advanced’ search engine. Obviously, as a human reading that sentence, you understand because of the context of those words I would like to find information about the Hilton in Paris. Well maybe.

Let’s see what Google comes up with: Of the ten search results (as of when I wrote this blog posting), one was a news item on the celebrity; six were on the celebrity describing her in some shape or form, and three results were on the actual Hotel. Google, at 30/70 – is a little unsure.

Why is Paris Hilton, that blonde haired thingy of a celebrity, coming up in the search results?

Technologies like Powerset apparently produce a better result because it understands the order of the words and context of the search query. But the problem with these searches, isn’t the interpretation of what the searcher wants – but also the ability to understand the actual search results. Powerset can only interpret so much of the gazilions of words out there. There is the whole problem of the source data, no just the query. Don’t get what I mean? Keep reading. But for now, learn this lesson

Computers have no idea about the data they are reading. In fact, Google pumping out those search results is based on people linking. Google is a machine, and reads 1s and 0s – machine language. It doesn’t get human language

2) The Semantic web is about making what human’s read, machine readable
Tim Berner’s Lee, the guy that invented the World Wide Web and the visionary behind the Semantic Web, prefers to call it the ‘data web’. The current web is a web of documents – by adding this extra data to content – machines will be able to understand it. Metadata, is data about data.

A practical outcome of having a semantic web, is that Google would know that when it pulls up a web page regardless of the context of the words – it will understand what the content is. Think of every word on the web, being linked to a master dictionary.

The benefit of the semantic web is not for humans – at least immediately. The Semantic Web is actually pretty boring with what it does – what is exciting, is what it will enable. Keep reading.

3) The Semantic web is for machines to interpret, not people
A lot of the skeptics of the semantic web, usually don’t see the value of it. Who cares about adding all this extra meta data? I mean heck – Google still was able to get the website I needed – the Hilton in Paris. Sure, the other 60% of the results on that page were irrelevant, but I’m happy.

I once came across a Google employee and he asked “what’s the point of a semantic web; don’t we already enough metadata?” To some extent, he’s right – there are some websites out there that have metadata. But the point of the semantic web is so that machines once they read the information, can start thinking like how a human would and connecting it to other information. There needs to be across the board metadata.

For example, my friend Michael was recently looking to buy a car. A painful process, because there are so many variables. So many different models, different makes, different dealers, different packages. We have websites, with cars for sale neatly categorised into profile pages saying what model it is, what colour it is, and how much. (Which may I add, are hosted on multiple car sites with different types of profiles). A human painfully reads through these profiles, and computes as fast as a human can. But a machine can’t read these profiles.

Instead of wasting his (and my) weekends driving around Sydney to find his car, a machine could find it for him. So, Mike would enter his profile in – what he requires in a car, what his credit limit is, what his prior history with cars are – everything that would affect his judgement of a car. And then, the computer can query every online website with cars to match the criteria. Because the computer can interpret these websites across the board, it can evaluate and it can go back to Michael and say “this is the car for you, at this dealer – click yes to buy”.

The semantic web is about giving computers the information to be able to interpret data, so that it can do what they do really well – compute.

4) A worldwide database
What essentially Berner’s Lee envisions, is turning the entire world wide web into a database that can be queried. Currently, the web looks like Microsoft Word – one swab of text. However, if that swab of text was neatly categorised in an Excel spreadsheet, you could manipulate that data and do what you please – create reports, reorder them, filter, and do whatever until your heart is content.

At university, I was forced to do an Information Systems subject which was essentially about the theory of databases. Damn painful. I learned only two things from that course. The first thing was that my lecturer, tutor, and classmates spoke less intelligible English than a caterpillar. But the second thing was that I learned what information is and how it differs from data. I am now going to share with you that lesson, and save you three months of your life.

You see, data is meaningless. For example, 23 degrees is data. On its own, it’s useless. Another piece of data in Sydney. Again, - useless. I mean, you can think all sorts of things when you think of Sydney, but it doesn’t have any meaning.

Now put together 23 degrees and Sydney, and you have just created information. Information is about creating relationships between data. By creating a relationship, an association, between these two different pieces of data – you can determine it’s going to be a warm day in Sydney. And that is what information is: Relationship building; connecting the dots; linking the islands of data together to generate something meaningful.

The semantic web is about allowing computers to be able to query the sum of human knowledge like one big database to generate information

Concluding thoughts
You are probably now starting to freak out and think “Terminator” images with computers suddenly erupting form under your computer desk, and smashing you against the wall as a battle between humans and computers begins. But I don’t see it like that.

I think about the thousands of hours humans spend trying to compute things. I think of the cancer research, whereby all this experimentation occurring in labs, is trying to connect new pieces of data with old data to create new information. I think about computers being about to query the entire taxation legislation to make sure I don’t pay any tax, because it knows how it all fits together (having studied tax, I can assure you – it takes a lifetime to only understand a portion of tax law). In short, I understand the vision of the Semantic web as a way of linking things together, to enable computers to compute – so that I can sit on my hammock drinking my beer, as I can delegate the duties of my life to the machines.

All the semantic web is trying to do, is making sure everything is structured in a consistent manner, with a consistent dictionary behind the content, so that a machine can draw connections. As Berner’s Lee said on one of the videos I saw: “it’s all about creating links”.

The process to a Semantic Web is boring. But once we have those links, we can then start talking about those hammocks. And that’s when the power of the internet - the global network - will really take off.

Facebook is doing what Google did: enabling

The hype surrounding the Facebook platform has created a frenzy of hype - on it being a closed wall, on privacy and the right to users having control of their data, and of course the monetisation opportunities of the applications themselves (which on the whole, appear futile but that will change).

We’ve heard of applications becoming targeted, with one (rumoured) for $3 million - and it has proved applications are an excellent way to acquire users and generate leads to your off-Facebook website & products. We’ve also seen applications desperately trying to monetise their products, by putting Google Ads on the homepage of the application, which are probably just as effective as giving a steak to a vegetarian. The other day however was the first instance where I have seen a monetisation strategy by an application that genuinely looked possible.

It’s this application called Compare Friends, where you essentially compare two friends on a question (who’s nicer, who has better hair, who would you rather sleep with…). The aggregate of responses from your friends who have compared you, can indicate how a person sits in a social network. For example, I am most dateable in my network, and one of the people with prettiest eyes (oh shucks guys!).

The other day, I was given an option to access the premium service - which essentially analyses your friends’ responses.

compare sub

It occurred to me that monetisation strategies for the Facebook platform are possible beyond whacking Google Adsense on the application homepage. Valuable data can be collected by an application, such as what your friends think of you, and that can be turned into a useful service. Like above, they offer to tell you who is most likely to give you a good reference - that could be a useful thing. In the applications current iteration, I have no plans to pay 10 bucks for that data - but it does make you wonder that with time, more sophisticated services can be offered.

Facebook as the bastion of consumer insight

On a similar theme, I did an experiment a few months ago whereby I purchased a facebook poll, asking a certain demographic a serious question. The poll itself revealed some valuable data, as it gave me some more insight into the type of users of Facebook (following up from my original posting). However what it also revealed was the power of tapping into the crowd for a response so quickly.
clustered yes
Seeing the data come in by the minute as up to 200 people took the poll, as a marketer you could quickly gauge how people think about something in a statistically valid sample, in literally hours. You should read this posting discussing what I learned from the poll if you are interested.

It’s difficult to predict the trends I am seeing, and what will become of Facebook because a lot could happen. However one thing is certain, is that right now, it is a highly effective vehicle for individuals to gain insight about themselves - and generating this information is something I think people will pay for if it proves useful. Furthermore, it is an excellent way for organisations to organise quick and effective market research to test a hypothesis.

The power of Facebook, for external entities, is that it gives access to controlled populations whereby valuable data can be gained. As the WSJ notes, the platform has now started to see some clever applications that realise this. Expect a lot more to come.

Facebook is doing what Google did for the industry

When Google listed, a commentator said this could launch a new golden age that would bring optimism not seen since the bubble days to this badly shaken industry. I reflected on that point he made to see if his prophesy would come true one day. In case you hadn’t noticed, he was spot on!

When Google came, it did two big things for the industry

1) AdSense. Companies now had a revenue model - put some Google ads on your website in minutes. It was a cheap, effective advertising network that created an ecosystem. As of 30 June 2007, Google makes about 36% of their revenue from members in the Google network - meaning, non-Google websites. That’s about $2.7 billion. Although we can’t quantify how much their partners received - which could be anything from 20% to 70% (the $2.7 billion of course is Google’s share) - it would be safe to say Google helped the web ecosystem generate an extra $1 billion. That’s a lot of money!

2) Acquisitions. Google’s cash meant that buyouts where an option, rather than IPO, as is what most start-ups aimed for in the bubble days. In fact, I would argue the whole web2.0 strategy for startups is to get acquired by Google. This has encouraged innovation, as all parties from entrepreneurs to VC’s can make money from simply building features rather than actual businesses that have a positive cashflow. This innovation has a cumulative effect, as somewhere along the line, someone discovers an easy way to make money in ways others hadn’t thought possible.

Google’s starting to get stale now - but here comes Facebook to further add to the ecosystem. Their acquisition of a ‘web-operating system‘ built by a guy considered to be the next Bill Gates shows that Facebook’s growth is beyond a one hit wonder. The potential for the company to shake the industry is huge - for example, in advertising alone, they could roll out an advertising network that takes it a step further than contextual advertising as they actually have a full profile of 40 million people. This would make it the most efficient advertising system in the world. They could become the default login and identity system for people - no longer will you need to create an account for that pesky new site asking you to create an account. And as we are seeing currently, they enable a platform the helps other businesses generate business.

I’ve often heard people say that history will repeat itself - usually pointing to how 12 months ago Myspace was all the rage: Facebook is a fad, they will be replaced one day. I don’t think so - Facebook is evolving, and more importantly is that it is improving the entire web ecosystem. Facebook, like Google, is a company that strengthens the web economy. I am probably going to hate them one day, just like how my once loved Google is starting to annoy me now. But thank God it exists - because it’s enabling another generation of commerce that sees the sophistication of the web.

John Hagel - What do you think is the single most important question after everything is connected?

I recently was pointed to a presentation of John Hagel who is a renowned strategy consultant and author on the impact the Internet has on business. He recently joined Deloitte and Touche, where he will head a new Silicon Valley research institute. At the conference (Supernova 2007), John outlined critical research questions regarding the future of digital business that remain unresolved, which revolved around the following:

What happens after everything is connected? What are the most important questions?

I had to watch the video a few times because its not possible to capture everything he says in one hit. So I started writing notes each time, which I have reproduced below to help guide your thoughts and give a summary as you are watching the presentation (which I highly recommend).

I also have discovered (after writing these notes - damn it!) that he has written his speech (slightly different however) and posted it on his blog. I’ll try and reference my future postings on these themes here, by pinging or adding links to this posting.
Continue reading ‘John Hagel - What do you think is the single most important question after everything is connected?’