Archive for August, 2007

On the future of search

Robert Scoble has put together a video presentation on how Techmeme, Facebook and Mahalo will kill Google in four years time. His basic premise is that SEO’s who game Google’s algorithm are as bad as spam (and there are some pissed SEO experts waking up today!). People like the ideas he introduces about social filtering, but on the whole - people are a bit more skeptical on his world domination theory.

There are a few good posts like Muhammad’s on why the combo won’t prevail, but on the whole, I think everyone is missing the real issue: the whole concept of relevant results.

Relevance is personal

When I search, I am looking for answers. Scoble uses the example of searching for HDTV and makes note of the top manufacturers as something he would expect at the top of the results. For him - that’s probably what he wants to see - but for me, I want to be reading about the technology behind it. What I am trying to illustrate here is that relevance is personal.

The argument for social filtering, is that it makes it more relevant. For example, by having a bunch of my friends associated with me on my Facebook account, an inference engine can determine that if my friend called A is also friends with person B, who is friends with person C - than something I like must also be something that person C likes. When it comes to search results, that sort of social/collaborative filtering doesn’t work because relevance is complicated. The only value a social network can provide is if the content is spam or not - a yes or no type of answer - which is assuming if someone in my network has come across this content. Just because my social network can (potentially) help filter out spam, doesn’t make the search results higher quality. It just means less spam results. There is plenty of content that may be on-topic but may as well be classed as spam.

Google’s algorithm essentially works on the popularity of links, which is how it determines relevance. People can game this algorithm, because someone can make a website popular to manipulate rankings through linking from fake sites and other optimisations. But Google’s pagerank algorithm is assuming that relevant results are, at their core, purely about popularity. The innovation the Google guys brought to the world of search is something to be applauded for, but the extreme lack of innovation in this area since just shows how hard it is to come up with new ways of making something relevant. Popularity is a smart way of determining relevance (because most people would like it) - but since that can be gamed, it no longer is.

The semantic web

I still don’t quite understand why people don’t realise the potential for the semantic web, something I go on about over and over again (maybe not on this blog - maybe it’s time I did). But if it is something that is going to change search, it will be that - because the semantic web will structure data - moving away from the document approach that webpages represent and more towards the data approach that resembles a database table. It may not be able to make results more relevant to your personal interests, but it will better understand the sources of data that make up the search results, and can match it up to whatever constructs you present it.

Like Google’s page rank, the semantic web will require human’s to structure data, which a machine will then make inferences - similar to how Pagerank makes inferences based on what links people make. However Scoble’s claim that humans can overtake a machine is silly - yes humans have a much higher intellect and are better at filtering, but they in no way can match the speed and power of a machine. Once the semantic web gets into full gear a few years from now, humans will have trained the machine to think - and it can then do the filtering for us.

Human intelligence will be crucial for the future of search - but not in the way Mahalo does it which is like manually categorising pieces of paper into a file cabinet - which is not sustainable. A bit like how when the painters of the Sydney harbour bridge finish painting it, they have to start all over again because the other side is already starting to rust again. Once we can train a machine that for example, a dog is an animal, that has four legs and makes a sound like “woof” - the machine can then act on our behalf, like a trained animal, and go fetch what we want; how those paper documents are stored will now be irrelevant and the machine can do the sorting for us.

The Google killer of the future will be the people that can convert the knowledge on the world wide web into information readeable by computers, to create this (weak) form of artificial intelligence. Now that’s where it gets interesting.

Google: the ultimate ontology

A big issue with the semantic web is ontologies - the use of consistent definitions to concepts. For those that don’t understand what I’m talking about - essentially, the next evolution of the web is about making content readable by not just humans but also machines. However for a machine to understand something it reads, it needs consistent definitions. Human’s for example, are intelligent - they understand that the word “friend” is also related to the word “acquaintance”, but a computer would treat them to mean two different things. Or do they?

Just casually looking at some of my web analytics, I noticed some people landed on my site by doing a google search for how many acquaintances do people have, which took them to a popular posting of mine about how many friends people have on facebook. I’ve had a lot of visitors because of this posting, and its been an interesting case study for me on how search engines work. However today was something different from other times: I found the word acquaintance weird. I know I didn’t use that word in my posting - and when I went to the Google cache I realised something interesting: because someone linked to me using that word, the search engine replaced the word ‘friend’ with ‘acquaintances’.

acquaintances

Google’s linking mechanism is one powerful ontology generator.

BarCampSydney2

Things I learned at this BarCamp

  • It was a very different crowd from the first one.
  • It’s so easy to network - it was as difficult as breathing in, breathing out! I gave a presentation, and as a consequence, I had people throughout the day approach me and introduce themselves.
  • In the morning, collaboration was a bit of a hot theme. John Rotenstein from Atlassian asked the question of how do people define collaboration: “when two or more people work together on a business purpose”, was my answer. We agreed. Everyone else, kind of didn’t.
  • How to raise money - was the afternoon’s theme. Great points were brought up by Marty Wells, Mike Canon-Brookes and Dean McEvoy who led the discussion.
  • Some things mentioned:
  1. Aussie VC’s lead you on. “Nice idea- let’s keep in touch” is their way of not burning bridges
  2. VC’s work in a cycle that are in five or so year cycles - raise money at the beginning of the cycle
  3. Rule of thumb: give 30% away on the first round, 30% on the second round
  4. Advisor’s that give out Comet grants work on a 2% commission of future venture capital that you raise.
  5. No one understands the advertising market - everyone in the room wanted something they could read to learn more (check back here soon - I promise!). For example, Google’s adwords programme is largely supported by the property market - the mortgage lending market that is affected by the current credit crisis, is going to affect start-ups relying on adsense as the money drops out of these ads.
  • I met Jan Devos, who randomly approached me and blew me away with what he has done in his life. Basically (and from the age of 17), he created an implementation of the MPEG4 compression technology (for non-tech readers - MP4 as opposed to the older MP3) and he licenses out the technology to major consumer appliance companies like Samsung, who incorporate the technology into their products.
  • I met Dave O’Flynn - self-described as a “tall Irish red-head” developer; Matt June - a former Major in the Australian military, and now pursuing a project based around social innovation; I discovered Rai of Tangler is a commitmentphobe; Mick thinks he can skip most of BarCamp because he thinks organising a wedding is so hard; Mike Canon-Brookes over beer revealed he is a Mark Zuckerberg wannabe; and Christy Dena one of the lead (un)organisers of the conference looks completely different from the person I thought she was!

I got a positive reaction to my half hour session on five lessons I have learned on successful intrapreneurship due to a large internal project I started at my employer, with people throughout the day getting into a chat with me about it. Richard Pendergast, who is starting a online parenting site, said he was going to write a blog on one the points with his own personal battle of creating credibility. Glad I helped! I said to him I was going to blog what I talked about it so we could turn it into a discussion, but I have decided, this exam I have to sit in 12 8 days might need to start getting my attention. Anyway, here were the five points I made, however given the discussion during the session by everyone, is a very rough framework as people brought up some great points when talking:

1) It is a lot easier to seek forgiveness, than permission when doing something in an organisation. Or in other words, just do it.

2) Be proactive, never reactive. By pushing the agenda, you are framing the agenda for something that works for your project. Once you start reacting to others, your idea will die.

3) The more you let go - the bigger your idea will get. Use other people to achieve your vision. Give other people a sense of ownership in it. Let them take credit.

4) It’s all about perception. It’s amazing how much credibility you can build by simply associating your idea to other things - and which in the process, builds your own personal brand to push through with more later on.

5) Hype build hype. Get people excited, and they will carry your idea forward. People get excited when you communicate the potential, and have them realise it.

Thank you to all those involved - both the organisers and the contributors - and I look forward to the next one.