Tag Archive for 'search'

Advertising on the Internet needs innovation

On the weekend, I caught up with Cameron Reilly of the Podcast network , and he was telling me about his views on monetising podcasts. It got me thinking again about those things I like to think about: how content can be monetised. Despite the growth in online advertising which is tipped to be $80 billion, I think we still have a lot more innovation to go with revenue models, especially ones that help content creators.

Advertising is a revenue stream that has traditionally enabled content-creators to monetise their products, in the absence of people paying a fee or subscription. With the Internet, content has undergone a radical changing of what it is - digital, abundant, easily copied - whilst the Internet has offered new opportunities for how advertising is done. However, the Internet has identified the fundamental weaknesses of advertising , as consumers can now control their content consumption, which allows them to ignore embedded advertising altogether. Content on the other hand, still remains in demand, but means of monetising it are slipping into a free economy which is not sustainable. I make that point to illustrate not that professional content creation is a sunset industry - but rather there’s a big market opportunity as this massive industry needs better options.

time mag

"Hey man, there’s this new thing called the Internet. Sounds pretty cool"

One of the biggest innovations in advertising (and enabled by the Internet) is of contextual search advertising. This has been popularised by Google, which now makes 98% of its $17 billion revenue from these units. This advertising dominates online advertising (40% of total) because of its pull nature, whereby key-words stated by a consumer in effect state their intention of what they are interested or would like to purchase. Whilst this is a highly efficient form of advertising, it also has its weaknesses - for example, it is not as effective outside of the search engine environment. Google makes 35% of its revenue from the adSense network , where these contextual ads are placed on peoples personal websites. Evidence from high traffic bloggers suggests they barely make enough money through this type of advertising. Another point to consider is that aspects of the Google network include significant partnership agreements like the one with AOL which accounts for 10% of Googles revenue (this is a 2005 figure which has likely changed, but Google does state in their 2007 report "Our agreements with a few of the largest Google Network members account for a significant portion of revenues derived from our AdSense program. If our relationship with one or more large Google Network members were terminated or renegotiated on terms less favorable to us, our business could be adversely affected.". AOL most recently reported for Q1 2008 half a billion dollars largely from search advertising ).

Other attempts at creating more efficient advertising which have existed for over a decade, have come in the form of profiling or behavioural tracking. However, these forms of advertising has also highlighted the growing awareness of consumer privacy being eroded, and is under heavy scrutiny by activist groups and government. Facebook is a company that is best posed to deliver new forms of advertising because of the rich profiling data it has, but it itself has faced massive backlash .

My view is that the majority of online advertising for successful individual publishers at least, has largely come from traditional approaches to advertising - a masthead blog with a sales team that uses display advertising. How effective this display advertising is is debateable with widespread banner blindness and consumer control over their content, but it would appear that this is more a case of advertisers seeing this as the least bad on the overall scale of opportunities. The fact it replicates the mass media approach of number of unique consumers viewing the content, and not the types of users, means this isn’t anything new other than being done in a digital environment.

Digital content is in need of a better monetisation system.
Targeted advertising is the most efficient form, yet consumer privacy is a growing force preventing this. What we need, is not a new advertising technology, but a new way of thinking about advertising - in a way that can help the content economy rather than riding on it without giving benefit. Contextual advertising sounds great in theory as it calculates key-word frequency of words on a website, to match it to a key word ad - but it’s proving in practice these ads are not very relevant. Yet trying to think of a smarter way to advertise, may be the wrong question - perhaps half the problem itself is advertising as a concept?

perspective

Are we running down a tunnel, only to find there is nothing there?

Content which comes in the form of news (historical and breaking), analysis, and entertainment can be monetised via a persons attention or through a transaction (ie, subscription, fee, etc). Both this approaches have different barriers.

- Attention: The key driver is increased dollars per unique person, over a period of time. The barriers to this approach is the challenge of identifying the individual in a way that gives advertising that is highly relevant and will result in a conversion. In other words, privacy privacy privacy.

- Alternative payment: Requiring consumers to pay for content is a barrier due to the paid wall. What is more problematic for digital content, is that the ability to replicate it freely makes it not just easy to do for the masses but has created a culture of if it’s not free, it’s not worth purchasing unless its really necessary. There needs to be a strong value proposition for a consumer to purchase content, and in the absense of a brand and marketing, the restriction of what value the content offers is a barrier for consumer demand as they don’t know what they are missing out on.

So as you see above, content creators are in a difficult position. Charging people reduces their opportunity unless they are really established, but even then, due to the digital environment they don’t have any control over subsequent distribution (with rampant piracy). Yet advertising is fraught with being irrelevant and hence not effective (so advertisers go to other forms) and any attempts to make it more relevant, gets held back by the concerns of privacy advocates (and rightly so). Whilst the Internet parades itself as an advertising growth machine, it’s growing in new areas but not the old areas that have traditionally been the medium for advertisers.

This advertising growth is largely being driven through utility computing products that aim to make information retrieval more efficient (ie, search). However, the growth for the content creators, is not happening. As Cam was telling me, in a market like Australia - small content organisations like TPN and Bronwen Clune ’s Norgs , don’t have access to the big end of town for a sales team. And he didn’t have to tell me, those Google ads for the smaller guys, are not enough to pay the bills. That small to middle end is not being really catered for.

But before you jump on the phone and create some mid-tier advertising network that caters for a niche, think about the real problem: content creators need a better solution to monetise their content. But advertisers also need a better way of selling, other than some slick-talking sales person who can sell ads on pageviews (a broken model with weak alternatives ) They need advertising that is suited for their product, but the market now includes other products media outlets never had to compete with like marketplaces now happening online and utility computing products. Whilst the technology community obsesses about search , let’s also remember we have yet to see a new way to monetise content that is superior to the old world. Contextual advertising of text is the latest new thing area, but that technique is nearly a decade old. As I prove above, outside of the search environment, it is showing to not be that effective.

Where is the innovation going to come from? Not through technology but with a new paradigm shift like how content creators operate . New ways of thinking about the way we ’sell’ like what the VRM Project is challenging. But perhaps more fundamentally, is an understanding that the holy grail of targeted advertising has got a speed hump called privacy - and that may actually be a sign of not going faster towards better targeting, but changing the vehicle all together.

Understand your content

I picked up a book my parents used on their recent trip to Greece, which was a guidebook of the Peloponnese. Flicking through this paper book reminded me of my thoughts of how the content business is so rife with piracy. Especially with an online world now, people can copy content - images, text, audio - and mash it up into their own creation. It seems crazy but why do people enter a business like that?

The Information Sector is not only a big money maker, but very unique as well. Yes, it can be copied and ripped off - unlike a barbie doll where its form can’t really be manipulated into a new product. However different from selling barbies, is that information products do things that are very unique in this world and extremely powerful. In my view there are four types of information product, which can be explained under the categories of data or culture.

Data

New data
A friend and aspiring politician, once said to me that “information is the currency of politics”. Reuters, the famed news organisation that supplies breaking news to media outfits across the world - derives 90% of its revenue from selling up-to-the-minute financial information to stockbrokers and the like who profit on getting information before others. New information, like what the weather will be tomorrow, loses value with time (no many care what the weather was eight days ago). But people are willing to pay a price, and a big one, to get access to this breaking news because it can help make decisions.

Old data
On the flip side, old information can be very valuable because of the ability to conduct research and analysis. Search engines effectively fit into this segment of the information economy, because they can query past news and knowledge to produce answers. Extending the weather example, being about to find out that data eight days ago along with the weather exactly one, five and ten years ago - can help you identify trends that, for example, validates the global warming theory.

Culture

Analysis
The third category of information products, I call them simply analysis because what they are is unique insight into things. We all have access to the same news for example, but it takes a smart thinker to create a prediction, by pulling the pieces together and creating new value from them. Analytical content usually gets plagiarised by students writing essays, but its also the stuff that shapes peoples perceptions in world-changing ways.

Entertainment
One of the most powerful uses of content is the way it can impact people - entertainment type content is the stuff that generates emotion in people. Emotions are a key human trait that you should keep in mind in any decision - no matter how logical someone is, the emotional self can overtake. A documentary that portrays an issue negatively, and that can generate an angry response in a person, is the stuff that can topple governments and corporations.

Not all information is equal
If you are a content creator, you need to accept that other people can copy your creation. The key is to understand what type of content you are creating, and develop a content strategy that exploits its unique characteristics.

Information products need different strategies in order to effectively monetise them. Below is a brief discussion which extends on the above to help you understand.
New data
With this type of content, the value is in the time; the quicker that information can be accessed, the more useful it is. News items (like current affairs) fit into this category. As a news consumer, I don’t care how I get my news, but I care about how quickly I can get it. It’s for this reason I no longer read newspapers, yet through various technologies like RSS and my mobile phone, that I probably consume more news than ever before.

You should sell this data based on access - the more you pay, the quicker the access. Likewise, the ability to enable multiple outputs is key - you need to be able to deliver your content to as many different places as possible: SMS, email, RSS etc. You should not discriminate on the output; the value is on the time.

If you create news breaks, why are you wasting your time on who can access that information, because of the threat that someone can copy it? If the value is in the time, who cares who copies it because by the time they republish it, its already lost value. A flash driven site like the Australian Financial Review is an example of a management that doesn’t realise this.

Old data
A recent example of action in this space is the New York Times who have recently removed their paid subscription wall, which was previously only available via subscription but now can be accessed by anyone for free. This is a smart business move, because if you are selling archived content, you will make more money by having more people know what exists. A paid wall limits people using it which decreases the opportunity for consumption: you a relying on a brand only to create demand. If you are website with a lot of historical content - restricting access is stupid because you are effectively asking people to pay for access to something that they have no idea what value it holds for them. It’s a bit like traveling - if you’ve never been overseas, you don’t know what you are missing out on. Give people a taste of the travel bug, and they will never be able to sit still.

Unlike new data where the value is based on time, old data finds value on accessibility. People will place value on things like search, and the ability to find relevant content through the mountains of content available. Here the multitude of outputs doesn’t matter, because researchers have all the time in the world. What matters is a good interface, and powerful tools to mine the data: the value is on being to find information. You shouldn’t charge people on access to the content; where you will make money is on the tools to mine the data.

Analysis
This type of content is difficult to create, but easily ripped off by other people - just think of how rife plagiarism is with schools and universities, where the latter treats plagiarism as a crime just short of murder. You can distinguish this type of content as it demonstrates the ability to offer content that is was produced from a common set on inputs that anyone could access, and creating a viewpoint that only a certain type of person could create. The value is on the unique insight.

Despite the higher intlellect to product, it unfortunately is content that is harder to capitalise on. A lot of technology blogs feel the pressure of moving into a more news style than analytical service because news is what gets eyeballs. If you are a blogger looking to make money - the new data approach above should be your strategy. But if you are a blogger trying to build your brand - do analysis. The consequence with analysis is that its harder to do, so you shouldn’t feel pressured to produce more content. I’ve noticed a trend for example, that if I post more blog postings, I will get more traffic. But on the same token, more postings puts more pressure on me, which means less quality content. Understand that the value of analysis isn’t dependent on time. Or better said, the value of analysis is not how quickly it gets pumped out and realised, but how thoroughly it gets incubated as an idea and later communicated.

The value for analysis is clarity and ability to offer new thoughts. To look at the relationship with advertising models, new data like news (discussed above) typically gets higher viewers - which works for the pageview model (the more people refreshing, the more CPMs). Analysis, on the other hand, works with the time spent model. Take advantage of the engagement you have with those types of readers, because you are cultivating a community of smart people - there can be a lot more loyalty with that type of readership.

Entertainment
My sister downloads the Chaser’s War on Everything as a podcast. She first came across them on the radio, but she now downloads the podcasts religiously. Even though I knew about the Chaser’s efforts for years in their various products, I didn’t realise they were still around. If the last few weeks, I have been noticing my friends bring up the shows they are doing. The value in this content was the ability to make people laugh, due to their unique stunts. Their brand is built because of word of mouth recommendations.

Like analysis, entertainment can be a very hard thing to generate because it relies on unique thinking. With a strong brand, people will pay for access to that content. Although it may seem that the viral spreading of funny content for free is a nightmare for a content producer trying to collect royalties, it’s actually a good thing because it entrenches the brand: more people will find out about it. The nature of entertainment, like analysis, is that it is difficult to do repeatedly. Sure people can copy your individual tricks - but they can only do so after the fact. They can’t pre-anticipate the next thing you will do; because unlike breaking news which is on how quickly you can pump out content, entertainment content requires a unique creative process to produce it.

The key with entertainment content, is to build a relationship with an audience and to sustain it. Create a predictable flow of content. Encourage people copying it, because all it does it get more people wanting to see what you come up with next. If it wasn’t for Stephen Colbert’s clips on Youtube, I would never have realised his brilliance. Not knowing he existed, means a DVD set of his shows means nothing to me (but which holds a lot of value now). The value of entertainment is to generate emotions in people repeatedly. Emotions are a powerful influence on human behaviour - master that and you can be dangerous!

Concluding thoughts
This posting only touches on the issues, but what I suggest is that creators of content need to look at what type of content they are producing, for them to exploit its unique aspects. Content represents human ideas, and content isn’t distiguished by a physical form. The theft of your content should be a given and can actually help you. Depending on what that content is, there may be natural safeguards that make it irrelevant (ie, the time value of news).

Don’t get the Semantic Web? You will after this

Prior to 2006, I had sort of heard of the Semantic Web. To be honest, I didn’t know much – it was just another buzzword. I’ve been hearing about Microformats for years, and cool but useless initiatives like XFN. However to me it was simply just another web thing being thrown around.

Then in August 2006, I came across Adrian Holovaty’s article where he argues journalism needs to move from a story-centric world to a data-centric world. And that’s when it dawned on me: the Semantic web is some serious business.

I have since done a lot of reading, listening, and thinking. I don’t profess to be a Semantic Web expert – but I know more than the average person as I have (painfully) put myself through videos and audios of academic types who confuse the crap out of me. I’ve also read through a myriad of academic papers from the W3C, which are like the times when you read a novel and keep re-reading the same page and still can’t remember what you just read.

Hell – I still don’t get things. But I get the vision, so that’s what I am going to share with you now. Hopefully, my understanding will benefit the clueless and the skeptical alike, because it’s a powerful vision which is entirely possible

1) The current web is great for humans; useless for machines
When you search for ambiguous terms, at best, search engines can algorithmically predict some sort of answer that partially answers your query. Sometimes not. But the complexity of language, is not something engineers can engineer to deal with. After all, without ambiguity of natural languages, the existence of poetry is impossible.

Fine.

What did you think when you read that? As in: “I’ve had it – fine!” which is like another way of saying ok or agreeing with something. Perhaps you thought about that parking ticket I just got – illegal parking gets you fined. Maybe you thought I am applauding myself by saying that was one fine piece of wordcraftship I just wrote, or said in another context, like a fine wine.

Language is ambiguous, and depending on the context with other words, we can determine what the meaning of the word is. Search start-up company Powerset, which is hoping to kill Google and rule the world, is employing exactly this technique to improve search: intelligent processing of words depending on context. So by me putting in “it’s a fine”, it understands the context that it’s a parking ticket, because you wouldn’t say “it’s a” in front of ‘fine’ when you use it to agree with something (the ‘ok’ meaning above).

But let’s use another example: “Hilton Paris” in Google – the worlds most ‘advanced’ search engine. Obviously, as a human reading that sentence, you understand because of the context of those words I would like to find information about the Hilton in Paris. Well maybe.

Let’s see what Google comes up with: Of the ten search results (as of when I wrote this blog posting), one was a news item on the celebrity; six were on the celebrity describing her in some shape or form, and three results were on the actual Hotel. Google, at 30/70 – is a little unsure.

Why is Paris Hilton, that blonde haired thingy of a celebrity, coming up in the search results?

Technologies like Powerset apparently produce a better result because it understands the order of the words and context of the search query. But the problem with these searches, isn’t the interpretation of what the searcher wants – but also the ability to understand the actual search results. Powerset can only interpret so much of the gazilions of words out there. There is the whole problem of the source data, no just the query. Don’t get what I mean? Keep reading. But for now, learn this lesson

Computers have no idea about the data they are reading. In fact, Google pumping out those search results is based on people linking. Google is a machine, and reads 1s and 0s – machine language. It doesn’t get human language

2) The Semantic web is about making what human’s read, machine readable
Tim Berner’s Lee, the guy that invented the World Wide Web and the visionary behind the Semantic Web, prefers to call it the ‘data web’. The current web is a web of documents – by adding this extra data to content – machines will be able to understand it. Metadata, is data about data.

A practical outcome of having a semantic web, is that Google would know that when it pulls up a web page regardless of the context of the words – it will understand what the content is. Think of every word on the web, being linked to a master dictionary.

The benefit of the semantic web is not for humans – at least immediately. The Semantic Web is actually pretty boring with what it does – what is exciting, is what it will enable. Keep reading.

3) The Semantic web is for machines to interpret, not people
A lot of the skeptics of the semantic web, usually don’t see the value of it. Who cares about adding all this extra meta data? I mean heck – Google still was able to get the website I needed – the Hilton in Paris. Sure, the other 60% of the results on that page were irrelevant, but I’m happy.

I once came across a Google employee and he asked “what’s the point of a semantic web; don’t we already enough metadata?” To some extent, he’s right – there are some websites out there that have metadata. But the point of the semantic web is so that machines once they read the information, can start thinking like how a human would and connecting it to other information. There needs to be across the board metadata.

For example, my friend Michael was recently looking to buy a car. A painful process, because there are so many variables. So many different models, different makes, different dealers, different packages. We have websites, with cars for sale neatly categorised into profile pages saying what model it is, what colour it is, and how much. (Which may I add, are hosted on multiple car sites with different types of profiles). A human painfully reads through these profiles, and computes as fast as a human can. But a machine can’t read these profiles.

Instead of wasting his (and my) weekends driving around Sydney to find his car, a machine could find it for him. So, Mike would enter his profile in – what he requires in a car, what his credit limit is, what his prior history with cars are – everything that would affect his judgement of a car. And then, the computer can query every online website with cars to match the criteria. Because the computer can interpret these websites across the board, it can evaluate and it can go back to Michael and say “this is the car for you, at this dealer – click yes to buy”.

The semantic web is about giving computers the information to be able to interpret data, so that it can do what they do really well – compute.

4) A worldwide database
What essentially Berner’s Lee envisions, is turning the entire world wide web into a database that can be queried. Currently, the web looks like Microsoft Word – one swab of text. However, if that swab of text was neatly categorised in an Excel spreadsheet, you could manipulate that data and do what you please – create reports, reorder them, filter, and do whatever until your heart is content.

At university, I was forced to do an Information Systems subject which was essentially about the theory of databases. Damn painful. I learned only two things from that course. The first thing was that my lecturer, tutor, and classmates spoke less intelligible English than a caterpillar. But the second thing was that I learned what information is and how it differs from data. I am now going to share with you that lesson, and save you three months of your life.

You see, data is meaningless. For example, 23 degrees is data. On its own, it’s useless. Another piece of data in Sydney. Again, - useless. I mean, you can think all sorts of things when you think of Sydney, but it doesn’t have any meaning.

Now put together 23 degrees and Sydney, and you have just created information. Information is about creating relationships between data. By creating a relationship, an association, between these two different pieces of data – you can determine it’s going to be a warm day in Sydney. And that is what information is: Relationship building; connecting the dots; linking the islands of data together to generate something meaningful.

The semantic web is about allowing computers to be able to query the sum of human knowledge like one big database to generate information

Concluding thoughts
You are probably now starting to freak out and think “Terminator” images with computers suddenly erupting form under your computer desk, and smashing you against the wall as a battle between humans and computers begins. But I don’t see it like that.

I think about the thousands of hours humans spend trying to compute things. I think of the cancer research, whereby all this experimentation occurring in labs, is trying to connect new pieces of data with old data to create new information. I think about computers being about to query the entire taxation legislation to make sure I don’t pay any tax, because it knows how it all fits together (having studied tax, I can assure you – it takes a lifetime to only understand a portion of tax law). In short, I understand the vision of the Semantic web as a way of linking things together, to enable computers to compute – so that I can sit on my hammock drinking my beer, as I can delegate the duties of my life to the machines.

All the semantic web is trying to do, is making sure everything is structured in a consistent manner, with a consistent dictionary behind the content, so that a machine can draw connections. As Berner’s Lee said on one of the videos I saw: “it’s all about creating links”.

The process to a Semantic Web is boring. But once we have those links, we can then start talking about those hammocks. And that’s when the power of the internet - the global network - will really take off.