Tag Archive for 'data'

Information overload: we need a supply side solution

About a month ago, I went to a conference filled with journalists and I couldn’t help but ask them what they thought about blogs and its impact on their profession. Predictably, they weren’t too happy about it. Unpredictably however, were the reasons for it. It wasn’t just a rant, but a genuine care about journalism as a concept – and how the blogging “news industry” is digging a hole for everyone.

Bloggers and social media are replacing the newspaper industry as a source of breaking news. What they still lack, is quality – as there have been multiple examples of blogs breaking news that in the rush to publish it, turns out it was in fact fallacious . Personally, I think as blogging evolves (as a form of journalism) the checks and balances will be developed – such as big names blogs with their brands, effectively acting like a traditional masthead. And when a brand is developed, more care is put into quality.

Regardless, the infancy of blogging highlights the broader concern of “quality”. With the freedom for anyone to create, the Information Age has seen us overload with information despite our finite ability to take it all in. The relationship between the producer of news and consumer of news, not only is blurring – but it’s also radically transforming the dynamics that is impacting even the offline world.

Traditionally, the concept of “information overload” has been relegated as a simple analysis of lower costs to entry as a producer of content (anyone can create a blog on wordpress.com and away you go). However what I am starting to realise, is the issue isn’t so much the technological ability for anyone to create their own media empire, but instead, the incentive system we’ve inherited from the offline world.

Whilst there have been numerous companies trying to solve the problem from the demand side with “personalisation” of content (on the desktop , as an aggregator , and about another 1000 different spins), what we really need are attempts on the supply side, from the actual content creators themselves.

info overload

Too much signal, can make it all look like noise

Information overload: we need a supply side solution
Marshall Kirkpatrick , along with his boss Richard McManus , are some of the best thinkers in the industry. The fact they can write, makes them not journalists in the traditional sense, but analysts with the ability to clearly communicate their thoughts. Add to the mix Techcrunch don Michael Arrington , and his amazing team – they are analysts that give us amazing insight into the industry. I value what they write; but when they feel the stress of their industry to write more, they are not only doing a disservice to themselves, but also to the humble reader they write to. Quality is not something you can automate – there’s a fixed amount a writer can do not because of their typing skills but because quality is a factor of self-reflection and research.

The problem is that whilst they want, can and do write analysis – their incentive system is biased towards a numbers system driven by popularity. The more people that read and the more content created (which creates more potential to get readers) means more pageviews and therefore money in the bank as advertisers pay on number of impressions. The conflict of the leading blogs churning out content , is that their incentive system is based on a flawed system in the pre-digital world, which is known as circulation offline, and is now known as pageviews online.

A newspaper primarily makes money through their circulation: the amount of physical newspapers they sell, but also the audited figures of how many people read their newspaper (readership can have a factor of up to three times the physical circulation ). With the latter, a newspaper can sell space based on their proven circulation: the higher the readership, the higher the premium. The reason for this is that in the mass media world, the concept of advertising was about hitting as many people as possible. I liken it to the image of flying a plane over a piece of land, and dropping leaflets with the blind faith that of those 100,000 pamphlets, at least 1000 people catch them.

It sounds stupid why an advertiser would blindly drop pamphlets, but they had to: it was the only way they could effectively advertise. For them to make sales, they need the ability to target buyers and create exposure of the product. The only mechanism available for this was the mass media as it was a captured audience, and at best, an advertiser could places ads on specialist publications hoping to getter better return on their investment (dropping pamphlets about water bottles over a desert, makes more sense than over a group of people in a tropical rainforest). Nevertheless, this advertising was done on mass – the technology limited the ability to target.

catch the advert

Advertising in the mass media: dropping messages, hoping the right person catches them

On the Internet, it is a completely new way to publish. The technology enables a relationship with a consumer of content, a vendor, a producer of content unlike anything else previously in the world. The end goal of a vendor advertising is about sales and they no longer need to drop pamphlets – they can now build a one on one relationship with that consumer. They can now knock on your door (after you’ve flagged you want them to), sit down with you, and have a meaningful conversion on buying the product.

“Pageviews” are pamphlets being dropped – a flawed system that we used purely due to technological limitations. We now have the opportunity for a new way of doing advertising, but we fail to recognise it – and so our new media content creators are being driven by an old media revenue model.

It’s not technology that holds us back, but perception
Vendor Relationship Management or (VRM) is a fascinating new way of looking at advertising, where the above scenario is possible. A person can contain this bank of personal information about themselves, as well as flagging their intention of what products they want to buy – and vendors don’t need to resort to advertising to sell their product, but by building a relationship with these potential buyers one on one. If an advertiser knows you are a potential customer (by virtue of knowing your personal information – which might I add under VRM, is something the consumer controls), they can focus their efforts on you rather than blindly advertising on the other 80% of people that would never buy their product). In a world like this, advertising as we know it is dead because we know longer need it.

VRM requires a cultural change in our world of understanding a future like this. Key to this is the ability for companies to recognise the value of a user controlling their personal data is in fact allowing us new opportunities for advertising. Companies currently believe by accumulating data about a user, they are builder a richer profile of someone and therefore can better ‘target’ advertising. But companies succeeding technologically on this front, are being booed down in a big way from privacy advocates and the mainstream public. The cost of holding this rich data is too much. Privacy by obscurity is no longer possible, and people demand the right of privacy due to an electronic age where disparate pieces of their life can be linked online

One of the biggest things the DataPortability Project is doing, is transforming the notion that a company somehow has a competitive advantage by controlling a users data. The political pressure, education, and advocacy of this group is going to allow things like VRM. When I spoke to a room of Australia’s leading technologists at BarCamp Sydney about DataPortability, what I realised is that they failed to recognise what we are doing is not a technological transformation (we are advocating existing open standards that already exist, not new ones) but a cultural transformation of a users relationship with their data. We are changing perceptions, not building new technology.

money on the plate

To fix a problem, you need to look at the source that feeds the beast

How the content business will change with VRM
One day, when users control their data and have data portability, and we can have VRM – the content-generating business will find a light to the hole currently being dug. Advertising on a “hits” model will no longer be relevant. The page view will be dead.

Instead, what we may see is an evolution to a subscription model. Rather than content producers measuring success based on how many people viewed their content, they can now focus less on hits and more on quality as their incentive system will not be driven by the pageview. Instead, consumers can build up ‘credits’ under a VRM system for participating (my independent view, not a VRM idea), and can then use those credits to purchase access to content they come across online. Such a model allows content creators to be rewarded for quality, not numbers. They will need to focus on their brand managing their audiences expectations of what they create, and in return, a user can subscribe with regular payments of credits they earned in the VRM system.

Content producers can then follow whatever content strategy they want (news, analysis, entertainment ) and will no longer be held captive by the legacy world system that drives reward for number of people not types of people.

Will this happen any time soon? With DataPortability, yes – but once we all realise we need to work together towards a new future. But until we get that broad recognition, I’m just going to have to keep hitting “read all” in my feed reader because I can’t keep up with the amount of content being generated; whilst the poor content creators strain their lives, in the hope of working in a flawed system that doesn’t reward their brilliance.

Here’s a secret: the semantic web is the boring bit

Marshall Kirkpatrick caused a wave today, when he gave a brutally honest assessment of one of the most talked up semantic web applications, Twine. It was as per usual, an excellent analysis by Marshall and I don’t think he needs to hide behind his words as they are fair. However, what I think is crucial is now that the semantic web is gaining traction into the mainstream from a academic thesis to real world web applications, is we do a little bit of stakeholder management.

Ready? The semantic web is as boring as bat shit.

Essentially, the semantic web is about structuring content in a way so that computers can interpret the information. It’s a bit like linking every word on the web, to a dictionary entry so that computers understand the language that humans use.

But seriously, how is that exciting? People don’t get the semantic web, because it’s the fundamentals - and thats boring! Take for example RDF, the semantic web building block, and which is about structuring data into subject, predicate and object. This is straight from primary school grammar lessons, where we learn about the fundamentals of the English language (no coincidence I just linked to an grammar guide, not the RDF guide). And if you have heard of subject, predicate and object before in the context of the semantic web, you probably didn’t even realise it’s how the entire English language is based. It’s because you probably did learn it, and forgot - it’s boring as bat shit. But damn, without them, we wouldn’t be communicating right now to each other.

The point I want to make, is that the building blocks are not where the excitement: the excitement, is what you can do once we have those building blocks. In English, we have poetry, literature, and just language in general where we communicate as human beings. Once we get the basics down of information, we are laying the foundation of a whole new world of computational possibilities. Marshall is spot on in saying “…semantics may be best suited to the back end…” because the excitement is what they enable, not the actual semantics itself which is going to take a long time to build up.

Imagine, the sum of human knowledge accessible by a computer to query? Semantic web applications are boring and you won’t ever get them - but what they enable, is a whole new world of potential which once we can flick the switch, will mean a world we will barely recognise from today’s standpoint.

DataPortability is about user value, fool!

In a recent interview, VentureBeat asks Facebook creator and CEO Mark Zuckerberg the following:

VB: Facebook has recently joined DataPortability.org, a working group among web companies, that intends to develop common standards so users can access their data across sites. Is Facebook going to let users — and other companies — take Facebook data completely off Facebook?

MZ: I think that trend is worth watching.

It disappoints me to see that, because it seems like a quick journalists hit at a contentious issue. On the other hand, we have seen amazing news today which are examples of exactly the type of thing we should be expecting in a data portability enabled world: the Google contacts API which has been a thing we have highlighted for months now as an issue for data security and Google analytics allowing benchmarking which is a clear example of a company that understands by linking different types of data you generate more information and therefore value for the user. The DataPortability project is about trying to advocate new ways of thinking, and indeed, we don’t have to formally produce a product in as much maintain the agenda in the industry.

However the reason I write this is that it worries me a bit that we are throwing around the term “data portability” despite the fact the DataPortability Project has yet to formally define what that means. I can say this because as a member of the policy action group and the steering action group which are responsible for making this distinction, we have yet to formally decide.

Today, I offer an analysis of what the industry needs to be talking about, because the term is being thrown around like buggery. Whilst it may be weeks or months before we finalise this, it’s starting to bother me that people seem to think the concept means solving the rest of the world’s problems or to disrupt the status quo. It’s time for some focus!

Value creation
First of all, we need to determine why the hell we want data portability. DataPortability (note the distinction of the term with that of ‘data portability’ - the latter represents the philosophy whilst the former is the implementation of that philosophy by DataPortability.org) is not a new utopian ideal; it’s a new way of thinking about things that will generate value in the entire Information sector. So to genuinely want to create value for consumers and businesses alike, we need to apply thinking that we use in the rest of the business world.

A company should be centered on generating value for its customers. Whilst they may have obligations to generate returns for their shareholders, and may attempt different things to meet those obligations; they also have an obligation to generate shareholder value. To generate shareholder value, means to fund the growth of their business ultimately through increased customer utility which is the only long term way of doing so (taking out acquisitions and operational efficiency which are other ways companies generate more value but which are short term measures however). Therefore an analysis of what value DataPortability creates should be done with the customer in mind.

The economic value of a user having some sort of control over their data is that they can generate more value through their transactions within the Information economy. This means better insights (ie, greater interoperability allowing the connection of data to create more information), less redundancy (being able to use the same data), and more security (which includes better privacy which can compromise a consumers existence if not managed).

Secondly, what does it mean for a consumer to have data portability? Since we have realised that the purpose of such an exercise is to generate value, questions about data like “control”, “access” and “ownership” need to be reevaluated because on face value, the way they are applied may have either beneficial or detrimental effects for new business models. The international accounting standards state that you can legally “own” an asset but not necessarily receive the economics benefits associated with that asset. The concept of ownership to achieve benefit is something we really need to clarify, because quite frankly, ownership does not translate into economic benefit which is what we are at stake to achieve.

Privacy is a concept that has legal implications, and regardless of what we discuss with DataPortability, it still needs to be considered because business operates within the frameworks of law. Specifically, the human rights of an individual (who are consumers) need to be given greater priority than any other factor. So although we should be focused on how we can generate value, we also need to be mindful that certain types of data, like personally identifiable data, needs to be considered in adifferent light as there are social implications in addition to the economic aspects.

The use cases
The technical action group within the DataPortability project has been attempting to create a list of scenarios that constitute use cases for DataPortability enablement. This is crucial because to develop the blueprint, we also need to know what exactly the blueprint applies to.

I think it’s time however we recognise, that this isn’t merely a technical issue, but an industry issue. So now that we have begun the research phase of the DataPortability Project, I ask you and everyone else to join me as we discuss what exactly is the economic benefit that DataPortability creates. Rather than asking if Facebook is going to give up its users data to other applications, we need to be thinking on what is the end value that we strive to achieve by having DataPortability.

Portability in context, not location
When the media discuss DataPortability, please understand that a user simply being able to export their data is quite irrelevant to the discussion, as I have outlined in my previous posting. What truly matters is “access”. The ability for a user to command the economic benefits of their data, is the ability to determine who else can access their data. Companies need to be thinking that value creation comes from generating information – which is simply relationships between different data ‘objects’. If a user is to get the economic benefits of using their data from other repositories, companies simply need to allow the ability for a user to delegate permission for others to access that data. Such a thing does not compromise a company’s competitive advantage as they won’t necessarily have to delete data they have of a user; rather it requires them to try to to realise that holding in custody a users data or parts of it gives them a better advantage as hosting a users data gives them complete access, to try to come up with innovative new information products for the user.

So what’s my point? When discussing DataPortability, let’s focus on the value to the user. And the next time the top tech blogs confront the companies that are supporting the movement with a simplistic “when are you going to let users take their data completely off ” I am going to burn my bra in protest.

Disclosure: I’m a hetrosexual male that doesn’t cross-dress

Update: I didn’t mean to scapegoat Eric from VentureBeat who is a brilliant writer. However I used him to give an example of the language being used in the entire community which now needs to change. With the DP research phase now officially underway for the next few months, the questions we should be asking should be more open-ended as we at the DataPortability project have realised these issues are complex, and we need to get the entire community to come to a consensus. DataPortability is no longer just about exporting your social graph - it’s an entirely new approach to how we will be doing business on the net, and as such, requires us to fundamentally reexamine a lot more than we originally thought.