Tag Archive for 'DataPortability' Page 2 of 7



What is data?

The leading voices in technology have exploded in discussion about data portability, data rights, and the future of web applications. As an active member in the DataPortability Policy group, here is my suggestion on how the debate needs to proceed: break it down. Michael Arrington seems pretty convinced you own all your data, but I don’t think that’s a fair thing to say - and at core is the reason he is clashing with Robert Scoble’s view. For things to proceed, I really think a deeper analysis of the issues need to be made.

1) Define the difference between data, information and knowledge. There’s a big difference.
2) Determine what things are. (is an e-mail address data or information?)
3) Recognise the difference between ownership, rights and their implications.
4) Determine what rights (if that’s what it is) the various entities have over data (users, web apps, etc).

This is a big area and has a lot of abstract concepts - break it down and debate it there.

Some of my own thoughts to give some context

1) Data is an object and information is generated when you create linkages between different types of data - the ‘relationships’. Knowledge is the application of information.

  • 2000 is data - a symbol with no meaning. Connect it with other data, like the noun "year", and you have information because 2008 now has meaning. Connect that information with other information, like "computer bug" and "HSBC and you now have an application of that information. That being, there was an issue with the Y2K bug that has something to the bank HSBC.

2) Define what things are

What’s an e-mail address, a phone number, a social graph, an image, a podcast…I’m not entirely sure. I wouldn’t be blogging this if I had all the answers. Once we agree on definitions, we can then start categorising them and applying a criteria.

3) Ownership:

Here is something Steve Greenberg explained to me

- Ownership is relevant when there is scarcity.
- Ownership is the ability to deny someone else’s use of the asset.
- So, if data is shared and publicly available, it is a practical impossibility for me to deny use
- and if data is available in a form where I can’t control others’ use of it, I can not really claim to own it

Nitin Borwankar has a very different argument: you should have ownership based on property rights. He explained that to me here .

4) Rights over data

I personally think no one owns data (which is inspired by the definition of data being inherently meaningless); instead you own things further down the value chain when that data becomes something with value. You own your overall blog posts - but not the words.

But again, this goes back to what is data?

The value chain for information

Lately, I’ve been doing a lot of thinking about the value chain of information, based on the Porter model of doing a value chain analysis . Given there is an undeniable trend to an knowledge-based economy (that is, if we’re not already there!), it seems pretty valuable that we should at least understand the different facets in the value chain to better understand the information sector.

Below are some thoughts about what I think are the broad aspects of the value system, with some commentary under each to help you understand my thinking. I’ve used common social computing sites to help illustrates the concepts, as everyone can relate to them. Also my definitions for data, information, and knowledge .

value chain is sweet
The value chain
1) Data collection
- value is in the storage
Competitive advantage: who offers the consumers the lowest price for the most storage. You should not just consider this in terms of cost in hosting but also about whether is costs the user their rights to control over some of their data.
Example: MySpace is where you store all your demographic data; SmugMug is where you store all your photos (which I consider data)

2) Data processing
- value is in the ability to manipulate the data
Competitive advantage: The infrastructure to process vasts amount of data at the highest output with the lowest cost
Example: Facebook calculates how many friends you have. The raw computing power to calculate the information requires substantial computing power, which is why Friendster fell when it captured the imagination of the industry as the first major social networking site.

3) Information generation
- Value is in the type and diversity of information. The connection of data (objects) is what generates information. Requires unique ability to understand what data inputs to pull.
Competitive advantage: Ability to access the most data (ie, relationships with the data storage components in the chain), and be able to creatively apply the data in a unique way.
Example: LinkedIn allows me to know that I am two degrees separated from a certain individual. The ability for LinkedIn to do that is a combination of what data they can use as well as the ability to process it. Essentially, the creativity of the company’s management to determine the feature’s value and the relationships with storage vendors or methods of using their own storage. In a DataPortability enabled world, it’s not so much how much data you can store of a user - but how much you can access from the storage vendors ie, relationships with these vendors.

4) Knowledge application
- value is in the application of information
Competitive advantage is on the application of information in a unique way that has not been done before
Example: A network analysis of my social graph. So if a social networking sites can tell me that 48% of my friends are male; and another piece of information that 98% of them are heterosexual; then therefore it is likely I am a straight male. The ability to derive insight, despite the multiple piece of information available, is filtered by those with the unique ability to recognise application of information in certain ways. The determination that I am straight is inference, which is a higher order type value as opposed to just information (which is grounded in hard data and more based on fact).

Implications of the value chain
It is important to note, and why it will be difficult for you to conceptualise the above, is that the Internet industry which is the backbone of the Information Sector of the economy, is still relatively immature. Flickr for example does most of the value chain - they store my photos, they allow me to make changes to the photos and add addition data like tags; they generate information by allowing me to organise my photos into sets (hence giving more value to the photo by putting it into context). And of course, they allow for knowledge application through their community - people passing by, leaving comments, is quite a unique thing that is unique to Flickr.

By better understanding the value chain, hopefully we can also realise that business can thrive by focussing on specific areas and it may not be in their interest to be in all areas. For example, the notion that locking up a person’s user data as being a competitive advantage is silly, if you can offer value through knowledge application.

To put the above in context, MySpace’s recent data availability announcement is a step into the direction of DataPortability (something that will take until the end of this year to finalise at minimum), but whilst Google and Facebook race to offer similar services to ‘lock’ their data, they are in fact missing the point. The value of MySpace for example is the community, and they get value in accessing data and information from as many diverse places as possible to apply that in a unique way. Because they think locking in the data is what determines their business strategy, it forces them to compete in the data storage market - and that is something I would not want to be in given the ability for it to be commoditised, and the massive compliance demands with government and user expectations with their rights. As highlighted by Nitin , data redundancy is a big issue so battling in the storage market puts you at risk if you are solely relying on it as your source for information and knowledge.

As always, I write my blog posts to extend on my thoughts. I’d love feedback and people to challenge the assumptions I’ve made, because I think this can be a very valuable tool in how we view businesses on the web.

Update 1 June 2008: Tim Bull made a video of this posting, which does a better job explaining the concepts presented above

It’s all still alpha in my eyes

The invention of hypertext has been the most revolutionary thing since two previous technologies before: the printing press and the alphabet. Combined with computing and the Internet, we have seen a new world represented by the World Wide Web that has transformed entire industries in its mere 19 15 year existence.

The web caught our imagination in the nineties, which became the Dot-Com bubble. Several years after the bust, optimism reawakened when the Google machine listed on the stock exchange – heralding a new era dubbed “web2.0”. This era has now been recognised in the mainstream, elevated by the mass adoption of the social computing services, and has once again seen the web transform traditional ideas and generate excitement.

davewiner
The web2.0 era is far from over – the recent global recession however has flagged though that the pioneers of the industry are looking for something new. As the mainstream is rejuvenated by web2.0 like the Valley was not that long ago, it’s time to now look for what the next big thing will be. Innovation on the web is apparently flattening. Perhaps it has – but the seeds of the next generation of innovation on the web are already here.

Controversy of the meaning of web2.0 – and what its successor will be – should not distract us. We are seeing the web and associated technologies evolve to new heights. So the question is not when web2.0 ends, but what are we seeing now, that will dominate in the future?

My view:
The mobile web. The mobile phone is now evolving into a generic entertainment device, becoming a new computing device that extends the reach of the internet. First with the desktop computer, and then with the laptop computer – new opportunities presented themselves in the way we could use computers. The use of this new computing platform will create new opportunities that we have only scratched the surface.
The 3D web. Visit second life, the virtual world, as you quickly note the main driver of activity is sex and that it’s just a game. However, porn and games have spearheaded a lot of the innovation of technology in the past. The 3D web is now emerging with four separate but related trends: virtual worlds, mirror worlds, augmented reality and lifelogging.
The data web. Data has now become a focus in the industry. The semantic web, eventually, will allow a weak form of artificial intelligence that will allow computer agents to work in an automated fashion. Vendor Relationship Management is changing the fundamental assumptions of advertising, with a new way of how we transact in our world. Those trends, when combined with the drive for portability of peoples data, is having us see the web in a new light with new potential. Not as a collection of documents, and not as a platform for computing, but as a database that can be queried.

So to get some discussion, I thought I might ping some smart people I know in the industry on what they think: Chris Saad, Daniela Barbosa, Ben Metcalfe, Ross Dawson, Mick Liubinskas, Randal Leeb-du Toit, Stewart Mader, Tim Bull, Seth Yates, Richard Giles as well as you reading this now.
What do you think is currently in the landscape that will dominate the next generation of the web?