Archive for the 'DataPortability' Category Page 2 of 8



What is data?

The leading voices in technology have exploded in discussion about data portability, data rights, and the future of web applications. As an active member in the DataPortability Policy group, here is my suggestion on how the debate needs to proceed: break it down. Michael Arrington seems pretty convinced you own all your data, but I don’t think that’s a fair thing to say - and at core is the reason he is clashing with Robert Scoble’s view. For things to proceed, I really think a deeper analysis of the issues need to be made.

1) Define the difference between data, information and knowledge. There’s a big difference.
2) Determine what things are. (is an e-mail address data or information?)
3) Recognise the difference between ownership, rights and their implications.
4) Determine what rights (if that’s what it is) the various entities have over data (users, web apps, etc).

This is a big area and has a lot of abstract concepts - break it down and debate it there.

Some of my own thoughts to give some context

1) Data is an object and information is generated when you create linkages between different types of data - the ‘relationships’. Knowledge is the application of information.

  • 2000 is data - a symbol with no meaning. Connect it with other data, like the noun "year", and you have information because 2008 now has meaning. Connect that information with other information, like "computer bug" and "HSBC and you now have an application of that information. That being, there was an issue with the Y2K bug that has something to the bank HSBC.

2) Define what things are

What’s an e-mail address, a phone number, a social graph, an image, a podcast…I’m not entirely sure. I wouldn’t be blogging this if I had all the answers. Once we agree on definitions, we can then start categorising them and applying a criteria.

3) Ownership:

Here is something Steve Greenberg explained to me

- Ownership is relevant when there is scarcity.
- Ownership is the ability to deny someone else’s use of the asset.
- So, if data is shared and publicly available, it is a practical impossibility for me to deny use
- and if data is available in a form where I can’t control others’ use of it, I can not really claim to own it

Nitin Borwankar has a very different argument: you should have ownership based on property rights. He explained that to me here .

4) Rights over data

I personally think no one owns data (which is inspired by the definition of data being inherently meaningless); instead you own things further down the value chain when that data becomes something with value. You own your overall blog posts - but not the words.

But again, this goes back to what is data?

What is the DataPortability Project

When we created the DataPortability workgroup in November 2007, it was after discussion amongst a few of us to further explore an idea; a vision for the future of the social web. By working together, we thought we could make real change in the industry. What we didn’t realise, was how quickly and how big the attention generated by this workgroup was to be. A press release has been released that details the journey to date, which highlight’s some interesting tidbits. What I am going to write below, are how my own thoughts have evolved over the last few months, and what it is that I think DataPortability is.

1) Getting companies to adopt open, existing standards
RSS , OpenID , APML , oAuth , RDF , and the rest. These technologies exist, with of which have been around for many years. Everyone that understands what they are, know that they rock. If these standards are all so great - why hasn’t the entire technology industry adopted them yet? Now we just need awareness, education and in some cases pressure on the industry heavies to adopt them.

2) Create best practices of implementing these standards
When you are part of a community, you are in the know, and don’t realise how the outside world looks in. Let the standards communities focus their precious energies on creating and maintaining the technologies; and DataPortability can help provide resources for people to implement them. Is providing PHP4 support for oAuth really a priority? It isn’t for them - but by pooling the community with people that have diverse skillsets and are committed to the overall picture, it has a better chance of happening.

3) Synthesise these open standards to play nice with each other.
All these different communities working in isolation have been doing their own thing. An example is how Yadis-XRDS are working on service discovery and have a lacklustre catalogue. Do we just leave them to do their own thing? Does someone else in Bangalore create his own catalogue? (Which is highly likely given the under-exposure of this key aspect to groups needing it for the other standards, and the current state its in). Thanks to Kaliya for mentioning that the XRDS guys have been more then proficient in working with other groups - "how do you think their spec is part of the OpenID spec?". Julian Bond goes on to say: "Yadis-XRDS is only months old and XRDS-Simple is literally days old…Having trouble thinking of a community that is working in isolation. And that isn’t likely to be hugely offended if you suggested it. " So let me leave the examples here, and just say the DataPortability Project when defining technical and policy blueprints, can identify issues and from the bigger picture perspective focus attention on where it’s needed. By embracing the broader community, and focusing our attention on weaknesses, we can ensure no one is reinventing wheels .

4) Communicate all the good things the existing communities are doing, under the one brand, to the end user.
RSS is by far the most recognised open standard. Have you ever tried explaining RSS to someone who is outside of the tech industry? I have. Multiple times. It’s like I’ve just told them about the future with flying cars and settlements on Mars. I’ve done it in in the corporate world, to friends, family, girls I date, guys I weight train with and anyone else. Moving onto OpenID - does anyone apart from Scoble and the technorati who try all the webservices they can, really care? Most people use Facebook, Hotmail (the cutting edge are using Gmail) and that’s it. On your next trip to Europe ask a cultured French (wo)man if they know what OpenID is; why they need it; what they can do with it. Now try explaining RSS to the mix. And APML. And oAuth. Bonus if you can explain RDF to yourself.

Wouldn’t it be just easier if you explained what DataPortability is, and explained the benefits that can be achieved by using all these standards? Standards are invisible things that consumers shouldn’t need to care about; they just care about the benefits. Do consumers care about the standards behind Wi-Fi, as defined by Zero-conf - or do they care about clicking "enable wireless" on their laptop and them connecting to the Internet. If you are going around evangelising the technical standards, the only audience you will get are the corporates in IT departments, who couldn’t care less. The corporate IT guys respond to their customer/client facing guys, who in turn respond to consumers - and consumers couldn’t care less on how its done, but just what they can do. Have the consumer channel their demand, and it benefits the whole ecosystem.


The new DataPortability trustmark

It has been said the average consumer doesn’t care about DataPortability. Of course they don’t - we are still in the investigation phase of the Project ; which later on will evolve to the design phases and then evangelising phases. We know people would want RSS, oAuth, and the rest of the Alphabet soup - so lets use DataPortability as a brand that we can communicate this. Sales is about creating demand - lets coordinate our ’selling’ to make it overwhelming - and make it easy for consumers to channel that want in a way they can relate to. You don’t say "oAuth"; you say "preventing password theft" to them instead.

5) Make the business case that a user should get open access to their data
Why should Facebook let other applications use the data it has on its servers? Why should google give up all this data they have about their users to a competitor? Why should a Fortune 500 adopt solutions that decentralise their control? Why should a user adopt RDF on their blog when they get no clear benefit from it? Is a self-trained PHP coder who can whack something together, going to be able to articulate that to the VC’s?

The tech industry has this obsession that nothing gets done unless the developers are on board. No surprises there - if we don’t have an engineer to build the bridge, we are going to have to keep jumping off the cliff hoping we make it to the other side. But at the same time, if you don’t have the people persuading the people that would fund this bridge; or the broader population about how important it is for them to have this bridge - that engineer can build what he wants but the end result is that no one will ever walk on it. Funny how web2.0 companies suck at the revenue model thing : overhype on the development innovation, with under-hype on the value-proposition to the ordinary consumer who funds their business .

Developers need to be on board because they hassle their bosses and sometimes that evangelising from within works; but imagine if we get the developers bosses bosses on board because some old bear on the board of directors wants DataPortability after his daughter explained it to him (the same person that also told him about Facebook and Youtube). I can assure you, as I’ve seen it first hand with the senior leadership at my own firm, this is exactly what is happening.

Intel is one of the best selling computer-chip companies in the world. Do you really think as a consumer I care about what chip my computers works on? Logically - no. But "Intel’s Inside" marketing campaign gave them a monopoly, because end consumers would ask "does it have intel inside?" and this pressure forced Intel’s customers (IBM and the rest) to actually use Intel. Steve Greenberg corrects me by saying "The Intel Inside campaign came a decade after Intel took over the world. It wasn’t what got them there. It was in response to Microsoft signaling that they liked AMD. Looked like AMD was going to take off… but then they didn’t". So my facts were slightly wrong, but the point still remains.
At the same time, it isn’t just political pressure but its also to educate. I genuinely believe opening up your data is a smart business strategy that will change the potential of web services.

You make people care by giving them an incentive to do it (business opportunities; customer political pressure; peer pressure as individuals and an industry which later evolve to industry norms). The semantic web communities, the VRM communities, the entire open standards communities - all have a common interest in doing this. DataPortability is culture change on an industry wide level, that will improve the entire ecosystem. Apparently innovation has died - I say it’s just beginning .

The DataPortability Logo competition

As one of the founders of DataPortability that plays an active role in driving the project, I am writing this post to give recognition to some key individuals as well as transparency in line with the DataPortability philosophy. I also want to promote the fact that the social experiment that is DataPortability, something that both Chris Saad and myself actively are trying to build, has had a massive evolution and validation that it works. The example set by this team on the first major deliverable external to the Project, is a model to how things will occur in the future

In February, RedHat sent a cease and desist letter to the Project, that we must drop our logo as it infringes on their copyright. Whilst the threat could have been debated, the decision was made after community consultation that it was not worth a fight, and we sould pick a new logo. However, what was different was how we were going to pick a new logo: we decided to reach out to the broader community on this one.

So a competition was launched , that soon followed with some generous prizes, for who could design the best logo. Over the course of those next few weeks, we received 403 entries that vied for the prize .

Now what?

DataPortability is a completely decentralised, non-hierarchical movement. Chris calls it participant democracy, where I prefer the simpler wikiocracy term. There is not formal management structure, and everyone is considered equal. No one is forced to do anything, but everyone involved in enthusiastic to make our vision a reality. So how do you convert those 403 submissions into a list of 15 logos that the public can easily vote for, with the pressure that the whole industry is looking and everything must be done with complete accountability? Add to the fact people involved in DataPortability all have full time jobs, and other commitments - turning around something like this in a few short weeks is not easy.

Mary Trigiani (a founding member of San Francisco based Foldier) took the initiative, and formed a group to spearhead this mammoth task. She was joined by Phil Wolff (editor of Skype Journal , and from San Francisco), Brady Brim-DeForest (a Director and entrepreneur from Los Angeles),  Navarr Barnier (a 17 year old Texan high school student on the W3C HTML committee), Triona Carey (a technical writer from sunny UK) and myself - where the team started assembling themselves. Remember, we have no authority formalised in the Project, and with such a mammoth task, the ability to self-organise and get things done should not be underestimated. Both Triona and Mary who initially led the team, lived in completely different time zones - it’s not an easy thing to make even simple decisions with such a factor, making the team completely virtual.

What followed was an amazing team effort that did the following:

  • Discovered a technical issue where everyone seemed to be getting a different count, and therefore, not necessarily seeing all the logos submitted on the Flickr pool. This created a big problem: how do we ensure all our judges give equal consideration to all logos? Sure - you can download the logos and wack them on another server…you try doing that for 400 separate images in a semi-closed application
  • Coordinate to get all these well recognised judges onto the same page, to vote their favourites, and thereby create a shortlist of logos.
  • Reduce that shortlist of entries to a maximum of 15 (of 55 as picked by the judges), with all logos investigated for potential trademark issues and other factors that bore consideration on an appropriate logo
  • Battle with timezones, evolving decision making processes, constantly changing leadership and commitments due to personal circumstances, and the dozens of hiccups along the way
  • …as well as numerous other logistical issues which are still occurring and I don’t need to bore you with now

The technical issue, which we experimented with God knows how many options eventually had Phil download all the files with a special utility , and Navarr created an application that could allow all the logos to be seen and voted on (with some initial help from Bob Ngu ). Phil also organised a logo collaboration space generously donated by conceptshare, that allowed the judges to get into discussions on logos to raise issues and generate awareness of potential problems with certain logos - a massive process given how many logos the judges had to review. These judges then placed their votes on Navarr’s application, which then had to be scrutinised quite intensely by members of the team to cut down the combined judges short list as well as research any potential trademark issues. And the people at webreakstuff rushed to build a system to enable the public voting at http://dataportability.techcrunch.com

The end result are 15 logos that have gone through a very thorough process of review that had them considered against every other logo.

Whilst I hesitate writing posts like this on my blog (I like to keep this blog primarily about analysis rather than events), I want to record this as evidence that its requires key individuals whose names are not known outside of the project to get things done - so thank you to everyone mentioned above. Combined I don’t think its unfair to say the team spent 100 hours working together, and this was done in their free time - they are all busy people like the rest of us.

I also want to thank our brilliant judges, who gave very considerate review over the logos and great insight as to what would make an effective logo.

They are:

and I cannot praise these individuals more highly after interacting with some of them and seeing their judging which showed they obviously put a lot of consideration into their shortlist (and well as showing clear talent)

So check out the logos and don’t forget to vote (thanks to Techcrunch) ! DataPortability is a community effort for a new future - your small contribution by voting, together with everyone else, helps us get one step closer to that vision.