The value chain for information

Lately, I’ve been doing a lot of thinking about the value chain of information, based on the Porter model of doing a value chain analysis . Given there is an undeniable trend to an knowledge-based economy (that is, if we’re not already there!), it seems pretty valuable that we should at least understand the different facets in the value chain to better understand the information sector.

Below are some thoughts about what I think are the broad aspects of the value system, with some commentary under each to help you understand my thinking. I’ve used common social computing sites to help illustrates the concepts, as everyone can relate to them. Also my definitions for data, information, and knowledge .

value chain is sweet
The value chain
1) Data collection
- value is in the storage
Competitive advantage: who offers the consumers the lowest price for the most storage. You should not just consider this in terms of cost in hosting but also about whether is costs the user their rights to control over some of their data.
Example: MySpace is where you store all your demographic data; SmugMug is where you store all your photos (which I consider data)

2) Data processing
- value is in the ability to manipulate the data
Competitive advantage: The infrastructure to process vasts amount of data at the highest output with the lowest cost
Example: Facebook calculates how many friends you have. The raw computing power to calculate the information requires substantial computing power, which is why Friendster fell when it captured the imagination of the industry as the first major social networking site.

3) Information generation
- Value is in the type and diversity of information. The connection of data (objects) is what generates information. Requires unique ability to understand what data inputs to pull.
Competitive advantage: Ability to access the most data (ie, relationships with the data storage components in the chain), and be able to creatively apply the data in a unique way.
Example: LinkedIn allows me to know that I am two degrees separated from a certain individual. The ability for LinkedIn to do that is a combination of what data they can use as well as the ability to process it. Essentially, the creativity of the company’s management to determine the feature’s value and the relationships with storage vendors or methods of using their own storage. In a DataPortability enabled world, it’s not so much how much data you can store of a user - but how much you can access from the storage vendors ie, relationships with these vendors.

4) Knowledge application
- value is in the application of information
Competitive advantage is on the application of information in a unique way that has not been done before
Example: A network analysis of my social graph. So if a social networking sites can tell me that 48% of my friends are male; and another piece of information that 98% of them are heterosexual; then therefore it is likely I am a straight male. The ability to derive insight, despite the multiple piece of information available, is filtered by those with the unique ability to recognise application of information in certain ways. The determination that I am straight is inference, which is a higher order type value as opposed to just information (which is grounded in hard data and more based on fact).

Implications of the value chain
It is important to note, and why it will be difficult for you to conceptualise the above, is that the Internet industry which is the backbone of the Information Sector of the economy, is still relatively immature. Flickr for example does most of the value chain - they store my photos, they allow me to make changes to the photos and add addition data like tags; they generate information by allowing me to organise my photos into sets (hence giving more value to the photo by putting it into context). And of course, they allow for knowledge application through their community - people passing by, leaving comments, is quite a unique thing that is unique to Flickr.

By better understanding the value chain, hopefully we can also realise that business can thrive by focussing on specific areas and it may not be in their interest to be in all areas. For example, the notion that locking up a person’s user data as being a competitive advantage is silly, if you can offer value through knowledge application.

To put the above in context, MySpace’s recent data availability announcement is a step into the direction of DataPortability (something that will take until the end of this year to finalise at minimum), but whilst Google and Facebook race to offer similar services to ‘lock’ their data, they are in fact missing the point. The value of MySpace for example is the community, and they get value in accessing data and information from as many diverse places as possible to apply that in a unique way. Because they think locking in the data is what determines their business strategy, it forces them to compete in the data storage market - and that is something I would not want to be in given the ability for it to be commoditised, and the massive compliance demands with government and user expectations with their rights. As highlighted by Nitin , data redundancy is a big issue so battling in the storage market puts you at risk if you are solely relying on it as your source for information and knowledge.

As always, I write my blog posts to extend on my thoughts. I’d love feedback and people to challenge the assumptions I’ve made, because I think this can be a very valuable tool in how we view businesses on the web.

Update 1 June 2008: Tim Bull made a video of this posting, which does a better job explaining the concepts presented above

6 Responses to “The value chain for information”


  1. 1 David Novakovic

    Prepare for my rambling. It may seem way out, but bare with me :)
    This model of information -> knowledge reminds me of a three level model proposed by Peter Gärdenfors, a very interesting take on the evolution and future of AI and cognitive science. His model describes what can be best described as layers of abstraction of thought. These layers map very, very well to the current trends in storing, managing, analyzing and reasoning about information. They are not meant exactly represent what we are discussing, merely shed light on how we can better tackle the task of managing knowledge. Without going into too many details I’d describe the relationship between this rather abstract concept and the different categories in your post. I also don’t have Peter’s book handy for a lot of the details… hopefully I won’t misrepresent what I consider to be a masterpiece. ( http://www.amazon.com/Conceptual-Spaces-Geometry-Thought-Bradford/dp/0262572192/ )

    The three Gärdenforsian levels are as follows:
    1. Symbolic (symbols, words… the nuts and bolts)
    2. Spacial (vector spaces, geometry, representation… measurable data)
    3. Connectionist (neural networks, perceptrons etc.. generally non-linear anlysis.. or our brains)

    Now I know this is already sounding way out of scope on this post, but I swear this is going somewhere :)
    Essentially it is human nature to understand things better, so we find ways to better represent information in a way that either makes a problem simpler, easier to understand. It is a matter of economics, economics of thought. We don’t like thinking about things too hard. This means we take very large amounts of raw data (symbols) and create a way in our brain for us to understand them better. We don’t normally remember a whole document, we remember the general meaning of it, we summarize/tag/title it in our brain and store it away. This model applys to all information, not just something you can read (though it’s a good example) This process of understanding knowledge can be looked at in Peter’s framework.

    Now I think you have addressed the first two levels of Peter’s model well, but under emphasised the importance of point 4. Essentially we have information which we need to store, make available, manipulate and even generate. This relates very closely to the first layer of the model. Symbols and what we can do with them, very much like traditional Logic. Things we can hold onto and move around. This is the raw data of the web, we could try and understand it all, but we know better than that. ;)
    Your fouth point relates to the next level up is slightly more complex, but not so much that it isn’t useful. I personally feel very strongly about the Spacial layer, it is very powerful and is an enabler that joins the Connectionist layer with the Symbolic layer (our brains with the data). I’ll get back to this, because this where i feel the hot action is happening in the web at the moment.

    The Connectionist layer in Peter’s model is probably best mapped to the human brain and its processing of information. In fact neural networks are actually models of what happens in our brain. We look at different bits of pieces of information in various contexts on the web and we are constantly reasoning about them and finding connections between them. The power to do this is enabled by the fact that we have systems that can make this information easier to understand and take in.

    Generally we can deduce a lot of information about a blog post by reading the heading and looking at tags associated with it, someone has gone to the hard work to make that information easy and available for us to see. We could have done it ourselves by reading the article, but we didn’t have to. BUT someone else did have to do it, someone took the symbols and represented them in a way that made it easy for us to understand. Someone took the information from Layer 1, considered it, modelled it in their own mind (Layer 2) and then presented it in a way that our brains can come along and put into context very easily at Layer 3.

    Now back to the Spacial Layer I skipped over earlier. This is the mapping layer, the layer that maps the free text into the tags. It needs to model our own cognitive processes in order to be able to present information that we can understand straight away. In reality a system like this presents a series of clues that we can then look at and go “ah hah!” The Spacial Layer can be mapped to applications on the web that as you say, take information and tell you what the chances of you being gay are. :)

    In reality many applications site across all layers of the model. Like Twine, it stores the information and uses a combination of machines and humans to reduce information to its most important pieces for us to easily read and understand the general meaning of. Or last.fm, it looks at large amounts of data and presents neighbourhoods of music that you can quickly get an idea of people in that neighbourhood as well as other music. This is where the value will be in information management applications, and any companies have realised this. I know a whole bunch of semweb guys will pounce on me for this… but this is why cannot have a pure Semantic Web style world, free information will always exist and require humans or machines at layer 2 to help us better understand the information.

    The final piece in this puzzle is how our brains actually take these clues and reason about them in a meaningful way… that’s outside the scope of this post… but I implore you to look at Abductive reasoning :) http://en.wikipedia.org/wiki/Abductive_reasoning

    In summary, we need machines to make information easier for us to understand. It is a matter of cognitive economics (and real economics too). More and more of the webapps that are appearing are providing very in depth analysis of the things we do, read, like and dislike on the web and in our lives. The Semantic web is one part of the Smart Web.

    Hope that rambling sprawl doesn’t scare anyone away…. or piss anyone off :)
    A provoking read apparently Elias, thanks for sharing :)

  2. 2 David Novakovic

    Woah, I missed the obvious punch line. Apps that provide a better analysis, understanding and presentation of information will be the ones that provide the most compelling user experience and value in the future. :)

  3. 3 Tim Bull

    Interesting idea…

    I don’t think that there is anything inherently wrong with your value chain at all, I just wonder if there is another layer to this. I’m going to demonstrate this by pulling a couple of examples from other domains.

    1. Mining Industry

    Data Collection — Drilling core samples
    Data Processing — Combining information from the core samples to build a view of the underlying terrain
    Information Generation — Taking the new terrain map and understanding it ie. that particular fold and type of mineral could indicate gold.
    Knowledge Application — Successfully mining the gold.

    2. Academic Research
    Data Collection — Hypothesis and conducting some experiment collecting samples.
    Data Processing — Crunching the numbers to find facts and figures.
    Information Generation — Interpreting the facts and figures and producing a research paper.
    Knowledge Application — Publishing that paper in a journal for peer review.

    OK, so these are quite brief, but I think this kind of value chain has existed for a long time and is not really that new.

    What the internet brings to this is the ability to use these existing value chains in new ways:

    a) It shortens the value chain considerably — I can get my content peer reviewed in lightning speed (ok, maybe not with the same rigour as a scientific journal, but you get the point).
    b) Different chains easily cross knowledge domains; the Internet allows value chains to spawn in new ways — mashups are about merging different value chains across applications into new methods of consuming the information.

    I’m still wondering if there is a fifth element here, that there is some kind of meta-information or knowledge (I guess part of my b above) is the new link that the Internet brings to the traditional elements.

  4. 4 Elias Bizannes

    Dave: Solid post! I had to read that a view times to take it all in, thank you. Not really sure though how I can apply that to the value chain I proposed. Your argument is not so much that the fourth stage is wrong, but that it’s the most important. I agree! But not sure if this means we need to add another stage?

    Tim: I am writing this more in the context of information companies like web services. So I would say your mining example is not really a good example - there is a difference between information as a process and information as a product. But you are right - the Internet has transformed it. But I don’t think looking at the Internet specifically is required - the Internet is merely the technology enabler to do more powerful things. In terms of the actual value chain, it’s not it’s own “stage” - it more like the internal organs of the the chain.

  5. 5 David Novakovic

    Hahha Elias, you’re exactly right :) The point I was getting at is that many of the points in your model are becoming commodities. Data is very much available, as is storage. Amazon s3 and related services are testament to that. There will be increasingly more players, and costs are dropping rapidly.

    The value moving forward will be in the more semantically aware applications who provide real value to users rather than just being the “bookshelf.” Of course this semantic layer will become a commodity eventually too, that’s why we need to move on our businesses early, while we are still leading the charge.

    What will come after that? I’m not too sure, but I suspect it will be systems that not only aid us, but actually can reason for us! Technology is encroaching on our turf fast, watch your back. :)

  1. 1 david petar novakovic: attempted axiomatisation

Leave a Reply