Pro metadata will lose to folksonomy

Not only does Shirky nail it, but Cory hones in on the money graf s for
us. This is clearly one of a class of problems where scaling issues
overwhelm other factors and force solutions to be somehow distributed.

These are much like the situation in the early days of long-distance
telephone service that needed operators to complete all calls. Analyses
at the time predicted that the services would fail because your clearly
were going to need to hire so many operators that the system would
collapse. The solution, in that case, was to effectively make everyone
an operator by inventing direct-dial long distance and area codes. Of
course, we’ve now reached the point where area codes are an anachronism
and have little predictive value about where the phone in question
exists in the physical universe.

Shirky: Pro metadata will lose to folksonomy. Cory Doctorow:
Clay Shirky continues to just totally nail the questions of metadata,
authority, and user-created content. Today’s installment: why crappy,
cheap, user-generated, uncontrolled metadata will win out over
expensive, controlled, useful, professionally generated metadata:

Furthermore, users pollute
controlled vocabularies, either because they misapply the words, or
stretch them to uses the designers never imagined, or because the
designers say “Oh, let’s throw in an ‘Other’ category, as a fail-safe”
which then balloons so far out of control that most of what gets filed
gets filed in the junk drawer. Usenet blew up in exactly this fashion,
where the 7 top-level controlled categories were extended to include an
8th, the ‘alt.’ hierarchy, which exploded and came to dwarf the entire,
sanctioned corpus of groups.

The cost of finding your way through 60K photos tagged ‘summer’,
when you can use other latent characteristics like ‘who posted it?’ and
‘when did they post it?’, is nothing compared to the cost of trying to
design a controlled vocabulary and then force users to apply it evenly
and universally.

This is something the ‘well-designed metadata’ crowd has never
understood — just because it’s better to have well-designed metadata
along one axis does not mean that it is better along all axes, and the
axis of cost, in particular, will trump any other advantage as it grows
larger. And the cost of tagging large systems rigorously is crippling,
so fantasies of using controlled metadata in environments like Flickr
are really fantasies of users suddenly deciding to become disciples of
information architecture.

If you want to trace back to some of the items that launched this most recent disscussion, here are some of the key links:

Essay Scam Busters