Searching, Metadata, and the Semantic Web

Tim Bray has another excellent post on searching. This time he’s talking about metadata. How to collect it. What some limitations of collection are. Etc.

We’ve seen the same limitations he lists for “hand collected metadata” - metadata that’s manually entered by users. If you give them too many fields (like, say, maybe IMS LOM?) they just won’t do it. Or, even worse, they’ll do a crappy job. Even CanCore isn’t small enough to be done efficiently and effectively. Heck, even DublinCore is too big for most users to regularly enter all fields completely.

I’m guessing there will be some happy medium between hand collected metadata and automated metadata (stuff gleaned from the contents of a file, from a file system, etc… and crunched by a CPU sitting in a closet somewhere). I’d love to see some kind of connection, where the minimal set of hand collected metadata is used to provide enough of a context that the automated stuff finally becomes actually useful (this could lead into the vision based research I talked about last week).