Archive of UserLand's first discussion group, started October 5, 1998.

Mark Wilcox's recommendation: generate and select

Author:Jon Udell
Posted:9/10/1999; 11:19:14 AM
Topic:rss channels via email
Msg #:10895 (In response to 10830)
Prev/Next:10894 / 10896

Mark Wilcox responded over in my newsgroup; I thought it worthwhile to echo his comments here. I've thought of his "generate and select" idea at various times too. Rather than present users with a blank slate, you process their stuff through filters that try to map it into conventional schemes, and then kick back the results for users to confirm or deny, alter or extend.

RSS channel feeding doesn't currently work this way, but it conceivably could. I post my RDF, a categorizing host grabs it, does a best-guess mapping into one or more conventional schemes, and sends me back an URL. When I visit that URL I get to see where it thought the items belonged, and maybe how it thought my user-defined categories should map into (or be replaced by) categories in one or more conventional schemes. Per-item I can agree or disagree; when I submit, my choices are honored according to the channel host's policy.

Definitely more complex. Also, arguably, a lot more useful. Optional, I guess, in that if I don't respond, a policy-defined default occurs and either my choices or the site's prevail.

Too many moving parts?

Anyway, here is Mark's comment:

-------------------


Ever heard the phrase that goes something like "Those who don't know UNIX are bound to make a poor imitation of it?". Same thing with people who've never had any background in knowledge management (e.g. librarianship) go about attempting to develop their own information management system.

While I think you are on the right track, you can't let people totally pick what categories their items should be in. This was the biggest problem with HTML meta tags and a big reason why search engines for the most part ignore them.

Instead you should take an existing organizational scheme in electronic form, such as Dewey or Library of Congress subject headings. Both of these systems cover just about any subject you want, are constantly updated and maintained by information professionals from around the world. These systems also have enormous amounts of cross-referencing, something that Yahoo has tried to do, but is still working at it.

Next you have people submit their items, run their text through a process that then uses your information base to present the user with a small list of possible categories.

This increases the likelyhood that similar items end up together as opposed to being scattered apart.

OCLC, which is a library entity that provides much of the classification information (e.g. library catalogs) to libraries in the US and around the world has done a couple of research projects on this area. I urge you to take a look at:

http://www.oclc.org/oclc/research/publications/review96/scorpion.htm

http://www.oclc.org/oclc/research/publications/review97/shafer/eval_scorpion/eval_sc.html

And the z39.50 protocol (which I understand now can talk over TCP/IP) is designed to provide standardized access to information systems (normally library catalogs, but it doesn't have to be).

The "wheel" doesn't totally exist yet so you do have to do some inventing on your part, but you don't have to start from scratch :)


There are responses to this message:


This page was archived on 6/13/2001; 4:52:33 PM.

© Copyright 1998-2001 UserLand Software, Inc.