Archive of UserLand's first discussion group, started October 5, 1998.

Re: 130 RSS newsfeeds from Moreover.com

Author:Dave Winer
Posted:8/31/1999; 4:46:50 PM
Topic:130 RSS newsfeeds from Moreover.com
Msg #:10353 (In response to 10352)
Prev/Next:10352 / 10354

Ian, check out the bottom of the Scripting News home page. There's a copyright notice. I think it's pretty clear, and I also think you'll find it on many of the sites you scrape. I also think if you asked permission of the sites they would probably turn you down.

I like your business model, it happens to be the same as mine. But I prefer to let the early days be for the pioneers, the ones who are alert enough to catch on to a trend and support it. I have a DaveNet piece in the pipe about this stuff. In that piece I explain why I don't scrape. I hope you and others give this some thought:

Now, if you know much about web scripting, you have to wonder why we don't just read your HTML file and thru pattern matching, pull out the links and titles of the stories. There are several important reasons.

First, there's a small chance that this is an illegal, or at least unfair, use of your content. It's true that this is basically how search engines work, and no one has yet sued a search engine for indexing a site. But I prefer to build on a format that's soley used for syndication, so the webmaster's intent can't possibly be misunderstood. If you have an RSS file that's registered you clearly want me to aggregate it. There's no other purpose for an RSS file.

Second, by storing the information in XML, there's no mistaking a non-news link for a news story. Have you ever used a search engine to search for a story on one subject and have it return twenty hits because they contained links to the story? With a separate syndication file, again, there's no mistaking what a link is about. Only news stories belong in a RSS file.

Third, we can extend RSS in the future to include other important information. For example, The Motley Fool, which mostly covers publicly traded companies, wants to include a set of ticker symbols with every story. This makes perfect sense for their kind of content, but would not be needed for a site that covers a set of open source development projects, for example. By starting fresh with a new format, just for syndication, we'll have more room to expand the format in the future to respond to opportunities in the market.

Further, I can understand why you *might* want to theorize about the wisdom of some people who are not here, but when you explain why I should like your scraping my site, that's plain disrespect. I'm right here. Please ask. Thanks for listening.


There are responses to this message:


This page was archived on 6/13/2001; 4:52:19 PM.

© Copyright 1998-2001 UserLand Software, Inc.