Archive of UserLand's first discussion group, started October 5, 1998.

Re: Define scraping (and other terms)

Author:Samuel Reynolds
Posted:9/2/1999; 2:02:13 PM
Topic:Define scraping (and other terms)
Msg #:10475 (In response to 10455)
Prev/Next:10474 / 10476

Can someone define "scraping". How is it different from "crawling"?

Here are my definitions. YMMV.

Crawling: Traversing web pages and following links to index them (for search engines) or to download them en mass for off-line reading or review. Note that this does not suggest the re-packaging or public re-use of the crawled pages, other than indexing them to make them more readily available.

Scraping: Traversing web pages and following links in order to repackage or reformat their content for different presentation. Note the probable loss of context and possible loss of attribution of the material.

From screen-scraping, in which dumb-terminal I/O forms are "scraped" from a virtual screen (intercepted by a program pretending to be a dumb terminal) and user keystrokes are emulated to provide input to the forms; screen-scraping is used mainly with legacy mainframe applications, to avoid having to re-write or replace them. Screen-scraping allows the creation of GUI or automated interfaces into unmodified legacy applications.




This page was archived on 6/13/2001; 4:52:22 PM.

© Copyright 1998-2001 UserLand Software, Inc.