Archive of UserLand's first discussion group, started October 5, 1998.
Re: Scripts to produced Channel files?
Author: Jamie Scheinblum Posted: 6/17/1999; 9:00:50 AM Topic: Scripts to produce Channel files? Msg #: 7510 (In response to 7486) Prev/Next: 7509 / 7511
Well, here's a prelim version. I'll have a better version after I see Dave's example. Then we can get in feature/syntax sync. Don't use this version, its not feature complete, and will probably get angry about syntax... A better/real version is forthcoming.Dave: how much of the scripting news header do I need to implement to stick to the spec?
Here's the input file: -- Hello this is a text document Hello Do you like text documents?
hello this is another text document script stuff --
Here's the output:
-- $ perl parse.pl input.txt
-- Hello this is a text document Do you like text documents? http://www.cnn.com</URL> Hello hello this is another text document stuff http://www.scripting.com</URL> script And here's the source so far...
-- use HTML::Parser;
### Copyright 1999 Jamie Scheinblum ### Jamie@networked.org ### 6/17/99 ### Working source-code, not for re-distribution
### What do we use to mark the end of a paragraph? ### Make this a regular expression to match
my ($article_mark) = '^
$';
###
{ package Parse; @ISA = qw(HTML::Parser);
my (%link); my ($cur_url); my ($look_for_text) = 0; my ($doc_text);
sub get_links { return %link; }
sub get_doc_text { return $doc_text; }
sub clear { $doc_text = ""; %link = {}; $look_for_text = ""; }
sub start { my ($this) = shift; my ($tag, $attr, $attrseq, $origtext) = @_; if ($tag eq "a") { $cur_url = $attr->{href}; $look_for_text = 1; } }
sub text { my ($this) = shift; my ($text) = shift;
if ($look_for_text == 1) { $link{$cur_url} .= $text; } else { $doc_text .= $text." "; }; }
sub end { my ($this) = shift; my ($tag,$orig) = @_;
if ($tag eq "a") { $look_for_text = 0; } }; }
my $parser = Parse->new;
print "n"; print "http://www.scripting.com/dtd/scriptingNews.dtd">n"; print "
\n"; \n";foreach my $input_file (@ARGV) { ### For each file on the commandline, process the file
open(INPUT, $input_file) || die "$! : $input_file\n";
### Read the file while () { ### Strip returns s/n//; s/r//;
if (/${article_mark}/) { &item;
} else { $parser->parse($_); } } close(INPUT); &item; } print "
sub item { ### New article time my $hash = Parse->get_links(); ### Now print out the xml tags
print "\t
- \n"; print "\t\t
\n"; Parse->clear(); }",Parse->get_doc_text()," \n";foreach my $key (keys(%{$hash})) { print "\t\t\n"; print "\t\t\t
$key \n"; print "\t\t\t$hash->{$key} \n"; print "\t\t\n"; }print "\t
There are responses to this message:
- Re: Scripts to produced Channel files?, Jamie Scheinblum, 6/17/1999; 9:02:10 AM
This page was archived on 6/13/2001; 4:50:53 PM.
© Copyright 1998-2001 UserLand Software, Inc.