Archive of UserLand's first discussion group, started October 5, 1998.
Re: GDBs and memory
Author: Lixian B. Chiu Posted: 2/23/1999; 9:09:22 AM Topic: GDBs and memory Msg #: 3153 (In response to 3122) Prev/Next: 3152 / 3154
I modified the search engine to index Chinese pages, so far it works fine, but I am having a major roadblock -- I am working on a project to index 25 official books of the Chinese history, and they are huge. The total words will be somewhere around 800,000,000 Chinese characters. And since Chinese charc. is 2-byte, that means they have about 1,600,000,000 ascii charc. The problem that I have is that I constantly ran out of memory when I tried to index the site. I have set the memory allocation to around 50Mb (I use a Mac), but I have never gotten it to index more than 2,000,000 Chinese characters. I checked my codes very carefully, and I didn't find anything that would caused the memory leak. So, I decided to run a test on the built-in "English" indexer on a large site (I wrote a script to translate some of the Chinese pages into some meaningless English pages). Then I found that even the built-in indexer has the same problem when indexing an extreme large site.Any help?
There are responses to this message:
This page was archived on 6/13/2001; 4:48:04 PM.
© Copyright 1998-2001 UserLand Software, Inc.