| Previous · Next | |
| User | Message |
|
knowmad
|
Date: 7/6/2008 9:08 pm · Subject: Searching external HTML files · Rating: -1
While preparing for my talk next month, I'm looking over the external indexer plugins. I notice that HTML files are simply being cat'd in whole. This will cause tags, attributes, styles and other non-text elements to get caught by the indexer. Is this intentional? I propose the use of w3m, lynx or a simple Perl script to strip the HTML tags to avoid them being indexed.
William ---- |
| Back to Top |
Rate [ | ]
|
|
JT
|
Date: 7/7/2008 8:05 pm · Subject: Re: Searching external HTML files · Rating: -1
You're incorrect about your observation. The HTML is being stripped. JT On Jul 6, 2008, at 9:08 PM, <william@knowmad.com> wrote: knowmad wrote:
|
| Back to Top |
Rate [ | ]
|