plainblack.com
Username Password
search
Bookmark and Share

    

Strip HTML in a SQL Query or Macro

User djs
Date 10/18/2007 12:24 pm
Views 10892
Rating 3    Rate [
|
]
Previous · Next
User Message
djs

Is there an existing macro or api I can call to stirp HTML from a field in a tempate?   Anyone already written one?

Any tricks besides the API or macros?

I've already searched the WebGUI Help, the forums and Google.


Thanks much,

D. 

--- (Edited on 10/18/2007 12:24 pm [GMT-0500] by djs) ---



Back to Top
Rate [
|
]
 
 
arjan

Yes, I have. I'll copy-paste it below and make a proper contribution this evening. I use it in the recover password template, where there's <ul><li><li></li></li></ul> and that is of course not correct. I'll make a proper bug-report of that too this evening. (US Central Daylight time.)

Kind regards,

Arjan. 

Note: the editor in which I now paste the code, might filter or change some strings out of the text below. So this might not work out of the box. If this is a problem, wait a few hours and download it from the "Get Add Ons" section.

package WebGUI::Macro::filter;

#
##-------------------------------------------------------------------
## This macro is Copyright 2007 United Knowledge
## http://www.unitedknowledge.nl/
## Author: Arjan Widlak
## Version: 0.1
## Date: 26th of September 2007
## Licence: GPL http://www.gnu.org/licenses/gpl-2.0.html
##-------------------------------------------------------------------
#

use strict;
use WebGUI::Macro;
use WebGUI::HTML;

=head1 NAME

Package WebGUI::Macro::filter;

=head1 DESCRIPTION

Macro for filtering HTML.

#-------------------------------------------------------------------

=head2 process ( html, [filter] )

I use this macro in the Password Recovery Template. Here the variable
<tmpl_var recoverMessage> returns <ul><li><li> which produces empty
bullets. The reason for this is that line 73 of Auth/WebGUI.pm adds
list-tags ($error .= '<li>'.$i18n->get(3).'</li>';) and line 727 of
the same module too. ($self->recoverPassword('<ul><li>'.$self->error.'</li></ul>');
The real solution would be to change Auth/WebGUI.pm.

This macro is in fact an interface to WebGUI::HTML::filter.

The filter chosen is "all" from the following in HTML::filter:
Choose from "all", "none", "macros", "javascript", or "most". Defaults
to "most". "all" removes all HTML tags and macros; "none" removes no
HTML tags; "javascript" removes all references to javacript and macros;
"macros" removes all macros, but nothing else; and "most" removes all
but simple formatting tags like bold and italics.

=cut

sub process {
    my $session = shift;
    my $html = shift;

    my $html = WebGUI::HTML::filter($html,"all");
   
    return $html;

}
1;
 

--- (Edited on 18-October-2007 19:54 [GMT+0200] by arjan) ---



Back to Top
Rate [
|
]
 
 
djs

 

Thanks!

I  implemented this as a macro called "StripHTML"

I Enabled it in my conf and "TagFilter" which it seems to use (though maybe not as a macro).

  1. When I run it like this:

    StripHTML("<b>test</b>");

    It returns nothing - removes the text between the tags or is just not returning anything.
  2. I have these errors in my httpd error log:

    [Thu Oct 18 14:35:21 2007] -e: Use of uninitialized value in subroutine entry at /data6/WebGUI/lib/HTML/TagFilter.pm line 319.
  3. I am uncertain how to use it with a template variable.  Is it like this (leaving out first char to prevent parsing problems in forum):

    StripHTML(<tmpl_var row.field.content.value>);

    Or like this:

    <tmpl_var StripHTML( row.field.content.value);>
  4. Is there a way to call TagFilter directly?

Thanks for any futher assistance!

D. 

--- (Edited on 10/18/2007 2:36 pm [GMT-0500] by djs) ---

--- (Edited on 10/18/2007 3:06 pm [GMT-0500] by djs) ---



Back to Top
Rate [
|
]
 
 
arjan

Hi djs,

I've posted the macro and it's documentation here.

You also asked if you can run this macro from the commandline. You can run any macro from the command line with this utility that JT posted the day before yesterday:

Macro runner.

Kind regards,

Arjan. 

--- (Edited on 18-October-2007 23:56 [GMT+0200] by arjan) ---

--- (Edited on 18-October-2007 23:58 [GMT+0200] by arjan) ---



Back to Top
Rate [
|
]
 
 
djs
Arjan,

OK, I followed your submission post - but I still got the same error in the apache error log:

 -e: Use of uninitialized value in subroutine entry at /data6/WebGUI/lib/HTML/TagFilter.pm line 319. 

I had to make the following changes to the code to get it working (not sure if it's because I'm on 6.2.11 ?).

I had to retrieve the input variable like this:

  my ($html) = WebGUI::Macro::getParams(shift); 

which replaced: 

 # my $session = shift;
 # my $html = shift;

and I couldn't use the same variable name for the filtered html, so I did this:

  my $htmlfiltered = WebGUI::HTML::filter($html,"all");
   
  return $htmlfiltered;

This is MUCH better than before.

My only challenge is that it does not remove special codes like "nbsp;" - anyone know how I would do that?

--- (Edited on 10/18/2007 9:04 pm [GMT-0500] by djs) ---



Back to Top
Rate [
|
]
 
 
djs

 

Ok, I found this routine within the html2text procedure instead of filter:

    my $htmlfiltered = WebGUI::HTML::html2text($html);

But "&nbsp;" becomes a question mark.

I found lots of info on the "proper' way to replace all html "entities" (http://www.english.uga.edu/humcomp/perl/regexps.html) but my stuff only seems to include the non breaking space, so I hacked it like this (don't look if you hate hacks):

     my $htmlfiltered = WebGUI::HTML::filter($html,"all");
    $htmlfiltered = $htmlfiltered =~ s/&nbsp;/\ /g;

Fyi,

D. 

 

--- (Edited on 10/18/2007 11:06 pm [GMT-0500] by djs) ---



Back to Top
Rate [
|
]
 
 
    



© 2022 Plain Black Corporation | All Rights Reserved