plainblack.com
Username Password
search
Bookmark and Share

    
Goto page «Previous Page   1 2    Next Page»

Email address validation

User frodwith
Date 6/3/2009 1:09 pm
Views 1753
Rating 0    Rate [
|
]
Previous · Next
User Message
frodwith

Colin has very helpfully put in some improved email address validation recently.  Seems like duplication of effort, though.


For normal email addresses, Colin's regex is about 6 times faster (from my simple, probably wrong benchmark) than Email::Valid's rfc822 method. 

On the other hand, Email::Valid's check is correct, and Colin's will reject technically valid (though extremely uncommon) addresses.

Since Email::Valid still only takes about .021 ms to do a check on my Macbook (on average), I would content that's very much fast enough, and the speed difference is between a speck of dust and a slightly larger (but actually correct) speck of dust

What do you all think?  Should we be using Email::Valid instead, or Colin's regex?



Back to Top
Rate [
|
]
 
 
JT
Speed isn't the only consideration. You're also adding a prereq to an  
already large set of prereqs. So how much memory does this new prereq  
(and any of it's prereqs) use.


Back to Top
Rate [
|
]
 
 
frodwith

After loading preload.perl in WebGUI beta, Email::Valid adds an additional 72K memory usage.



Back to Top
Rate [
|
]
 
 
JT
That seems like an aweful lot of memory for something so simple, because we're not talking 72k, we're talking 72k times the number of mod perl processes you have running.

On Jun 3, 2009, at 3:54 PM, <paul@plainblack.com> wrote:
frodwith wrote:

After loading preload.perl in WebGUI beta, Email::Valid adds an additional 72K memory usage.



http://www.plainblack.com/webgui/dev/discuss/email-address-validation/2

--

Plain Black&#44; makers of WebGUI
http://plainblack.com


JT Smithph: 703-286-2525 x810fx: 312-264-5382
Create like a god. Command like a king. Work like a slave.


Back to Top
Rate [
|
]
 
 
frodwith

Hang on, I thought the entire reason we do the module preloading dance is to get shared versions of modules amonst modperl processes.  If we load this before we fork, it's 72k period, right?



Back to Top
Rate [
|
]
 
 
JT
You're right. I'm sorry. I'm so used to talking about memory being consumed by a particular method than I am entire modules that I forgot that.
Regardless, I'm not sure the extra 72kb of memory gets us anything useful.

On Jun 4, 2009, at 9:14 AM, <paul@plainblack.com> wrote:
frodwith wrote:

Hang on, I thought the entire reason we do the module preloading dance is to get shared versions of modules amonst modperl processes.  If we load this before we fork, it's 72k period, right?



http://www.webgui.org/webgui/dev/discuss/email-address-validation/4

--

Plain Black&#44; makers of WebGUI
http://plainblack.com


JT Smithph: 703-286-2525 x810fx: 312-264-5382
Create like a god. Command like a king. Work like a slave.


Back to Top
Rate [
|
]
 
 
perlDreamer

Colin has very helpfully put in some improved email address validation recently.  Seems like duplication of effort, though.

A couple of notes on intent, and history:

My goal in putting the email regex into WebGUI::Utility was for better maintenance.  For a while, we've had two email validation regular expressions:

  • /^[0-9a-z._%+-]+@(?:[0-9a-z-]+\.)+[a-z]{2,9}$/i; #from Form::Email
  • qr/^([0-9a-zA-Z]+[-._+&])*\w+@([-0-9a-zA-Z]+[.])+[a-zA-Z]{2,7}$/ from Account/Inbox

(There may be other individual validators out there, I just haven't looked, but will keep an eye out and try to unify them all)

The first has been used for over 5 years with no known complaints on email addresses.  I went with the 2nd because it allows fewer illegal email addresses.



Back to Top
Rate [
|
]
 
 
preaction

No known complaints

I've always hated it, because I can't do something as simple as "root@localhost".

Disregarding the argument over what is and is not an e-mail address, my opinion is that I do not want to keep coming back to this validation routine when there is a problem. I want one that works perfectly for all variations of e-mail addresses, or I want to send an e-mail to validate an e-mail address. Those are the only two possibilities that would ensure a proper e-mail address is entered and put the entire issue behind us.

70k is a bit, sure. Is it bigger or smaller than the XML or HTML parsers we rely on? Even HTML is easier to parse with a regular expression than RFC822 e-mail addresses (though I don't recommend regular expressions for either purpose).

We can put this out of our hands and rely on the CPAN authors' expertise, or we can continue to have questions about our e-mail validation routine (and a small percentage of people getting frustrated over it).

 



Back to Top
Rate [
|
]
 
 
JT
I don't have a problem with this one particular module, and it's 72k.  
What I have a problem with is we keep adding modules for this and that  
and the other thing. So presuppose for a moment that we could find a  
module for each of the 70 form controls that come with WebGUI (right  
now, and you know it will grow). Let's say they average 50k a piece  
because email validation is probably bigger than most types of  
validation. Now we've just added 3.5MB to our master process size. And  
that's just form control validation. And you know that each of those  
modules grow in each process over time, because they all have data  
passing through them. Sure it's cleaned up and reused (if the module  
doesn't leak), but it's never released back to the OS until Apache is  
dead.

What I'm getting at here is that we have to use a little be of  
judgement on each new module. It's not "I found a module, let's use  
it." It's "Does the functionality the module gives us outweigh the  
amount of memory the module uses?"  The answer here is no. The current  
regex handles 99.8% of the cases. As for root@localhost, deal with it.


Back to Top
Rate [
|
]
 
 
koen

I don't have a problem with this one particular module

Which is the subject of this discussion.

What I have a problem with is we keep adding modules for this and that and the other thing.

Which is another discussion all together.

Do you agree that (aside from the every perl module we add needs separate judgement) using this presumably perfect perl module should have precedence over a self made regular expression?

If this was the first time this functionality was needed, would you have considered the 72k increase in memory an issue?

Koen de Jonge - ProcoliX
http://www.procolix.com
Hosting - WebGUI - Virtualization



Back to Top
Rate [
|
]
 
 
     Goto page «Previous Page   1 2    Next Page»



© 2010 Plain Black Corporation | All Rights Reserved