WebGUI
      Click here to register.
      
PBWG Banner


     Report a Bug > WebGUI Bug Tracker

Database cache causes segfaults

User cap10morgan
Date 2/11/2008 11:55 pm
Severity Minor (annoying, but not harmful)
Version WebGUI Stable 7.4.14 / 0.8.0
Views 374
Rating 6    Rate [
|
]
Karma Rank 0.000000
Previous · Next
User Message
cap10morgan

When using the database cache (WebGUI::Cache::Database), mod_perl processes routinely segfault when trying to access the cache. Clearing the cache does not help, nor does restarting mod_perl. Switching to the FileCache does help.

myisamchk on the cache table reports that 1 client is using or hasn't closed the table (expected since the site is live) and that the cache table is usable but should be fixed (no other errors are reported, so I assume that "should be fixed" message is because the table is open--i.e. it's assuming the table is closed but a client didn't close it properly). 

I'll post a reply w/ a sample strace output showing what happens (I saw this on many different sites on my server).

 



Back to Top
Rate [
|
]
 
 
nuba

myisamchk on the cache table reports that 1 client is using or hasn't closed the table (expected since the site is live) and that the cache table is usable but should be fixed (no other errors are reported, so I assume that "should be fixed" message is because the table is open--i.e. it's assuming the table is closed but a client didn't close it properly). 

Hi cap10morgan, check this


Important

You must ensure that no other program is using the tables while you are running myisamchk. The most effective means of doing so is to shut down the MySQL server while running myisamchk, or to lock all tables that myisamchk is being used on.

Otherwise, when you run myisamchk, it may display the following error message:

warning: clients are using or haven't closed the table properly
 

from http://dev.mysql.com/doc/refman/5.1/en/myisamchk.html



Back to Top
Rate [
|
]
 
 
cap10morgan
It's taking me longer to get that strace result than I'd hoped. I need to deploy a new test server to do it, since I don't want to re-enable segfaults on my production server. I'll try to get that up here soon.

Back to Top
Rate [
|
]
 
 
cap10morgan

I have confirmed that this still occurs on WRE 0.8.1 / WebGUI 7.4.24. Here's the strace output for a site where the database is named "webgui_envco" and it lives on a MySQL server running on 192.168.77.164 (from the WebGUI mod_perl server's perspective):

 

28549 connect(19, {sa_family=AF_INET, sin_port=htons(3306), sin_addr=inet_addr("192.168.77.164")}, 16) = 0
28549 setsockopt(19, SOL_SOCKET, SO_RCVTIMEO, "\2003\341\1\0\0\0\0", 8) = 0
28549 setsockopt(19, SOL_SOCKET, SO_SNDTIMEO, "\2003\341\1\0\0\0\0", 8) = 0
28549 setsockopt(19, SOL_IP, IP_TOS, [8], 4) = 0
28549 setsockopt(19, SOL_TCP, TCP_NODELAY, [1], 4) = 0
28549 setsockopt(19, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
28549 read(19, "C\0\0\0\n5.0.51a-community-log\0001\2439\0u"..., 16384) = 71
28549 write(19, "O\0\0\1\217\242\2\0\0\0\0@\10\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 83) = 83
28549 read(19, "\1\0\0\2\376", 16384)   = 5
28549 write(19, "\t\0\0\3WKNHLMUD\0", 13) = 13
28549 read(19, "\7\0\0\4\0\0\0\2\0\0\0", 16384) = 11
28549 poll([{fd=19, events=POLLIN|POLLPRI}], 1, 0) = 0
28549 write(19, "\21\0\0\0\3set autocommit=1", 21) = 21
28549 read(19, "\7\0\0\1\0\0\0\2\0\0\0", 16384) = 11
28549 open("/dev/urandom", O_RDONLY|O_LARGEFILE) = 20
28549 read(20, "\340tZ\373", 4)         = 4
28549 close(20)                         = 0
28549 gettimeofday({1203192462, 792868}, NULL) = 0
28549 time(NULL)                        = 1203192462
28549 poll([{fd=19, events=POLLIN|POLLPRI}], 1, 0) = 0
28549 write(19, "\27\0\0\0\3select * from settings", 27) = 27
28549 read(19, "\1\0\0\1\2:\0\0\2\3def\fwebgui_envco\10setti"..., 16384) = 1448
28549 read(19, "viewLength\002309\0\0,!commercePurcha"..., 16384) = 1448
28549 read(19, "his email address exists in our "..., 16384) = 1087
28549 poll([{fd=19, events=POLLIN|POLLPRI}], 1, 0) = 0
28549 write(19, "G\0\0\0\3replace into userSession (s"..., 75) = 75
28549 read(19, "\7\0\0\1\0\1\0\2\0\1\0", 16384) = 11
28549 poll([{fd=19, events=POLLIN|POLLPRI}], 1, 0) = 0
28549 write(19, "\241\0\0\0\3update userSession set admi"..., 165) = 165
28549 read(19, "0\0\0\1\0\1\0\2\0\0\0(Rows matched: 1  Cha"..., 16384) = 52
28549 time(NULL)                        = 1203192462
28549 poll([{fd=19, events=POLLIN|POLLPRI}], 1, 0) = 0
28549 write(19, "\243\0\0\0\3select content from cache w"..., 167) = 167
28549 read(19, "\1\0\0\1\1:\0\0\2\3def\fwebgui_envco\5cache"..., 16384) = 1182
28549 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
28549 chdir("/data/wre/prereqs")        = 0
28549 rt_sigaction(SIGSEGV, {SIG_DFL}, {SIG_DFL}, 8) = 0
28549 kill(28549, SIGSEGV)              = 0
28549 sigreturn()                       = ? (mask now [])

28549 --- SIGSEGV (Segmentation fault) @ 0 (0) --- 

 

I've seen many examples of this, and every time it happens right after an attempt to select from the cache database. The segfaults stop if I switch to the FileCache (but I can't run on that as my iowait skyrockets). 



Back to Top
Rate [
|
]
 
 
cap10morgan
I should note that this is on an RHEL 4 server (i386) using the pre-compiled binaries distributed by Plain Black of the WRE (0.8.1) for that platform.

Back to Top
Rate [
|
]
 
 
Graham
This may be the result of storing binary data in a text field.  Can you test this with the latest beta release?

Back to Top
Rate [
|
]
 
 
cap10morgan
Yes, hopefully I can do that soon. Should I specifically test with 7.5.5-beta, or just whatever the most recent 7.5 release is at the time I'm able to perform the test?

Back to Top
Rate [
|
]
 
 
Graham
You'll want to test 7.5.6 or greater.  The change hasn't been included in a release yet.

Back to Top
Rate [
|
]