plainblack.com
Username Password
search
Bookmark and Share
View All Tickets
synchToCdn.pl misses many files  (#10617)
Issue

On a test copy of one of my sites running in EC2, synchToCdn.pl --migrate seems to miss many of my files. To the point where I cannot find a single example of the site spitting out a CloudFront URL for any static files.

When I check the /data/domains/... directories, I see that many of the storage locations are missing the .cdn file.

I can let someone into this test environment to debug the issue.

Solution Summary
Comments
Mike_S
0
7/8/2009 8:14 pm
Do you have any examples?  We did one patch to address assets with leading hyphens that I know of...
cap10morgan
0
7/13/2009 12:14 pm
I'm going to see if there is any discernible rhyme or reason to it. In my experience, it seems to miss everything that is actually on my site somewhere. But it does synch up a ton of files that aren't (or else I'm not running into them by clicking around and then searching the source for CDN URLs).

I can give you access to an EC2 instance that demonstrates the issue.
Mike_S
0
7/13/2009 12:30 pm
Can you either post or send me directly your config files (both the webgui conf and s3 bucket config)?

mike schroeder
email: mike@donor.com



cap10morgan
0
7/13/2009 1:09 pm
Sure, here they are.
cap10morgan
0
7/13/2009 2:50 pm
Some stats after analyzing my setup. I have ~20K storage locations and of those, around half sync'd to the CDN and half did not. I don't see any common denominators between those that did and those that didn't. I think there is some more debugging output I can coax out of the syncToCdn.pl script, so I'm going to try doing that next and see if that gets me anywhere.
cap10morgan
0
7/13/2009 2:59 pm
Oh, hold the phone. I'm using Ubuntu's built-in s3cmd version from 8.04. That's probably way too old. And the log is telling me it doesn't recognize the --config option. So let me fix that first. If it solves this issue (which is likely), I'll close this bug.
cap10morgan
0
7/13/2009 7:36 pm
OK, so fixed the s3cmd version issue, but it's still not working. It appears that it is only syncing empty storage locations. So I have a bunch of folders in s3 that contain nothing but .cdn and/or .wgaccess. Any storage locations that contain real files don't get uploaded.

From what I can tell, $store->syncToCdn *does* get called on them, but then it doesn't actually go into S3 and nothing about them is mentioned in the logfile specified in s3-cdn.py (whereas I do see lines going by about uploading the .wgaccess and .cdn files).

Any ideas of other things to check?
Mike_S
0
7/17/2009 12:06 pm
Here is the latest versions of cdn-s3.py and s3del that we have,  Also, here is the latest production cdn config snippet:

   "cdn" : {
      "enabled" : 1,
      "url" : "http://cdn.donor.com/assets",
      "sslAlt" : 1,
      "sslUrl" : "https://donorcdn.s3.amazonaws.com/assets",
      "queuePath" : "/data/domains/donor.com/cdn",
      "syncProgram" : "/data/WebGUI/utils/cdn-s3.py 'donorcdn' 'donorcdn/assets' '%s'",
      "deleteProgram" : "/data/dw/bin/s3del 'donorcdn' 'donorcdn/assets' '%s'",
      "extrasCdn" : "http://cdn.donor.com/extras",
      "extrasSsl" : "https://donorcdn.s3.amazonaws.com/extras",
      "extrasExclude" : ["tinymce"]
   },

Notice that we changed to passing three args - this way we can specify a common bucket but put the assets in a separate subdir - keeps the bucket root a little tidier...

And here is the production s3 config file - don't forget it needs to be named for the bucket (so our bucket of donorcdn becomes donorcdn.s3cfg:

[default]
access_key = ****SECRET******
acl_public = False
bucket_location = US
cloudfront_host = cloudfront.amazonaws.com
cloudfront_resource = /2008-06-30/distribution
default_mime_type = binary/octet-stream
delete_removed = False
dry_run = False
encoding = UTF-8
encrypt = False
force = False
get_continue = False
gpg_command = /usr/bin/gpg
gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_passphrase =
guess_mime_type = True
host_base = s3.amazonaws.com
host_bucket = %(bucket)s.s3.amazonaws.com
human_readable_sizes = False
list_md5 = False
preserve_attrs = True
progress_meter = True
proxy_host =
proxy_port = 0
recursive = False
recv_chunk = 4096
secret_key = ******SECRET*******
send_chunk = 4096
simpledb_host = sdb.amazonaws.com
skip_existing = False
use_https = False
verbosity = WARNING


Also, did you ever do a syncToCdn.pl --migrate?  And your cron job is running?

#Enable this line for synctocdn to work on cron
#be careful as this will generate an e-mail every minute if something is screwed up
* * * * * /data/WebGUI/sbin/run-syncToCdn.sh


I'll be gone for a few weeks on vacation, but hopefully this gives you some ideas to check on...
Mike_S
0
7/17/2009 12:10 pm
Also - we are running s3cmd version 0.9.9 on python 2.5.4 -- I'm pretty sure it will *not* work with python 2.4.x
Mike_S
0
7/17/2009 12:14 pm
Woops - one last correction - from the s3tools.org site: Requires Python 2.4 or newer and some pretty common Python modules.
cap10morgan
0
7/17/2009 5:43 pm
New version exhibits the same behavior, but I think I tracked it down finally. The CDN sync code assumes that all storage location paths are in hex. And all newly-created ones (as of a recent WebGUI version) are, which is why you aren't seeing this bug probably.

However, our sites started in WebGUI 6.6.5, so they have some *old* storage locations out there. The pre-hex ones are silently skipped by this code.

I think I can fix the bug. I'll work on it and update this thread with my progress.
cap10morgan
0
7/20/2009 11:01 am
I'll be posting a patch to fix this later today.
cap10morgan
0
7/20/2009 1:42 pm
OK, I've attached a patch against 7.7.15 that fixes this bug. If a dev can review it and make sure it's OK, I can commit it to SVN.
perlDreamer
0
7/23/2009 4:52 pm
Fixed in 7.7.16.
Resolved by perlDreamer
Details
Ticket Status Resolved  
Rating0.0 
Submitted Bycap10morgan 
Date Submitted2009-07-08 
Assigned To Mike_S  
Date Assigned2009-07-08 
Assigned ByGraham 
Severity Minor (annoying, but not harmful)  
What's the bug in? WebGUI Beta  
WebGUI / WRE Version 7.7.13 / 0.9.3  
URLuse/bugs/tracker/10617
Keywords
Related Files
Ticket History
7/23/2009
9:52 PM
Resolved perlDreamer
7/9/2009
1:08 AM
Assigned to Mike_S Graham
7/8/2009
6:37 PM
Assigned to mikes perlDreamer
7/8/2009
5:48 PM
Ticket created cap10morgan
© 2019 Plain Black Corporation | All Rights Reserved