plainblack.com
Username Password
search
Bookmark and Share

    

Creating a Perl utility script to delete orphaned files

User xootom
Date 9/6/2011 5:39 pm
Views 6865
Rating 4    Rate [
|
]
Previous · Next
User Message
xootom

I have sites that have enourmous upload directories, I think from past failed gallery imports. I've been trying to find out ways of purging all the orphaned files, and think the best way may be to build a script that recurses through the uploads directory and checks whether each file exists in the site database.

However files aren't just storageIds of FileAsset, there are storage IDs elsewhere such as Post. There are other similar looking fields too like MapPoint StorageIdPhoto. 

Is there a way I can verify whether a storage location is used using the API that takes this into account?

Another way of accomplishing the same may be to export all the valid files to a new directory structure, but this still requires a definitive list of used storage locations from the database. 



Back to Top
Rate [
|
]
 
 
perlDreamer

Currently, there's no way of doing that built into WebGUI, however, if you look at lib/WebGUI/Group.pm, resetGroupFields, you'd see the beginnings of some code that is very similar to what you need.

What that code does is introspect the definition of every asset type, and then go through and see if any of the are group fields,  Then it checks every group field to see if it contains the group that has just been deleted.  If it does, then it changes that group to be Admin so that calling Group->new doesn't do bad things.

The code that you want would work very similarly, but it would work on storageIds (Form types File and Image).  In addition to every revision of every asset's properties, it would need to check:

  • DataForm fields (these are versioned, too)
  • Thingy fields
  • User profile fields (photo, avatar, and any custom ones)
  • And probably a lot more places than I can think of off the top of my head.


Back to Top
Rate [
|
]
 
 
arjan
There's no RFE for this yet, right? I'll donate some karma.

The character set where storage ids are made of is different from the character set where WebGUI ids are made off, correct? Does that change in WG8? Because that would make this easier. And if that changes, perhaps a character set can be chosen that makes it possible to use ids in mysql without backticks and in css without prefix.

Kind regards,
Arjan.

On 09/07/2011 12:39 AM, dev@webgui.org wrote:
xootom wrote:

I have sites that have enourmous upload directories, I think from past failed gallery imports. I've been trying to find out ways of purging all the orphaned files, and think the best way may be to build a script that recurses through the uploads directory and checks whether each file exists in the site database.

However files aren't just storageIds of FileAsset, there are storage IDs elsewhere such as Post. There are other similar looking fields too like MapPoint StorageIdPhoto. 

Is there a way I can verify whether a storage location is used using the API that takes this into account?

Another way of accomplishing the same may be to export all the valid files to a new directory structure, but this still requires a definitive list of used storage locations from the database. 



http://www.webgui.org/forums/dev/creating-a-perl-utility-script-to-delete-orphaned-files

--

WebGUI
http://www.webgui.org


-- 
NIEUW: http://www.mediacalculator.unitedknowledge.nl/

Hoe verslaan de media het politieke nieuws? Wie haalt het nieuws en hoe werkt dat uit? Bekijk het in de MediaCalculator: mediacalculator.unitedknowledge.nl

Recent: http://www.lomcongres.nl/
Congres- en nieuwsbriefportaal met relatiebeheer systeem voor het Landelijk Overleg Milieuhandhaving

Setting Standards, a Delft University of Technology and United Knowledge simulation exercise on strategy and cooperation in standardization, http://www.setting-standards.com

United Knowledge, internet voor de publieke sector
Keizersgracht 74
1015 CT Amsterdam
T +31 (0)20 52 18 300
F +31 (0)20 52 18 301
bureau@unitedknowledge.nl
http://www.unitedknowledge.nl

M +31 (0)6 2427 1444
E arjan@unitedknowledge.nl

Bezoek onze site op:
http://www.unitedknowledge.nl

Of bekijk een van onze projecten:
http://www.handhavingsportaal.nl/
http://www.setting-standards.com/
http://www.lomcongres.nl/
http://www.clubvanmaarssen.org/ 


Back to Top
Rate [
|
]
 
 
xootom

There's no RFE for this yet, right? I'll donate some karma.

Thanks :-)

I've created a corresponding RFE now.



Back to Top
Rate [
|
]
 
 
xootom

We've now built the utility script to clean up the uploads directory.

Direct link:
http://www.webgui.org/addons/uploads-file-cleanup-utility-script 



Back to Top
Rate [
|
]
 
 
arjan
Cool. I'll try it out.

On 09/13/2011 02:46 PM, dev@webgui.org wrote:
xootom wrote:

We've now built the utility script to clean up the uploads directory.

Direct link:
http://www.webgui.org/addons/uploads-file-cleanup-utility-script 



http://www.webgui.org/forums/dev/creating-a-perl-utility-script-to-delete-orphaned-files/4

--

WebGUI
http://www.webgui.org


-- 
NIEUW: http://www.mediacalculator.unitedknowledge.nl/

Hoe verslaan de media het politieke nieuws? Wie haalt het nieuws en hoe werkt dat uit? Bekijk het in de MediaCalculator: mediacalculator.unitedknowledge.nl

Recent: http://www.lomcongres.nl/
Congres- en nieuwsbriefportaal met relatiebeheer systeem voor het Landelijk Overleg Milieuhandhaving

Setting Standards, a Delft University of Technology and United Knowledge simulation exercise on strategy and cooperation in standardization, http://www.setting-standards.com

United Knowledge, internet voor de publieke sector
Keizersgracht 74
1015 CT Amsterdam
T +31 (0)20 52 18 300
F +31 (0)20 52 18 301
bureau@unitedknowledge.nl
http://www.unitedknowledge.nl

M +31 (0)6 2427 1444
E arjan@unitedknowledge.nl

Bezoek onze site op:
http://www.unitedknowledge.nl

Of bekijk een van onze projecten:
http://www.handhavingsportaal.nl/
http://www.setting-standards.com/
http://www.lomcongres.nl/
http://www.clubvanmaarssen.org/ 


Back to Top
Rate [
|
]
 
 
susanb

W00T!

Can't wait to try it out. Thanks guys!



Back to Top
Rate [
|
]
 
 
    



© 2019 Plain Black Corporation | All Rights Reserved