Changeset 16245

Show
Ignore:
Timestamp:
04/25/08 16:38:19 (2 months ago)
Author:
rambo
Message:

avoid renaming wgets .orig files, make the base options also overrideable, removed two scripts to test parts of dump_sites functionality that are no longer maintained

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • branches/MidCOM_2_8/fi.hut.staticdumps/bin/dump_sites.php

    r16188 r16245  
    33ini_set('error_reporting', E_ALL); 
    44 
    5 $wget_options = "-erobots=off -q -m -nH"; 
    6 $rsync_options = '-a'; 
    7 $http_timeout = 300; // seconds = 5minutes 
    8 $lockfile_path = '/var/run'; 
    9 $lockfile_prefix = 'fi_hut_staticdumps_'; 
     5$defaults = array 
     6
     7    'wget_options' => '-erobots=off -q -m -nH', 
     8    'rsync_options' => '-a', 
     9    'http_timeout' => 300, // seconds = 5minutes 
     10    'lockfile_path' => '/var/run', 
     11    'lockfile_prefix' => 'fi_hut_staticdumps_', 
     12); 
    1013 
    1114function better_die($msg) 
     
    98101foreach ($sites_config as $k => $site_config) 
    99102{ 
     103    foreach ($defaults as $key => $val) 
     104    { 
     105        if (isset($site_config[$key])) 
     106        { 
     107            $$key = $site_config[$key]; 
     108        } 
     109        else 
     110        { 
     111            $$key = $val; 
     112        } 
     113    } 
    100114    if (!isset($site_config['url'])) 
    101115    { 
     
    197211            foreach($output as $filepath) 
    198212            { 
     213                if (preg_match('/\.orig$/', $filepath)) 
     214                { 
     215                    // Skip wget --keep-originals .orig files from rename 
     216                    continue; 
     217                } 
    199218                list($filepart, $querypart) = explode('?', $filepath); 
    200219                $newpath = dirname($filepart) . "/{$querypart}_" . basename($filepart); 
  • branches/MidCOM_2_8/fi.hut.staticdumps/documentation/USAGE

    r15931 r16245  
    1818      'post_dump_script => '', // optional, the url is passed as argument along with general status indicator exit codes some certain prior operations and dump path are passed as arguments  
    1919    ), 
     20 
     21Also if you wish to override some of the more basic settings there are the following config keys and their default values: 
     22 
     23    'wget_options' => '-erobots=off -q -m -nH', 
     24    'rsync_options' => '-a', 
     25    'http_timeout' => 300, // seconds = 5minutes 
     26    'lockfile_path' => '/var/run', 
     27    'lockfile_prefix' => 'fi_hut_staticdumps_', 
     28 
     29The command used to execute `wget` is formed from the wget_options internal default concatenated with the `wget_extra_options` (if defined), ditto for `rsync`. `http_timeout` is used when querying for protected/redirection folder list. If you have multiple nodes sharing the load of dumping a ton of sites `lockfile_path` should point to a shared directory they all can write to, `lockfile_prefix` is configurable for completeness sake. 
    2030 
    2131In the VirtualHost directive of your static apache set the following to handle URLs with GET parameters in them nicely: