diff --git a/README.md b/README.md index 2195285..b8d3f8e 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@ # mirror-sync -A tool to mirror repostories for Linux and other similar tools. This tool is designed to help follow upstream mirror instructions, and implement the features they expect from a downstream official mirror. It also includes features to help keep you in the loop in case of situations that need manual intervention. +A tool to mirror repositories for Linux and other similar tools. This tool is designed to help follow upstream mirror instructions, and implement the features they expect from a downstream official mirror. It also includes features to help keep you in the loop in case of situations that need manual intervention. ## Configuration It is suggested that you mirror using a sub user account, this tool prevents execution as root to protect you. Once you have an user account dedicated to mirror activities, you can make the log directory, configure logrotate, and add a configuration file to define configurations. @@ -29,7 +29,7 @@ The configuration file is in `/etc/mirror-sync.conf` and is formatted in bash. ## Main configurations ### MODULES -The available modules separated by space. Each module is a separate repostory to sync, and this list allows the script to know how to find their configs. +The available modules separated by space. Each module is a separate repository to sync, and this list allows the script to know how to find their configs. ### TRACEHOST The hostname to show in trace project files, it defaults to the FQDN hostname of the server. @@ -47,7 +47,7 @@ If you wish to override where logs are stored, the default is `/var/log/mirror-s Timeout before a sync is cancelled, defaults to `timeout 1d` which should work for most mirrors. ### max_errors -How many errors before an email is sent regarding the issue. This allows you to ignore anomolies. +How many errors before an email is sent regarding the issue. This allows you to ignore anomalies. ### upstream_max_age If the upstream last modified date is older than the defined number of seconds, the upstream check will skip syncing. Default is 5 hours. @@ -98,7 +98,7 @@ How did the sync occur, cron job or manually via ssh? This is auto detected and Path to save a grand total of each disk usage sum in human readable form. ### dusum_kbytes_total_file -Path to save a grand total of each disk usage sum in killo bytes. +Path to save a grand total of each disk usage sum in kilobytes. ## Module specific configurations Each module is configured via configurations prefixed by the module name. The one configuration used by all modules is the `_sync_method` configuration which defines what sync method to use. Each sync method has different configurations available. The default sync method is rsync. @@ -114,7 +114,7 @@ Each repo has at bare minimum the following configurations: Synchronizes a git repository via git pull. To use this method, you need to have the git package installed. #### options -Extra options appended to `get pull`. +Extra options appended to `git pull`. #### Example ```bash @@ -132,7 +132,7 @@ The bucket URL to sync with. #### aws_access_key The access key for the s3 bucket. -### aws_secret_key +#### aws_secret_key The secret for the s3 bucket. #### aws_endpoint_url @@ -160,14 +160,14 @@ The bucket URL to sync with. #### aws_access_key The access key for the s3 bucket. -### aws_secret_key +#### aws_secret_key The secret for the s3 bucket. #### aws_endpoint_url If you are using a third party S3 compatible service, you can enter their endpoint URL here in format of HOSTNAME:PORT. #### options -Extra options to append to `s5cmd`. +Extra options to append to `s3cmd`. #### Example ```bash @@ -199,7 +199,7 @@ The bucket URL to sync with. You must end the bucket url with `*` for s5cmd to w #### aws_access_key The access key for the s3 bucket. -### aws_secret_key +#### aws_secret_key The secret for the s3 bucket. #### aws_endpoint_url @@ -275,9 +275,9 @@ If your repo needs a 2 stage rsync, define some options here. The most basic opt A hook to run prior to the second stage sync. #### upstream_check -An http URL to check the last modified date as a reference for if the upstream mirror was possibly modified recently. This option is mainly here to lower the impact on upstream mirrors so that mirrorning happens less often. See `upstream_timestamp_min` and `upstream_max_age` for global configuration options of this check. +An http URL to check the last modified date as a reference for if the upstream mirror was possibly modified recently. This option is mainly here to lower the impact on upstream mirrors so that mirroring happens less often. See `upstream_timestamp_min` and `upstream_max_age` for global configuration options of this check. -### time_file_check +#### time_file_check Name of a time file to check if the upstream has updated before syncing all files to reduce load on upstream mirrors. #### report_mirror @@ -334,7 +334,7 @@ example_type="iso" ### qfm Quick Fedora Mirror is a tool to help Fedora mirrors distribute changes faster and save on resources when trying to discover what needs to be synced. To use this method, you must have both the rsync and zsh package installed. This tool automatically downloads QFM if you do not already have it installed. -This tool requires that the upstream mirror has an module with sub modules designed for use with quick-fedora-mirror. You can use this tool with non-fedora mirrors, however they must follow the fedora module configurations. For fedora mirrors, you can utilize [tier 1 mirrors](https://fedoraproject.org/wiki/Infrastructure/Mirroring/Tiering#Tier_1_mirrors). +This tool requires that the upstream mirror has a module with sub modules designed for use with quick-fedora-mirror. You can use this tool with non-fedora mirrors, however they must follow the fedora module configurations. For fedora mirrors, you can utilize [tier 1 mirrors](https://fedoraproject.org/wiki/Infrastructure/Mirroring/Tiering#Tier_1_mirrors). You can list modules available on an rsync server with: ```bash @@ -405,7 +405,7 @@ example_type=rpm ## CLI Options There are not that many cli options available, usage is as follows: ``` -[--help|--update-support-utilities] {module} [--force] +[--help|--update-support-utilities|--version] {module} [--force] ``` ## Requirements list @@ -449,7 +449,7 @@ This tool utilizes the same config file as mirror-sync, and shares the following * timestamp - Used for sync time. * dusum - Used for disk usage summary. -The tool also adds the following repo +The tool also adds the following repo-specific configurations: ### section What section to associate the repo with. @@ -460,7 +460,7 @@ A title for the repo to show instead of the directory name. ### repo_icon The repo icon, will default to tux if not defined. The icon can be defined as an http(s) link, file path, a file stored in the template directory, or png image name from [Dashboard Icons](https://github.com/walkxcode/dashboard-icons/tree/main/png). The script will automatically make a copy or download the icon to the image folder. -### repo_descriotion +### repo_description A description to show at the bottom of the repo card. ### repo_skip @@ -476,6 +476,8 @@ If you do not have a timestamp file with the UNIX timestamp of the last sync, bu If you have a repo that is not synced via the mirror-sync, but want to customize its look on the generated index.html. You can define a list of custom modules with the `CUSTOM_MODULES` variable, then define any of the following configurations. * repo +* timestamp +* dusum * section * repo_title * repo_icon @@ -494,7 +496,7 @@ example_repo='/home/mirror/http/' example_section="official" example_repo_title="Test repo" example_repo_icon="terminal.png" -example_repo_descriont="Test, this is a test." +example_repo_description="Test, this is a test." example2_repo='/home/mirror/windows/' example2_repo_icon="windows.png" @@ -526,7 +528,7 @@ A name for the global footer generation. MIRRORS="mirror_example" mirror_example_path="/home/mirror/mirror_docroot" -mirror_example_name="My company" +mirror_example_title="My company" mirror_example_logo="http://example.com/logo.png" mirror_example_description="A public mirror provided by this cool company." mirror_example_provider_site="http://www.example.com/" @@ -537,11 +539,11 @@ mirror_example_provider_name="Company" You can define multiple sections for the index.html with `SECTIONS` variable, it defaults to `official unofficial`. You can then set a default section with `section_default`, which defaults to `unofficial`. A title is auto generated as `{SECTION} Mirrors`, which you can customize with a variable named `section_{SECTION}_title`. ## Templates -Where templates are stored is configured by `template_dir` which defaults to `/usr/local/share/file-generator-templates`. Default files should be stored under the `default` sub directory, and any customizations to individual mirrors should be saved under a sub directory with that mirror's name. You can add icons/logos into these template directories as well. +Where templates are stored is configured by `template_dir` which defaults to `/usr/local/share/mirror-file-generator/templates`. Default files should be stored under the `default` sub directory, and any customizations to individual mirrors should be saved under a sub directory with that mirror's name. You can add icons/logos into these template directories as well. Default templates: * header.html - The main index header. -* secion.thml - Template for a secion. +* section.html - Template for a section. * repo.html - The repo card template. * footer.html - The footer of the index. * footer.txt - Template for the global footer file. @@ -549,28 +551,28 @@ Default templates: ## Configurations of general defaults. ### index_generate -Rather or not to generate the index.html file. +Whether or not to generate the index.html file. * 1 Enabled -* 0 Disbaled +* 0 Disabled ### index_file_name If your index file name is different, you can adjust here. ### footer_generate -Rather or not to generate a footer file that can be configured as the mirror's global footer. +Whether or not to generate a footer file that can be configured as the mirror's global footer. * 1 Enabled -* 0 Disbaled +* 0 Disabled ### footer_file_name Alternative file name for the footer file. ### dir_sizes_generate -Rather or not to generate directory sizes file. +Whether or not to generate directory sizes file. * 1 Enabled -* 0 Disbaled +* 0 Disabled ### dir_sizes_file_name Alternative file name for directory sizes file. @@ -591,4 +593,13 @@ Where to store logos and icons. The default URL to pull icons from, defaults to [Dashboard Icons](https://github.com/walkxcode/dashboard-icons/tree/main/png). ### icons_default_img -A default file to use if icon or logo defined either isn't defined or isn't accessible. \ No newline at end of file +A default file to use if icon or logo defined either isn't defined or isn't accessible. + +### icons_local_repo +Local path to a cloned copy of the dashboard-icons git repository. When this directory exists, the script serves icons from it instead of fetching them over HTTP, which avoids per-icon network requests. Defaults to `$HOME/dashboard-icons`. + +### icons_repo_url +Git URL used to clone the dashboard-icons repository into `icons_local_repo` if it does not already exist. Set to an empty string to disable automatic cloning. Defaults to `https://github.com/walkxcode/dashboard-icons.git`. + +### icons_repo_refresh +How often (in seconds) the local dashboard-icons clone is pulled for updates. Defaults to `604800` (7 days). \ No newline at end of file diff --git a/mirror-file-generator.sh b/mirror-file-generator.sh index e2c89f9..eb43869 100644 --- a/mirror-file-generator.sh +++ b/mirror-file-generator.sh @@ -10,8 +10,9 @@ PATH="/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:$HOME/.local/ # Variables about this program. PROGRAM="mirror-file-generator" -VERSION="20240219" -PIDFILE="/var/run/$PROGRAM.pid" +VERSION="20260602" +PIDPATH="/tmp" +PIDFILE="${PIDPATH}/${PROGRAM}.pid" LOGFILE="/var/log/mirror-sync/$PROGRAM.log" # Default variables @@ -24,9 +25,12 @@ footer_generate=1 footer_file_name="footer.txt" dir_sizes_generate=1 dir_sizes_file_name="DIRECTORY_SIZES.TXT" -dir_sizes_unknown_path="/home/mirror/dusum/unknown_dirs" +dir_sizes_unknown_path="$HOME/dusum/unknown_dirs" dir_sizes_human_readable=1 icons_dir_name="img" +icons_local_repo="$HOME/dashboard-icons" +icons_repo_url="https://github.com/walkxcode/dashboard-icons.git" +icons_repo_refresh=604800 icons_default_source="https://raw.githubusercontent.com/walkxcode/dashboard-icons/main/png" icons_default_img="tux.png" @@ -108,14 +112,28 @@ image_copy() { local save_path="$path/$icons_dir_name/$file_name.$extension" local http_code - # If the file isn't already saved, attempt to grab it. + # Determine if the saved file needs to be updated. + local needs_update=0 if [[ ! -e "$save_path" ]]; then + needs_update=1 + elif [[ "$file" =~ ^http(s|)\:\/\/ ]]; then + # Re-download remote files if older than the refresh interval. + if (( $(date +%s) - $(stat --format='%Y' "$save_path") > icons_repo_refresh )); then + needs_update=1 + fi + elif [[ -f $file ]]; then + # Re-copy local files if the source mtime differs. + if (( $(stat --format='%Y' "$file") != $(stat --format='%Y' "$save_path") )); then + needs_update=1 + fi + fi + + if (( needs_update )); then # If http, use curl to grab the image. if [[ "$file" =~ ^http(s|)\:\/\/ ]]; then - # If failure, and is not the default image, attempt to grab the default file. # --fail suppresses writing the response body on HTTP errors; rm -f cleans up - # any partial file so the fallback recursion isn't blocked by the [[ ! -e ]] guard. + # any partial file so the fallback recursion isn't blocked by the needs_update check. if ! http_code=$(curl -sf --write-out "%{http_code}" -o "$save_path" "$file") \ || ( ((http_code!=200)) && [[ "$file" != "$icons_default_img" ]] \ && [[ "$file" != "$icons_default_source/$icons_default_img" ]] ); then @@ -123,17 +141,17 @@ image_copy() { image_copy "$icons_default_img" "$file_name" return fi - # If the file exists, copy it. + # If the file exists, copy it preserving mtime. elif [[ -f $file ]]; then - cp "$file" "$save_path" + cp -p "$file" "$save_path" else # Check to see if a template file exists with the file name. local t_file t_file=$(template_file "$file") - # If the file exists, copy it. + # If the file exists, copy it preserving mtime. if [[ -f $t_file ]]; then - cp "$t_file" "$save_path" - elif [[ "$file" != "$icons_default_source/$file" ]]; then + cp -p "$t_file" "$save_path" + elif [[ "$file" != /* ]] && [[ "$file" != "$icons_default_source/$file" ]]; then # If nothing else exists, try grabbing from the default source. image_copy "$icons_default_source/$file" "$file_name" return @@ -199,7 +217,7 @@ while (( $# > 0 )); do # If the mirror wasn't found, quit. if [[ -z $foundMirror ]]; then - echo "Unknown mirror '$1'" + echo "Unknown mirror '$mirror'" echo print_help "$@" fi @@ -230,13 +248,13 @@ if [[ -f $PIDFILE ]]; then # Check if PID is active. if ps -p "$PID" >/dev/null; then - log "A sync is already in progress for ${MODULE} with pid ${PID}." + log "A sync is already in progress (pid ${PID})." exit 1 fi fi # Create a new pid file for this process. -echo $BASHPID >"$PIDFILE" +echo "$BASHPID" >"$PIDFILE" # On exit, remove pid file. trap 'rm -f "$PIDFILE"' EXIT @@ -259,6 +277,28 @@ if (( ${#selected_mirrors[@]} == 0 )); then done fi +# Ensure the local dashboard-icons repo is present and fresh (pulled at most weekly). +if [[ -n $icons_local_repo ]]; then + if [[ -n $icons_repo_url ]] && [[ ! -d "$icons_local_repo/.git" ]]; then + log "Cloning dashboard-icons to $icons_local_repo" + git clone --depth=1 "$icons_repo_url" "$icons_local_repo" \ + || log "Warning: failed to clone dashboard-icons, falling back to remote URLs" + elif [[ -n $icons_repo_url ]] && [[ -d "$icons_local_repo/.git" ]]; then + fetch_head="$icons_local_repo/.git/FETCH_HEAD" + if [[ ! -f "$fetch_head" ]] \ + || (( $(date +%s) - $(stat --format='%Y' "$fetch_head") > icons_repo_refresh )); then + log "Updating dashboard-icons at $icons_local_repo" + git -C "$icons_local_repo" pull --ff-only \ + || log "Warning: failed to update dashboard-icons" + fi + fi + + # Prefer the local clone over the remote URL. + if [[ -d "$icons_local_repo/png" ]]; then + icons_default_source="$icons_local_repo/png" + fi +fi + # Keep track of repos which sizes were updated for to # avoid updating sizes in multi mirror situations. repo_sizes_updated=() @@ -347,7 +387,7 @@ for ((i=0; i<${#selected_mirrors[@]}; i++)); do eval sync_method="\${${MODULE}_sync_method:-rsync}" # If is this module. - if [[ "${repo:?}" == "$real_dir" ]]; then + if [[ -n $repo ]] && [[ "$repo" == "$real_dir" ]]; then found_repo=1 # If QFM module, we need to determine sub path using QFM logic. elif [[ "${sync_method:?}" == "qfm" ]]; then @@ -391,13 +431,13 @@ for ((i=0; i<${#selected_mirrors[@]}; i++)); do read_config # If a timestamp file exists, grab and format the date. - if [[ -f ${timestamp:?} ]]; then - repo_sync_time=$(date -d "@$(cat "${timestamp:?}")" '+%c') + if [[ -n $timestamp ]] && [[ -f $timestamp ]]; then + repo_sync_time=$(date -d "@$(cat "$timestamp")" '+%c') fi # If a directory usage summary exists and we're not skipping, parse the size. - if [[ -f ${dusum:?} ]] && ((${repo_skip:-0} == 0)); then - repo_size_kb=$(grep "$real_dir" "${dusum:?}" | awk '{print $1}') + if [[ -n $dusum ]] && [[ -f $dusum ]] && ((${repo_skip:-0} == 0)); then + repo_size_kb=$(grep "$real_dir" "$dusum" | awk '{print $1}') if [[ -n $repo_size_kb ]]; then totalKBytes=$((totalKBytes+repo_size_kb)) repo_size=$(echo "$repo_size_kb*1024" | bc | numfmt --to=iec) @@ -409,12 +449,12 @@ for ((i=0; i<${#selected_mirrors[@]}; i++)); do if ((found_repo == 0)); then # To allow customization of non synced modules, check each module. - for MODULE in ${CUSTOM_MODULES:?}; do + for MODULE in ${CUSTOM_MODULES:-}; do # Get the repo with trailing slash removed. eval repo="\${${MODULE}_repo%/}" # Confirm if this custom module is this repo, and parse configs if it is. - if [[ "${repo:?}" == "$real_dir" ]]; then + if [[ -n $repo ]] && [[ "$repo" == "$real_dir" ]]; then log "Found custom configurations" read_config # Stage/prod tiers populated by mirror-promote.sh land here, @@ -527,7 +567,7 @@ for ((i=0; i<${#selected_mirrors[@]}; i++)); do if ((dir_sizes_human_readable)); then printf "%-5s %s\n" "$repo_size" "$dir_name" >> "$dir_sizes_file_path" else - printf "%-12s %s\n" "$repo_size_kb" "$dir_name" >> "$dir_sizes_file_path" + printf "%-12s %s\n" "${repo_size_kb:-0}" "$dir_name" >> "$dir_sizes_file_path" fi fi @@ -539,7 +579,7 @@ for ((i=0; i<${#selected_mirrors[@]}; i++)); do # If the index should be generated, add each section and footer. if ((index_generate)); then - # Add all sections and remove teh temp file. + # Add all sections and remove the temp file. for SECTION in $SECTIONS; do cat "$index_file_temp.$SECTION" >> "$index_file_temp" rm -f "$index_file_temp.$SECTION" @@ -552,6 +592,8 @@ for ((i=0; i<${#selected_mirrors[@]}; i++)); do if grep -q "Last Sync:" "$index_file_temp"; then [[ -f $index_file_path ]] && rm -f "$index_file_path" mv "$index_file_temp" "$index_file_path" + else + rm -f "$index_file_temp" fi fi @@ -564,7 +606,7 @@ for ((i=0; i<${#selected_mirrors[@]}; i++)); do fi fi - # If we should generate the gloabl footer, do so. + # If we should generate the global footer, do so. if ((footer_generate)); then log "Generating footer for $mirror at $path/$footer_file_name" envsubst < "$(template_file footer.txt)" > "$path/$footer_file_name" diff --git a/mirror-sync.sh b/mirror-sync.sh index 59ce7b3..fdcf316 100644 --- a/mirror-sync.sh +++ b/mirror-sync.sh @@ -5,7 +5,7 @@ PATH="/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:$HOME/.local/ # Variables for trace generation. PROGRAM="mirror-sync" -VERSION="20240219" +VERSION="20260602" TRACEHOST=$(hostname -f) mirror_hostname=$(hostname -f) DATE_STARTED=$(LC_ALL=POSIX LANG=POSIX date -u -R) @@ -122,7 +122,7 @@ quick_fedora_mirror_install() { exit 1 fi fi - ) + ) || return 1 } # Installs jigdo image tool. @@ -180,7 +180,7 @@ EOF fi fi make install - ) + ) || return 1 fi } @@ -211,7 +211,10 @@ s5cmd_install() { fi # Extract and check that s5cmd extracted correctly. - tar -xf s5cmd.tar.gz + if ! tar -xf s5cmd.tar.gz; then + echo "Unable to extract s5cmd." + exit 1 + fi if ! [[ -f s5cmd ]]; then echo "Unable to extract s5cmd." exit 1 @@ -242,11 +245,11 @@ jigdo_hook() { currentVersion="${currentVersion##* -> }" versionDir="$(realpath "$repo")/${currentVersion}" - # For each archetecture, run jigdo to build iso files. + # For each architecture, run jigdo to build iso files. for a in "$versionDir"/*/; do arch=$(basename "$a") - # Determine what releases are needed for this archetecture. + # Determine what releases are needed for this architecture. sets=$(cat "${repo}/project/build/${currentVersion}/${arch}") # For each set, build iso files. @@ -299,7 +302,7 @@ build_trace_content() { echo "Date: ${rfc822date}" echo "Date-Started: ${DATE_STARTED}" - if [[ -e $TRACEFILE_MASTER ]]; then + if [[ -e $TRACE_MASTER_FILE ]]; then echo "Archive serial: $(extract_trace_field 'Archive serial' "$TRACE_MASTER_FILE" || echo unknown )" fi @@ -326,7 +329,7 @@ build_trace_content() { echo "Trigger: ${INFO_TRIGGER}" fi - # Depending on repo type, find archetectures supported. + # Depending on repo type, find architectures supported. ARCH_REGEX='(source|SRPMS|amd64|mips64el|mipsel|i386|x86_64|aarch64|ppc64le|ppc64el|s390x|armhf)' if [[ $repo_type == "deb" ]]; then ARCH=$(find "${repo}/dists" \( -name 'Packages.*' -o -name 'Sources.*' \) 2>/dev/null | @@ -364,18 +367,21 @@ build_trace_content() { total=$(( total + bytes )) done elif [[ -f $LOGFILE_STAGE1 ]]; then - bytes=$(sed -Ene 's/(^|.* )sent ([0-9]+) bytes received ([0-9]+) bytes.*/\3/p' "$LOGFILE_STAGE1") - total=$(( total + bytes )) + for bytes in $(sed -Ene 's/(^|.* )sent ([0-9]+) bytes received ([0-9]+) bytes.*/\3/p' "$LOGFILE_STAGE1"); do + total=$(( total + bytes )) + done fi if [[ -f $LOGFILE_STAGE2 ]]; then - bytes=$(sed -Ene 's/(^|.* )sent ([0-9]+) bytes received ([0-9]+) bytes.*/\3/p' "$LOGFILE_STAGE2") - total=$(( total + bytes )) + for bytes in $(sed -Ene 's/(^|.* )sent ([0-9]+) bytes received ([0-9]+) bytes.*/\3/p' "$LOGFILE_STAGE2"); do + total=$(( total + bytes )) + done fi if (( total > 0 )); then echo "Total bytes received in rsync: ${total}" fi # Calculate time per rsync stage and print both stages if both were started. + total_time=0 if [[ $sync_started ]]; then STATS_TOTAL_RSYNC_TIME1=$(( sync_ended - sync_started )) total_time=$STATS_TOTAL_RSYNC_TIME1 @@ -444,6 +450,7 @@ acquire_lock() { fi # Redirect stdout to both stdout and log file. + mkdir -p "$LOGPATH" exec 1> >(tee -a "$LOGFILE") # Redirect errors to stdout so they also are logged. exec 2>&1 @@ -465,7 +472,7 @@ acquire_lock() { fi # Create a new pid file for this process. - echo $BASHPID >"$PIDFILE" + echo "$BASHPID" >"$PIDFILE" # On exit, remove pid file. trap 'rm -f "$PIDFILE"' EXIT @@ -493,15 +500,15 @@ rebuild_dusum_totals() { { date totalKBytes=0 - for MODULE in ${MODULES:?}; do - eval dusum="\${${MODULE}_dusum:-}" - if [[ -n $dusum ]] && [[ -f $dusum ]]; then + for _DUSUM_MODULE in ${MODULES:?}; do + eval _dusum="\${${_DUSUM_MODULE}_dusum:-}" + if [[ -n $_dusum ]] && [[ -f $_dusum ]]; then while read -r size path; do if [[ -n $size ]]; then totalKBytes=$((totalKBytes+size)) printf "%-12s %s\n" "$size" "$path" fi - done < "$dusum" + done < "$_dusum" fi done printf "%-12s %s\n" "$totalKBytes" "total" @@ -513,15 +520,15 @@ rebuild_dusum_totals() { { date totalKBytes=0 - for MODULE in ${MODULES:?}; do - eval dusum="\${${MODULE}_dusum:-}" - if [[ -n $dusum ]] && [[ -f $dusum ]]; then + for _DUSUM_MODULE in ${MODULES:?}; do + eval _dusum="\${${_DUSUM_MODULE}_dusum:-}" + if [[ -n $_dusum ]] && [[ -f $_dusum ]]; then while read -r size path; do if [[ -n $size ]]; then totalKBytes=$((totalKBytes+size)) printf "%-5s %s\n" "$(echo "$size*1024" | bc | numfmt --to=iec)" "$path" fi - done < "$dusum" + done < "$_dusum" fi done printf "%-5s %s\n" "$(echo "$totalKBytes*1024" | bc | numfmt --to=iec)" "total" @@ -577,7 +584,7 @@ post_failed_sync() { # Remove the error count file so that the count resets. rm -f "$ERRORFILE" - # Exit not to not save the updated count. + # Exit to not save the updated count. exit 1 fi @@ -610,20 +617,18 @@ git_sync() { # Start the module. module_config "$1" - ( - # Do a git pull within the repo folder to sync. - if ! cd "${repo:?}"; then - echo "Failed to access '${repo:?}' git repository." - exit 1 - fi - eval git pull "$options" - RT=${PIPESTATUS[0]} - if (( RT == 0 )); then - post_successful_sync - else - post_failed_sync - fi - ) + # Do a git pull within the repo folder to sync. + if ! cd "${repo:?}"; then + echo "Failed to access '${repo:?}' git repository." + exit 1 + fi + eval git pull ${options:+$options} + RT=$? + if (( RT == 0 )); then + post_successful_sync + else + post_failed_sync + fi log_end_header } @@ -708,7 +713,7 @@ s5cmd_sync() { # Run AWS client to sync the S3 bucket. eval "$sync_timeout" "$S5CMD_BIN" "$options" \ - sync "${sync_options:-}" \ + sync ${sync_options:+$sync_options} \ --no-follow-symlinks \ --delete \ "'${bucket:?}'" "'${repo:?}'" @@ -751,6 +756,7 @@ wget_sync() { fi ( + trap - EXIT # Make sure the repo directory exists and we are in it. if ! [[ -e $repo ]]; then mkdir -p "$repo" @@ -758,6 +764,7 @@ wget_sync() { if ! cd "$repo"; then echo "Unable to enter repo directory." + exit 1 fi # Run wget with configured options. @@ -769,6 +776,10 @@ wget_sync() { post_failed_sync fi ) + RT=$? + if (( RT != 0 )); then + exit "$RT" + fi log_end_header } @@ -823,20 +834,31 @@ rsync_sync() { # when we detect its needed or when last rsync was a long time ago. if [[ $upstream_check ]] && (( force == 0 )); then now=$(date +%s) - last_timestamp=$(cat "${timestamp:?}") + if [[ ! -f ${timestamp:?} ]]; then + echo "Timestamp file not found, skipping upstream check." + else + last_timestamp=$(cat "$timestamp") - # If last update was not that long ago, we should check if upstream was updated recently. - if (( now-last_timestamp < ${upstream_timestamp_min:?} )); then - echo "Checking upstream's last modified." + # If last update was not that long ago, we should check if upstream was updated recently. + if (( now-last_timestamp < ${upstream_timestamp_min:?} )); then + echo "Checking upstream's last modified." - # Get the last modified date. - IFS=': ' read -r _ last_modified < <(curl -sI HEAD "${upstream_check:?}" | grep Last-Modified) - last_modified_unix=$(date -u +%s -d "$last_modified") + # Get the last modified date. + IFS=': ' read -r _ last_modified < <(curl -sI "${upstream_check:?}" | grep -i Last-Modified) + last_modified="${last_modified//$'\r'/}" - # If last modified is greater than our max age, it wasn't modified recently and we should not rsync. - if (( now-last_modified_unix > ${upstream_max_age:-0} )); then - echo "Skipping sync as upstream wasn't updated recently." - exit 88 + # If last modified couldn't be determined, proceed with sync. + if [[ -z $last_modified ]]; then + echo "Could not determine upstream last-modified, proceeding with sync." + else + last_modified_unix=$(date -u +%s -d "$last_modified") + + # If last modified is greater than our max age, it wasn't modified recently and we should not rsync. + if (( now-last_modified_unix > ${upstream_max_age:-0} )); then + echo "Skipping sync as upstream wasn't updated recently." + exit 88 + fi + fi fi fi fi @@ -844,19 +866,26 @@ rsync_sync() { # If a time file check was defined, and check if needed. if [[ ${time_file_check:-} ]] && (( force == 0 )); then now=$(date +%s) - last_timestamp=$(cat "${timestamp:?}") + if [[ ! -f ${timestamp:?} ]]; then + echo "Timestamp file not found, skipping time file check." + else + last_timestamp=$(cat "$timestamp") - # Only check time file if the timestamp was recently updated. - if (( now-last_timestamp < ${upstream_timestamp_min:?} )); then - echo "Checking if time file has changed since last sync." - checkresult=$($sync_timeout rsync \ - --no-motd \ - --dry-run \ - --out-format="%n" \ - "${source:?}/${time_file_check:?}" "${repo:?}/${time_file_check:?}") - if [[ -z $checkresult ]]; then - echo "The time file has not changed since last sync, we are not updating at this time." - exit 88 + # Only check time file if the timestamp was recently updated. + if (( now-last_timestamp < ${upstream_timestamp_min:?} )); then + echo "Checking if time file has changed since last sync." + checkresult=$($sync_timeout rsync \ + --no-motd \ + --dry-run \ + --out-format="%n" \ + "${source:?}/${time_file_check:?}" "${repo:?}/${time_file_check:?}") + rsync_rt=$? + if (( rsync_rt != 0 )); then + echo "time_file_check rsync failed (exit ${rsync_rt}), proceeding with sync." + elif [[ -z $checkresult ]]; then + echo "The time file has not changed since last sync, we are not updating at this time." + exit 88 + fi fi fi fi @@ -882,13 +911,10 @@ rsync_sync() { touch "$mirror_update_file" LOGFILE_STAGE1="${LOGFILE}.stage1" echo -n > "$LOGFILE_STAGE1" - LOGFILE_STAGE2="${LOGFILE}.stage2" - echo -n > "$LOGFILE_STAGE2" # Run the rsync. Using eval here so extra_args expands and is used as arguments. stage1_started=$(date +%s) eval "$sync_timeout" rsync -avH \ - --human-readable \ --progress \ --safe-links \ --delay-updates \ @@ -900,11 +926,11 @@ rsync_sync() { --exclude "Archive-Update-in-Progress-${mirror_hostname:?}" \ --exclude "project/trace/${mirror_hostname:?}" \ "'${source:?}'" "'${repo:?}'" | tee -a "$LOGFILE_STAGE1" - RT=${PIPESTATUS[0]} stage1_ended=$(date +%s) # Check if run was successful. if [[ $(grep -c '^total size is' "$LOGFILE_STAGE1") -ne 1 ]]; then + rm -f "$mirror_update_file" post_failed_sync fi @@ -939,6 +965,8 @@ rsync_sync() { # Add stage 2 options from configurations. extra_args="${options_stage2:-}" + LOGFILE_STAGE2="${LOGFILE}.stage2" + echo -n > "$LOGFILE_STAGE2" echo echo "Running rsync stage 2:" @@ -946,7 +974,6 @@ rsync_sync() { # Run the rsync. Using eval here so extra_args expands and is used as arguments. stage2_started=$(date +%s) eval "$sync_timeout" rsync -avH \ - --human-readable \ --progress \ --safe-links \ --delete \ @@ -960,11 +987,11 @@ rsync_sync() { --exclude "Archive-Update-in-Progress-${mirror_hostname:?}" \ --exclude "project/trace/${mirror_hostname:?}" \ "'${source:?}'" "'${repo:?}'" | tee -a "$LOGFILE_STAGE2" - RT=${PIPESTATUS[0]} stage2_ended=$(date +%s) # Check if run was successful. if [[ $(grep -c '^total size is' "$LOGFILE_STAGE2") -ne 1 ]]; then + rm -f "$mirror_update_file" post_failed_sync fi fi @@ -983,7 +1010,7 @@ rsync_sync() { save_trace_file fi rm -f "$LOGFILE_STAGE1" - rm -f "$LOGFILE_STAGE2" + [[ -n ${LOGFILE_STAGE2:-} ]] && rm -f "$LOGFILE_STAGE2" # Remove archive update file. rm -f "$mirror_update_file" @@ -1033,13 +1060,6 @@ quick_fedora_mirror_sync() { eval filterexp="\$${MODULE}_filterexp" eval rsync_options="\$${MODULE}_rsync_options" - # If configuration is not set, exit. - if [[ ! $repo ]]; then - echo "No configuration exists for ${MODULE}" - exit 1 - fi - log_start_header - # Install QFM if not already installed. quick_fedora_mirror_install @@ -1088,11 +1108,13 @@ EOF eval "$sync_timeout" "$QFM_BIN" \ -c "'$conf_path'" \ "$extra_args" | tee -a "$LOGFILE_SYNC" - RT=${PIPESTATUS[0]} sync_ended=$(date +%s) # Check if run was successful. - if [[ $(grep -c '^total size is' "$LOGFILE_SYNC") -lt 1 ]]; then + if ! grep -q '^total size is' "$LOGFILE_SYNC"; then + for module in $modules; do + rm -f "$docroot$(module_dir "$module")/Archive-Update-in-Progress-${mirror_hostname:?}" + done post_failed_sync fi