Initital commit

This commit is contained in:
GRMrGecko 2023-11-07 13:20:19 -06:00
commit aa67146dab
3 changed files with 1357 additions and 0 deletions

19
LICENSE Normal file
View File

@ -0,0 +1,19 @@
Copyright (c) 2023 Mr. Gecko's Media (James Coleman). http://mrgeckosmedia.com/
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

348
README.md Normal file
View File

@ -0,0 +1,348 @@
# mirror-sync
A tool to mirror repostories for Linux and other similar tools. This tool is designed to help follow upstream mirror instructions, and implement the features they expect from a downstream official mirror. It also includes features to help keep you in the loop in case of situations that need manual intervention.
## Configuration
It is suggested that you mirror using a sub user account, this tool prevents execution as root to protect you. Once you have an user account dedicated to mirror activities, you can make the log directory, configure logrotate, and add a configuration file to define configurations.
### Making log directory
```bash
mkdir -p /var/log/mirror-sync/
chown mirror: /var/log/mirror-sync/
```
### Configuration for logrotate
```
/var/log/mirror-sync/*.log {
rotate 7
create 644 mirror mirror
daily
missingok
notifempty
sharedscripts
copytruncate
compress
}
```
### Configuring mirror-sync
The configuration file is in `/etc/mirror-sync.conf` and is formatted in bash.
## Main configurations
### MODULES
The available modules separated by space. Each module is a separate repostory to sync, and this list allows the script to know how to find their configs.
### TRACEHOST
The hostname to show in trace project files, it defaults to the FQDN hostname of the server.
### mirror_hostname
The hostname of this mirror server, it defaults to the FQDN hostname of the server. If you have a public domain for your mirror, you may wish to adjust this configurtion to that.
### PIDPATH
If you wish to override where pid files are stored to prevent duplicate module syncs, the default is `/tmp` and the directory must have write access for the mirror user.
### LOGPATH
If you wish to override where logs are stored, the default is `/var/log/mirror-sync` and the directory must have write access for the mirror user.
### max_errors
How many errors before an email is sent regarding the issue. This allows you to ignore anomolies.
### upstream_max_age
If the upstream last modified date is older than the defined number of seconds, the upstream check will skip syncing. Default is 5 hours.
### upstream_timestamp_min
If an upstream check is configured, this defines the minimum age in seconds of the last successful sync before the next sync will skip the upstream check. Default is 24 hours.
### QFM_PATH
Path to where quick-fedora-mirror is located and configurations are saved. If you already have QFM installed, but want configurations stored separately. You can use the `QFM_BIN` configuration to set the QFM binary path.
### QFM_BIN
The binary path for quick-fedora-mirror. If you override `QFM_PATH`, you will likely also have to override this path. Default:
```bash
QFM_BIN="$QFM_PATH/quick-fedora-mirror"
```
### JIGDO_FILE_BIN
If you installed jigdo outside of the home directory, you need to manually configure the `jigdo-file` binary path here.
### JIGDO_MIRROR_BIN
If you installed jigdo outside of the home directory, you need to manually configure the `jigdo-mirror` binary path here.
### jigdoConf
If you use jigdo to build ISO images, this is the base configuration file name. The jigdo hook saves configurations in `${jigdoConf:?}.${arch}.${s}` format.
### MAILTO
The email address of which to mail errors to.
### INFO_MAINTAINER
The maintainer of this repository, should be defined in `name <email>` format.
### INFO_SPONSOR
If this repo is sponsored, you may define the sponsors here.
### INFO_COUNTRY
The country of which this server resides.
### INFO_LOCATION
The region of which this server resides (state/providence).
### INFO_THROUGHPUT
How fast are the pipes to your repository.
### INFO_TRIGGER
How did the sync occur, cron job or manually via ssh? This is auto detected and you do not need to define this configuration.
## Module specific configurations
Each module is configured via configurations prefixed by the module name. The one configuration used by all modules is the `_sync_method` configuration which defines what sync method to use. Each sync method has different configurations available. The default sync method is rsync.
Each repo has at bare minimum the following configurations:
- sync_method - rsync, git, aws, ftp, wget, or qfm.
- repo - The destination directory of the repository.
- timestamp - Path to a file to store the last successful sync unix time stamp. Can be used by a monitoring system to confirm each repo is syncing successfully.
### git
Synchronizes a git repository via git pull. To use this method, you need to have the git package installed.
#### options
Extra options appended to `get pull`.
#### Example
```bash
example_sync_method="git"
example_repo="/home/mirror/http/example"
example_timestamp="/home/mirror/timestamp/example"
```
### aws
Synchronize with an s3 bucket using aws cli. To use this, you need the aws cli package installed.
#### aws_bucket
The bucket URL to sync with.
#### aws_access_key
The access key for the s3 bucket.
#### aws_endpoint_url
If you are using a third party S3 compatible service, you can enter their endpoint URL here.
#### options
Extra options to append to `aws s3 sync`.
#### Example
```bash
example_sync_method="aws"
example_repo="/home/mirror/http/example"
example_timestamp="/home/mirror/timestamp/example"
example_aws_bucket="s3://bucket/directory"
example_aws_access_key="RANDOM_KEY_FROM_PROVIDER"
example_aws_secret_key="RANDOM_SECRET_FROM_PROVIDER"
```
### ftp
Synchronize both http and ftp sources to a repo. This sync method requires the lftp package to be installed.
#### source
The source url to mirror from.
#### options
Extra options to append to the mirror command of lftp.
#### Example
```bash
example_sync_method="ftp"
example_repo="/home/mirror/http/example/"
example_timestamp="/home/mirror/timestamp/example"
example_source="https://repos.example.com/rhel/7/x86_64/stable"
```
### wget
Synchronizes using wget to a repository. To use this, you need the wget package installed.
#### source
The source url to mirror from.
#### options
The options passed to wget. Defaults to `--mirror --no-host-directories --no-parent`.
#### Example
```bash
example_sync_method="wget"
example_repo="/home/mirror/http/example/"
example_timestamp="/home/mirror/timestamp/example"
example_source="https://repos.example.com/rhel/7/x86_64/stable"
example_options="--mirror --no-host-directories --no-parent --cut-dirs=4"
```
### rsync
By far, the most common mirror method is to use rsync. It, while not perfect, is more efficent than using wget or ftp mirroring. You will need the rsync package installed for this to function. There is an extra CLI argument available for this sync method, `--force` which allows you to by-pass upstream checks and synchronize immediately.
#### pre_hook
A hook to run prior to the first stage sync.
#### source
The rsync server or ssh server URL.
#### options
Synchronization options for the first rsync stage.
#### options_stage2
If your repo needs a 2 stage rsync, define some options here. The most basic option you can use, if you want to force stage 2 to occur, would be `--exclude '.~tmp~'`.
#### pre_stage2_hook
A hook to run prior to the second stage sync.
#### upstream_check
An http URL to check the last modified date as a reference for if the upstream mirror was possibly modified recently. This option is mainly here to lower the impact on upstream mirrors so that mirrorning happens less often. See `upstream_timestamp_min` and `upstream_max_age` for global configuration options of this check.
#### report_mirror
If you have Fedora report mirror installed, and need to report back to Fedora about the status of your repository, you can provide this option a configuration path for the `report_mirror` utility to run the report after a successful sync.
#### rsync_password
If you have an rsync password and need to authenticate with an rsync server, this is where you define the password.
#### post_hook
Any hooks to call after a successful sync, define here. If you are using jigdo, the hook is `jigdo_hook`.
#### jigdo_pkg_repo
If you are using jigdo to build ISO images, you need to define the path to the repo of packages.
#### arch_configurations
Information for trace files on what architectures are synchronized to this mirror.
#### type
For the trace file saving, this defines what type of repo is being synced. Options are deb, rpm, iso, or source.
#### Example
Example for RPM based mirror:
```bash
example_repo="/home/mirror/http/example/"
example_timestamp="/home/mirror/timestamp/example"
example_source="rsync://rsync.example.org/module/"
example_options="--exclude '.~tmp~' --exclude 'repodata/*'"
example_options_stage2="--exclude '.~tmp~'"
example_type="rpm"
```
Example for DEB based mirror:
```bash
example_repo="/home/mirror/http/example/"
example_timestamp="/home/mirror/timestamp/example"
example_source="rsync://rsync.example.org/module/"
example_options="--exclude '.~tmp~' --include=*.diff/ --exclude=*.diff/Index --exclude=Packages* --exclude=Sources* --exclude=Release* --exclude=InRelease --include=i18n/by-hash --exclude=i18n/* --exclude=ls-lR*"
example_options_stage2="--exclude '.~tmp~'"
example_type="deb"
```
Example with jigdo:
```bash
example_repo="/home/mirror/http/example/"
example_timestamp="/home/mirror/timestamp/example"
example_source="rsync://rsync.example.org/module/"
example_options="--exclude '.~tmp~' --exclude '*.iso'"
example_pre_stage2_hook="jigdo_hook"
example_jigdo_pkg_repo="/home/mirror/http/debian/"
example_options_stage2="--exclude '.~tmp~'"
example_type="iso"
```
### qfm
Quick Fedora Mirror is a tool to help Fedora mirrors distribute changes faster and save on resources when trying to discover what needs to be synced. To use this method, you must have both the rsync and zsh package installed. This tool automatically downloads QFM if you do not already have it installed.
This tool requires that the upstream mirror has an module with sub modules designed for use with quick-fedora-mirror. You can use this tool with non-fedora mirrors, however they must follow the fedora module configurations. For fedora mirrors, you can utilize [tier 1 mirrors](https://fedoraproject.org/wiki/Infrastructure/Mirroring/Tiering#Tier_1_mirrors).
You can list modules available on an rsync server with:
```bash
rsync --list-only rsync://SERVER
```
And to check a module out, you can list the files with:
```bash
rsync --list-only rsync://SERVER/MODULE
```
#### pre_hook
A hook to run prior to running QFM.
#### source
The source rsync server, without any modules appended.
#### master_module
The main rsync module under which the fedora sub module directories exist. Defaults to `fedora-buffet`.
#### module_mapping
If you are using this with a non-fedora mirror, you can define your own custom sub module mapping.
#### mirror_manager_mapping
The names for custom module mapping.
#### modules
The sub modules to sync. It is recommended that you only do one sub module, the modules available by default are fedora-alt, fedora-archive, fedora-enchilada, fedora-epel, and fedora-secondary.
#### options
Extra options to pass to quick-fedora-mirror.
#### filterexp
If you wish to filter out particular directories/files, define regular expression here.
#### rsync_options
Extra options to pass to rsync during sync.
#### report_mirror
If you have Fedora report mirror installed, and need to report back to Fedora about the status of your repository, you can provide this option a configuration path for the `report_mirror` utility to run the report after a successful sync.
#### rsync_password
If you have an rsync password and need to authenticate with an rsync server, this is where you define the password.
#### post_hook
Any hooks to call after a successful sync, define here.
#### arch_configurations
Information for trace files on what architectures are synchronized to this mirror.
#### type
For the trace file saving, this defines what type of repo is being synced. Options are deb, rpm, iso, or source.
#### Example
```bash
example_sync_method=qfm
example_repo='/home/mirror/http/example/'
example_timestamp='/home/mirror/timestamp/example'
example_source='rsync://mirrors.example.com'
example_modules=fedora-enchilada
example_report_mirror='/home/mirror/report_mirror.conf'
example_type=rpm
```
## CLI Options
There are not that many cli options available, usage is as follows:
```
[--help|--update-support-utilities] {module} [--force]
```
## Requirements list
- bash
- zsh
- sendmail
- git
- awscli
- lftp
- wget
- rsync
- jigdo - this tool auto installs.
- quick-fedora-mirror - this tool auto installs.
### Install on RPM based servers
```bash
yum install bash zsh sendmail git awscli lftp wget rsync
```
### Install on DEB based servers
```bash
apt install bash zsh sendmail git awscli lftp wget rsync
```
### Install on Arch
```bash
yay -S bash zsh sendmail git aws-cli-git lftp wget rsync
```

990
mirror-sync.sh Normal file
View File

@ -0,0 +1,990 @@
#!/bin/bash
# This script is designed to handle mirror syncing tasks from external mirrors.
# Each mirror is handled within a module which can be configured via the configuration file /etc/mirror-sync.conf.
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/home/mirror/.local/bin:/home/mirror/bin
# Variables for trace generation.
PROGRAM="mirror-sync"
VERSION="20231107"
TRACEHOST=$(hostname -f)
mirror_hostname=$(hostname -f)
DATE_STARTED=$(LC_ALL=POSIX LANG=POSIX date -u -R)
INFO_TRIGGER=cron
if [[ $SUDO_USER ]]; then
INFO_TRIGGER=ssh
fi
# Pid file temporary path.
PIDPATH="/tmp"
PIDSUFFIX="-mirror-sync.pid"
PIDFILE="" # To be filled by acquire_lock().
# Log file.
LOGPATH="/var/log/mirror-sync"
LOGFILE="" # To be filled by acquire_lock().
ERRORFILE="" # To be filled by acquire_lock().
error_count=0
max_errors=3
tmpDirBase="$HOME/tmp"
# Do not check upstream unless it was updated in the last 5 hours.
upstream_max_age=18000
# Update anyway if last check was more than 24 hours ago.
upstream_timestamp_min=86400
# quick-fedora-mirror tool config.
QFM_GIT="https://pagure.io/quick-fedora-mirror.git"
QFM_PATH="$HOME/quick-fedora-mirror"
QFM_BIN="$QFM_PATH/quick-fedora-mirror"
# For installing Jigdo
JIGDO_SOURCE_URL="http://deb.debian.org/debian/pool/main/j/jigdo/jigdo_0.8.0.orig.tar.xz"
JIGDO_FILE_BIN="$HOME/bin/jigdo-file"
JIGDO_MIRROR_BIN="$HOME/bin/jigdo-mirror"
jigdoConf="$HOME/etc/jigdo/jigdo-mirror.conf"
# Prevent run as root.
if (( EUID == 0 )); then
echo "Do not mirror as root."
exit 1
fi
# Load the required configuration file or quit.
if [[ -f /etc/mirror-sync.conf ]]; then
# shellcheck disable=SC1090
source /etc/mirror-sync.conf
else
echo "No configuration file defined, please setup a proper configuration file."
exit 1
fi
# Print the help for this command.
print_help() {
echo "Mirror Sync"
echo
echo "Usage:"
echo "$0 [--help|--update-support-utilities] {module} [--force]"
echo
echo "Available modules:"
for MODULE in ${MODULES:?}; do
echo "$MODULE"
done
exit
}
# Send email to admins about error.
mail_error() {
if [[ -z $MAILTO ]]; then
echo "MAILTO is undefined."
return
fi
{
cat <<EOF
Subject: ${PROGRAM} Error
To: ${MAILTO}
Auto-Submitted: auto-generated
MIME-Version: 1.0
Content-Type: text/plain
Host: $(hostname -f)
Module: ${MODULE}
Logfile: ${LOGFILE}
$*
EOF
} | sendmail -i -t
}
# Installs quick-fedora-mirror and updates.
quick_fedora_mirror_install() {
if ! [[ -f $QFM_BIN ]]; then
echo "quick-fedora-mirror is not on this system, attempting to get it"
[[ -e $QFM_PATH ]] && rm -Rf "$QFM_PATH"
git clone "$QFM_GIT" "$QFM_PATH"
if ! [[ -f $QFM_BIN ]]; then
echo "Failed to get quick-fedora-mirror!"
exit 1
fi
fi
(
if [[ $1 == "-u" ]]; then
if ! cd "$QFM_PATH"; then
echo "Unable to enter QFM path."
exit 1
fi
if ! git pull; then
echo "Unable to update QFM."
exit 1
fi
fi
)
}
# Installs jigdo image tool.
jigdo_install() {
if [[ $1 == "-u" ]] || ! [[ -e $JIGDO_FILE_BIN ]]; then
if ! cd "$HOME"; then
echo "Unable to access home dir."
exit 1
fi
if [[ ! -d bin ]]; then
mkdir -p bin
fi
if ! wget "$JIGDO_SOURCE_URL" -O jigdo.tar.xz; then
echo "Unable to download jigdo utility."
exit 1
fi
if ! tar -xvf jigdo.tar.xz; then
echo "Unable to unarchive jigdo."
exit 1
fi
rm -f jigdo.tar.xz
(
if ! cd jigdo-*/; then
echo "Unable to enter extracted archive."
exit 1
fi
cat > jigdo.patch <<'EOF'
--- src/util/sha256sum.hh 2019-11-19 10:43:22.000000000 -0500
+++ src-fix/util/sha256sum.hh 2023-04-19 16:33:40.840831304 -0400
@@ -27,6 +27,7 @@
#include <cstring>
#include <iosfwd>
#include <string>
+#include <stdint.h>
#include <bstream.hh>
#include <debug.hh>
EOF
patch -u src/util/sha256sum.hh -i jigdo.patch
if ! ./configure --prefix="$HOME"; then
echo "Unable to configure jigdo."
exit 1
fi
# Build fails first few times due to docs, but clears after a few builds.
if ! make; then
if ! make; then
make
fi
fi
make install
)
fi
}
# Updates the mirror support utilties on server with upstream.
update_support_utilities() {
quick_fedora_mirror_install -u
jigdo_install -u
}
# Acquire a sync lock for this command.
acquire_lock() {
MODULE=$1
# Pid file for this module sync.
PIDFILE="${PIDPATH}/${MODULE}${PIDSUFFIX}"
LOGFILE="${LOGPATH}/${MODULE}.log"
ERRORFILE="${LOGPATH}/${MODULE}.error_count"
if [[ -e $ERRORFILE ]]; then
error_count=$(cat "$ERRORFILE")
fi
# Redirect stdout to both stdout and log file.
exec 1> >(tee -a "$LOGFILE")
# Redirect errors to stdout so they also are logged.
exec 2>&1
# Check existing pid file.
if [[ -f $PIDFILE ]]; then
PID=$(cat "$PIDFILE")
# Prevent double locks.
if [[ $PID == "$BASHPID" ]]; then
echo "Double lock detected."
exit 1
fi
# Check if PID is active.
if ps -p "$PID" >/dev/null; then
echo "A sync is already in progress for ${MODULE} with pid ${PID}."
exit 1
fi
fi
# Create a new pid file for this process.
echo $BASHPID >"$PIDFILE"
# On exit, remove pid file.
trap 'rm -f "$PIDFILE"' EXIT
}
log_start_header() {
echo
echo "=========================================="
echo "Starting execution: $(date +"%Y-%m-%d %T")"
echo "=========================================="
echo
}
log_end_header() {
echo
echo "=========================================="
echo "Execution complete: $(date +"%Y-%m-%d %T")"
echo "=========================================="
}
# Sync git based mirrors.
git_sync() {
MODULE=$1
acquire_lock "$MODULE"
# Read the configuration for this module.
eval repo="\$${MODULE}_repo"
eval timestamp="\$${MODULE}_timestamp"
eval options="\$${MODULE}_options"
# If configuration is not set, exit.
if [[ ! $repo ]]; then
echo "No configuration exists for ${MODULE}"
exit 1
fi
log_start_header
(
# Do a git pull within the repo folder to sync.
if ! cd "${repo:?}"; then
echo "Failed to access '${repo:?}' git repository."
exit 1
fi
eval git pull "$options"
RT=${PIPESTATUS[0]}
if (( RT == 0 )); then
date +%s > "${timestamp:?}"
if [[ -e $ERRORFILE ]]; then
rm -f "$ERRORFILE"
fi
else
error_count=$((error_count+1))
if ((error_count>max_errors)); then
mail_error "Unable to sync with git, check logs."
rm -f "$ERRORFILE"
fi
echo "$error_count" > "$ERRORFILE"
fi
)
log_end_header
}
# Sync AWS S3 bucket based mirrors.
aws_sync() {
MODULE=$1
acquire_lock "$MODULE"
# Read the configuration for this module.
eval repo="\$${MODULE}_repo"
eval timestamp="\$${MODULE}_timestamp"
eval bucket="\$${MODULE}_aws_bucket"
eval AWS_ACCESS_KEY_ID="\$${MODULE}_aws_access_key"
eval AWS_SECRET_ACCESS_KEY="\$${MODULE}_aws_secret_key"
eval AWS_ENDPOINT_URL="\$${MODULE}_aws_endpoint_url"
eval options="\$${MODULE}_options"
export AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY
# If configuration is not set, exit.
if [[ ! $repo ]]; then
echo "No configuration exists for ${MODULE}"
exit 1
fi
log_start_header
if [[ -n $AWS_ENDPOINT_URL ]]; then
options="$options --endpoint-url='$AWS_ENDPOINT_URL'"
fi
# Run AWS client to sync the S3 bucket.
eval timeout 1d aws s3 sync \
--no-follow-symlinks \
--delete \
"$options" \
"'${bucket:?}'" "'${repo:?}'"
RT=${PIPESTATUS[0]}
if (( RT == 0 )); then
date +%s > "${timestamp:?}"
if [[ -e $ERRORFILE ]]; then
rm -f "$ERRORFILE"
fi
else
error_count=$((error_count+1))
if ((error_count>max_errors)); then
mail_error "Unable to sync with aws, check logs."
rm -f "$ERRORFILE"
fi
echo "$error_count" > "$ERRORFILE"
fi
log_end_header
}
# Sync using FTP.
ftp_sync() {
MODULE=$1
acquire_lock "$MODULE"
# Read the configuration for this module.
eval repo="\$${MODULE}_repo"
eval timestamp="\$${MODULE}_timestamp"
eval source="\$${MODULE}_source"
eval options="\$${MODULE}_options"
# If configuration is not set, exit.
if [[ ! $repo ]]; then
echo "No configuration exists for ${MODULE}"
exit 1
fi
log_start_header
# Run AWS client to sync the S3 bucket.
timeout 1d lftp <<< "mirror -v --delete --no-perms $options '${source:?}' '${repo:?}'"
RT=${PIPESTATUS[0]}
if (( RT == 0 )); then
date +%s > "${timestamp:?}"
if [[ -e $ERRORFILE ]]; then
rm -f "$ERRORFILE"
fi
else
error_count=$((error_count+1))
if ((error_count>max_errors)); then
mail_error "Unable to sync with lftp, check logs."
rm -f "$ERRORFILE"
fi
echo "$error_count" > "$ERRORFILE"
fi
log_end_header
}
# Sync using wget.
wget_sync() {
MODULE=$1
acquire_lock "$MODULE"
# Read the configuration for this module.
eval repo="\$${MODULE}_repo"
eval timestamp="\$${MODULE}_timestamp"
eval source="\$${MODULE}_source"
eval options="\$${MODULE}_options"
if [[ -z $options ]]; then
options="--mirror --no-host-directories --no-parent"
fi
# If configuration is not set, exit.
if [[ ! $repo ]]; then
echo "No configuration exists for ${MODULE}"
exit 1
fi
log_start_header
(
# Make sure the repo directory exists and we are in it.
if ! [[ -e $repo ]]; then
mkdir -p "$repo"
fi
if ! cd "$repo"; then
echo "Unable to enter repo directory."
fi
# Run wget with configured options.
eval timeout 1d wget "$options" "'${source:?}'"
RT=${PIPESTATUS[0]}
if (( RT == 0 )); then
date +%s > "${timestamp:?}"
if [[ -e $ERRORFILE ]]; then
rm -f "$ERRORFILE"
fi
else
error_count=$((error_count+1))
if ((error_count>max_errors)); then
mail_error "Unable to sync with lftp, check logs."
rm -f "$ERRORFILE"
fi
echo "$error_count" > "$ERRORFILE"
fi
)
log_end_header
}
# Jigdo hook - builds iso images from jigdo files.
jigdo_hook() {
jigdo_install
currentVersion=$(ls -l "${repo}/current")
currentVersion="${currentVersion##* -> }"
versionDir="$(realpath "$repo")/${currentVersion}"
for a in "$versionDir"/*/; do
arch=$(basename "$a")
sets=$(cat "${repo}/project/build/${currentVersion}/${arch}")
for s in $sets; do
jigdoDir="${repo}/${currentVersion}/${arch}/jigdo-${s}"
imageDir="${repo}/${currentVersion}/${arch}/iso-${s}"
if [[ ! -d $imageDir ]]; then
mkdir -p "$imageDir"
fi
# Sums are now SHA256SUMS and SHA512SUMS.
cp -a "${jigdoDir}"/*SUMS* "${imageDir}/"
cat >"${jigdoConf:?}.${arch}.${s}" <<EOF
LOGROTATE=14
jigdoFile="$JIGDO_FILE_BIN --cache=\$tmpDir/jigdo-cache.db --cache-expiry=1w --report=noprogress --no-check-files"
debianMirror="file:${jigdo_pkg_repo:-}"
nonusMirror="file:/tmp"
include='.' # include all files,
exclude='^$' # then exclude none
jigdoDir=${jigdoDir}"
imageDir=${imageDir}
tmpDir=${tmpDirBase:?}/${arch}.${s}
#logfile=${LOGPATH}/${MODULE}-${arch}.${s}.log
EOF
echo "Running jigdo for ${arch}.${s}"
$JIGDO_MIRROR_BIN "${jigdoConf:?}.${arch}.${s}"
done
done
}
# Pull a field from a trace file or rsync stats.
extract_trace_field() {
value=$(awk -F': ' "\$1==\"$1\" {print \$2; exit}" "$2" 2>/dev/null)
[[ $value ]] || return 1
echo "$value"
}
# Build trace content.
build_trace_content() {
LC_ALL=POSIX LANG=POSIX date -u
rfc822date=$(LC_ALL=POSIX LANG=POSIX date -u -R)
echo "Date: ${rfc822date}"
echo "Date-Started: ${DATE_STARTED}"
if [[ -e $TRACEFILE_MASTER ]]; then
echo "Archive serial: $(extract_trace_field 'Archive serial' "$TRACE_MASTER_FILE" || echo unknown )"
fi
echo "Used ${PROGRAM} version: ${VERSION}"
echo "Creator: ${PROGRAM} ${VERSION}"
echo "Running on host: ${TRACEHOST}"
if [[ ${INFO_MAINTAINER:-} ]]; then
echo "Maintainer: ${INFO_MAINTAINER}"
fi
if [[ ${INFO_SPONSOR:-} ]]; then
echo "Sponsor: ${INFO_SPONSOR}"
fi
if [[ ${INFO_COUNTRY:-} ]]; then
echo "Country: ${INFO_COUNTRY}"
fi
if [[ ${INFO_LOCATION:-} ]]; then
echo "Location: ${INFO_LOCATION}"
fi
if [[ ${INFO_THROUGHPUT:-} ]]; then
echo "Throughput: ${INFO_THROUGHPUT}"
fi
if [[ ${INFO_TRIGGER:-} ]]; then
echo "Trigger: ${INFO_TRIGGER}"
fi
# Depending on repo type, find archetectures supported.
ARCH_REGEX='(source|SRPMS|amd64|mips64el|mipsel|i386|x86_64|aarch64|ppc64le|ppc64el|s390x|armhf)'
if [[ $repo_type == "deb" ]]; then
ARCH=$(find "${repo}/dists" \( -name 'Packages.*' -o -name 'Sources.*' \) 2>/dev/null |
sed -Ene 's#.*/binary-([^/]+)/Packages.*#\1#p; s#.*/(source)/Sources.*#\1#p' |
sort -u | tr '\n' ' ')
if [[ $ARCH ]]; then
echo "Architectures: ${ARCH}"
fi
elif [[ $repo_type == "rpm" ]]; then
ARCH=$(find "$repo" -name 'repomd.xml' 2>/dev/null |
grep -Po "$ARCH_REGEX" |
sort -u | tr '\n' ' ')
if [[ $ARCH ]]; then
echo "Architectures: ${ARCH}"
fi
elif [[ $repo_type == "iso" ]]; then
ARCH=$(find "$repo" -name '*.iso' 2>/dev/null |
grep -Po "$ARCH_REGEX" |
sort -u | tr '\n' ' ')
if [[ $ARCH ]]; then
echo "Architectures: ${ARCH}"
fi
elif [[ $repo_type == "source" ]]; then
echo "Architectures: source"
fi
echo "Architectures-Configuration: ${arch_configurations:-ALL}"
echo "Upstream-mirror: ${RSYNC_HOST:-unknown}"
# Total bytes synced per rsync stage.
total=0
if [[ -f $LOGFILE_SYNC ]]; then
for bytes in $(sed -Ene 's/(^|.* )sent ([0-9]+) bytes received ([0-9]+) bytes.*/\3/p' "$LOGFILE_SYNC"); do
total=$(( total + bytes ))
done
elif [[ -f $LOGFILE_STAGE1 ]]; then
bytes=$(sed -Ene 's/(^|.* )sent ([0-9]+) bytes received ([0-9]+) bytes.*/\3/p' "$LOGFILE_STAGE1")
total=$(( total + bytes ))
fi
if [[ -f $LOGFILE_STAGE2 ]]; then
bytes=$(sed -Ene 's/(^|.* )sent ([0-9]+) bytes received ([0-9]+) bytes.*/\3/p' "$LOGFILE_STAGE2")
total=$(( total + bytes ))
fi
if (( total > 0 )); then
echo "Total bytes received in rsync: ${total}"
fi
# Calculate time per rsync stage and print both stages if both were started.
if [[ $sync_started ]]; then
STATS_TOTAL_RSYNC_TIME1=$(( sync_ended - sync_started ))
total_time=$STATS_TOTAL_RSYNC_TIME1
elif [[ $stage1_started ]]; then
STATS_TOTAL_RSYNC_TIME1=$(( stage1_ended - stage1_started ))
total_time=$STATS_TOTAL_RSYNC_TIME1
fi
if [[ $stage2_started ]]; then
STATS_TOTAL_RSYNC_TIME2=$(( stage2_ended - stage2_started ))
total_time=$(( total_time + STATS_TOTAL_RSYNC_TIME2 ))
echo "Total time spent in stage1 rsync: ${STATS_TOTAL_RSYNC_TIME1}"
echo "Total time spent in stage2 rsync: ${STATS_TOTAL_RSYNC_TIME2}"
fi
echo "Total time spent in rsync: ${total_time}"
if (( total_time != 0 )); then
rate=$(( total / total_time ))
echo "Average rate: ${rate} B/s"
fi
}
# Save trace file.
save_trace_file() {
# Trace file/dir paths.
TRACE_DIR="${repo}/project/trace"
mkdir -p "$TRACE_DIR"
TRACE_FILE="${TRACE_DIR}/${mirror_hostname:?}"
TRACE_MASTER_FILE="${TRACE_DIR}/master"
TRACE_HIERARCHY="${TRACE_DIR}/_hierarchy"
# Parse the rsync host from the source.
RSYNC_HOST=${source/rsync:\/\//}
RSYNC_HOST=${RSYNC_HOST%%:*}
RSYNC_HOST=${RSYNC_HOST%%/*}
# Build trace and save to file.
build_trace_content > "${TRACE_FILE}.new"
mv "${TRACE_FILE}.new" "$TRACE_FILE"
# Build heirarchy file.
{
if [[ -e "${TRACE_HIERARCHY}.mirror" ]]; then
cat "${TRACE_HIERARCHY}.mirror"
fi
echo "$(basename "$TRACE_FILE") $mirror_hostname $TRACEHOST ${RSYNC_HOST:-unknown}"
} > "${TRACE_HIERARCHY}.new"
mv "${TRACE_HIERARCHY}.new" "$TRACE_HIERARCHY"
cp "$TRACE_HIERARCHY" "${TRACE_HIERARCHY}.mirror"
# Output all traces to _traces file. Disabling shell check because the glob in this case is used right.
# shellcheck disable=SC2035
(cd "$TRACE_DIR" && find * -type f \! -name "_*") > "$TRACE_DIR/_traces"
}
# Modules based on rsync.
rsync_sync() {
MODULE=$1
shift
# Check for any arguments.
force=0
while (( $# > 0 )); do
case $1 in
# Force rsync, ignore upstream check.
-f|--force)
force=1
shift
;;
*)
echo "Unknown option $1"
echo
print_help "$@"
;;
esac
done
acquire_lock "$MODULE"
# Read the configuration for this module.
eval repo="\$${MODULE}_repo"
eval pre_hook="\$${MODULE}_pre_hook"
eval timestamp="\$${MODULE}_timestamp"
eval source="\$${MODULE}_source"
eval options="\$${MODULE}_options"
eval options_stage2="\$${MODULE}_options_stage2"
eval pre_stage2_hook="\$${MODULE}_pre_stage2_hook"
eval upstream_check="\$${MODULE}_upstream_check"
eval report_mirror="\$${MODULE}_report_mirror"
eval RSYNC_PASSWORD="\$${MODULE}_rsync_password"
if [[ $RSYNC_PASSWORD ]]; then
export RSYNC_PASSWORD
fi
eval post_hook="\$${MODULE}_post_hook"
eval jigdo_pkg_repo="\$${MODULE}_jigdo_pkg_repo"
eval arch_configurations="\$${MODULE}_arch_configurations"
eval repo_type="\$${MODULE}_type"
# If configuration is not set, exit.
if [[ ! $repo ]]; then
echo "No configuration exists for ${MODULE}"
exit 1
fi
log_start_header
# Check if upstream was updated recently if configured.
# This is designed to slow down rsync so we only rsync
# when we detect its needed or when last rsync was a long time ago.
if [[ $upstream_check ]] && (( force == 0 )); then
now=$(date +%s)
last_timestamp=$(cat "${timestamp:?}")
# If last update was not that long ago, we should check if upstream was updated recently.
if [[ $((now-last_timestamp)) -lt ${upstream_timestamp_min:?} ]]; then
echo "Checking upstream's last modified."
# Get the last modified date.
IFS=': ' read -r _ last_modified < <(curl -sI HEAD "${upstream_check:?}" | grep Last-Modified)
last_modified_unix=$(date -u +%s -d "$last_modified")
# If last modified is greater than our max age, it wasn't modified recently and we should not rsync.
if (( now-last_modified_unix > ${upstream_max_age:-0} )); then
echo "Skipping sync as upstream wasn't updated recently."
exit 88
fi
fi
fi
# Run any hooks.
if [[ $pre_hook ]]; then
echo "Executing pre-hook:"
eval "$pre_hook"
fi
# Add arguments from configurations.
extra_args="${options:-}"
# If 2 stage, we do not want to delete in stage 1.
if [[ ! $options_stage2 ]]; then
extra_args+=" --delete --delete-after"
echo "Running rsync:"
else
echo "Running rsync stage 1:"
fi
# Create archive update file.
mirror_update_file="${repo:?}/Archive-Update-in-Progress-${mirror_hostname:?}"
touch "$mirror_update_file"
LOGFILE_STAGE1="${LOGFILE}.stage1"
echo -n > "$LOGFILE_STAGE1"
LOGFILE_STAGE2="${LOGFILE}.stage2"
echo -n > "$LOGFILE_STAGE2"
# Run the rsync. Using eval here so extra_args expands and is used as arguments.
stage1_started=$(date +%s)
eval timeout 1d rsync -avH \
--human-readable \
--progress \
--safe-links \
--delay-updates \
--stats \
--no-human-readable \
--itemize-changes \
--timeout=10800 \
"$extra_args" \
--exclude "Archive-Update-in-Progress-${mirror_hostname:?}" \
--exclude "project/trace/${mirror_hostname:?}" \
"'${source:?}'" "'${repo:?}'" | tee -a "$LOGFILE_STAGE1"
RT=${PIPESTATUS[0]}
stage1_ended=$(date +%s)
# Check if run was successful.
if [[ $(grep -c '^total size is' "$LOGFILE_STAGE1") -ne 1 ]]; then
echo "Rsync failed."
error_count=$((error_count+1))
if ((error_count>max_errors)); then
mail_error "Unable to sync with rsync, check logs."
rm -f "$ERRORFILE"
fi
echo "$error_count" > "$ERRORFILE"
exit 1
fi
# If 2 stage, perform second stage.
if [[ $options_stage2 ]]; then
# Check if upstream is currently updating.
for aupfile in "${repo:?}/Archive-Update-in-Progress-"*; do
case "$aupfile" in
"$mirror_update_file")
:
;;
*)
if [[ -f $aupfile ]]; then
# Remove the file, it will be synced again if
# upstream is still not done
rm -f "$aupfile"
else
echo "AUIP file '${aupfile}' is not really a file, weird"
fi
echo "Upstream is currently updating their repo, skipping second stage for now."
rm -f "$mirror_update_file"
exit 0
;;
esac
done
# Run any hooks.
if [[ $pre_stage2_hook ]]; then
echo "Executing pre-stage2 hook:"
eval "$pre_stage2_hook"
fi
# Add stage 2 options from configurations.
extra_args="${options_stage2:-}"
echo
echo "Running rsync stage 2:"
# Run the rsync. Using eval here so extra_args expands and is used as arguments.
stage2_started=$(date +%s)
eval timeout 1d rsync -avH \
--human-readable \
--progress \
--safe-links \
--delete \
--delete-after \
--delay-updates \
--stats \
--no-human-readable \
--itemize-changes \
--timeout=10800 \
"$extra_args" \
--exclude "Archive-Update-in-Progress-${mirror_hostname:?}" \
--exclude "project/trace/${mirror_hostname:?}" \
"'${source:?}'" "'${repo:?}'" | tee -a "$LOGFILE_STAGE2"
RT=${PIPESTATUS[0]}
stage2_ended=$(date +%s)
# Check if run was successful.
if [[ $(grep -c '^total size is' "$LOGFILE_STAGE2") -ne 1 ]]; then
echo "Rsync stage 2 failed."
error_count=$((error_count+1))
if ((error_count>max_errors)); then
mail_error "Unable to sync with rsync stage 2, check logs."
rm -f "$ERRORFILE"
fi
echo "$error_count" > "$ERRORFILE"
exit 1
fi
fi
# At this point we are successful, update timestamp of last sync.
date +%s > "${timestamp:?}"
if [[ -e $ERRORFILE ]]; then
rm -f "$ERRORFILE"
fi
# Run any hooks.
if [[ $post_hook ]]; then
echo "Executing post hook:"
eval "$post_hook"
fi
# Save trace information.
if [[ $repo_type ]]; then
save_trace_file
fi
rm -f "$LOGFILE_STAGE1"
rm -f "$LOGFILE_STAGE2"
# Remove archive update file.
rm -f "$mirror_update_file"
# If report mirror configuration file provided, run report mirror.
if [[ $report_mirror ]]; then
echo
echo "Reporting mirror update:"
/bin/report_mirror -c "${report_mirror:?}"
fi
log_end_header
}
# Modules based on quick-fedora-mirror.
quick_fedora_mirror_sync() {
MODULE=$1
acquire_lock "$MODULE"
# Read the configuration for this module.
eval repo="\$${MODULE}_repo"
eval pre_hook="\$${MODULE}_pre_hook"
eval timestamp="\$${MODULE}_timestamp"
eval source="\$${MODULE}_source"
eval master_module="\$${MODULE}_master_module"
eval module_mapping="\$${MODULE}_module_mapping"
eval mirror_manager_mapping="\$${MODULE}_mirror_manager_mapping"
eval modules="\$${MODULE}_modules"
eval options="\$${MODULE}_options"
eval filterexp="\$${MODULE}_filterexp"
eval rsync_options="\$${MODULE}_rsync_options"
eval report_mirror="\$${MODULE}_report_mirror"
eval RSYNC_PASSWORD="\$${MODULE}_rsync_password"
if [[ $RSYNC_PASSWORD ]]; then
export RSYNC_PASSWORD
fi
eval post_hook="\$${MODULE}_post_hook"
eval arch_configurations="\$${MODULE}_arch_configurations"
eval repo_type="\$${MODULE}_type"
# If configuration is not set, exit.
if [[ ! $repo ]]; then
echo "No configuration exists for ${MODULE}"
exit 1
fi
log_start_header
quick_fedora_mirror_install
conf_path="${QFM_PATH}/${MODULE}_qfm.conf"
cat <<EOF > "$conf_path"
DESTD="$repo"
TIMEFILE="${LOGPATH}/${MODULE}_timefile.txt"
REMOTE="$source"
MODULES=(${modules:?})
FILTEREXP='${filterexp:-}'
VERBOSE=7
LOGITEMS=aeEl
RSYNCOPTS=(-aSH -f 'R .~tmp~' --stats --no-human-readable --preallocate --delay-updates ${rsync_options:-} --out-format='@ %i %n%L')
EOF
if [[ $master_module ]]; then
echo "MASTERMODULE='$master_module'" >> "$conf_path"
fi
if [[ $module_mapping ]]; then
echo "MODULEMAPPING=($module_mapping)" >> "$conf_path"
fi
if [[ $mirror_manager_mapping ]]; then
echo "MIRRORMANAGERMAPPING=($mirror_manager_mapping)" >> "$conf_path"
fi
# Run any hooks.
if [[ $pre_hook ]]; then
echo "Executing pre-hook:"
eval "$pre_hook"
fi
# Create archive update file.
mirror_update_file="${repo:?}/Archive-Update-in-Progress-${mirror_hostname:?}"
touch "$mirror_update_file"
LOGFILE_SYNC="${LOGFILE}.sync"
echo -n > "$LOGFILE_SYNC"
# Add arguments from configurations.
extra_args="${options:-}"
# Run the rsync. Using eval here so extra_args expands and is used as arguments.
sync_started=$(date +%s)
eval timeout 1d "$QFM_BIN" \
-c "'$conf_path'" \
"$extra_args" | tee -a "$LOGFILE_SYNC"
RT=${PIPESTATUS[0]}
sync_ended=$(date +%s)
# Check if run was successful.
if [[ $(grep -c '^total size is' "$LOGFILE_SYNC") -lt 1 ]]; then
echo "Rsync failed."
error_count=$((error_count+1))
if ((error_count>max_errors)); then
mail_error "Unable to sync with rsync, check logs."
rm -f "$ERRORFILE"
fi
echo "$error_count" > "$ERRORFILE"
exit 1
fi
# At this point we are successful, update timestamp of last sync.
date +%s > "${timestamp:?}"
if [[ -e $ERRORFILE ]]; then
rm -f "$ERRORFILE"
fi
# Run any hooks.
if [[ $post_hook ]]; then
echo "Executing post hook:"
eval "$post_hook"
fi
# Save trace information.
if [[ $repo_type ]]; then
save_trace_file
fi
rm -f "$LOGFILE_SYNC"
# Remove archive update file.
rm -f "$mirror_update_file"
# If report mirror configuration file provided, run report mirror.
if [[ $report_mirror ]]; then
echo
echo "Reporting mirror update:"
/bin/report_mirror -c "${report_mirror:?}"
fi
log_end_header
}
# If no arugments are provided, we can print help.
if (( $# < 1 )); then
print_help "$@"
fi
# Parse arguments.
while (( $# > 0 )); do
case "$1" in
# Installs utilities used by this script which are not available in the standard repositories.
-u|--update-support-utilities)
update_support_utilities
exit 0
;;
# If help is requested, print it.
-h|h|help|--help)
print_help "$@"
;;
# Default to rsync if module has no special options, otherwise if no module is found give help.
*)
for MODULE in ${MODULES:?}; do
if [[ "$1" == "$MODULE" ]]; then
eval sync_method="\${${MODULE}_sync_method:-rsync}"
if [[ "${sync_method:?}" == "git" ]]; then
git_sync "$@"
elif [[ "${sync_method:?}" == "aws" ]]; then
aws_sync "$@"
elif [[ "${sync_method:?}" == "ftp" ]]; then
ftp_sync "$@"
elif [[ "${sync_method:?}" == "wget" ]]; then
wget_sync "$@"
elif [[ "${sync_method:?}" == "qfm" ]]; then
quick_fedora_mirror_sync "$@"
else
rsync_sync "$@"
fi
exit 0
fi
done
# No module was found, so give help.
echo "Unknown module '$1'"
echo
print_help "$@"
;;
esac
done