ssync
ssync is a suite of utilities that facilitate syncing a remote device with a local one regardless of file structure. For example if your remote storage's structure and design differs from your local - rsync wouldn't be able to easily 1:1 fetch everything neatly; so ssync works by indexing all of the files it hasn't yet - then queues them for download to single local target dir; then you just have to do whatever you want locally, separately, to reorganize things. I generally use automv for common/repeat fetches.
setup
ssync works via three main modules:
ssync-index- a script that indexes a directory (run separately from remote and local target)ssync-queue- generates the queue for-fetchto fetchssync-fetch- fetches off of the queue
With the optional ssync bundled as a single executable that runs in the typical aka "my desired" use-case.
The typical process is:
1) have a cron execute ssync on the local/target system
2) it will run index against the remote and local systems, refreshing the index for anything new
3) it will then run queue which generates the queue of files yet-to-be-downloaded from the index, appending any files to the queue
4) finally run fetch to process through the queue
logic flow
- (s) establish a lock
- (i) Generate the remote and local indexes
- (q) Compare remote to local adding left side diff to the queue
- (f) iterate over the queue
- (f) check if the file exists
- (f) if not download
- (s) report queue diff, process duration, and status
- (s) release lock
Notes on queue maintenance
Rather than maintain a queue, each process will generate its own queue from the index diffs. Each queue is placed in a specified directory, defaulting to $XDG_CACHE_DIR/ssync/queue/ (if not set uses /home/$USER/.config/ if exists, else creates and stores in /tmp/ssync/queue/)
Since it runs off of process local indexes the queue can be reaped between processes and not incur any potential data loss.
Index window
So long as files on the local system are expect to persist longer than it does on the remote you'll always be safe. But the index window helps set a maximum lookback - so that any older files may be removed from the local system without being resynced.
In a previous implementation of this - not starting from scratch each run led to examples where a set of sequential files had a missing file in between. Like img-1, img-2, img-3, img-5, img-6: with img-4 missing. It was annoying and easy to not notice.
Configuration
ssync.conf
remote_host=HOST
remote_root_dir=/path/to/sync/root/
keyfile=/path/to/key
local_root_dir=/path/to/local/sync/root/
index_window_s=86400 # 24 hours
index_dir=/path/to/index/dir # optional
queue_dir=/path/to/queue/dir # optional
lock_file=/path/to/desired/file.lock # optional
commands
ssync [options]
OPTIONS
-c [CONFIG_FILE] optional config file to use
default: ~/.config/ssync/ssync.conf
-l [LOCK_FILE] optional lock file
default: from config)
-q [QUEUE_DIR] optional queue dir
default: from config)
-k [KEY_FILE] optional key file
default: from config)
ssync-index [options] -c [FILE]
REQUIRED
-c [CONFIG_FILE] config file to use
OPTIONS
-l local only (cannot be used in conjunction with -r)
-r remote only (cannot be used in conjunction with -l)
-o [OUTPUT_FILE] output file override
-k [KEY_FILE] key file override