zsh
and bash
scripts that combine the names and cleaned content of all non-hidden files in a folder into a single output file, excluding the output file and the script itself.
This project automates the creation of a combined file. The script reads through all non-hidden files in the current folder and appends their cleaned content to the output file, ensuring compatibility and clarity.
Additional functionality includes:
- File extension filtering
- Recursive folder traversal
- Dynamic count of processed files by file type
- Compatible with macOS and the zsh shell
- Cleans file content by removing BOMs and carriage return characters
- Excludes hidden files, folders, the output file, and the script file itself
- Supports:
- File extension filtering (e.g., only .txt files)
- Recursive folder traversal
- Dynamic count of processed files by extension
- Generates a file with headers and cleaned content for each file
- Operating System: macOS or Linux
- Shell: zsh or bash (v4.0+)
- Tools Required:
find
sed
printf
These tools are included by default in macOS and most Linux distributions.
- Open the terminal on your macOS device.
- Go to the folder with the files to process:
cd /path/to/your/folder
- Choose one of the following options:
- Run the command in the zsh shell:
outfile="combined.txt";extensions="";rm -f "$outfile";[[ "$0" == "-bash" || "$0" == "-zsh" ]] && echo "Executed as a command in the shell." || { script_name=$(basename "$0"); echo "Executed from a script file: $script_name"; };ext_filters=();typeset -A ext_counts;count=0;if [[ -n "$extensions" ]]; then ext_array=("${(@s/,/)extensions}");for ext in "${ext_array[@]}";do ext_filters+=(-name "*.$ext");done;fi;LC_ALL=C find . -type f $([[ -n "$extensions" ]] && printf "%s" "\( ${ext_filters[@]} -o -false \)") -not -path '*/.*' -not -name "$outfile" -not -name "$script_name" -print0 | while IFS= read -r -d '' file;do printf "###\n### %s\n###\n\n" "$file" >> "$outfile";sed '1s/^\xEF\xBB\xBF//; s/\r$//' "$file" >> "$outfile";printf "\n\n" >> "$outfile";ext="${file##*.}";((ext_counts["$ext"]++));count=$((count + 1));done;echo "Processing complete. $count files processed. Split by extension:";for ext in ${(k)ext_counts};do echo "$ext: ${ext_counts[$ext]} files";done;echo "Output written to $outfile."
- Run the following command in the bash shell:
outfile="combined.txt"; extensions=""; rm -f "$outfile"; [[ "$0" == "-bash" || "$0" == "-zsh" ]] && echo "Executed as a command in the shell." || { script_name=$(basename "$0"); echo "Executed from a script file: $script_name"; }; declare -A ext_counts; count=0; ext_filters=(); if [[ -n "$extensions" ]]; then IFS=',' read -ra ext_array <<< "$extensions"; for ext in "${ext_array[@]}"; do ext_filters+=(-name "*.$ext"); done; fi; find_cmd="find . -type f"; [[ -n "$extensions" ]] && find_cmd+=" \( ${ext_filters[@]} -o -false \)"; find_cmd+=" -not -path '*/.*' -not -name '$outfile' -not -name '$script_name' -print0"; eval "$find_cmd" | while IFS= read -r -d '' file; do printf "###\n### %s\n###\n\n" "$file" >> "$outfile"; sed '1s/^\xEF\xBB\xBF//; s/\r$//' "$file" >> "$outfile"; printf "\n\n" >> "$outfile"; ext="${file##*.}"; ext_counts["$ext"]=$((ext_counts["$ext"] + 1)); count=$((count + 1)); done; echo "Processing complete. $count files processed. Split by extension:"; for ext in "${!ext_counts[@]}"; do echo "$ext: ${ext_counts[$ext]} files"; done; echo "Output written to $outfile."
- Save and execute the scripts as a file:
- Save the zsh or bash script in the folder with the files to process
- Make it executable:
chmod +x ./file-content-aggregator.zsh
resp.chmod +x file-content-aggregator.sh
- Run the script:
./file-content-aggregator.zsh
resp../file-content-aggregator.sh
- Run the command in the zsh shell:
- Initialisation:
- Sets the output file and optional file extensions
- Excludes hidden files, the output file, and the script file itself
- File Selection:
- Dynamically includes all files if extensions are empty
- Otherwise, filters files based on specified extensions
- File Processing:
- Iterates through selected files
- For each file:
- Appends a header with the file name
- Cleans the file content by removing BOM and carriage returns
- Counts files by extension
- Completion:
- Outputs a summary of processed files by extension
- Permission denied: Update folder permissions with
chmod
if required and ensure the script has executable permissions:chmod +x ./file-content-aggregator.zsh
resp.chmod +x ./file-content-aggregator.sh
- Output file not generated: Verify that the
find
,sed
, andprintf
commands are working on your system by testing them individually - Empty output: Ensure your file extension filters or folder contain matching files
Contributions are welcome! Please follow these steps:
- Fork this repository
- Create a feature branch:
git checkout -b feature-branch
- Commit your changes:
git commit -m "Add feature"
- Push the branch:
git push origin feature-branch
- Submit a pull request
Please ensure all changes are well-documented and tested.
Suggestions for improvements are highly encouraged! Please ensure that your contributions adhere to the project’s coding standards and include appropriate documentation.
This project is licenced under the MIT Licence. You are free to use, modify, and distribute this project in compliance with the licence terms.