This is a sample git
project to demonstrate how to track progress on a thesis/report or whatever else document written in the Microsoft Office Word .docx
file format.
It is primarily intended to work on Microsoft Windows operating systems; functionality on other platforms has not been tested.
Tracking your Microsoft Office-based thesis with git
works best with Github Desktop.
To generate meaningful and human-readable diffs, the following packages should be installed on your desktop:
-
Git for Windows
The Windows distribution ofgit
. Make sure to enable Linux bash commands during the installation process. -
Github Desktop (optional)
This is a really handygit
GUI client for Windows which eases repository management and committing. Its use is highly recommended, but every other GUI will likely work, too.
Download the latest release or clone this repository to your local machine and run setup.sh
.
This script will create the following entries in the projects .git\config
:
[diff "docx"]
textconv = tools/pandoc --from=docx --to=markdown --track-changes=all
prompt = false
binary = true
[diff "pptx"]
textconv = sh -c 'tools/pptx2md --disable-image --disable-wmf "$0" -o ~/.cache/git/presentation.md >/dev/null && cat ~/.cache/git/presentation.md'
cachetextconv = true
prompt = false
binary = true
[diff "pdf"]
textconv = sh -c 'tools/pdftotext -simple -enc UTF-8 "$0" -'
cachetextconv = true
prompt = false
binary = true
[core]
hooksPath = tools/hooks
This configuration will enable the generation of user-readable diffs inside of Github Desktop.
Note: Do not forget to unpack pandoc.zip
in the tools
folder.
The hooksPath
setting tells git to look for hooks in the tracked hooks
folder. See Publishing for an explanation of the included post-commit
hook.
You can of course also add this configuration settings to your global .gitconfig
to make it available for all your projects.
While writing your document in MS Word, stick to some basic rules to avoid problems with document loading times, crashes and other stuff that people keep complaining about. Here are some of those as an entry point:
-
Insert your images as links instead of embedding them in the document.
Managing your images in theimages
folder has several advantages:.docx
files are just archives. If you embed an image, it is stored inside themedia
folder of the archive and your documents' file size increases. Largedocx
files are more prone to crashes than small ones.- Linked images are automatically updated in the
docx
file whenever you replace them with a new version in theimages
folder. No need to manually replace (and most often resize) them inside the document. - The
images
folder is tracked bygit
. If you replace an image and want to revert the changes later, just click through your history and revert the changes.
If you want to follow this guide, you can add a linked image to your document via
Insert > Quick Parts > Fields > IncludePicture
Enterimages/YourImageName.extension
into theFile Name
field and you're ready to go.
You can find several helpful scripts in the tools
directory. Among them are the three binaries that provide human-readable diffs for Microsoft Word documents, PDFs and Microsoft PowerPoint presentations.
Further, you will find a small Python project called py-pdf
which includes scripts to manipulate PDF files generated from your document. These allow you to modify and tailor the PDF TOC to your needs and to replace bitmap graphics embedded in the PDF with vector images.
See the corresponding ReadMe
files.
This package comes with two pre-configured automatic release methods:
-
Using the
post-commit
hook
Inside thehooks
folder you can find apost-commit
hook. This is enabled by default when running the installation script and is the primary way to automate releases on a local machine.The hook is triggered by commits containing a commit message composed like
Prepare v7.4
, wherePrepare v
is the keyword the hook checks against and the following version number will be used for the release. If a valid commit is detected and it contains changes to one or more.docx
files, the hook copies these file to thereleases
folder and submits aRelease _file name_ v7.4
commit to the repository. -
Using the
github workflow
Inside the.github/workflows
folder you can find a Github action filerelease.yml.dist
. This action triggers a release action similar to the one executed by the git hook, but only when changes to a.docx
file are pushed to the Github repository and the commit message containsPrepare v
(see the description of the hook).To enable the action, simple remove the
.dist
extension from the workflow file. -
Manual releases
Of course, you can also manually release a new version of a document by- commiting all changes,
- copying the document to be released to the
releases
folder, - renaming the new file with
vX.x
, - comitting the new release to the repository.
Any improvements, suggestions and questions about this project are welcome at any time.
Feel free to open an issue for discussion if you think you found a mistake, have suggestions for improvement or extension or whatever else is on your mind.
This project is published unter the MIT License
.
For more information, please refer to the LICENSE
file.