Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support git annex get/git annex copy --from over http(s):// #3

Closed
kousu opened this issue Apr 27, 2022 · 6 comments
Closed

Support git annex get/git annex copy --from over http(s):// #3

kousu opened this issue Apr 27, 2022 · 6 comments

Comments

@kousu
Copy link
Member

kousu commented Apr 27, 2022

Currently #1 exposes git-annex-shell to the ssh environment that ./gitea serv presents to clients.

This is enough to make work

git clone [email protected]:some/repo.git
cd repo
git annex get .

But this fails:

git clone https://data.example.com/some/repo.git
cd repo
git annex get .

So, this is fine for data curators and data collaborators, but it is not fine for downstream users trying to use our open access data.

It also means that the prototype we built over in https://github.com/neuropoly/computers/issues/167 that uses DroneCI is broken. Drone clones over https -- passing itself an HTTP Bearer token in lieu of adding an extra ssh user.

There are some instructions on how to make this work with git-annex, and a live repo that functions this way at https://downloads.kitenet.net/.git/.

Implement this into #1.

Also, make sure to respect access control. That demo is meant as a public repo, so there's no access control deployed -- it literally just exposes the .git/ folder, including .git/annex, to the whole web, using Apache -- so we need to instead layer Gitea's auth system and Gitea's built in web-server in. The use-case I have in mind is only our own public repos, but I don't want to do a half-job where security is involved.

@kousu
Copy link
Member Author

kousu commented May 7, 2022

According to the instructions, the key points are really just

  1. make sure the raw .git/ folder is accessible over http
  2. run git update-server-info periodically, ideally in a .git/hook but manually or a cronjob works too (experimentally: this is because this generates .git/info/refs which is needed for the client to find remote branches)
git clone works

client:

p115628@joplin:~/datasets/test$ git clone https://data.praxisinstitute.org.dev.neuropoly.org/uofc/candice-fmri-.git
Cloning into 'candice-fmri-'...
remote: Enumerating objects: 71, done.
remote: Counting objects: 100% (71/71), done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 71 (delta 21), reused 0 (delta 0)
Unpacking objects: 100% (71/71), 9.70 KiB | 551.00 KiB/s, done.

server:

root@data:~# tail -f /var/log/nginx/access.log
132.207.65.211 - - [07/May/2022:21:30:31 -0400] "GET /uofc/candice-fmri-.git/info/refs?service=git-upload-pack HTTP/1.1" 200 567 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:21:30:31 -0400] "POST /uofc/candice-fmri-.git/git-upload-pack HTTP/1.1" 200 14904 "-" "git/2.25.1"

But

git annex get fails
p115628@joplin:~/datasets/test/candice-fmri-$ git annex get
(merging origin/git-annex origin/synced/git-annex into git-annex...)
(recording state in git...)
(scanning for unlocked files...)
get sub-BAN01/anat/sub-BAN01_T1w.nii.gz 
  Remote origin not usable by git-annex; setting annex-ignore
(not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
get sub-BAN02/anat/sub-BAN02_T1w.nii.gz (not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
get sub-BAN03/anat/sub-BAN03_T1w.nii.gz (not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
(recording state in git...)
git-annex: get: 3 failed

server:

root@data:~# tail -f /var/log/nginx/access.log
132.207.65.211 - - [07/May/2022:21:37:20 -0400] "GET /uofc/candice-fmri-.git/config HTTP/1.1" 404 11 "-" "git-annex/8.20200226"

git-annex only made one request before giving up on our server; it couldn't get .git/config so it gave up and

setting annex-ignore

According to joeyh's instructions just exposing the .git/ folder to the web with Apache is enough, so I tried porting the idea to nginx by adding this into /etc/nginx/sites-available/gitea:

        location /uofc/candice-fmri-.git {
                root /srv/gitea/data/gitea-repositories/;
                autoindex on;
        }

(uploading over http requires some extra server-side CGI stuff but downloading just needs to be able to read the raw git files, so this is enough)

Now at https://data.praxisinstitute.org.dev.neuropoly.org/uofc/candice-fmri-.git/ I can see

Screenshot 2022-05-07 at 21-55-09 Index of uofc_candice-fmri- git

And now this works:

git annex get

client:

p115628@joplin:~/datasets/test/candice-fmri-$ git config --unset remote.origin.annex-ignore # have to reset this or git-annex refuses
p115628@joplin:~/datasets/test/candice-fmri-$ git annex get
get sub-BAN01/anat/sub-BAN01_T1w.nii.gz (from origin...) 
(checksum...) ok                  
get sub-BAN02/anat/sub-BAN02_T1w.nii.gz (from origin...) 
(checksum...) ok
get sub-BAN03/anat/sub-BAN03_T1w.nii.gz (from origin...) 
(checksum...) ok
(recording state in git...)

server:

132.207.65.211 - - [07/May/2022:21:55:53 -0400] "GET /uofc/candice-fmri-.git/config HTTP/1.1" 200 132 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:21:55:54 -0400] "GET /uofc/candice-fmri-.git/annex/objects/c34/98e/SHA256E-s6472001--453e19abdba0eaa546a2dcb3ff72f71b1540fe34b5ce9aa0df841cf00c790dc2.nii.gz/SHA256E-s6472001--453e19abdba0eaa546a2dcb3ff72f71b1540fe34b5ce9aa0df841cf00c790dc2.nii.gz HTTP/1.1" 200 6472001 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:21:55:54 -0400] "GET /uofc/candice-fmri-.git/annex/objects/433/2b9/SHA256E-s6677912--29be5e1eef0d3b5fcd6817999c9055f6c088d58b37c7c09bb87c440fb8037c81.nii.gz/SHA256E-s6677912--29be5e1eef0d3b5fcd6817999c9055f6c088d58b37c7c09bb87c440fb8037c81.nii.gz HTTP/1.1" 200 6677912 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:21:55:55 -0400] "GET /uofc/candice-fmri-.git/annex/objects/58d/8ea/SHA256E-s5688327--c9c8ad310aa906d9aea87f251932f7adba3120fd39f6a621c27b531c6ecfcce9.nii.gz/SHA256E-s5688327--c9c8ad310aa906d9aea87f251932f7adba3120fd39f6a621c27b531c6ecfcce9.nii.gz HTTP/1.1" 200 5688327 "-" "git-annex/8.20200226"

It looks like git update-server-info is already handled by gitea, since refs/info is up to date after each push, e.g.

refs/info
gitea@data:~/data/gitea-repositories/uofc/candice-fmri-.git$ cat info/refs 
2c085864431e37dda1123fc001cae68ec87ad0ee        refs/heads/git-annex
f9a91462ee31bd349c278767d3c607852bc6996e        refs/heads/master
f5a8660ae33e725df347ff34f8e5e6a2ad866f1d        refs/heads/synced/git-annex
f9a91462ee31bd349c278767d3c607852bc6996e        refs/heads/synced/master

@kousu
Copy link
Member Author

kousu commented May 8, 2022

I got lucky while googling and figured out how to get the nginx code to handle all repos:

        location / {
                root /srv/gitea/data/gitea-repositories/;
                try_files $uri @backend;
                #autoindex on;   # this is unusable because it's superseded by try_files
        }

        location @backend {
                proxy_pass http://gitea;
        }

How this works is that now, for any request, it first tries to look directly in the git repo in in the filesystem, and if there's an exact match, returns that file, and otherwise forwards the request to gitea.

The autoindex (see the screenshot above) is broken, because it only kicks in on 404s, but here we are redirecting 404s to gitea, but that's not really a problem.

You can see it working here:

client

p115628@joplin:~/datasets/test$ git clone https://data.praxisinstitute.org.dev.neuropoly.org/uofc/candice-fmri-.git
Cloning into 'candice-fmri-'...
p115628@joplin:~/datasets/test$ cd candice-fmri-/
p115628@joplin:~/datasets/test/candice-fmri-$ git annex get 
(merging origin/git-annex origin/synced/git-annex into git-annex...)
(recording state in git...)
(scanning for unlocked files...)
get sub-BAN01/anat/sub-BAN01_T1w.nii.gz (from origin...) 
(checksum...) ok                  
get sub-BAN02/anat/sub-BAN02_T1w.nii.gz (from origin...) 
(checksum...) ok
get sub-BAN03/anat/sub-BAN03_T1w.nii.gz (from origin...) 
(checksum...) ok
(recording state in git...)

server

132.207.65.211 - - [07/May/2022:23:46:21 -0400] "GET /uofc/candice-fmri-.git/info/refs?service=git-upload-pack HTTP/1.1" 200 256 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:21 -0400] "GET /uofc/candice-fmri-.git/HEAD HTTP/1.1" 200 23 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:21 -0400] "GET /uofc/candice-fmri-.git/objects/f9/a91462ee31bd349c278767d3c607852bc6996e HTTP/1.1" 200 119 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:21 -0400] "GET /uofc/candice-fmri-.git/objects/56/60aab0cd71d24ec424e2e51a34dcaaaf491abb HTTP/1.1" 200 490 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:21 -0400] "GET /uofc/candice-fmri-.git/objects/2c/085864431e37dda1123fc001cae68ec87ad0ee HTTP/1.1" 200 146 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:21 -0400] "GET /uofc/candice-fmri-.git/objects/f5/a8660ae33e725df347ff34f8e5e6a2ad866f1d HTTP/1.1" 200 143 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:21 -0400] "GET /uofc/candice-fmri-.git/objects/00/de965e4d9e205d4fe344c45fa60eaf938e6437 HTTP/1.1" 200 197 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/f6/78a93d40e46ec02ce5e4786de44b1663ee097c HTTP/1.1" 200 197 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/c4/00d059b86082c80a8f4ee65625dd3b9439f434 HTTP/1.1" 200 143 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/88/4d0f271f009c26d2c8722a91c89aae59b58576 HTTP/1.1" 200 1631 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/5a/ebfef22e5ebdac46e6537630e0d893963723ec HTTP/1.1" 200 485 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/7c/4881e682e1c868ea5db7495a6c60f9c202b7ac HTTP/1.1" 200 460 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/df/37cb78125846845f7be40dfa3bc18cae5f4cb6 HTTP/1.1" 200 200 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/5e/ce39f79c20ed7881c84925072094db6baa66e3 HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/00/2d227efd954e613cf9b81c42547736d0191f03 HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/27/b33f4d608dad672d33c48bf2aa0c1d5fb9c15e HTTP/1.1" 200 368 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/54/b3f7f085209c9736ee5832d1a3e0e74f743b78 HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/c3/a4b932e6788c221b33040f38ff95aef812073f HTTP/1.1" 200 316 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/44/87865ccee2b88a27c6306838eede14ad83e83a HTTP/1.1" 200 311 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/07/d617edc7fe49ac095990665d60a7cfb73fe714 HTTP/1.1" 200 64 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/7a/4441755b165d4d5bbc01f068db21424440f4ec HTTP/1.1" 200 352 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/a8/ef4455070dbece500d6d3da1216422f2eaf65a HTTP/1.1" 200 297 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/88/1220ec136cd1f5d2338d0ed8805f8a787c2940 HTTP/1.1" 200 309 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/44/f16703f8a3229b18e0e66be78b21ade6dc675c HTTP/1.1" 200 282 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/82/2b66268d639a756b7bfeea9bad19f97ada40a2 HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/86/5368494aecfaa24d6c360baba11f329b704a08 HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/49/9e9af8978134cbdcf186175dfbb8b37e5f3491 HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/ab/acfc428f9e57d9b5d06252a161635589fc9f3c HTTP/1.1" 200 81 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/1e/1b185ab0265f4d13f1e0f055497c3aa2c38ff5 HTTP/1.1" 200 83 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/45/7be23d5c9785539a988cc02070a0433a365866 HTTP/1.1" 200 385 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/ca/2e03ea84785aa7031a96ef5a1be261db73e2da HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/86/70bc80eac806377184cf355118c0200d35a75a HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/80/aa69c7dbc433db6ad9a2580ecf51b3687adada HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/2a/5cef83c8176163fff6475a2b24bb86ef1009ed HTTP/1.1" 200 131 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/9c/abf59a5eba85c204dcba4a6a009d0bb0d158d8 HTTP/1.1" 200 197 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/05/07133349c92819bbeb8e0103fb0114908962d2 HTTP/1.1" 200 144 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/b4/3974cf459a0c70156f95d31b42a3c8bafd696c HTTP/1.1" 200 93 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/09/a2650d8024039872b6ad4f3aba2c83745d9e04 HTTP/1.1" 200 93 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/05/ea163aa807e3f9008015231a002c9ecb0c7eb0 HTTP/1.1" 200 93 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/74/44e9da2c12208e2c26eab230e10c48c33c49b7 HTTP/1.1" 200 138 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/f8/46d15ea48ad71cbdb6ee963210d23f6b6cd411 HTTP/1.1" 200 139 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/4a/3c756803a27a935b985c00e688136f4a6bd4f9 HTTP/1.1" 200 139 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/e1/344628dbdea0b85ee1a8bc6c9da18c1916a902 HTTP/1.1" 200 138 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/83/05a7ad0be5eca5128b6167cf43c9d64340677b HTTP/1.1" 200 138 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/84/9b5e3cca8a2c078cd77b6ef9e41a359520797b HTTP/1.1" 200 139 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/aa/5fa7ed7bbe17558cad7b742eef0e1296fbf495 HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/c0/336dd31f573a8b09644355ac087d43f92754dd HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/8b/a4fcf3294d46acced8f1df1a3a2c5714d3d9a9 HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/4c/6b304908bcee9013dbfd613fd25f035d45b275 HTTP/1.1" 200 114 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/5b/521858bcf8208499301160a8f1887a8b7dd0e9 HTTP/1.1" 200 143 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/8d/fc37e7b630cb1718a002e0f2b53560fc0fd4a7 HTTP/1.1" 200 920 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/88/1e520cb47300085f45f2e90848dbbcb67a5fae HTTP/1.1" 200 110 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/ab/978970b05e7b7a88bcab8e7cbfbe5cddc93256 HTTP/1.1" 200 914 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/42/14c23191e3df0c73f393481a062588ab56b8f7 HTTP/1.1" 200 111 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/70/123d41759cf58ef0fc254506ed5266f0f039a5 HTTP/1.1" 200 906 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/86/7cfc1a1699d582d0f49920f47ce038f1a5a3cc HTTP/1.1" 200 110 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/c6/a8e55ab0a2f10bb6941e7e0531af1736f965a2 HTTP/1.1" 200 115 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/2a/3538e5b73b6acbf24557c6de2a7febf5ea818d HTTP/1.1" 200 116 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/3b/bb29962ca37b086790cb5cf9d9da30d88d0f0d HTTP/1.1" 200 116 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/2e/1b77bb58fdb3da9f0d12b94deb47c5737c3d80 HTTP/1.1" 200 105 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/84/f3f487752dbb28dce1499d9b00198274a831fe HTTP/1.1" 200 105 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/ee/ba42ac36e1d79251247b97f6d5317222c23b79 HTTP/1.1" 200 105 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/e0/1f0b2bd727b1c73b84c9a3c9d2d2a9d91f1f7a HTTP/1.1" 200 137 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/25/0f51389d8aedb049c1ce4ddfed00b80e00f966 HTTP/1.1" 200 139 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/5c/b984eb14f4aaa676ff231ac1dc6644c1ee8037 HTTP/1.1" 200 137 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/da/9e86353104c7fd5702977183446d5d859886e3 HTTP/1.1" 200 82 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/31/a5306a18f68ac3e7ef548782b315839f9efaae HTTP/1.1" 200 144 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/24/48b50fcc3d5590b2d44e8f4ec04f3a03154240 HTTP/1.1" 200 70 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/68/c17fe9af5d6f778ea1a957985cb69d64da9f4b HTTP/1.1" 200 70 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/17/017d3d45365d9c41fbc07db9e56c8c487b4028 HTTP/1.1" 200 71 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/ba/5138b969792a0770dc71a0bfdf70f3888cd4b1 HTTP/1.1" 200 53 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:22 -0400] "GET /uofc/candice-fmri-.git/objects/bd/2311658ea3d543522c24d1bb0ea6509476673d HTTP/1.1" 200 118 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:46:38 -0400] "GET /uofc/candice-fmri-.git/config HTTP/1.1" 200 132 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:23:46:39 -0400] "GET /uofc/candice-fmri-.git/annex/objects/c34/98e/SHA256E-s6472001--453e19abdba0eaa546a2dcb3ff72f71b1540fe34b5ce9aa0df841cf00c790dc2.nii.gz/SHA256E-s6472001--453e19abdba0eaa546a2dcb3ff72f71b1540fe34b5ce9aa0df841cf00c790dc2.nii.gz HTTP/1.1" 200 6472001 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:23:46:40 -0400] "GET /uofc/candice-fmri-.git/annex/objects/433/2b9/SHA256E-s6677912--29be5e1eef0d3b5fcd6817999c9055f6c088d58b37c7c09bb87c440fb8037c81.nii.gz/SHA256E-s6677912--29be5e1eef0d3b5fcd6817999c9055f6c088d58b37c7c09bb87c440fb8037c81.nii.gz HTTP/1.1" 200 6677912 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:23:46:40 -0400] "GET /uofc/candice-fmri-.git/annex/objects/58d/8ea/SHA256E-s5688327--c9c8ad310aa906d9aea87f251932f7adba3120fd39f6a621c27b531c6ecfcce9.nii.gz/SHA256E-s5688327--c9c8ad310aa906d9aea87f251932f7adba3120fd39f6a621c27b531c6ecfcce9.nii.gz HTTP/1.1" 200 5688327 "-" "git-annex/8.20200226"

Shortcomings

Directory Traversal?

I did a quick test to make sure this was safe against directory-traversal also, and it seems to be:

$ curl https://data.praxisinstitute.org.dev.neuropoly.org/../../../../../../../../../../etc/passwd
Not found.

ACLs?

But this precludes Gitea's permission system, so even if I make this repo private

Screenshot 2022-05-07 at 22-23-18 candice-fmri-
Screenshot 2022-05-07 at 22-23-31 candice-fmri-

I can download it without a password:

client:

115628@joplin:~/datasets/test$ git clone https://data.praxisinstitute.org.dev.neuropoly.org/uofc/candice-fmri-.git
Cloning into 'candice-fmri-'...

(funny, this behaviour is different than before: before it used POST git-upload-pack to upload, and had a progress bar to go with it; this time it's silent as it downloads the contents of .git/objects/ one-by-one:)

server:

132.207.65.211 - - [07/May/2022:22:30:30 -0400] "GET /uofc/candice-fmri-.git/info/refs?service=git-upload-pack HTTP/1.1" 200 256 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:30 -0400] "GET /uofc/candice-fmri-.git/HEAD HTTP/1.1" 200 23 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:30 -0400] "GET /uofc/candice-fmri-.git/objects/f9/a91462ee31bd349c278767d3c607852bc6996e HTTP/1.1" 200 119 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/56/60aab0cd71d24ec424e2e51a34dcaaaf491abb HTTP/1.1" 200 490 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/2c/085864431e37dda1123fc001cae68ec87ad0ee HTTP/1.1" 200 146 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/f5/a8660ae33e725df347ff34f8e5e6a2ad866f1d HTTP/1.1" 200 143 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/00/de965e4d9e205d4fe344c45fa60eaf938e6437 HTTP/1.1" 200 197 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/f6/78a93d40e46ec02ce5e4786de44b1663ee097c HTTP/1.1" 200 197 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/c4/00d059b86082c80a8f4ee65625dd3b9439f434 HTTP/1.1" 200 143 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/88/4d0f271f009c26d2c8722a91c89aae59b58576 HTTP/1.1" 200 1631 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/5a/ebfef22e5ebdac46e6537630e0d893963723ec HTTP/1.1" 200 485 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/7c/4881e682e1c868ea5db7495a6c60f9c202b7ac HTTP/1.1" 200 460 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/df/37cb78125846845f7be40dfa3bc18cae5f4cb6 HTTP/1.1" 200 200 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/00/2d227efd954e613cf9b81c42547736d0191f03 HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/5e/ce39f79c20ed7881c84925072094db6baa66e3 HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/27/b33f4d608dad672d33c48bf2aa0c1d5fb9c15e HTTP/1.1" 200 368 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/54/b3f7f085209c9736ee5832d1a3e0e74f743b78 HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/44/87865ccee2b88a27c6306838eede14ad83e83a HTTP/1.1" 200 311 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/c3/a4b932e6788c221b33040f38ff95aef812073f HTTP/1.1" 200 316 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/07/d617edc7fe49ac095990665d60a7cfb73fe714 HTTP/1.1" 200 64 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/7a/4441755b165d4d5bbc01f068db21424440f4ec HTTP/1.1" 200 352 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/a8/ef4455070dbece500d6d3da1216422f2eaf65a HTTP/1.1" 200 297 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/44/f16703f8a3229b18e0e66be78b21ade6dc675c HTTP/1.1" 200 282 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/88/1220ec136cd1f5d2338d0ed8805f8a787c2940 HTTP/1.1" 200 309 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/82/2b66268d639a756b7bfeea9bad19f97ada40a2 HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/86/5368494aecfaa24d6c360baba11f329b704a08 HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/49/9e9af8978134cbdcf186175dfbb8b37e5f3491 HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/1e/1b185ab0265f4d13f1e0f055497c3aa2c38ff5 HTTP/1.1" 200 83 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/45/7be23d5c9785539a988cc02070a0433a365866 HTTP/1.1" 200 385 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/ab/acfc428f9e57d9b5d06252a161635589fc9f3c HTTP/1.1" 200 81 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/ca/2e03ea84785aa7031a96ef5a1be261db73e2da HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/86/70bc80eac806377184cf355118c0200d35a75a HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/2a/5cef83c8176163fff6475a2b24bb86ef1009ed HTTP/1.1" 200 131 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/9c/abf59a5eba85c204dcba4a6a009d0bb0d158d8 HTTP/1.1" 200 197 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/80/aa69c7dbc433db6ad9a2580ecf51b3687adada HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/05/07133349c92819bbeb8e0103fb0114908962d2 HTTP/1.1" 200 144 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/b4/3974cf459a0c70156f95d31b42a3c8bafd696c HTTP/1.1" 200 93 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/05/ea163aa807e3f9008015231a002c9ecb0c7eb0 HTTP/1.1" 200 93 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/74/44e9da2c12208e2c26eab230e10c48c33c49b7 HTTP/1.1" 200 138 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/09/a2650d8024039872b6ad4f3aba2c83745d9e04 HTTP/1.1" 200 93 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/f8/46d15ea48ad71cbdb6ee963210d23f6b6cd411 HTTP/1.1" 200 139 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/4a/3c756803a27a935b985c00e688136f4a6bd4f9 HTTP/1.1" 200 139 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/83/05a7ad0be5eca5128b6167cf43c9d64340677b HTTP/1.1" 200 138 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/84/9b5e3cca8a2c078cd77b6ef9e41a359520797b HTTP/1.1" 200 139 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/e1/344628dbdea0b85ee1a8bc6c9da18c1916a902 HTTP/1.1" 200 138 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/aa/5fa7ed7bbe17558cad7b742eef0e1296fbf495 HTTP/1.1" 200 46 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/c0/336dd31f573a8b09644355ac087d43f92754dd HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/4c/6b304908bcee9013dbfd613fd25f035d45b275 HTTP/1.1" 200 114 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/5b/521858bcf8208499301160a8f1887a8b7dd0e9 HTTP/1.1" 200 143 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/8b/a4fcf3294d46acced8f1df1a3a2c5714d3d9a9 HTTP/1.1" 200 45 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/8d/fc37e7b630cb1718a002e0f2b53560fc0fd4a7 HTTP/1.1" 200 920 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/88/1e520cb47300085f45f2e90848dbbcb67a5fae HTTP/1.1" 200 110 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/42/14c23191e3df0c73f393481a062588ab56b8f7 HTTP/1.1" 200 111 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/70/123d41759cf58ef0fc254506ed5266f0f039a5 HTTP/1.1" 200 906 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/ab/978970b05e7b7a88bcab8e7cbfbe5cddc93256 HTTP/1.1" 200 914 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/86/7cfc1a1699d582d0f49920f47ce038f1a5a3cc HTTP/1.1" 200 110 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/c6/a8e55ab0a2f10bb6941e7e0531af1736f965a2 HTTP/1.1" 200 115 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/3b/bb29962ca37b086790cb5cf9d9da30d88d0f0d HTTP/1.1" 200 116 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/2a/3538e5b73b6acbf24557c6de2a7febf5ea818d HTTP/1.1" 200 116 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/84/f3f487752dbb28dce1499d9b00198274a831fe HTTP/1.1" 200 105 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/2e/1b77bb58fdb3da9f0d12b94deb47c5737c3d80 HTTP/1.1" 200 105 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/ee/ba42ac36e1d79251247b97f6d5317222c23b79 HTTP/1.1" 200 105 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/25/0f51389d8aedb049c1ce4ddfed00b80e00f966 HTTP/1.1" 200 139 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:31 -0400] "GET /uofc/candice-fmri-.git/objects/e0/1f0b2bd727b1c73b84c9a3c9d2d2a9d91f1f7a HTTP/1.1" 200 137 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:32 -0400] "GET /uofc/candice-fmri-.git/objects/da/9e86353104c7fd5702977183446d5d859886e3 HTTP/1.1" 200 82 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:32 -0400] "GET /uofc/candice-fmri-.git/objects/5c/b984eb14f4aaa676ff231ac1dc6644c1ee8037 HTTP/1.1" 200 137 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:32 -0400] "GET /uofc/candice-fmri-.git/objects/31/a5306a18f68ac3e7ef548782b315839f9efaae HTTP/1.1" 200 144 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:32 -0400] "GET /uofc/candice-fmri-.git/objects/24/48b50fcc3d5590b2d44e8f4ec04f3a03154240 HTTP/1.1" 200 70 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:32 -0400] "GET /uofc/candice-fmri-.git/objects/68/c17fe9af5d6f778ea1a957985cb69d64da9f4b HTTP/1.1" 200 70 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:32 -0400] "GET /uofc/candice-fmri-.git/objects/17/017d3d45365d9c41fbc07db9e56c8c487b4028 HTTP/1.1" 200 71 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:32 -0400] "GET /uofc/candice-fmri-.git/objects/ba/5138b969792a0770dc71a0bfdf70f3888cd4b1 HTTP/1.1" 200 53 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:22:30:32 -0400] "GET /uofc/candice-fmri-.git/objects/bd/2311658ea3d543522c24d1bb0ea6509476673d HTTP/1.1" 200 118 "-" "git/2.25.1"

Case Sensitivity?

Also it's case-sensitive because nginx is case-sensitive; but Gitea is case-insensitive, so e.g. you are supposed to be able to use https://data.praxisinstitute.org.dev.neuropoly.org/UofC/CANDICE-fMRI-.git, but this fails

client:

$ git clone https://data.praxisinstitute.org.dev.neuropoly.org/UofC/CANDICE-fMRI-.git
Cloning into 'CANDICE-fMRI-'...
Username for 'https://data.praxisinstitute.org.dev.neuropoly.org': 

server

132.207.65.211 - - [07/May/2022:23:35:11 -0400] "GET /UofC/CANDICE-fMRI-.git/info/refs?service=git-upload-pack HTTP/1.1" 401 13 "-" "git/2.25.1"

because the first location block is failing (because of the case mismatch), then falling back to the second one, which passes to Gitea, which does recognize the request, but recognizes it as a private repo and so returns 401 Unauthorized.

Another way this can fail is that you can clone the git part but not the git-annex part, because the .git/config and .git/annex/* get returned by nginx directly and nginx doesn't recognize the mismatched case:

client:

p115628@joplin:~/datasets/test$ git clone https://data.praxisinstitute.org.dev.neuropoly.org/UofC/CANDICE-fMRI-.git
Cloning into 'CANDICE-fMRI-'...
remote: Enumerating objects: 71, done.
remote: Counting objects: 100% (71/71), done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 71 (delta 21), reused 0 (delta 0)
Unpacking objects: 100% (71/71), 9.70 KiB | 522.00 KiB/s, done.
p115628@joplin:~/datasets/test$ cd CANDICE-fMRI-/
p115628@joplin:~/datasets/test/CANDICE-fMRI-$ get annex get

Command 'get' not found, but there are 18 similar ones.

p115628@joplin:~/datasets/test/CANDICE-fMRI-$ git annex get
(merging origin/git-annex origin/synced/git-annex into git-annex...)
(recording state in git...)
(scanning for unlocked files...)
get sub-BAN01/anat/sub-BAN01_T1w.nii.gz 
  Remote origin not usable by git-annex; setting annex-ignore
(not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
get sub-BAN02/anat/sub-BAN02_T1w.nii.gz (not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
get sub-BAN03/anat/sub-BAN03_T1w.nii.gz (not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
(recording state in git...)
git-annex: get: 3 failed

server:

132.207.65.211 - - [07/May/2022:23:42:48 -0400] "GET /UofC/CANDICE-fMRI-.git/info/refs?service=git-upload-pack HTTP/1.1" 200 567 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:42:48 -0400] "POST /UofC/CANDICE-fMRI-.git/git-upload-pack HTTP/1.1" 200 14887 "-" "git/2.25.1"
132.207.65.211 - - [07/May/2022:23:43:46 -0400] "GET /UofC/CANDICE-fMRI-.git/config HTTP/1.1" 404 11 "-" "git-annex/8.20200226"

@kousu
Copy link
Member Author

kousu commented May 8, 2022

The git-annex-specific endpoints are just

132.207.65.211 - - [07/May/2022:23:46:38 -0400] "GET /uofc/candice-fmri-.git/config HTTP/1.1" 200 132 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:23:46:39 -0400] "GET /uofc/candice-fmri-.git/annex/objects/c34/98e/SHA256E-s6472001--453e19abdba0eaa546a2dcb3ff72f71b1540fe34b5ce9aa0df841cf00c790dc2.nii.gz/SHA256E-s6472001--453e19abdba0eaa546a2dcb3ff72f71b1540fe34b5ce9aa0df841cf00c790dc2.nii.gz HTTP/1.1" 200 6472001 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:23:46:40 -0400] "GET /uofc/candice-fmri-.git/annex/objects/433/2b9/SHA256E-s6677912--29be5e1eef0d3b5fcd6817999c9055f6c088d58b37c7c09bb87c440fb8037c81.nii.gz/SHA256E-s6677912--29be5e1eef0d3b5fcd6817999c9055f6c088d58b37c7c09bb87c440fb8037c81.nii.gz HTTP/1.1" 200 6677912 "-" "git-annex/8.20200226"
132.207.65.211 - - [07/May/2022:23:46:40 -0400] "GET /uofc/candice-fmri-.git/annex/objects/58d/8ea/SHA256E-s5688327--c9c8ad310aa906d9aea87f251932f7adba3120fd39f6a621c27b531c6ecfcce9.nii.gz/SHA256E-s5688327--c9c8ad310aa906d9aea87f251932f7adba3120fd39f6a621c27b531c6ecfcce9.nii.gz HTTP/1.1" 200 5688327 "-" "git-annex/8.20200226"

which I think I can summarize as .git/config and .git/annex/*. So the task will be just to add these as routes to Gitea's built in web server, while making sure they respect its permission system.

And the only reason .git/config needs to be exposed, I think, is so it can read annex.uuid from it (and use that to both decide if it's a usable remote and to determine what files might be in it); so maybe instead of exposing the entire thing maybe I just need to generate a fake config that just contains that.

@kousu
Copy link
Member Author

kousu commented May 8, 2022

I removed the nginx hack and tried to trace gitea instead to get a sense of where to start editing; I set LOG.LEVEL = Trace in ~gitea/custom/conf/app.ini, and ran it.

Here's what I got:

client
p115628@joplin:~/datasets/test$ git clone https://data.praxisinstitute.org.dev.neuropoly.org/uofc/candice-fmri-.git
Cloning into 'candice-fmri-'...
remote: Enumerating objects: 71, done.
remote: Counting objects: 100% (71/71), done.
remote: Compressing objects: 100% (57/57), done.
remote: Total 71 (delta 21), reused 0 (delta 0)
Unpacking objects: 100% (71/71), 9.70 KiB | 496.00 KiB/s, done.
p115628@joplin:~/datasets/test$ cd candice-fmri-/
p115628@joplin:~/datasets/test/candice-fmri-$ git annex get 
(merging origin/git-annex origin/synced/git-annex into git-annex...)
(recording state in git...)
(scanning for unlocked files...)
get sub-BAN01/anat/sub-BAN01_T1w.nii.gz 
  Remote origin not usable by git-annex; setting annex-ignore
(not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
get sub-BAN02/anat/sub-BAN02_T1w.nii.gz (not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
get sub-BAN03/anat/sub-BAN03_T1w.nii.gz (not available) 
  Try making some of these repositories available:
        d57145f0-6115-47c4-8278-994da90074c1 -- [email protected]:~/data/gitea-repositories/muhc/candice-fmri-.git

  (Note that these git remotes have annex-ignore set: origin)
failed
(recording state in git...)
git-annex: get: 3 failed
server
gitea@data:~$ ./gitea 
[...]
2022/05/08 00:05:48 Started GET /uofc/candice-fmri-.git/info/refs?service=git-upload-pack for 127.0.0.1:47812
2022/05/08 00:05:48 ...dules/git/command.go:143:RunWithContext() [D] /srv/gitea/data/gitea-repositories/uofc/candice-fmri-.git: /usr/bin/git -c credential.helper= -c protocol.version=2 -c uploadpack.allowfilter=true -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= upload-pack --stateless-rpc --advertise-refs .
2022/05/08 00:05:48 ...dules/git/command.go:229:RunInDirTimeoutEnv() [T] Stdout:
         0101f9a91462ee31bd349c278767d3c607852bc6996e HEAD\000multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/master filter agent=git/2.25.1
        00422c085864431e37dda1123fc001cae68ec87ad0ee refs/heads/git-annex
        003ff9a91462ee31bd349c278767d3c607852bc6996e refs/heads/master
        0049f5a8660ae33e725df347ff34f8e5e6a2ad866f1d refs/heads/synced/git-annex
        0046f9a91462ee31bd349c278767d3c607852bc6996e refs/heads/synced/master
        0000
2022/05/08 00:05:48 Completed GET /uofc/candice-fmri-.git/info/refs?service=git-upload-pack 200 OK in 13.768695ms
2022/05/08 00:05:48 Started POST /uofc/candice-fmri-.git/git-upload-pack for 127.0.0.1:47814
2022/05/08 00:05:48 Completed POST /uofc/candice-fmri-.git/git-upload-pack 200 OK in 21.155688ms
2022/05/08 00:05:54 Started GET /uofc/candice-fmri-.git/config for 127.0.0.1:47816
2022/05/08 00:05:54 ...s/context/context.go:304:PlainTextBytes() [E] PlainTextBytes: Not found.

2022/05/08 00:05:54 Completed GET /uofc/candice-fmri-.git/config 404 Not Found in 468.005µs

So, actually I'm not sure that helped me find where to edit, but it at least gives me a baseline to work against so that, when I do start editing, I know what to look for.

@kousu
Copy link
Member Author

kousu commented May 8, 2022

I poked around the code and my first guess is that what I'm looking for starts here:

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/web/repo/http.go#L36-L40

"smart HTTP protocol" sounds about right.


Let's see if I can find it another way. Starting from the top:

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/init.go#L175

calls

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/init.go#L39

calls

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/web/web.go#L100-L102

includes

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/web/web.go#L1208-L1220

and repo.ServiceUploadPack and repo.ServiceReceivePack and repo.GetInfoRefs and repo.GetInfoPacks and repo.GetLooseObject all call httpBase:

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/web/repo/http.go#L490-L491

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/web/repo/http.go#L498-L499

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/web/repo/http.go#L530-L531

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/web/repo/http.go#L577-L578

https://github.com/neuropoly/gitea/blob/b31f800f5c060cc8985b018714e4d995d7491ca3/routers/web/repo/http.go#L586-L587

So that's indeed the right connection.


But I think what I need to edit is in those routes, I think maybe if I just add

m.GetOptions("/config", repo.GetTextFile("config"))
m.GetOptions("/annex/objects/{object:[^.]*}", repo.GetAnnexObject)

and add

func GetAnnexObject(ctx *context.Context) {
	h := httpBase(ctx)
	if h != nil {
		h.setHeaderCacheForever()
		h.sendFile("application/octet-stream", "annex/objects/"+ctx.Params("object")) // XXX fix the directory traversal here
	}
}

then maybe it will just work?

@kousu
Copy link
Member Author

kousu commented May 8, 2022

My guesses were pretty close! See the final version at #6

kousu added a commit that referenced this issue Sep 20, 2022
I don't think git-annex supports uploading over http,
so I didn't try to. Strictly download only.

To support private repos, I had to hunt down and patch
a secret extra corner of security that Gitea only applies
to HTTP Basic Auth for some reason (in services/auth/basic.go)

Ref: https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3
kousu added a commit that referenced this issue Oct 19, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Oct 30, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Oct 31, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 1, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 2, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
kousu added a commit that referenced this issue Nov 4, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 5, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 6, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 7, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 8, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 9, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 10, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 11, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 12, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 13, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
gitea-sync bot pushed a commit that referenced this issue Nov 14, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
kousu added a commit that referenced this issue Nov 20, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
kousu added a commit that referenced this issue Nov 29, 2023
This makes HTTP symmetric with SSH clone URLs.

This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.

Previously, to access "open access" data shared this way,
users would need to:

  1. Create an account on gitea.example.com
  2. Create ssh keys
  3. Upload ssh keys (and make sure to find and upload the correct file)
  4. `git clone [email protected]:user/dataset.git`
  5. `cd dataset`
  6. `git annex get`

This cuts that down to just the last three steps:

  1. `git clone https://gitea.example.com/user/dataset.git`
  2. `cd dataset`
  3. `git annex get`

This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.

Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See #7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.

This is not a major loss since no one wants uploading to be anonymous anyway.

To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).

This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/

Fixes #3

Co-authored-by: Mathieu Guay-Paquet <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant