Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.23.0-RC6 segfault during upload of Google Photos #613

Closed
nickspacek opened this issue Jan 8, 2025 · 17 comments
Closed

0.23.0-RC6 segfault during upload of Google Photos #613

nickspacek opened this issue Jan 8, 2025 · 17 comments
Assignees

Comments

@nickspacek
Copy link

Getting this during upload of extracted Takeout from Google Photos:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x70 pc=0x8cc92f]
goroutine 156 [running]:
github.com/simulot/immich-go/adapters/googlePhotos.(*Takeout).handleDir.func1()
github.com/simulot/immich-go/adapters/googlePhotos/googlephotos.go:471 +0x48f
created by github.com/simulot/immich-go/adapters/googlePhotos.(*Takeout).handleDir in goroutine
github.com/simulot/immich-go/adapters/googlePhotos/googlephotos.go:425 +0x48d

Seems to be happening consistently early in my upload. I'm not seeing any specific error in the immich logs that might indicate what's going on, but if there's anything on that end that might help let me know.

@simulot
Copy link
Owner

simulot commented Jan 8, 2025

which version of immich-go?
What was the command line ?
Any logs?

@nickspacek
Copy link
Author

Version was 0.23.0-RC6:

2025-01-08 11:36:46 INF immich-go version:dev,  commit:550dac7fe0c192444a49ab7cc2fa5919378a7375, date:2025-01-05T07:55:16Z
2025-01-08 11:36:46 INF Command: from-google-photos [flags] <takeout-*.zip> | <takeout-folder>
2025-01-08 11:36:46 INF Flags:
2025-01-08 11:36:46 INF  --api-key=***************************************vlUg
2025-01-08 11:36:46 INF  --api-trace=false
2025-01-08 11:36:46 INF  --ban-file='@eaDir/', '@__thumb/', 'SYNOFILE_THUMB_*.*', 'Lightroom Catalog/', 'thumbnails/', '.DS_Store/', '._*.*', '.photostructure/'
2025-01-08 11:36:46 INF  --client-timeout=5m0s
2025-01-08 11:36:46 INF  --date-range=unset
2025-01-08 11:36:46 INF  --device-uuid=spacek-nas
2025-01-08 11:36:46 INF  --dry-run=false
2025-01-08 11:36:46 INF  --exclude-extensions=
2025-01-08 11:36:46 INF  --from-album-name=Family & friends
2025-01-08 11:36:46 INF  --help=false
2025-01-08 11:36:46 INF  --include-archived=true
2025-01-08 11:36:46 INF  --include-extensions=
2025-01-08 11:36:46 INF  --include-partner=true
2025-01-08 11:36:46 INF  --include-trashed=false
2025-01-08 11:36:46 INF  --include-unmatched=true
2025-01-08 11:36:46 INF  --include-untitled-albums=false
2025-01-08 11:36:46 INF  --log-file=/home/nick/.cache/immich-go/immich-go_2025-01-08_11-36-46.log
2025-01-08 11:36:46 INF  --log-level=INFO
2025-01-08 11:36:46 INF  --log-type=text
2025-01-08 11:36:46 INF  --manage-burst=
2025-01-08 11:36:46 INF  --manage-epson-fastfoto=false
2025-01-08 11:36:46 INF  --manage-heic-jpeg=
2025-01-08 11:36:46 INF  --manage-raw-jpeg=
2025-01-08 11:36:46 INF  --no-ui=false
2025-01-08 11:36:46 INF  --partner-shared-album=
2025-01-08 11:36:46 INF  --server=http://10.10.10.7:2283
2025-01-08 11:36:46 INF  --session-tag=false
2025-01-08 11:36:46 INF  --skip-verify-ssl=false
2025-01-08 11:36:46 INF  --sync-albums=true
2025-01-08 11:36:46 INF  --tag=[]
2025-01-08 11:36:46 INF  --takeout-tag=true
2025-01-08 11:36:46 INF  --time-zone=
2025-01-08 11:36:46 INF Arguments:
2025-01-08 11:36:46 INF   "."

I re-ran against the Family & friends album and it fails shortly into the upload phase. Logs don't seem to have any errors. For some reason there are two metadata.json files:

2025-01-08 11:37:30 INF scanned sidecar file file=Google Photos:Family & friends/metadata(1).json type=album metadata title=Family & friends
2025-01-08 11:37:30 INF scanned sidecar file file=Google Photos:Family & friends/metadata.json type=album metadata title=Family & friends

I was able to complete successfully when targeting a different specific album that had two metadata files.

@simulot simulot self-assigned this Jan 8, 2025
@simulot
Copy link
Owner

simulot commented Jan 8, 2025

i can't reproduce your error.
maybe change the "." to the real directory.

@nickspacek
Copy link
Author

Thanks; I will give that a try once the rest of the import completes and report back.

@nickspacek
Copy link
Author

I moved my "Family & friends" folder into a separate parent folder and targeted that parent folder directly:

immich-go -s http://10.10.10.7:2283 -k=$KEY upload from-google-photos -u tmp

It still fails when it completes metadata begins to upload. The final messages in the immich-go logs are:

2025-01-08 16:07:21 WRN missing associated metadata file file=tmp:Family & friends/13039_327707840323_537645323_9470255_27214_n(1).jpg
2025-01-08 16:07:21 WRN missing associated metadata file file=tmp:Family & friends/3AE83D9D-EAD0-4CDF-B5E4-FFB2109F0936_1_102_o(1).jpeg

I'm going to try to prune the contents to find the minimum reproducible case.

@nickspacek
Copy link
Author

The minimum case appears to be a single image in the "Family & friends" folder.

@nickspacek
Copy link
Author

I re-built on an x86 Mac and see the same behavior with a local Family & friends folder with a single image in it.

@simulot
Copy link
Owner

simulot commented Jan 9, 2025

Thanks for your help.

The problem is triggered with one folder named "Family & friends" containing 2 metadata.json files and one photo.
Is this correct?

Have you tried to change the photo by another?

@nickspacek
Copy link
Author

I can trigger it with an album called Family & friends with a single, normal-lookng metadata.json and a single image. I've tried replacing the image; any image triggers the behavior.

@nickspacek
Copy link
Author

nickspacek commented Jan 9, 2025

Digging in and debugging. It seems that a.FromApplication is being accessed when it is nil here: https://github.com/simulot/immich-go/blob/v0.23.0-RC6/adapters/googlePhotos/googlephotos.go#L471

However, I guess this is a little contrived when taking a folder that was part of a Takeout, removing the contents, and manually adding an image. I can see now that the expectation is that Google Takeout should always produce JSON metadata alongside the images? I'll have a look at the Takeout folders to see if there is an actual case of missing metadata.

@nickspacek
Copy link
Author

Think I sorted it out into two issues:

  1. If metadata is missing that exception occurs. If (since?) metadata is required for Takeout - Google Photos import, that should be enforced earlier in the flow.
  2. The logic for matching metadata should be expanded to cover some more edge cases

In my case, the warnings in the logs were showing the failure cases:

3AE83D9D-EAD0-4CDF-B5E4-FFB2109F0936_1_102_o(1).jpeg
3AE83D9D-EAD0-4CDF-B5E4-FFB2109F0936_1_102_o.j(1).json
13039_327707840323_537645323_9470255_27214_n(1).jpg
13039_327707840323_537645323_9470255_27214_n.j(1).json

I wonder if the strange formatting is due to a maximum file length?

@tetebueno
Copy link

Hi, bump here. As a workaround I executed the script excluding all Photos from... directories in order to cover all albums only:

$ ./immich-go upload from-google-photos --sync-albums --api-key XXX --server http://localhost:2283 --ban-file 'Photos from*' ${TAKEOUT_LOCATION}

Then took care of the rest using a glob for the path:

$ ./immich-go upload from-google-photos --sync-albums --api-key XXX --server http://localhost:2283 ${TAKEOUT_LOCATION}/Photos\ from*

I believe this is also related to the number of files being processed since the conflicting files from the original upload with all Takeout files didn't throw any errors doing the upload with this slice-strategy.

Just chimed in in case anyone was struggling with uploading everything even with the bug present.

Cheers.

P.S.: awesome project. Thanks a lot.

@simulot
Copy link
Owner

simulot commented Jan 9, 2025

Thank you for your deep analysis

13039_327707840323_537645323_9470255_27214_n.j(1).json
I wonder if the strange formatting is due to a maximum file length?

yes, 46 chars... but UTF16 chars when the zip is encoded with UTF-8. I m documenting google takeout issues here: https://github.com/simulot/immich-go/blob/main/docs/google-takeout.md

Digging in and debugging. It seems that a.FromApplication is being accessed when it is nil here: https://github.com/simulot/immich-go/blob/v0.23.0-RC6/adapters/googlePhotos/googlephotos.go#L471

I have also identified this line as a potential problem. I should understand when the FromApplication is nil, and at least not reference it when nil.

Takeout folders:
The folders Photos from YYYY contain photos for the year YYYY
The other folders are albums. They should contain a metadata.json file that describe the album. 2 metadata files is unexpected. But, it doesn't hurt.

$ ./immich-go upload from-google-photos --sync-albums --api-key XXX --server http://localhost:2283 --ban-file 'Photos from*' ${TAKEOUT_LOCATION}

This is a guru technique! It should work, but images files may be absent from the album folder. Immich-go searches them in the folder, then in the Photos from YYYY folder.

I believe this is also related to the number of files being processed since the conflicting files from the original upload with all Takeout files didn't throw any errors doing the upload with this slice-strategy.

There too many moving parts.... The content of the immich server and the flow of photos...

I'm glad you overcome the problem. Anyway, I'd like to fix the initial bug, or at least prevent the segfault.

P.S.: awesome project. Thanks a lot.

You're welcome. I have started it for fun 1.5 years ago for my personal takeout, and I'm still working on the project. I'm learning by doing.

@simulot
Copy link
Owner

simulot commented Jan 10, 2025

Hi @nickspacek

Think I sorted it out into two issues:

1. If metadata is missing that exception occurs. If (since?) metadata is required for Takeout - Google Photos import, that should be enforced earlier in the flow.

Could you share with me the minimal set of files that triggers the segfault?
I did some test, but none was giving this problem.

2. The logic for matching metadata should be expanded to cover some more edge cases

In my case, the warnings in the logs were showing the failure cases:

3AE83D9D-EAD0-4CDF-B5E4-FFB2109F0936_1_102_o(1).jpeg
3AE83D9D-EAD0-4CDF-B5E4-FFB2109F0936_1_102_o.j(1).json
13039_327707840323_537645323_9470255_27214_n(1).jpg
13039_327707840323_537645323_9470255_27214_n.j(1).json

I wonder if the strange formatting is due to a maximum file length?

Those cases should be handled properly. Could you share content of the 2 JSON files?

@nickspacek
Copy link
Author

nickspacek commented Jan 10, 2025

I will attach shortly; however, when I added those cases to the matchers test there was no matcher found for them out of the box.

Edit: tmp2.tgz reproduces with 0.23.0-RC6 with params ./immich-go -s $SERVER -k=$KEY upload from-google-photos -u tmp2 --from-album-name 'Family & friends'

Hope that helps @simulot

@simulot
Copy link
Owner

simulot commented Jan 10, 2025

@nickspacek : This helps!

  1. One image isn't associated to its JSON despite the obvious matching,
  2. _xxxx.JSON aren't discarded,
  3. You are using the -u option that force the integration of json less files,
  4. The SEGFAULT is triggered when the metadata associated to an image isn't loaded

Stay tuned

@nickspacek
Copy link
Author

I missed the -u flag, I'm not sure why I would have specified it! I see it in the debug logs of my second message in the thread though so it was definitely there in the beginning. Thanks for looking into this.

simulot added a commit that referenced this issue Jan 11, 2025
simulot added a commit that referenced this issue Jan 11, 2025
* Fix filterOnMetadata to handle empty directory case

* Add a test case to track #613

* split e2e tests

* Add end-to-end test for uploading from Google Photos to address issue #613

* Fix nil pointer dereference by checking FromApplication before assigning Albums.

* Refactor GP match functions for improved clarity and performance; add getFileIndex utility

* Refactor matcher tests to improve clarity and update expected results for various cases

* Refactor matchers in Google Photos adapter for improved clarity and organization

* Update end-to-end tests for Google Photos upload to support wildcard zip files and add album name parameter

* Merge branch 'next' into simulot/issue613
@simulot simulot closed this as completed Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants