Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checksum collision with completely different photos #10546

Closed
1 of 3 tasks
miahi opened this issue Jun 22, 2024 · 9 comments
Closed
1 of 3 tasks

Checksum collision with completely different photos #10546

miahi opened this issue Jun 22, 2024 · 9 comments

Comments

@miahi
Copy link

miahi commented Jun 22, 2024

The bug

I have an issue with missing photos whilst showing/syncing photos with the Android app, and looking through the logs I found this (amongst other 200+ duplicate warnings). Full log entry is below in the 'Relevant log output'.

2024-06-22 16:41:25.329649 | INFO     | SyncService          | Ignoring duplicate assets on device:
{
  "id": "N/A",
  "remoteId": "N/A",
  "localId": "1000034462",
  "checksum": "GNrECa7lbm6smQHTdW5Rbf99NNU=",
[...]
  "fileName": "20230919_211125.jpg",
[...]
}
{
  "id": "N/A",
  "remoteId": "N/A",
  "localId": "1000053533",
  "checksum": "GNrECa7lbm6smQHTdW5Rbf99NNU=",
[...]
  "fileName": "20240622_075114.jpg",
[...]
} 

I checked the photos, both of them are in the same directory on the phone but they are completely different, taken at different times, different file size, different phones (extracted from the phone and attached in zip to make sure they are not changed in any way - not sure how the checksum is calculated)
immich_collision.zip

The newer photo (20240622_075114.jpg) does not show in the immich app and is not synced to server. The older one (20230919_211125.jpg) is backed up to the server, and shows in the app.

The OS that Immich Server is running on

Debian

Version of Immich Server

v1.106.4

Version of Immich Mobile App

v1.106.3 build.143

Platform with the issue

  • Server
  • Web
  • Mobile

Your docker-compose.yml content

version: "3.8"

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: [ "start.sh", "immich" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    # extends:
    #   file: hwaccel.yml
    #   service: hwaccel
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:c5a607fb6e1bb15d32bbcf14db22787d19e428d59e31a5da67511b49bb0f1ccc
    restart: always

  database:
    container_name: immich_postgres
    image: tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    env_file:
      - .env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: always

volumes:
  pgdata:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/opt/immich-app/library

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=sure

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

REDIS_HOSTNAME=immich_redis

Reproduction steps

1. Use the two photos in the zip to check if both are seen by the app and uploaded to server

Relevant log output

2024-06-22 16:41:25.329649 | INFO     | SyncService          | Ignoring duplicate assets on device:
{
  "id": "N/A",
  "remoteId": "N/A",
  "localId": "1000034462",
  "checksum": "GNrECa7lbm6smQHTdW5Rbf99NNU=",
  "ownerId": -1389388339680424900,
  "livePhotoVideoId": "N/A",
  "stackCount": "0",
  "stackParentId": "N/A",
  "fileCreatedAt": "2023-09-19 21:11:24.000",
  "fileModifiedAt": "2023-09-19 21:11:25.000",
  "updatedAt": "2023-09-19 21:11:25.000",
  "durationInSeconds": 0,
  "type": "AssetType.image",
  "fileName": "20230919_211125.jpg",
  "isFavorite": false,
  "isRemote": false,
  "storage": "AssetState.local",
  "width": 4000,
  "height": 3000,
  "isArchived": false,
  "isTrashed": false,
  "isOffline": false,
}
{
  "id": "N/A",
  "remoteId": "N/A",
  "localId": "1000053533",
  "checksum": "GNrECa7lbm6smQHTdW5Rbf99NNU=",
  "ownerId": -1389388339680424900,
  "livePhotoVideoId": "N/A",
  "stackCount": "0",
  "stackParentId": "N/A",
  "fileCreatedAt": "2024-06-22 07:51:14.000",
  "fileModifiedAt": "2024-06-22 07:51:16.000",
  "updatedAt": "2024-06-22 07:51:16.000",
  "durationInSeconds": 0,
  "type": "AssetType.image",
  "fileName": "20240622_075114.jpg",
  "isFavorite": false,
  "isRemote": false,
  "storage": "AssetState.local",
  "width": 4000,
  "height": 3000,
  "isArchived": false,
  "isTrashed": false,
  "isOffline": false,
}

Additional information

No response

@bo0tzz
Copy link
Member

bo0tzz commented Jun 22, 2024

@fyfrey I'm having a look at the app's hashing code and there's a fair bit of array index juggling. Nothing stands out to me immediately, but how likely do you think it is that there's some edge case where indexes get misaligned or such?

@miahi can you pull the complete logs from the app? I'm particularly interested to see what got logged at the time it calculated all the hashes.

@miahi
Copy link
Author

miahi commented Jun 22, 2024

The logs might be a bit "dirty" as I was trying a lot of things today as I was trying to understand what is happening with #6196 (two of the photos missing there are found as duplicates in the logs).

immich_android_logs.zip

I tried uploading the two files on a different account directly to the web app and both were accepted. I can do other tests if needed.

@fyfrey
Copy link
Contributor

fyfrey commented Jun 22, 2024

I'll need to check. An edge case bug is certainly possible. We can also just check if these files happen to produce a hash collision for sha1 (unlikely, but we should rule this out first)

@miahi
Copy link
Author

miahi commented Jun 22, 2024

@bo0tzz I checked the code as I thought maybe it's a custom hash implementation but it's just SHA1, fat chance of random collisions there. The SHA1 hash of the old file is correct (GNrECa7lbm6smQHTdW5Rbf99NNU= in base64), but then the second one should be JpWUN7FvqDoak41elMrIc7wsH64= but it's somehow mixed with the first one in the app.

What I also did today (a lot) was to remove and re-add libraries into the app to try to find a pattern for #6196 - that might also be related somehow to indexing, as it seems that the app is not showing photos from one library when it is showing photos from another. I have 3 libraries on the phone with > 6000 photos each. Sometimes I added multiple libraries at the same time. But #6196 seems to be deterministic, the same photos were missing from the app every time I added the libraries. So it might be that the same bug that assigns the wrong hash to some of the photos is causing the missing photos too (maybe overwriting other data?).

@bo0tzz
Copy link
Member

bo0tzz commented Jun 22, 2024

I checked as well, it's definitely not a hash collision.

@fyfrey
Copy link
Contributor

fyfrey commented Jun 22, 2024

I can think of one scenario causing issues:
Did you restore your entire phone, your photos, or the immich app from a Google cloud backup etc.?

@miahi
Copy link
Author

miahi commented Jun 22, 2024

You are right. I changed my phone a while back and used a migration app to move the data (via cable connection). It moved the photos and I think also the immich local storage. That was around 1st of March 2024. I remember that I had to do a login and resync at that point, but I don't think the local DB was cleaned.

I cleared the app data and I'm doing a new sync now. There were ~500 images in that Camera album that were not synced before and they were synced after the cleanup. Not all of them were reported as duplicates, but all of them were shot on the new phone.

@fyfrey
Copy link
Contributor

fyfrey commented Jun 23, 2024

Copying over immich data was the issue: we need to keep a mapping between asset ID and its file hash (otherwise we'd need to calculate all hashes on app start again and again). This mapping is wrong when moving the data to a new phone or when restoring a backup. We have this as an open issue somewhere.... To detect this case and ask the user to delete app data.

@miahi
Copy link
Author

miahi commented Jun 23, 2024

I think this is one of them #4939

@bo0tzz bo0tzz closed this as completed Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants