immich/server/src/infra/sql
Sushain Cherivirala 7fc1954e2a
fix(server): add filename search (#6394)
Fixes https://github.com/immich-app/immich/issues/5982.

There are basically three options:

1. Search `originalFileName` by dropping a file extension from the query
(if present). Lower fidelity but very easy - just a standard index &
equality.
2. Search `originalPath` by adding an index on `reverse(originalPath)`
and using `starts_with(reverse(query) + "/", reverse(originalPath)`. A
weird index & query but high fidelity.
3. Add a new generated column called `originalFileNameWithExtension` or
something. More storage, kinda jank.

TBH, I think (1) is good enough and easy to make better in the future.
For example, if I search "DSC_4242.jpg", I don't really think it matters
if "DSC_4242.mov" also shows up.

edit: There's a fourth approach that we discussed a bit in Discord and
decided we could switch to it in the future: using a GIN. The minor
issue is that Postgres doesn't tokenize paths in a useful (they're a
single token and it won't match against partial components). We can
solve that by tokenizing it ourselves. For example:

```
immich=# with vecs as (select to_tsvector('simple', array_to_string(string_to_array('upload/library/sushain/2015/2015-08-09/IMG_275.JPG', '/'), ' ')) as vec)  select * from vecs where vec @@ phraseto_tsquery('simple', array_to_string(string_to_array('library/sushain', '/'), ' '));
                                      vec
-------------------------------------------------------------------------------
 '-08':6 '-09':7 '2015':4,5 'img_275.jpg':8 'library':2 'sushain':3 'upload':1
(1 row)
```

The query is also tokenized with the 'split-by-slash-join-with-space'
strategy. This strategy results in `IMG_275.JPG`, `2015`, `sushain` and
`library/sushain` matching. But, `08` and `IMG_275` do not match. The
former is because the token is `-08` and the latter because the
`img_275.jpg` token is matched against exactly.
2024-01-15 14:40:28 -06:00
..
access.repository.sql chore(server): sql versioning (#5346) 2023-11-30 10:10:30 -05:00
album.repository.sql feat(server, web): quotas (#4471) 2024-01-12 19:43:36 -05:00
api.key.repository.sql feat(server, web): quotas (#4471) 2024-01-12 19:43:36 -05:00
asset.repository.sql fix(server): add filename search (#6394) 2024-01-15 14:40:28 -06:00
audit.repository.sql chore(server): sql versioning (#5346) 2023-11-30 10:10:30 -05:00
library.repository.sql feat(server, web): quotas (#4471) 2024-01-12 19:43:36 -05:00
move.repository.sql chore(server): sql versioning (#5346) 2023-11-30 10:10:30 -05:00
partner.repository.sql chore(server): sql versioning (#5346) 2023-11-30 10:10:30 -05:00
person.repository.sql fix: remove archived people from explore (#6091) 2024-01-01 11:07:42 -05:00
shared.link.repository.sql feat(server, web): quotas (#4471) 2024-01-12 19:43:36 -05:00
smart.info.repository.sql feat(server): search across own+partner assets (#5966) 2024-01-01 17:25:22 -05:00
system.config.repository.sql chore(server): sql versioning (#5346) 2023-11-30 10:10:30 -05:00
system.metadata.repository.sql chore(server): sql versioning (#5346) 2023-11-30 10:10:30 -05:00
tag.repository.sql chore(server): sql versioning (#5346) 2023-11-30 10:10:30 -05:00
user.repository.sql chore(web): quota enhancement (#6371) 2024-01-15 09:04:29 -06:00
user.token.repository.sql feat(server, web): quotas (#4471) 2024-01-12 19:43:36 -05:00