AI Prompt & metatags #25

Til555 · 2025-03-23T15:45:58Z

Til555
Mar 23, 2025

Hi Jabberjabberjabber :)

This ImageIndexer of yours is an awesome tool! I do have a few questions:

For captioning lots of images of a cat called 'Felini' I tried to use the prompt (in the ImageIndexer settings) instructing the AI to mention Felini the cat in the description / keywords which didn't work. This is maybe a bit along the line of @shoverians request of being able to add specific keyword to a batch of images?
When preparing images for the web, one of the most important tags is the Alt tag ("-AltTextAccessibility" in exiftool) because it helps people who cannot see to identify the image content. The descriptions/captions coming from ImageIndexer would just be perfect for this purpose. So I guess this is a bit like the suggestion from @eternalliving where those exif fields could be redirected...

I know these tags are super messy and it's probably very tough to direct... I dunno, just in case it helps: At the moment I have ChatGPT look at an image and instruct it to spit out a code snippet like the one below (giving it an example of all the tags to be used), that I copy&paste in exiftool to inject the tags.

exiftool -Creator="Til Vogt" -Copyright="Til & Felini" -Source="https://felini.rocks" -CreatorWorkEmail="meow@felini.rocks" -PersonInImage="Felini the Kitty" -City="Agra" -Country="India" -Subject="Felini Cat Relaxing at the Taj Mahal" -Title="Felini Cat Lounges in Front of the Taj Mahal" -Headline="Felini the Cat Enjoys a Relaxing Moment Near the Taj Mahal" -Description="Felini, a playful black-and-white cat, stretches out comfortably near the Taj Mahal, basking in the golden sunlight of Agra, India. His relaxed pose perfectly captures the essence of a peaceful journey." -AltTextAccessibility="Felini, a black-and-white cat, lounges near the Taj Mahal, enjoying the warm light of Agra, India." -keywords+="Felini cat" -keywords+="Taj Mahal" -keywords+="Agra India" -keywords+="travel photography" -keywords+="relaxing cat" -IntellectualGenre="Travel Photography" Felini-cat-world-trip_India_Agra_Taj-Mahal_B_01.jpg

To cut a few corners and being able to process a batch of images, I was kind of hoping that given this example Qwen2 might be able to spit out a similar code snippet that ImageIndexer could forward to exiftool... Sorry for those naive questions - my coding knowledge is super basic.

Anyhow, thanks a lot for your work and congrats to the great tool already!

Cheers,
Til

jabberjabberjabber · 2025-03-24T05:29:50Z

jabberjabberjabber
Mar 24, 2025
Maintainer

Qwen 2 2B isn't the best image capable model, but it is fast and small. I would try using Gemma-3 4B or even 12B to get some more intelligent output.

With that said, this is a script that was written to do a specific thing -- keyword and caption batches of images using a local LLM. Anything else is outside of those requirements.

I'm glad you like the script and find it useful!

1 reply

Til555 Mar 24, 2025
Author

Thanks a lot for your quick reply. Ok cool, will check out Gemma too!
No worries, still love ImageIndexer! 😊👍

ihsanfrr · 2025-03-24T06:03:02Z

ihsanfrr
Mar 24, 2025

Oh I see, Thank you for creating a metadata generator program, it's very useful for me. Pada Sen, 24 Mar 2025, 12.30, jabberjabberjabber ***@***.***> menulis:

…

Qwen 2 2B isn't the best image capable model, but it is fast and small. I would try using Gemma-3 4B or even 12B to get some more intelligent output. With that said, this is a script that was written to do a specific thing -- keyword and caption batches of images using a local LLM. Anything else is outside of those requirements. I'm glad you like the script and find it useful! — Reply to this email directly, view it on GitHub <#25 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOINEZER3K33Q2AS4BEAY732V6J6FAVCNFSM6AAAAABZTFLZB6VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTENJZG42TGNA> . You are receiving this because you are subscribed to this thread.Message ID: <jabberjabberjabber/ImageIndexer/repo-discussions/25/comments/12597534 @github.com>

0 replies

ihsanfrr · 2025-03-24T10:01:38Z

ihsanfrr
Mar 24, 2025

But the Gemma-3 4B model doesn't have a projector image, can it combine Gemma-3 4B for model text and Qwen2-VL-2B for image projector? Pada Sen, 24 Mar 2025, 15.40, Til ***@***.***> menulis:

…

Thanks a lot for your quick reply. Ok cool, will check out Gemma too! No worries, still love ImageIndexer! 😊👍 — Reply to this email directly, view it on GitHub <#25 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOINEZA2N63G5LEGQKDZNXD2V7AHRAVCNFSM6AAAAABZTFLZB6VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTENJZHEYTGMA> . You are receiving this because you commented.Message ID: <jabberjabberjabber/ImageIndexer/repo-discussions/25/comments/12599130@ github.com>

1 reply

jabberjabberjabber Mar 24, 2025
Maintainer

Projector: https://huggingface.co/bartowski/google_gemma-3-4b-it-GGUF/blob/main/mmproj-google_gemma-3-4b-it-f16.gguf

ihsanfrr · 2025-03-24T21:47:46Z

ihsanfrr
Mar 24, 2025

Thank you for your help. I’ve successfully implemented the Gemma-3 4B model on the Image Indexer program using an i3 10100f, 32GB RAM, and GTX 1080 Ti FTW. It outperforms the Qwen2-VL-2B model. Tested models: - google_gemma-3-4b-it-Q2_K.gguf: 10s/image - google_gemma-3-4b-it-Q6_K.gguf: 15s/image - Qwen2-VL-2B: 7s/image Pada Sel, 25 Mar 2025 pukul 01.48 jabberjabberjabber < ***@***.***> menulis:

…

Projector: https://huggingface.co/bartowski/google_gemma-3-4b-it-GGUF/blob/main/mmproj-google_gemma-3-4b-it-f16.gguf — Reply to this email directly, view it on GitHub <#25 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOINEZDZRCVEIMUQ2A7GUQT2WBHP5AVCNFSM6AAAAABZTFLZB6VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTENRQGYZTCNA> . You are receiving this because you commented.Message ID: <jabberjabberjabber/ImageIndexer/repo-discussions/25/comments/12606314@ github.com>