image vision (cooks pcs)

This commit is contained in:
Eri Ishihara 2025-08-02 21:39:23 +02:00
parent 8816fe5b39
commit b3c772cc20
4 changed files with 417 additions and 5 deletions

View file

@ -7,6 +7,8 @@ your friendly ai assistant. frontend for ollama.
- git clone repository
- install ollama from `https://ollama.ai/download`
- pull a model from ollama (i recommend gemma3n:e4b for laptops like mine (i7-10750h + rtx 3050ti laptop edition))
- for image stuff, you'll need a model like llava:7b
- for webcam features, install fswebcam (Linux) or imagesnap (macOS)
- copy config.example.toml to config.toml and edit it to have the model you selected, optionally set your name in [user]
- npm i
- node index.js
@ -31,10 +33,22 @@ lydia is written to be easily configurable through a toml file which is easier t
- temperature = the temperature you want lydia to use. basically how random the model is. default is 0.8
- max_tokens = the max context tokens you want lydia to use. basically how far she can remember. default is 8192
## Camera settings
- width = webcam capture width. default is 1280
- height = webcam capture height. default is 720
- quality = webcam capture quality (0-100). default is 100
- device = specific camera device (false for default, or "/dev/video0", etc.)
## Runtime configuration
the prompt can be changed by running l!prompt <text> in the chatbox. this only applies for the current session, if you want a persistent change, you can edit the config file.
## Image & Webcam Commands
- `l!image <path>` or `l!img <path>` - send an image file (or just `l!image` to browse)
- `l!webcam` or `l!cam` - take and send a webcam snapshot
- press `ESC` to open menu for image options
# Other stuff
by hitting escape you can tab out of the chatbox, here you can do cool things like:
by hitting escape you can open the menu, here you can do cool things like:
- send pictures and take webcam snapshots
- get help and see all commands
- hit Q or CTRL+C to quit lydia (but why would you wanna do that anyway?)
- yea thats it