lfm2-electronics-vlm-v2 / README.md

erikku-sama

Update README.md

21451b4 verified 5 days ago

preview code

raw

history blame contribute delete

2.6 kB

metadata

license: other
base_model: LiquidAI/LFM2.5-VL-450M
tags:
  - vision
  - vlm
  - electronics
  - object-detection
  - grounding
  - gguf
  - edge-ai
  - fine-tuned
  - lfm2.5
  - liquid-ai
pipeline_tag: image-text-to-text
language:
  - en
library_name: transformers
metrics:
  - iou
model-index:
  - name: Electrocom VLM V2
    results:
      - task:
          type: image-text-to-text
          name: Visual Grounding
        dataset:
          name: ElectroCom61 + Electronic Detection (test split)
          type: erikku-sama/electrocom61
        metrics:
          - type: iou
            name: Grounding Recall (IoU > 0.5)
            value: 0.319

Electrocom VLM V2 (450M)

Electrocom VLM V2 is a specialized Vision-Language Model optimized for electronic component detection and recognition. It is a fine-tuned version of Liquid AI's LFM2.5-VL-450M.

This "Phase 2" version was trained on a combined dataset of ElectroCom61 and Electronic Detection, significantly improving its visual robustness and detection accuracy.

📊 Performance

Metric	Base LFM2.5-VL-450M	Electrocom VLM V2 (Ours)
Grounding Recall (IoU > 0.5)	0.0%	31.9%
Inference Speed (tok/s)	81.6	85.2

Key Improvements:

7.5x Better Grounding: Compared to our V1 model, the V2 model shows a massive leap in its ability to correctly localize components.
Structured JSON Output: The model has been trained to output detections in a precise JSON format, suitable for programmatic integration.
Edge-Ready: At only 450M parameters, this model runs at high speeds even on modest hardware.

🚀 Usage with llama.cpp

The GGUF version of this model is included in this repository for use with llama.cpp.

# Download the model and projector
hf download erikku-sama/lfm2-electronics-vlm-v2 electrocom-vlm-v2-f16.gguf --local-dir .
hf download LiquidAI/LFM2.5-VL-450M-GGUF mmproj-LFM2.5-VL-450m-F16.gguf --local-dir .

# Run inference
llama-mtmd-cli \
  --model electrocom-vlm-v2-f16.gguf \
  --mmproj mmproj-LFM2.5-VL-450m-F16.gguf \
  --image your_image.jpg \
  -p "Inspect the image and detect all electronic components. Provide result as a valid JSON: [{\"label\": str, \"bbox\": [x1,y1,x2,y2]}, ...]. Coordinates must be normalized to 0-1." \
  --temp 0.1 \
  --jinja

🎯 Intended Use

Automated inventory of electronic components
Visual assistance for circuit board inspection
Educational tools for electronics identification

📝 License

This model is based on LFM2.5-VL-450M and is subject to the Liquid AI model license.