erikku-sama's picture
Update README.md
21451b4 verified
metadata
license: other
base_model: LiquidAI/LFM2.5-VL-450M
tags:
  - vision
  - vlm
  - electronics
  - object-detection
  - grounding
  - gguf
  - edge-ai
  - fine-tuned
  - lfm2.5
  - liquid-ai
pipeline_tag: image-text-to-text
language:
  - en
library_name: transformers
metrics:
  - iou
model-index:
  - name: Electrocom VLM V2
    results:
      - task:
          type: image-text-to-text
          name: Visual Grounding
        dataset:
          name: ElectroCom61 + Electronic Detection (test split)
          type: erikku-sama/electrocom61
        metrics:
          - type: iou
            name: Grounding Recall (IoU > 0.5)
            value: 0.319

Electrocom VLM V2 (450M)

Electrocom VLM V2 is a specialized Vision-Language Model optimized for electronic component detection and recognition. It is a fine-tuned version of Liquid AI's LFM2.5-VL-450M.

This "Phase 2" version was trained on a combined dataset of ElectroCom61 and Electronic Detection, significantly improving its visual robustness and detection accuracy.

πŸ“Š Performance

Metric Base LFM2.5-VL-450M Electrocom VLM V2 (Ours)
Grounding Recall (IoU > 0.5) 0.0% 31.9%
Inference Speed (tok/s) 81.6 85.2

Key Improvements:

  • 7.5x Better Grounding: Compared to our V1 model, the V2 model shows a massive leap in its ability to correctly localize components.
  • Structured JSON Output: The model has been trained to output detections in a precise JSON format, suitable for programmatic integration.
  • Edge-Ready: At only 450M parameters, this model runs at high speeds even on modest hardware.

πŸš€ Usage with llama.cpp

The GGUF version of this model is included in this repository for use with llama.cpp.

# Download the model and projector
hf download erikku-sama/lfm2-electronics-vlm-v2 electrocom-vlm-v2-f16.gguf --local-dir .
hf download LiquidAI/LFM2.5-VL-450M-GGUF mmproj-LFM2.5-VL-450m-F16.gguf --local-dir .

# Run inference
llama-mtmd-cli \
  --model electrocom-vlm-v2-f16.gguf \
  --mmproj mmproj-LFM2.5-VL-450m-F16.gguf \
  --image your_image.jpg \
  -p "Inspect the image and detect all electronic components. Provide result as a valid JSON: [{\"label\": str, \"bbox\": [x1,y1,x2,y2]}, ...]. Coordinates must be normalized to 0-1." \
  --temp 0.1 \
  --jinja

🎯 Intended Use

  • Automated inventory of electronic components
  • Visual assistance for circuit board inspection
  • Educational tools for electronics identification

πŸ“ License

This model is based on LFM2.5-VL-450M and is subject to the Liquid AI model license.