DEV Community

# vlm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
NuMarkdown-8B-Thinking: The Open-Source Reasoning OCR that Converts PDFs to Auditable Markdown for Enterprise RAG Pipelines

NuMarkdown-8B-Thinking: The Open-Source Reasoning OCR that Converts PDFs to Auditable Markdown for Enterprise RAG Pipelines

Comments
10 min read
Journal of our experiments on VLM token pruning

Journal of our experiments on VLM token pruning

Comments
15 min read
OCR - ID Card Scanner (VLM)

OCR - ID Card Scanner (VLM)

Comments
6 min read
VLM Pipeline with Docling

VLM Pipeline with Docling

Comments
7 min read
Small Model from Huggingface with Video understanding

Small Model from Huggingface with Video understanding

Comments
4 min read
Unlock the Magic of Images: A Quick and Easy Guide to Using the Cutting-Edge SmolVLM-500M Model

Unlock the Magic of Images: A Quick and Easy Guide to Using the Cutting-Edge SmolVLM-500M Model

1
Comments
2 min read
Benchmarking Pixtral Large vs Pixtral 12B

Benchmarking Pixtral Large vs Pixtral 12B

8
Comments
3 min read
📊 Exploring Vision Language Models (VLMs) for Structured Data Extraction

📊 Exploring Vision Language Models (VLMs) for Structured Data Extraction

Comments
2 min read
Stress Testing VLMs: Multi QnA and Description Tasks

Stress Testing VLMs: Multi QnA and Description Tasks

6
Comments
4 min read
Benchmarking Pixtral 12B: MistralAI's New VLM

Benchmarking Pixtral 12B: MistralAI's New VLM

10
Comments
5 min read
Porting Phi-3-Vision to MLX: A Python Hobbyist's Journey into Advanced AI on Apple Silicon

Porting Phi-3-Vision to MLX: A Python Hobbyist's Journey into Advanced AI on Apple Silicon

Comments
5 min read
Part 1: Basic Implementation of Phi-3-Vision in MLX

Part 1: Basic Implementation of Phi-3-Vision in MLX

2
Comments
8 min read
PixLab API Integration Guide: Quick Setup & Use

PixLab API Integration Guide: Quick Setup & Use

1
Comments
5 min read
loading...