Zhipu AI Introduces GLM-OCR: A 0.9B Multimodal OCR Model for Document Parsing and Key Information Extraction
Zhipu AI has launched GLM-OCR, a compact multimodal OCR model designed for efficient document parsing and key information extraction, featuring a 0.4B CogViT encoder and a 0.5B GLM decoder, with significant improvements in throughput and structured output capabilities.
Details
Zhipu AI has launched GLM-OCR, a compact multimodal OCR model designed for efficient document parsing and key information extraction, featuring a 0.4B CogViT encoder and a 0.5B GLM decoder, with significant improvements in throughput and structured output capabilities.
This story is part of the daily NewsCube AI news stream. The detail page keeps the main summary easy to scan, while surfacing the original source links so readers can verify the reporting and dive deeper.
Use the source list to jump directly to the original reporting, product page, repository, or reference material behind this item.