BRIEFLY.
Yandex Improves Archive Search Function with New Document Recognition Model
2 min read
Briefly Editorial Team

Yandex Improves Archive Search Function with New Document Recognition Model

TL;DR

  • Yandex has introduced an updated archive search function
  • The new document recognition model Alice AI VLM allows for structuring information and highlighting event participants' roles

Why it matters

The updated archive search function enables users to quickly find data about their ancestors and structure information

Technical Details

The Yandex team has improved the archive search function by introducing a new document recognition model called Alice AI VLM. The service now not only recognizes the text of an archive file but also structures the information, highlighting the roles of event participants and connections between people. This allows users to immediately see the name of the person they need and find data about their ancestors more quickly.

Background and Context

The Yandex archive search service helps users quickly find mentions of people, places, and events in handwritten documents from the 18th to 20th centuries. The service's database contains over 20 million pages of historical documents from archives in various regions of Russia, as well as information from over 200 pre-revolutionary and Soviet newspapers and directories.

Industry Impact

The updated service is based on Yandex's multimodal model Alice AI VLM, which has a deep understanding of the Russian language and images. As the developers noted, this has made it possible to achieve high search accuracy - on average 90.5%, and up to 92.7% for birth records. The new model allows users to set filters by events and roles, such as 'born', 'father', 'mother' for birth documents or 'groom', 'bride', 'witness' for marriage certificates.