Asilla, Inc. (Headquarters: Chiyoda-ku, Tokyo; CEO: Go Onoue; hereinafter "Asilla"), a company developing proprietary video analytics AI and behavior recognition AI, announces the development of "AsillaVision-v1-4B," a proprietary Vision Language Model (VLM) specialized in anomalous behavior detection from security camera footage, leveraging its dataset of over 7 million security camera video records.
The model achieved 89% accuracy in identifying real-world anomalous behaviors such as falls, fights, and skateboard use within facilities, outperforming major VLMs including Google Gemini 3.1 Pro (84%), Alibaba Qwen3.5-9B (64%), and NVIDIA Nemotron Nano-12B-v2-VL (61%). *Based on Asilla's proprietary evaluation dataset.

In recent years, rapid advancements in VLM (Vision Language Model) technology have accelerated the sophistication of video analytics AI. However, general-purpose VLMs developed by major tech companies such as Google, OpenAI, and NVIDIA are trained on large-scale internet data and lack specialized knowledge for security camera footage.
Security camera footage exists within closed networks at individual facilities and is rarely available on the internet. This "security camera data barrier" represents a structural limitation for general-purpose models.
Through AI Security asilla, deployed at facilities nationwide, Asilla has been continuously accumulating security camera footage data (CARD) since 2023. Leveraging this proprietary dataset, which surpassed 7 million records as of February 2026, Asilla successfully developed a VLM specialized for the security camera footage domain. Data collection and usage were conducted with consent from deployed facilities, with anonymization processing applied.
Despite being a lightweight model with only 4B (4 billion) parameters, AsillaVision achieved domain-specific performance in anomalous behavior detection from security camera footage that surpasses major general-purpose models, delivering superior accuracy in identifying real-world anomalous behaviors such as falls, fights, and skateboard use.
*Compared based on identification performance for falls, fights, and skateboard use within facilities.
*Comparison models were selected from representative VLMs publicly available as of February 2026.

With its compact 4B-parameter design, real-time inference is possible on edge computing environments (on-premise servers) without relying on the cloud. This eliminates the need to transmit video data outside the facility, achieving both privacy protection and real-time performance.
CARD data used in developing AsillaVision is collected with consent from deployed facilities and operated after conducting a Privacy Impact Assessment (PIA) by a third-party organization. Based on PIA evaluation results, facial images and personally identifiable information are excluded from training data, ensuring the model is trained without personal information.
Asilla will continue expanding the range of detectable behaviors and improving accuracy for AsillaVision, while accelerating social implementation through its own products and joint development with partner companies.
A system that uses AI to analyze existing security camera footage 24/7/365, instantly detecting anomalous behaviors such as violence, falls, and intrusion, as well as cautionary behaviors including wandering, crowding, and physical distress.
Amid a deepening shortage of security personnel, the system captures irregularities that are easily missed by human monitoring and immediately notifies security guards and facility managers. By utilizing existing cameras, no additional equipment investment is required, enabling high levels of safety even with limited personnel as a next-generation security solution.

Representative: Go Onoue, CEO
Location: 1-4-2 Nakamachi, Machida City, Tokyo
Business: Development and provision of products and solutions based on behavior recognition AI
Official website: https://jp.asilla.com/
Asilla, Inc. PR Contact: Nakamura
Email: pr@asilla.jp