It collects and measures information on the target by means of image, video, acceleration, magnetic, and other sensors, and uses a variety of technologies, including data mining, machine learning (including deep learning), and pattern recognition, to extract behavioral features of a person, such as the subject's posture, standing, walking, running, jumping, and a variety of other static and dynamic statuses. Extracting various static and dynamic statuses, such as the subject's posture, standing, walking, running, jumping, etc.
Currently, human behavior recognition mainly includes head movement recognition, gesture recognition, gait recognition, abnormal posture recognition, and hand-eye recognition. There are also several methods, and behavior recognition based on video input and computer vision is called "behavior recognition AI.
In this column, I would like to introduce how Asilla's "Action Recognition AI" works and how it can be used in business applications.
Asilla's Behavior Recognition AI, "Asilla Behavior Recognition," is a complex AI system developed for commercial use that can recognize and infer the behavior of a person captured by a security camera without using any sensors.
As shown in the figure above, video information (rtsp, mp4, etc.) is used as input information, human behavior is estimated by posture estimation and time series analysis, and inference results are output. In order to realize this mechanism for commercial use, it was necessary to combine the following four technologies (1) to (4) in a well-balanced manner. Since the inference results of multiple AIs are exchanged with each other, we call it "Bound-AI," a bound AI.
This AI technology does not use sensors, but uses video and images as input to detect a person's joint coordinates to understand the person's entire body and estimate the posture of multiple people in real time. Asilla's posture estimation technology, AsillaPose®, was developed in-house from scratch based on the following development concept, and is not dissimilar to the OpenPose® technology of Carnegie Mellon University, which has been widely seen at exhibitions since the early days of deep learning. AsillaPose® is developed from scratch based on the following development concepts.
・Pose estimation for a single image alone has no business value.
・Business value is created by analyzing and processing "time series" of images.
・It is essential to combine robustness, high accuracy, and high speed processing in time series processing.
In other words, Asilla's AsillaPose® is a posture estimation technology optimized for time series analysis.
The AsillaPose® metric uses a performance score based on mAP and fps, and is currently 2.1 times faster than OpenPose in a simple comparison (COCO), and is expected to continue to improve performance by 26.8% per year from 2016 to 2025. We have been benchmarking a Canadian AI startup as a competitor for the past few years, and it was recently acquired by a certain U.S. healthcare company.
Person tracking technology is essential for time series processing of the inference results of the aforementioned posture estimation. If the IDs issued to each person are confused due to shielding or crossing, time series processing will fail, and robust tracking technology is required to realize "action recognition AI. Asilla's tracking technology, development code "TT", is integrated into AsillaPose® and can provide robust and high tracking performance.
TT" is named after the CTO of Asilla Vietnam, Thai Tracking.
The metrics are based on MOTA, which is capable of delivering five times the performance of the aforementioned competitor.
This AI model synthesizes the posture and ID information output from the posture estimation AI, AsillaPose®, to detect specific behaviors by vector matching with individual behaviors or to detect abnormal behaviors with an anomaly detection model. It depends on the performance of (1) and (2) and uses F1 Score as the metric.
Base AI algorithms utilized include Auto Encoder, One-Class Classification, LSTM, and Graph Convolutional Networks.
This technology minimizes the computing resources consumed in (1), (2), and (3) above. Asilla has demonstrated cost-effectiveness in processing (1), (2), and (3) on a single server, which can process 60 to 80 streams (= number of cameras) in real time, and the source of this is this model compression technology.
The metrics are mAP, fps, MOTA, and F1, which are utilized in each AI model, and the basic modeling philosophy is to "increase fps while maintaining accuracy as much as possible. Among the various methods, Asilla is particularly good at the Knowledge Distillation method, which can adjust the balance of metrics to best suit your business.
Asilla's "Behavior Recognition AI" is currently being used by 9 major companies in the fields of crime prevention security and smart cities, 3 companies in the mobility field, 2 companies in the manufacturing industry, and 1 company in the healthcare and nursing care field (as of March 31, 2022).
All of the companies have evaluated the technology very highly, and the QCD and business value of the AI solution development project for PMF was rated as high as 4.6pt / 5.0pt, and business scale is expected in the future.
Finally, we would like to introduce some examples of business applications of this technology (unfortunately limited to those that can be made public).
With more than a majority of the public experiencing a "worsening" of public safety in Japan today, the security industry is facing a serious manpower shortage (approximately 90% of operators are understaffed), and the quality of facility security continues to decline as the gap between the two grows larger by the day. Real estate owners, facility operators, and social infrastructures are becoming increasingly concerned about this situation.
In response to this situation, Asilla is developing an "AI security system for facilities" for the market. This system can convert existing camera systems to AI and instantly detect and determine abnormal or suspicious behavior of facility users, thereby preventing crimes and speeding up emergency and first-aid calls.
Asilla is implementing "Behavior Recognition AI" in society for facilities to prevent incidents and accidents. We are developing and selling this product as our own original product.
Detection of uncomfortable behavior
Fall detection
Detection of fights and drunkenness
Notify video management system
The system can be used with either ATM built-in cameras or security cameras installed in stores.
Asilla has incorporated the knowledge and know-how gained from Akira Saito, a product development advisor to former detectives and the former head of the National Police Agency, into the AI, which can detect elderly people who are involved in special fraud and make transfers, as well as "out-siders," or criminals who are in the process of withdrawing money from ATMs after being entrusted with cash cards.
This is in the feasibility study stage and is a POC start. We look forward to hearing from everyone who has an issue.
The total amount of damage caused by cheating at pachinko and slot machines was 28 billion yen in six and a half years. The system is capable of detecting and alerting on unnatural hand movements (other than inserting coins or cards), scurrying gestures, and other suspicious behavior at the tables, and continuously tracking the inside of a store across multiple cameras. This is also in the feasibility study phase and will be a POC start. We look forward to hearing from all of you who have issues.
Behavior Recognition AI" is being used to improve quality levels and work efficiency by alerting operators to procedural errors and analyzing their work carefully on the production line. A feasibility study has been completed and we are looking for a joint product development and sales partner.
We are looking for companies that have channels into the manufacturing industry, including food production lines and pharmaceuticals, and are interested in working with us. We would be very happy if we could commercialize this project and support small and medium-sized manufacturing companies in Japan, as the skills of skilled workers, which cannot be automated, are at risk of not being passed on to the next generation due to the aging of the workforce.
Recently, the social implementation of artificial intelligence (AI) systems to "replace the human visual cortex" seems to be progressing steadily, but there is a view that it remains a "PoC" and does not actually create much business value.
We believe that the reason for this is that the system is limited to the recognition of a single image. Most of the AI products that have been launched in the world that advocate alternative value to the visual cortex recognize a single image using a CNN or other methods.
On the other hand, Asilla has been working on its technology day by day to achieve a high technological goal, believing that the replacement of human visual cortex can be realized only by analyzing images in time series as well as processing human visual cortex system since its foundation.
Our goals were so lofty that it took us seven years to bring our "action recognition AI" to the commercial stage and eight years to release the product. However, we feel that it was worthwhile to keep our initial ambition, and we were able to introduce a technology that many people highly appreciate.
We hope that you will utilize Asilla's "Behavior Recognition AI" and "AI Security System for Facilities" to improve your business value in the market.