Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
VIDEO DATA PROCESSING ARRANGEMENT, PROCESS FOR MANAGING VIDEO DATA, COMPUTER PROGRAM AND COMPUTER PROGRAM PRODUCT
Document Type and Number:
WIPO Patent Application WO/2024/094311
Kind Code:
A1
Abstract:
The invention relates to a video data processing arrangement (1), the arrangement comprising: a platform operating system (4) for managing hardware and/or software resources, whereby the platform operating system (4) is adapted for providing common services for apps (5), whereby the platform operating system (4) comprises a system input interface (6) for receiving video data and a system output interface (7) for providing the video data as video data as received or as preprocessed video data, the video data processing arrangement (1) further comprising: an app (5), whereby the app (5) is adapted for using at least a part of the common services of the platform operating system (4), whereby the app (5) comprises an app input interface (10) for receiving the video data from the system output interface (7, 5) whereby the platform operating system (4) provides a shared memory (11), whereby the system output interface (7) is adapted for providing the video data in the shared memory (11) and the app input interface (10) is adapted for receiving the video data from the shared memory (11).

Inventors:
PUVVULA BHARGAVA (NL)
VAN TIEL BAS (NL)
Application Number:
PCT/EP2022/080815
Publication Date:
May 10, 2024
Filing Date:
November 04, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ROBERT BOSCH GMBH (DE)
International Classes:
G06F9/455; G06T1/00
Download PDF:
Claims:
Claims

1. Video data processing arrangement (1), the arrangement comprising: a platform operating system (4) for managing hardware and/or software resources, whereby the platform operating system (4) is adapted for providing common services for apps (5), whereby the platform operating system (4) comprises a system input interface (6) for receiving video data and a system output interface (7) for providing the video data as video data as received or as preprocessed video data, the video data processing arrangement (1) further comprising: an app (5), whereby the app (5) is adapted for using at least a part of the common services of the platform operating system (4), whereby the app (5) comprises an app input interface (10) for receiving the video data from the system output interface (7), whereby the platform operating system (4) provides a shared memory (11), whereby the system output interface (7) is adapted for providing the video data in the shared memory (11) and the app input interface (10) is adapted for receiving the video data from the shared memory (11).

2. Video data processing arrangement (1) according to claim 1, characterized in that the system output interface (7) and the app input interface (10) are adapted to use the shared memory (11) as own memory areas.

3. Video data processing arrangement (1) according to claim 1 or 2, characterized in that the app (5) is adapted to run on the platform operating system (4). 4. Video data processing arrangement (1) according to one of the preceding claims, characterized in that the app (5) is adapted to perform video data analysis on the video data.

5. Video data processing arrangement (1) according to one of the preceding claims, characterized in that the app (5) is adapted to perform security video analysis tasks.

6. Video data processing arrangement (1) according to one of the preceding claims, characterized by comprising a camera system (2), whereby the camera system (2) is a hardware platform for the platform operating system (4) and for the app (5).

7. Video data processing arrangement (1) according to one of the preceding claims, characterized by comprising a cloud instance, whereby the cloud instance is a hardware platform for the platform operating system (4).

8. Video data processing arrangement according to one of the preceding claims 1 to 6, characterized by comprising a computer with a virtual system, whereby the computer with the virtual system is a hardware platform for the platform operating system (4).

9. Video data processing arrangement (1) according to one of the preceding claims, characterized in comprising a hardware platform with a CPU (12) and a GPU (13) and a further shared memory (14), whereby the CPU (12) and the GPU (13) are adapted to use the further shared memory (14) for transporting the video data.

10. Video data processing arrangement (1) according to one of the preceding claims, characterized in that the platform operating system (4) is adapted to convert the video data in the YUV-domain before transferring the video data to the system output interface (7). 11. Video data processing arrangement (1) according to one of the preceding claims, characterized in that the platform operating system (4) is based on an Android™ open source platform (AOSP).

12. Video data processing arrangement (1) according to claim 9, characterized in that the shared memory (11) is realized as an Android™ HardwareBuffer.

13. Process for managing video data in the video data processing arrangement (1) according to one of the preceding claims, whereby the video data is written in the shared memory (11) as the system output interface (7) by the platform operating system (4) and is read from the shared memory (11) as an app input interface (10) from the app (5).

14. Computer program having program code means in order to carry out all the steps of a process according to claim 13, particularly when the computer program is executed on the video data processing arrangement (1) according to one of the claims 1 to 12.

15. Computer program product having program code means that are stored on a computer-readable storage medium in order to carry out the steps of a method according to claim 13 when the computer program is executed on the video data processing arrangement (1) according to one of the claims 1 to 12.

Description:
description title

Video data processing arrangement, process for managing video data, computer program and computer program product

State of the art

The invention relates to a video data processing arrangement according to claim 1. Furthermore, the invention relates to a process for managing video data, a computer program and a computer program product.

In normal computer systems, a computer program (user-space program) needs to share the bandwidth with the operating system and other programs, this will affect latency and bandwidth of the memory that will be available for such programs. Especially, in real-time video applications, the data rates are quite high, so that the reduced latency and bandwidth leads to a delay in the real-time video application.

Disclosure of the invention

The invention concerns a video data processing arrangement with the features of claim 1, a process for managing video data with the media data processing arrangement with the features of claim 13, a computer program with the features of claim 14 and a computer program product with the features of claim 15. Preferred or advantageous embodiments of the invention are disclosed by the dependent claims, the description and the figures as attached.

Subject-matter of the invention is a video data processing arrangement, which is adapted to process video data. The video data is especially realized as a video stream and/or a plurality of video-frames, which represent a video stream. The video data may be provided by a storage, especially storage medium, a cloud storage et cetera. Preferably, the video data is provided by a camera, especially her surveillance camera. It is especially preferred, that the video data is realized as real-time video data, whereby the video data processing arrangement is adapted to process the video data in real-time.

The video data processing arrangement may be realized on a hardware platform, comprising a video camera, a computer, a cloud instance et cetera as the hardware platform. Optionally, the video data processing arrangement comprises the video camera as an input device. Alternatively or additionally, the video data processing arrangement comprises an output for providing the processed video data and/or meta data based on the processes video data. The output can be connected to a storage for storing the processed video data and/or meta data, to a display for displaying the processed video data and/or meta data and/or a signaling device, for example a traffic light, for outputting signals on basis of the processed video data and/or meta data. The storage, display, signaling device may be a part of the video data processing arrangement.

The video data processing arrangement comprises a platform operating system for managing hardware and/or software resources. In case of physical hardware resources, the platform operating system is adapted for managing for example storage means, data buss et cetera. In case of virtual hardware resources, the platform operating system is adapted for managing virtualized hardware.

Additionally, the platform operating system is adapted for providing common services for apps. The term "app" is a common abbreviation of “application”. Especially, an app is a computer program designed to carry out a specific task.

The platform operating system comprises a system input interface for receiving the video data. The system input interface may be connected to the storage or to the camera. Furthermore, the platform operating system comprises a system output interface for providing the video data as the original video data as received or as preprocessed video data. Thus, it is possible that the platform operating system provides functions for preprocessing the video data.

The platform operating system may be embodied as a single instance. Alternatively, the platform operating system may provide a plurality of instances. In case a plurality of instances is provided, it is possible, that the single instances provide only parts of the functions, especially managing hardware and/or software resources, providing common services etc. The platform operating system with all instances provides the said functions. But it is preferred, that the instance, especially each instance provide or encapsulate all functions.

The video data processing arrangement comprises an app adapted to the platform operating system. The app may be realized as a so-called third-party app. The app uses common services for apps of the platform operating system.

The app comprises an app input interface for receiving the video data, which is the original video data or the preprocessed video data from the system output interface. The app is especially adapted to process the video data and to produce processed video data and/or meta data based on the video data and/or on the processed video data. Optionally, the processed video data and/or meta data is output by the output as described above. The output is preferably part of the app.

Preferably, the app is within a user-space or sandbox of the platform operating system, which is separated from the system-space of the platform operating system, so that non-authorized accesses from the user-space to the system-space are prevented. It is especially preferred, that the platform operating system is adapted to control the user-rights of the app in the user-space.

According to the invention the platform operating system provides a shared memory, especially for temporarily storing video data, whereby the system output interface is adapted for providing, especially storing, the video data in the shared memory and the app input interface is adapted for receiving the video data from the shared memory. Especially, the platform operating system and the app are adapted to access the shared memory.

It is an idea of the invention, that storing and retrieving video data from memory means and especially copying the video data from one memory means to another memory means is time-consuming and burdensome for the bus system(s). Instead of providing an output memory means for the system output interface and an input memory means for the app input interface, whereby the video data must be transferred from the output memory means to the input memory means, it is proposed to use the shared memory as the system output memory means and the app input memory means. As a consequence, one copying process of the video data as a time-consuming and resource consuming process is omitted, so that latency is reduced and the available bandwidth of the platform is more efficiently used. The invention is about how to use software techniques to efficiently transfer large quantities of video data on the platform operating system and deliver it to the app, especially 3rd party applications.

Processing of live and/or pre-recorded encoded and/or decoded especially high resolution, high frame-rate video data consumes a lot of hardware resources in terms of bandwidth and latency. By reducing the number of copying processes, the latency can be enhanced and the bandwidth can be saved.

It is especially preferred, that the system output interface and the app input interface are adapted to use the shared memory as own memory areas. For example, both interfaces may be adapted to have read and write rights.

Preferably, the app is adapted to run on the platform operating system. The app is especially embodied as a native app for the platform operating system.

In a preferred realization, the app is adapted to perform video analysis tasks and/or security video analysis tasks on the video data. For example, the app may be adapted for object and/or person detection and/or tracking, detection of events like burglary, entering prohibited zones, detection of fire, detection of fleeing or persons, face detection etcetera. Other video analysis tasks concern retail store management, Crowd management, People counting, CO VID-19 detection, Queue management, Indoor detection, Motion detection, Object recognition, Anomaly detection, Crossline detection, Health & safety analysis, Heat mapping, Intrusion detection, Perimeta protection, Parking management, Traffic analysis, Demographic estimation, Dynamic masking of persons or objects, Loitering detection, Fall detection of persons, Face recognition, License plate recognition, Sabotage detection, Emotion estimation, Zone masking, Filtered search results, Filtered by Retail store management Retail, Out-of-shelf analytics provide near real-time information on product stock levels and can alert employees when restocking is required, Crowd Gathering Safety, Real Time Crowd Gathering, People Counting, PIN Terminal Blur, Mask Check, Container identification, Gun / active shooter detection, On camera VMS, Person re-identification, Smoke detection, Traffic incident detection, Vehicle identification etcetera.

The app may provide processed video data and/or meta data on the basis of the detected and/or tracked objects and/or persons and/or on basis of the other tasks as described above.

In a preferred embodiment, the video data processing arrangement comprises a camera system, whereby the camera system comprises a digital data processing unit, for example a computer, a micro-controller et cetera. Preferably the camera system with a digital data processing unit is realized as one common assembly, for example enclosed in one common housing. The camera system and especially the digital data processing unit forms a hardware platform for the platform operating system and for the app. In this embodiment, the shared memory is embodied within the common digital data processing unit. The camera system and especially the digital data processing unit is the common platform for the platform operating system and for the app.

In an alternative embodiment, the video data processing arrangement comprises a cloud instance, which can be run on a cloud server, whereby the cloud server is a hardware platform for the platform operating system. In this embodiment, the cloud server with the platform operating system can be coupled or is coupled with another digital processing apparatus, for example a camera system, whereby the platform operating system is running on the cloud instance and/or the cloud server and the app is running on the digital processing apparatus, whereby the cloud instance and/or the cloud server provides services to the digital processing apparatus. It is also possible, that the platform operating system additionally runs on the digital processing apparatus.

In yet another embodiment, the video data processing arrangement comprises a computer with a virtual system, whereby the computer with a virtual system is a hardware platform for the platform operating system. The platform operating system runs on the virtual system. The app can additionally run on the virtual system.

It is further preferred, that the video data processing arrangement comprises a hardware platform, whereby the hardware platform comprises a CPU and a GPU and a further shared memory. The hardware platform may be realized as the camera system or the cloud server or the computer with a virtual system. The platform operating system is adapted so that the CPU and the GPU uses both the further shared memory for transporting the video data. The CPU provides a CPU output interface, the GPU provides a GPU input interface, whereby the CPU output interface is adapted for providing the video data in the further shared memory and the GPU input interface is adapted for receiving the video data from the shared memory. Especially, the platform operating system is adapted so that CPU and GPU both are able to access the further shared memory. Instead of providing an CPU output memory means for the CPU output interface and an GPU input memory means for the GPU input interface, whereby the video data must be transferred from the CPU output memory means to the GPU input memory means, it is proposed to use the further shared memory as the CPU output memory means and the GPU input memory means. As a consequence, one copying process of the video data as a time-consuming and resource consuming process is omitted, so that latency is reduced and the available bandwidth of the platform is more efficiently used. The further shared memory may be embodied as an internal system memory (known as DDR, SDRAM, volatile system memory).

In a further embodiment, it is proposed, that the platform operating system is adapted to convert and/or to subsample the video data in the YU V domain as pre- processed video data to decrease the data size per image. It is possible, that the YUV-video data as preprocessed video data is provided in the shared memory. Alternatively, only the Y-channel is provided in the shared memory and the UV- channel is cancelled, because for video-analytics the Y-channel is often sufficient enough to do for computer-vision like applications as described above. The conversion is preferably performed by the GPU. In this embodiment the data stream of the video data is significantly downsized, so that latency is reduced and the available bandwidth of the platform is more efficiently used. In a preferred realization, the platform operating system is based on an Android™ open source platform and/or is realized as AZENA operating system. Both systems provide the possibility of sharing memory as needed for realization of the claimed subject matter. It is especially preferred, that the shared memory is realized as an Android™ HardwareBuffer.

The Android™ HardwareBuffer has a property to set the timestamp in a respective field. Preferably the video data processing arrangement, especially the platform operating system and/or the app, is adapted to set the timestamp in this field. With this embodiment, the video data processing arrangement is capable to replay prerecorded video data at a much higher frame-rate and still have frame accurate metadata. In the video data processing arrangement, it is beneficial to have the timestamp field as a ‘tag’ to easily find back original frames of the video data. The app can use this tag to embed it properly in the metadata.

A further subject matter of the invention is a process for managing video data in the video data processing arrangement as described above, whereby the video data is written in the shared memory as the system output interface by the platform operating system and is read from the shared memory as an app input interface from the app.

A further subject matter of the invention is a computer program having program code means in order to carry out all steps of the method according to claim 13, especially when the computer program is executed on the video data processing arrangement as described above. Preferably, the computer program comprises the platform operating system and the app.

A further subject matter of the invention is a computer program product having program code means that are stored on a computer readable storage medium. The computer program is realized as described above.

Further embodiments, features and advantages of the invention are described by the following description of preferred embodiments and the figures as attached. The figures show: Figure 1 a block diagram of a video data processing arrangement as a first embodiment of the invention;

Figure 2 a block diagram of a video data processing arrangement as a second embodiment of the invention;

Figure 1 shows a block diagram of a video data processing arrangement 1 as an embodiment of the invention. The video data processing arrangement 1 comprises a camera system 2 as a hardware platform of the video data processing arrangement 1. The camera system 2 comprises a camera sensor 3 for capturing images.

The video data processing arrangement 1 comprising a platform operating system 4, which runs on the camera system 2. The platform operating system 4 is adapted for managing hardware and/or software resources of the camera system 2 as well as to provide common services for apps 5. The apps 5 may be realized as third- party user space applications. The platform operating system 4 comprises a system input interface 6 for receiving video data capture by the camera sensor 3.

The platform operating system 4 further comprises a system output interface 7 for providing the video data as the original video data or as preprocessed video data. The platform operating system 4 provides a user space 8 on the hardware platform, especially the camera system 2. The video data processing arrangement 1 comprises at least one of the apps 5, which runs on the video data processing arrangement 1 within the user space 8. The user space 8 and thus the app 5 is restricted from accessing the space 9 of the platform operation system 4. The app 5 uses at least a part of the common services provided by the platform operating system 4. The app 5 may run on the same or a further instance of platform operating system 4. The app 5 comprises an app input interface 10 for receiving the video data from the system output interface 7. The tasks of the app 5 are described above and concern especially video data analysis tasks and especially security video analysis tasks.

The platform operating system 4 provides a shared memory 11, whereby the shared memory 11 is used by the system output interface 7 to write the video data in the shared memory 11 and is used by the app input interface 10 to read the video data from the shared memory 10.

By using the shared memory 11 by the system output interface 7 as well as by the app input interface 10, one copying process of the video data during transferring the video data from the space 9 and/or from the platform operating system 4 can be saved, so that the bandwidth of the camera system 2 is saved and the latency during transferring the video data is reduced.

The camera system 2 optionally comprises a CPU 12, whereby the CPU 12 manages the transfer including optionally preprocessing the captured images as video data to the system input interface 6. The camera system 2 comprises a memory means 13, which is embodied as a volatile memory, for example a DDR memory. The platform operating system 4 optionally comprises a further shared memory 14, which is embodied by the memory means 13. The video data is transferred from the system input interface 6 to the memory means 13 and especially to the further shared memory 14.

The camera system 2 comprises a GPU (graphic processing unit) 15, whereby the GPU 15 is adapted to preprocess the video data. The GPU 15 provides a GPU input interface 16, which is adapted to read the video data from the further shared memory 14. The GPU 15 can for example be adapted for converting the video data in the YUV-domain or in another domain as a preprocessing in order to reduce the data amount needed to transfer the video data. Optionally the GPU 15 can reduce the data amount by only transferring the Y-channel as video data, because the Y- Channel is often enough for video analysis as used by the apps 5.

By using the further shared memory 14 by the system input interface 6 as well as by the GPU input interface 16, one copying process of the video data during transferring the video data from the camera sensor 3 and/or from CPU 12 to the GPU 15 can be saved, so that the bandwidth of the camera system 2 is saved and the latency during transferring the video data is reduced. The GPU 15 provides a GPU output interface 17, which is equal to the system output interface 7 and which provides the video data as the original video data or the preprocessed video data to the shared memory 11.

The app 5 is adapted to process the video data and to generate process video data and/or meta data based on the processed video data. The video data processing arrangement 1 comprises an output 18 for providing the processed video data and/or meta data based on the processed video data from the app 5. The output 18 can be connected to a storage for storing the processed video data and/or meta data, to a display for displaying the processed video data and/or meta data and/or a signaling device, for example a traffic light, for outputting signals on basis of the processed video data and/or meta data.

In an overall view, the video data processing arrangement 1 comprises a video pipeline 19, starting from system input interface 6 and ending at the app 5, whereby two copying processes are saved by means of the shared memory 11 and the further shared memory 14.

In the embodiment as shown, the platform operating system 4 is realized as Azena operating system, which is based on an Android™ platform. All memory copies in software are minimized to the bare-essential of exactly 1 user-space 6 memory copy for the video-data from the source to the destination executed by the CPU 12 and provided as a common service by the platform operating system 4. Thereby, the Hardware Buffer is used.

Figure 2 shows a block diagram of a video data processing arrangement 1 as a further embodiment of the invention. The video data processing arrangement 1 comprises a cloud server 21 as a hardware platform and/or a physical machine of the video data processing arrangement 1. The cloud server 21 provides cloud services for the video data processing arrangement 1.

The video data processing arrangement 1 comprises a platform operating system 4, which runs on the cloud server in four instances I - IV. The platform operating system 4 is adapted for managing hardware and/or software resources of the cloud server 21 as well as to provide common services for apps 5. The apps 5 may be realized as third-party user space applications. The platform operating system 4 comprises the system input interface 6 for receiving video data provided by a network interface or a storage means or by a camera. Furthermore, the platform operating system 4 and especially the apps 5 provide the output 18 for providing the processed video data and/or meta data based on the processed video data from the apps 5.

Each of the four instances of the platform operating system 4 are providing the following functions:

Function I provides the system input interface 6 for receiving the video data from the network interface and/or from the storage and transferring the video data in an input shared memory 20.

Function II provides the transfer of the video data from the input shared memory 20 to the further shared memory 14 including optionally preprocessing the video data. The function II can be handling a physical or virtual CPU 12. Especially, the function II may provide the functions of the CPU 12 of the first embodiment as described above.

Function III provides the transfer of the video data from the further shared memory 14 to the shared memory 11 including optionally preprocessing the video data. The function III can be handling a physical or virtual GPU 15. The function III may comprise the GPU input interface 16 for receiving the video data from the further shared memory 14 and/or the GPU output interface 17 for transferring the video data to the shared memory 11, thus realizing the system output interface 7. Especially, the function III may provide the functions of the CPU 12 of the first embodiment as described above.

Function IV provides the user space 8 for the apps 5 and can be arranged on the cloud server 21 or on a separate device. The function IV provides the app input interface 10, which is embodied as the shared memory 11. Especially, the function IV and the app 5 may provide the functions of the user space 8 and the app 5 of the first embodiment as described above. The other functions I - III are arranged in space 9, which is separated from user space 8 as described above.

In an overall view, the video data processing arrangement 1 comprises the video pipeline 19, starting from system input interface 6 and ending at the system output interface 7 or the app 5, whereby three copying processes are saved by means of the shared memory 11, the further shared memory 14 and the input shared memory 20.

The Azena OS as an example for the platform operating system 4 can be hosted as a guest in a cloud environment embodied as the cloud server 19. The hardware peripherals like GPU and memory can be shared from the main platform OS to different Guest OS’s.

The physical CPUs will be shared as virtual CPU’s to the Azena OS. Multiple instances of the Azena OS which have their own identity can be running in parallel on one physical system in the cloud. The basic thought is to have video entering onto the system interface and metadata is coming back that might be processed further by the system receiving it. In the example 4 Azena OS instances are shown, each having four functions

If for example the same pre-recorded footage is presented to different instances of the Azena OS the input shared memory 20 as a buffer and the further shared memory 14 as a buffer and optionally the shared memory 11 can be refcount. Basically not making four copies but putting a refcount onto the said two buffers and saving additional memory copies. The guest virtual address of the Android™ Hardware buffer needs to be mapped to the same virtual (physical address) of the platform OS that runs the different Guest OS instances. Additionally a refcount is beneficial over the different guest OS's to control lifetime of the buffers. By saving those memory copies for certain use cases scaling in cloud environments can be improved. The guest id that is unique for each virtual machine can be used as a tag to identify the metadata origin.