With an expanding digital universe made up of a multitude of resources including but not limited to video footage from CCTV cameras, pictures from mobile phones uploaded to various social media sites, digital movies and all other forms of images and video.
According to IDC, a majority of the information in the digital universe, 68% in 2012, is created and consumed by consumers; watching digital TV, interacting with social media, sending camera phone images and videos between devices and around the Internet, and so on. Yet enterprises have liability or responsibility for nearly 80% of the information in the digital universe. Only a tiny fraction of the digital universe has been explored for analytic value.
One of the main reasons for this is that all of that video is unstructured data and is difficult to extract data and make sense of it. In fact, today’s technology for creating searchable data from video is manual, slow and ineffective as the titles & manually created tags cannot capture relevant content & context of the video.
iData Sciences has developed a revolutionary Image & Video Search framework that extracts and converts unstructured data from images & video to structured data, making it meaningful and understandable. With patented technology the entire audio, image, text and any meta-information from the video is extracted and distilled down to machine readable words indexed according to time making it easy for search engines, NLP engines, databases and analytical tools to discover.
This automated and distributed video distillation and meta-tagging engine processes all kinds of information done by most video search services, but goes a few steps further by applying proprietary processes, using “parallel computation” and cloud based computer processing power to
The core technology includes splitting a video into two components, audio and the video frames. Both components are then processed using speech-to-text transcription, text extraction from images, facial recognition and object recognition. The output is not only the transcript of the video and image frames, but metadata that is time stamped with frame references.
For example, if a brand logo is found an hour and a half into a video, the metadata would include that time reference a 01:30 with the frame actually marking the logo with a bounding box.
With this technological approach video classification/clustering, search engine indexing, and personalization for content, including targeted advertisements are possible.
Individual frames from within a video will be extracted automatically. The frequency of the frame extraction will be soft coded and the default value will be determined during development. Some of the major considerations include,
Each frame will be analyzed through the image processing sub system for the following:
The expectation is that there will be a network of reference objects to compare detected objects, faces, and text too.