-->
Save your seat for 纽约流媒体 this May. 现在注册!

SMW 17: Microsoft's Media & 机器学习

文章特色图片

Machine learning targeted to media has greatly evolved in the last 12-18 months, and Microsoft Cognitive Services tools now provide full-text audio transcription, 人脸检测, 视频稳定, 视频光学字符识别, 面对修订, 运动检测, facial e运动检测, 视频摘要, 内容审核, and object detection for VOD content. At 流媒体 West on Thursday, Microsoft principal software development engineer for communications and media Andy Beach shared what the company is doing to give content owners, 开发人员, and data scientists access to AI tools to make it easy to index and search hours of video content.

"The tools will extract metadata from video content and curate the 信息 found within the metadata. The intelligence is output to an embedded player, where a set of widgets provide interactive functionality for viewers,比奇说。. "We created a series of APIs that tie into machine learning and productized the offering to make it easy for anyone to get started."

These "products" will help improve content discoverability, 增强用户粘性, and hopefully increase content value. Online dating company Match.com tried out the AI tools for 内容审核, identifying video or images which were too racy for publishing. Nexx.Tv used the AI tools to build a better advertising use case, analyzing its content's full text metadata to deliver targeted ad overlays; i.e., if the content was about cars, it could match related advertisement to the video content on the fly to deliver more personalized ad overlays. 

训练人工智能

The initial step is training the AI. "The first version is OK, but it's not great because you have to train it,比奇说。. 例如, to identify all people within a piece of content, the AI needs to learn who each person is. However once this has been done, there's the ability to use what Beach calls a people heat map. All instances of a specific person can be identified in a video clip, and this is then graphically represented within the video scroll bar. 在下面的图表中, Julia White appears in 4% of the video, and a viewer can go directly to each clip she appears in. The most common keywords used within her clip is also shown onscreen, and these too are clickable.

Publishers: Plug and Play

Microsoft offers three flavors of AI product. The easiest to use is for content publishers. "Upload content and we will index it, 创建所有元数据, create a full-text searchable transcript, and provide widgets so you can provide an interactive viewing experience that's custom to your content,比奇说。. An analysis of video content can be completed in close to real time, 十分钟的内容, should take about ten minutes to process.

Developer: Customization

Microsoft has an a la carte option to give 开发人员 access to some or all of the video AI APIs. These APIs focus on computer visions, 内容审核, 识别情绪, 人脸识别, full-text video indexing, Bing的演讲, 以及说话人识别.

Data scientist: Infrastructure

For those who want to roll their own, the machine learning platform is available for data scientists to train their own neural networks. "You can use our platform as an infrastructure to do the compute or processing,比奇说。.

Whatever the preference, users can get 40 hours of free access to try out their tools at http://vi.microsoft.com/

流媒体覆盖
免费的
for qualified subscribers
现在就订阅 最新一期 过去的问题