为了把多媒体数据正确地发送到用户界面上

文档格式：PPT| 13 页|大小 398.51KB|积分 16|2024-12-12 发布|文档ID：253388171

下载文档

下载文档到电脑，查找使用更方便还剩页未读，继续阅读>>

侵权申诉举报

1 / 13

此文档下载收益归作者所有下载文档

版权提示

文本预览

常见问题

单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,为了把多媒体数据正确地发送到用户界面上，同步在其中起着重要的作用很难从人的主观感知角度这同步提供一个客观的度量标准每个人的感知都不一样，只有一些启发性的标准可以决定一个媒体流的展现正确与否For delivering multimedia data correctly at the user interface,synchronization is essential.It is not possible to provide an objective measurement for synchronization from the viewpoint of subjective human perception.As human perception varies from person to person,only heuristic criteria can determine whether a stream presentation is correct or not.,表现要求,口形同步要求,口形同步是指在人说话的情况下，音频与视频之间的时序关系。

音频与视频的逻辑数据单元之间的时间偏差称为错切（,shew,），,同步的媒体流之间的应该没有偏差图,15.18,给出实验室结果的概述，纵轴表示受试者发现同步错误的相对数目，但不管是滞后或提前，他们最初的假设是与不同视图相关的三条曲线应该大不一样但事实上并非如此（如图,15.18,所示）左：头像；中：正面半身；右：远景全身像,图,15.17,图,15.18,三个不同视角发现同步错误的曲线,15.3.1,Lip synchronization refers to the temporal relationship between an audio and video stream for the particular case of humans speaking.The time difference between related audio and video LDUs is known as the skew.,Figure 15.17:Left:head view;middle:shoulder view;right:body view.,Figure 15.18 provides an overview of the results.The vertical axis denotes the relative number of test candidates who detected a synchronization error,regardless of being able to determine if the audio was before or after the video.,Figure 15.17:,Left:head view;middle:shoulder view;right:body view.,指向同步要求,在计算机支持的协同工作环境中（,CSCW,），,摄像机与麦克风通常与用户的工作站相连。

在这个实现中，实现人员要观察一个包含有一些数据及相关图形的商务报告，所有受试人员有一个观察这些数据与图形的观察窗口在讨论时，共享一个指针，使用这一指针说话者可以指向任一与讨论内容相关的图形，这就要求音频与远程指针的同步In a Computer-Supported Co-operative,Work(CSCW)environment,cameras and microphones are usually attached to the users workstations.In the next experiment,the experimenters looked at a business report that contained some data with accompanying graphics.All participants had a window with these graphics on their desktop where a shared pointer was used in the discussion.Using this pointer,speakers pointed out individual elements of the graphics which may have been relevant to the discussion taking place.This obviously required synchronization of the audio and remote telepointer.,实验人员设计了两类实验：,第一是对一般船的技术部件进行解释，指针指向正在讨论的区域（图,15.21,右边解释越短，同步的要求越高。

实验人员选择了一个使用很短单词的讲话速度很快的人实验人员的另一个实验是在地图上对航海路线进行解释（图,15.21,左边），这包括指针的连续移动从人的感知角度来看，指向同步与口形同步极不同在接近同步的偏差值的情况下，它更难发现同步错误口形同步错误的偏差值在,40ms,到,160ms,之间，对于指向同步,The experimenters conducted two experiments:,The first was to explain some technical parts of a sailing boat,while a pointer located the area under discussion(Figure15.21).The shorter the explanation,the more crucial the synchronization;therefore,the experimenters selected a fast-speaking person who used fairly short words.,Additionally,the experimenters held a second experiment with the explanation of a traveling route on a map(Figure15.21,left side).This involved the continuous movement of the pointer.From the human perception point of view,pointer synchronization is very different from lip synchronization as it is much more difficult to detect the“out of sync”error at skew values near the error-free case.While a lip synchronization error is a matter of discussion for skews between 40ms and 160ms,for a pointer.,基本的媒体同步,前面对口形同步进行研究人，下面对同步研究的结果作一个总结，给出较全面的同步要求。

在数字化音频一出现时，就对专用硬件所容忍的跳跃范围进行了研究，,Dannenberg,给出了这些研究的文献与解释在文献,Ble78,中，对,16,位音频质量中最大的不跳跃采样间隔是,200ps,在文献,Sto 72,中，一些感知实验推荐的音频间隔是,5,到,10ns,，,更进一步的实验,Lic5,Woo51,表明，由短暂的滴答声融合为连续的音调的最大间隔是,2ms,（,参见文献,RM80,）,Lip synchronization and pointer synchronization were investigated due to inconsistent results from available sources.The following summarizes other synchronization result s to give a complete picture of synchronization requiremints.Since the beginning of digital audio,the jitter to be tolerate by dedicated hardware has been studied.Dannenberg provided some references and explanations of these studies.InBle78,the maximum allowable jitter for 16-bit quality audio in a sample period is 200ps,which is the error equivalence to the magnitude of the LSB(Least-Significant Bit)of a full-level maximum-frequency 0-KHz signal.In Sto72,some perception experiments,recommended an allowable jitter in an audio sample period between 5 and 10ns.Further perception experiments were carried out by Lic51 and Wood51,the maximum spacing of short clicks to obtain fusion into one continuous tone was given at 2ms(as cited byRM80),一般的音频与视频的集成没有口形同步算法那么严格，对于舞蹈的多媒体教学软件，它可表现为由动画展现的伴随着音乐的舞步。

使用多媒体交互能力，就可以一遍又一遍地观看舞蹈动作在这个特定的例子中，音乐与动画之间的同步重要，经验表明，,80ms,的偏差值基本能满足用户的要求，不过，最具挑战性的课题是噪声事件和视频表达之间的关联（例如，两车的碰撞，这里我们用口形同步的相同约束，即,80ms,）双音道既可紧耦合，也可以松散耦合，合成的效果与其内容紧密相关,The combination of audio and animation is usually not as stringent as lip synchronization.A multimedia course on dancing,for example,could show the dancing steps as animated sequences with accompanying music.By making use of the interactive capabilities,individual sequences can be viewed over and over again.In this particular example,the synchronization between music and animation is particularly important.Experience showed that a skew of+/-ms fulfills the user demands despite some possible jitte。

点击阅读更多内容