Without it, VR is destined to be a “farce”

Lei Feng Network (search "Lei Feng Network" public concern) Note: This article released by Deeperblue Lei Feng network.

There are 1,660,000 results for "VR 寒冬" on Baidu.

“Capital Face Change: 6 Months, VR from Carnival to Dismal” (Geeke Park, September 07, 2016), “The VR industry that has attracted countless capital and entrepreneurs seems to have entered the winter” (Chinese entrepreneurs, 2016 August 26th, 2008, "Is the winter of VR startups coming? Listen to Capital Grand Coffee how to say! 》(Netease News, May 27, 2016), “Be careful with quilt cover! VR will usher in the winter (Sohu News, May 11, 2016)...

People still remember the VR fever from the second half of 2015 to the beginning of 2016: From entrepreneurs to investors, everyone has unlimited enthusiasm and hope for VR. There are a few people who have switched from a computer background to being a VR player: They have spotted this "hot land", stepped out of the big companies, and got into the VR industry.

The industry lacks high-quality standards, and the key technology nodes are far from breaking. VR is like a baby in the dental school and is still in turmoil, far from mature. It is not so much that VR has arrived in the winter. It is better to say that the previous bubble is being reasonably squeezed.

One of the "key technology nodes" here is spatial audio technology . Today, people prefer to call it 3D audio .

Apple Music and VR studio Vrse teamed up to create a VR music video "Song for Someone" for the U2 band. Pictured in the MV clip U2 sings in an empty Toronto stadium.

As Adam Somers, chief audio engineer at Jaunt, the famous VR production company, said: “(in the VR case), hearing accounts for 50%, and vision accounts for the remaining 50%.”

Hearing determines the position of human beings in space, the perception of distance to objects, and so on. The visual gives a clue, and the auditory verifies whether the clue actually exists. If there is less visual immersion, the authenticity of all the pictures will disappear. Without solving the hearing problem, virtual reality cannot become a virtual reality. It can be said that 3D audio determines whether the VR era we are talking about is really coming.

First of all popular science 3D audio

What is 3D audio? In simple terms, 3D audio is the most realistic analog technology for the sound, allowing the audience to fully restore the sound field similar to the scene. Similar names are also known as panoramic sound schemes, Immersive Audio. You can completely hear the sound of "reality" instead of "realism".

Ramani Duraiswami, one of the founders of VisiSonics, one of the world’s most mature spatial audio production companies, has a saying: “When the voice you hear is extremely real, The headset disappeared."

The human ear has a set of own analysis system for the sound signal to analyze and locate the sound. The signal transmitted from any point in space to the human ear (in front of the tympanic membrane) can be described by a filtering system. The source + filter (transfer function) obtains the signal before the ear reaches the tympanic membrane of both ears.

HRTF icon

The human ear has a set of own analysis system for the sound signal to analyze and locate the sound.

We don't have to care about how the sound is delivered to both ears. We only need to know that the source is different from the signal before our ears are heard; and the sounds heard by our left and right ears are not the same. This may be due to our evolution. Vision cannot be positioned in the dark, but the ears can be positioned and defended by different sounds from the left and right ears.

This filter (transfer function) is called HRTF (head-related transfer function). If we have a filter bank that is spatially all-directional to the binaural, we can get a filter matrix to restore the sound signal from the entire spatial orientation.

HRTF is very personal. Everyone will form a set of their own perception of hearing in their growth. Also, each of us has a different head size, different ear spacing, and the contours of the ears and swirls inside are also different. Plus, we have developed our own unique listening habits as we grow. It can be said that the sound of the same object that everyone hears is in fact subtle.

How to restore the true sound of human ears?

Scientists' exploration of this is not a new thing today. Nearly a century ago, in 1933, AT&T Bell Labs brought this technology to the Chicago World Expo. The company’s vocal research department made a mechanized simulation head – they called this dummy “Oscar”. Oscar put two microphones in his ears and sat in the show room to record the surrounding sounds. Oscar hears what he can record.

The solution provided by AT&T Bell Labs is called Binaural Audio.

The binaural recording technology simulates the shape of human real heads, the ear spacing of the left and right ears, and can record sounds that are almost true to human hearing. This is an effective "stupid method" and the HRTF is restored on the physical level. Along this path, the German microphone company Neumann made successive breakthroughs in binaural recording technology between 1973 and 1992—better radio equipment, placing the microphone in the dummy's ear On the tympanic membrane and so on.

In 1933, the binaural recording technology simulator at AT&T Labs, Oscar

The binaural recording technology has been developing at a slow pace in this century, because there is no strong industrial demand. Waiting until the advent of this wave of VR, it is only on the stage. Thanks to the popularity of VR brought by Oculus Rift, Sony Morpheus and Samsung Gear, 3D audio technology ushered in its "Renaissance" - so it was also called VR audio.

Where is the difficulty of 3D audio technology?

VR requires 3D audio to produce a more realistic immersion. In the interview with THE VERGE, Jaunt’s chief audio engineer Adam Somers described: “In the immersing sense, hearing accounts for 50%, and vision accounts for the remaining 50%.”

Jaunt is a well-known VR production company in the United States. Last year, he was awarded a US$65 million investment by the Shanghai Chinese Cultural Industry Fund (CMC) and Disney.

However, the binaural recording technology itself cannot support turning the head while listening. When a player plays a VR game, if there is a voice coming from behind, the person's instinctive reaction is to look back. At this time, if the sound continues in front of you, it will greatly reduce the immersion.

Another technique that restores the real sound field cannot support the turn of the head when listening, called surround audio. Surround Sound uses multiple physical speakers to create a 360-degree sound field. Sound from different directions is played through different speakers/speakers. The most famous companies for this technology are DTS and Dolby.

For example, a surround sound theater, it will place a lot of speakers / speakers around the audience. If there is an explosion on the left side of the screen, the left speaker will sound instead of the right. Due to the fixed position of the player, the listener can only hear the most realistic sound field simulated at a fixed point.

The true immersive experience comes from the all-round restoration of high school, low audio and in space, that is, taking the human head as the center to record the sound of all angles within a sphere and restore it.

How to solve the immersive experience problem?

The calculation has become a top priority.

After the sound collected by the binaural recording technology, the HRTF is restored, and then the calculation is performed. The HRTFs in all directions are restored to synthesize a set of spatial audio that naturally changes with the rotation and position of the sound field.

In these three links - radio, recording, computing, rendering (processing, rendering); playback - the strongest technical barriers lie in the calculation.

The core algorithm is to test the capabilities of various space audio companies. The corporate slogan of Two Big Ears explains everything: "We do mathematics so you can focus on being awesome." (We do mathematics, you come to do cool things.) This Irish company is currently working on spatial audio technology. Leading.

Some teams use some stupid ways to reduce the amount of calculations. Like 3dio, they created radio equipment that can record HRTFs in all directions at the same time.

Four pairs of artificial ear radios produced by 3dio


The Verge presents their radio equipment in the short film New Zealand in 3D Audio - Simulating the human ear. This radio simulates the transmission of sound under the unique physical structure of the ear and reproduces the real sounds of the streets of New York. .

In general, three indicators can be used to determine the technical level of a company's 3D audio core algorithm:

1. Localization : Refers to the positioning of the sound. Surround sound has a good simulation effect for the sound plane 360 ​​degrees, but difficult to simulate for 360 degrees up and down. The difficulty of VR audio also lies in making up and down 360-degree sound simulations. Being able to go up and down is more advanced technology.


2. Propagation : In the closed space, the sound is not transmitted only once, but there are numerous bounce backs. We can use echo to understand. Propagation is used to describe whether it can make users feel that they are indeed in a real space, and the stronger this reality is, the better.


3. Occlusion : If there is an obstacle in the middle of the sound transmission, this obstacle will affect the sound transmission. If a VR audio technology can simulate the impact of obstacles on the sound well, it is a good VR audio.

In addition to the above three points, the current solution for the most advanced spatial audio is Ambisonic technology . Therefore, whether or not Ambisonic can judge the technical level of a company is also an indicator.

Ambisonic is also a sound field simulation method, but it also tests the overall capabilities of team physics, mathematics, and computers . If we imagine the location of one of the receiving sounds in space as a balloon filled with air, the sound waves coming from everywhere in the space will exert a force on the surface of the balloon.

Ambisonic uses this simple principle to place a bunch of speakers in the space to simulate the force of the acoustic waves at various places under the actual situation and then calculate and restore the HRTF.

The audio data obtained through Ambisonic is the most comprehensive data and it can be degraded to any other audio format. For example, if Ambisonic is a jpg in an image, the Dolby 7.0, Dolby 5.1, and other audio formats are equivalent to pixels.

How to evaluate the industry status of 3D audio technology?

3D audio technology is bound to change all application scenarios . The entire space audio industry has two opportunities :

The first opportunity is to make an audio engine. The core of the audio engine is HRTF functions, Propagation, and other techniques, that is, how to reproduce stereo sound in the game as realistically as possible. The current simple approach is to place spatial audio in VR games by placing different virtual speakers together with the Oculus Audio SDK (which implements HRTF and other effects, including reflection, etc.).

There are a lot of non-simple things that require the team to have powerful computing power. The Two Big Ears, founded in Edinburgh in 2013, is one of the best. The plug-in they make may be one of the best technology plug-ins in the industry today.

The second opportunity is to realize the application of 3D audio in real life , that is, how to cooperate with panoramic video recording with multiple angles of VR audio. In this way, if the user turns around when watching the panoramic video, the sound will be dynamically adjusted. In terms of application scenarios, 3D audio technology is particularly important for virtual concerts.

Considering the current global market, 3D audio has not yet emerged as a company or team that provides perfect solutions due to its own technical difficulty. The technology of the big-name audio company DTS has been relatively close, but the final version is still not public and is facing The possibility of high prices.

The following picture shows the seven most-watched 3D audio technology teams worldwide:

VisiSonics :

VisiSonic started with the earliest team from the University of Maryland laboratories and in October 2015 reached cooperation with Oculus. Oculus purchased their technology called RealSpace 3D for the Oculus Audio SDK. They designed a device for simultaneous recording of 3D audio and video, consisting of a column plus top ball device with 64 microphones on the ball device. On the hardware level, technology is top-notch and maintains its leading position in the world.

Two Big Ears :

The team based in Edinburgh, Scotland, was recently acquired by Facebook. Their core technology is to make a 3D audio engine called 3Dception. At the plug-in level, this team is currently the best technical team in the world. Before being acquired by Facebook, it was rumored that the HTC Vive audio was calculated and rendered using this team.

3dio sound :

This company is currently the best company in the field of space audio radio recording and recording. They produced a radio system with eight ears.

Thrive Audio :

This company was acquired by Google together with Tilt Brush and is part of the Google VR strategic layout. The team is from Trinity College, Dublin, Ireland. They claim to have applied for two patents.

Mint Muse :

A team that was previously relocated to Shanghai, San Diego, and recently moved to Shanghai. The creative team comes from Qualcomm. They focus on areas such as rendering quality, algorithm optimization, and time delay. They design and produce spatial audio post production, coding, and software plug-in tools. Currently, hardware production is not involved. The team's solution to the crowd is the sound engineer. They are developing a professional VR panorama tuning software that allows the tuner to monitor the finished product's effects in real-time while editing the panoramic sound and simplify the entire workflow.

Waves :

This is a space audio technology company based in Israel. It used to be a professional studio tool, such as plug-in and sound effects. It is a partner of many famous studios such as Abbey Road.

Impulsonic :

A team that has hatched in the Department of Computer Science, University of North Carolina, USA. The 3D audio development for VR games and applications is mainly called Phonon. Established in 2012. Dr. Anish Chandak, the founder, said that his idea was very simple when he was founded: “Helping game designers and developers make it easier to make high-quality spatial audio.” Most of the income comes from Phonon. License income. They accepted the investment of the National Science Foundation last year, and the research was applied to some government projects at the same time.

Most of the studios doing VR games haven't begun to use 3D audio technology on a large scale - the reason is mainly "poor". Since the "winter" comes, it will be even worse. In fact, these contentless game studios also use surround sound technology; only a few geek-style studios have begun to apply some of the spatial audio technology. 3D audio technology is still a very cutting-edge thing.

However, 3D audio is the basic technology sector for the entire VR industry .

With core technologies, it is possible to wait for the next breakthrough point at the application level. One example is DJI, which is a drone. Long before the drone became a business model, the major team of Dajiang had already made a long-term research and effort on the technical point of flight control.

If VR is worth every day to spend time with each user, then it is bound to mature on the technical side to the time of "immersive experience."

Wait for rejuvenation, starting with 3D audio.

references:

i. Rumsey, Francis (2001). Spatial Audio. Focal Press.pp. 62–64. ISBN 0 240 51623 0.

Ii. Blauert, J. (1997) Spatial hearing: the psychophysics of human sound localization. MIT Press.

Iii. Begault, DR (1994) 3D sound for virtual reality and multimedia. AP Professional.

Iv. Eric Benjamin, Richard Lee, andAaron Heller, Is My Decoder Ambisonic?, 125th AESConvention, San Francisco 2008

v. https://developer.oculus.com/documentation/audiosdk/latest/concepts/audio-intro-env-modeling

Lei Feng network (search "Lei Feng network" public concern) Note: This article is published for deeperblue authorized Lei Feng network, can be contacted WeChat. Do not delete content.

Photoelectric Switch

The photoelectric switch is the abbreviation of the photoelectric proximity switch. It utilizes the shielding or reflection of the light beam by the detected object, and the circuit is connected by the synchronization loop to detect the presence of the object. The object is not limited to metal, all objects that can reflect light (or block light) can be detected.

Photoelectric Proximity Sensor,Photoelectric Switch Sensor,Photoelectric Proximity Switch,Infrared Photoelectric Switch Sensor

Changchun Guangxing Sensing Technology Co.LTD , https://www.gx-encoder.com