Tokyo, Japan, Dec 27, 2005 - Nippon Telegraph and Telephone Corporation (NTT) started research on lossless coding technology for audio signals and international standardization in 2002, aiming at high quality services suitable for broadband network. Yesterday, according to the final national body ballot disclosed by ISO/IEC, the specification for this technology was officially approved as the MPEG1 standard (MPEG-a ALS).
This standard includes a number of elementary technologies proposed by NTT, one of which is the PARCOR coefficients2 invented by NTT more than 30 years ago. It also includes technologies developed under the collaborative research contract between NTT and the University of Tokyo (Prof. Shigeki Sagayama, Graduate School of Information Science and Technology).
1. Background
In parallel with the evolution of broadband networks and digital audio equipments, information rates for delivery and storage have risen rapidly owing to the demands for high- quality audio signals (high sampling rates, high word resolution, and multi-channel). NTT Communication Science Labs recognized the importance of lossless compression technology for audio signals and its standardization, considering interoperability, long-term maintenance, and clear IPR status. The Laboratories took the initiative in promoting this technology as the standard in the ISO/IEC3 MPEG group.
2. Progress for international standardization
For this standardization work, NTT initiated discussions on its need and requirements and prepared the technical call for the technologies. In line with the normal standardization process, a number of improvement and integration works were carried out on top of the initial reference model. Partners in this standardization work included the Technical University of Berlin (Germany), RealNetwoks Corp. (USA), and I2R (Singapore)
After the specification had been tentatively defined, it was voted on twice by 23 national bodies. The last ballot closed in last week, and it has been disclosed that the standard has been affirmed. This means specification of the lossless coding has now been officially established as [14496-3 3rd ED AMD 2 (ALS: Audio Lossless)4].
It is expected that this standard will be used in common tools for various applications, that it will continued to be maintained so that compressed files can be perfectly decoded even after 100 years. The MPEG group will continue working on the reference software and conformance testing. It is also expected a consortium of essential patent holders will be organized for the collection and delivery of patent royalties.
3. Technical merits
- Assured perfect reconstruction even after the compression
- State-of- the-art compression performance
- Significant reduction of transmission and storage cost with minor decoding time
It is known that we have already used some of standard audio coding schemes such as MP3 and AAC5 or one for minidisc. These are all perceptual coding that offer a high compression ratio at the penalty of minor waveform distortion at the decoder. These approaches carefully control the quantization distortion based on the characteristics of human hearing. The waveform is different from the original, although perceptually very close to it.
In contrast to perceptual coding, lossless coding assures perfect reconstruction of the waveform without a single bit of difference. This is very important for applications such as waveform editing and archiving high-quality audio signals. At the cost of perfect reconstruction, the compression ratio is limited and the compressed file size varies from 15 to 70 % of the original depending on the statistical properties of the original waveform.
The compression performance, however, outperforms ZIP6. Figure 1 compares the compression performance of MPEG-4ALS with other available compression tools for audio signals. The vertical axis denotes the compression ratio (the file size divided by the original size: the smaller, the less cost), and the horizontal axis shows the decoding time (the faster, the more convenient). The standardized specification offers a wide range of flexibility in selecting the operation mode at the encoder. One can select a very fast mode with lower performance or very high compression mode at the cost of slow encoding and decoding. The proprietary decoder can improve the speed. We can see that the standardized specification provides the state-of-the-art technology.
This MPEG-4ALS accepts variety of input formats:
- Sampling rates of up to 192 kHz (44.1 kHz for CD)
- Various integer PCM formats up to 32 bit per sample (16 bit for CD)
- 32-bit floating point data in the IEEE754 format (integer for CD)
- Up to 65536 channels (2 channels for CD)
It can be used for almost all applications. Decoding is generally very fast and at least 10 times faster than the playback time of the music. It is obvious that the file compression can reduce the size of archive files. It is also useful for downloading compressed files, since download time can be significantly reduced and the decoding time is much smaller than the playback or download time.
The specification features a number of technologies for reducing the rate. In particular, NTT contributed to the development of the following elementary tools:
- Time domain linear prediction based on PARCOR coefficients.
- Multi-channel coding (collaborative work with NTT and the University of Tokyo)
- Long-term prediction (collaborative work with NTT and the University of Tokyo)
- Common factor coding and masked compression for floating-point data
- Progressive order prediction for random accessibility.
In parallel with its standardization activities, NTT labs have developed the proprietary technologies for efficient algorithms and efficient implementation while maintaining compliancy to the standard.
4. Future task
NTT Communication Science Labs will continue to support the standardization of the conformance and reference software and the enhancement of the encoder performance.
In parallel, NTT Communications Corp. will design and provide integrated delivery or archiving systems by making use of practical software compliant to this standard. In addition, NTT group companies will produced with collaborative work with partners or with licensing for various applications, including professional audio editing tools, portable music players and editing or archiving medical or environmental data.
Terminology
(1) MPEG
Moving Picture Expert Group: standardization group in ISO/IEC JTC1/SC29/WG11. This group has established number of important compression schemes for video, and audio since 1978.
(2) PARCOR coefficient
Partial Auto Correlation:A set of predictive parameters invented by NTT Musashino Lab in 1972. This set has property of stability and easy quantization, and therefore widely used for speech coding and synthesis, and other signal processing areas.
(3) ISO/IEC
ISO (International Organization for Standardization) and IEC (International Electro technical Commission) are organization that seek to establish international standards for various fields.
(4) 14496-3 3rd ED AMD 2 (ALS)
MPEG-4 audio 3rd edition amendment 2. It is usually called as ALS.
(5) AAC
Advanced Audio Coder :Efficient multi-channel audio coder established in 1997. Perceptual quality is better than that of MP3. The coder is used in the Japanese digital broadcasting system and some of portable music players.
(6) ZIP
General purpose lossless compression tool, which adaptively updates the codebook depending on the input sequence. It can compress text and program sources and has been incorporated in the OS.
About Nippon Telegraph and Telephone (NTT)
Nippon Telegraph and Telephone Corporation (TSE: 9432; NYSE: NTT) was established in 1952 as a state-owned telecommunications public corporation and in 1986 converted to a private company to be the largest telecommunications company in Japan and the second largest in the world. NTT and its 430 group companies provide a wide range of telecommunications services. One of the important missions of NTT group is to contribute to the achievement of a Ubiquitous Broadband society. For more information, please visit http://www.ntt.co.jp/index_e.html.