Tech Kaizen: 11/1/14

OpenCV () is a library of programming functions mainly aimed at real-time computer vision. The library is cross-platform. It focuses mainly on real-time image processing. If the library finds on the system, it will use these proprietary optimized routines to accelerate itself.

OpenVX () is an open, royalty-free standard for cross platform acceleration of computer vision applications. It is designed by the to facilitate portable, optimized and power-efficient processing of methods for vision algorithms. This is aimed for embedded and real-time programs within computer vision and related scenarios. It uses a connected graph representation of operations.

OpenMAX Integration Layer (IL) is a standard API to access Multimedia Components on mobile platforms. It focuses mainly on audio, video, and still images processing. It has been defined by the Khronos group. By means of the , multimedia frameworks can access hardware accelerators on platforms that provide it. Bellagio is an opensource implementation of the OpenMAX IL API that runs on Linux.It is intended to show the usage of the IL API and to allow people to start developing components.This package provides the OpenMAX IL core shared library with a "reference" component.

OpenMAX DL is a OpenMAX DL () APIs contains a comprehensive set of audio, video and imaging functions that can be implemented and optimized on new CPUs, hardware engines, and DSPs and then used for a wide range of accelerated codec functionality such as , , , and .

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms. They currently cover the two following feature sets, and more will come in the future: , . Eigenfaces is a popular .

Festival library(&) is a C++ Text to Voice generation library.

Digital Audio is the most commonly used method to represent sound inside a computer. In this method sound is stored as a sequence of samples taken from the audio signal using constant time intervals. The quality of digital audio signal depends on the time (recording rate) and voltage resolution (usually in an linear integer representation with basic unit one bit). A sample represents volume of the signal at the moment when it was measured. In uncompressed digital audio each sample require one or more bytes of storage. Number of bytes required depends on number of channels (mono, stereo) and sample format (8 or 16 bits, mu-Law, etc.). The length of this interval determines the sampling rate. Normally used sampling rates are between 8 kHz (telephone quality) and 48 kHz (DAT tapes).

Digital audio can be stored in a wide range of formats. Generally speaking, audio comes in two flavors: compressed and uncompressed. Compressed audio can further be subdivided into different kinds of compression: lossless, which preserves the original content exactly, and lossy which achieves more compression at the expense of degrading the audio. Uncompressed PCM audio, on the other hand, is defined by two parameters: the sample rate and the bit-depth. Loosely speaking, the sample rate limits the maximum frequency that can be represented by the format, and the bit-depth determines the maximum dynamic range that can be represented by the format. You can think of bit-depth as determining how much noise there is compared to signal.

In the Linux kernel, there have historically been two uniform sound APIs used. One is OSS(Open Sound System); the other is ALSA (Advanced Linux Sound Architecture). ALSA is available for Linux only, and as there is only one implementation of the ALSA interface, ALSA refers equally to that implementation and to the interface itself.

Android Native code makes uses of OpenSL ES (Open Sound Library for Embedded Systems) for handling Audio processing.

ref:

The ABCs of PCM (Uncompressed) digital audio - http://blog.bjornroche.com/2013/05/the-abcs-of-pcm-uncompressed-digital.html

WAVE PCM soundfile format - https://ccrma.stanford.edu/courses/422/projects/WaveFormat/

An introduction to Linux sound systems and APIs - http://archive09.linux.com/articles/113775

A Tutorial on Using the ALSA Audio API - http://equalarea.com/paul/alsa-audio.html

How to write ALSA driver - http://c-qs.blogspot.com/2014/05/writing-of-alsa-driver.html

alsa vs tinyalsa - http://blog.csdn.net/myzhzygh/article/details/8468210

Open Sound System(OSS) -

Advanced Linux Sound Architecture(ALSA) -

ALSA pcm - http://www.alsa-project.org/alsa-doc/alsa-lib/pcm.html
ALSA - alsa-lib, libsound2.so - http://www.alsa-project.org/alsa-doc/alsa-lib/pcm.html
ALSA Pulse Audio - http://freedesktop.org/software/pulseaudio/doxygen/simple.html
Pulse Audio(libpulse.so) - http://freedesktop.org/software/pulseaudio/doxygen/simple.html
Linux sound drivers - http://www.tldp.org/HOWTO/Sound-HOWTO/

Introduction to Audio programming -

How Audio Data is Represented - http://blogs.msdn.com/b/dawate/archive/2009/06/22/intro-to-audio-programming-part-1-how-audio-data-is-represented.aspx
Demystifying the WAV Format - http://blogs.msdn.com/b/dawate/archive/2009/06/23/intro-to-audio-programming-part-2-demystifying-the-wav-format.aspx
An Introduction to Audio Processing Objects - http://msdn.microsoft.com/en-us/magazine/dn201755.aspx
Synthesizing Simple Wave Audio using C# - http://blogs.msdn.com/b/dawate/archive/2009/06/24/intro-to-audio-programming-part-3-synthesizing-simple-wave-audio-using-c.aspx
Algorithms for Different Sound Waves in C# - http://blogs.msdn.com/b/dawate/archive/2009/06/25/intro-to-audio-programming-part-4-algorithms-for-different-sound-waves-in-c.aspx

C/C++ wav processing sample code - https://android.googlesource.com/platform/frameworks/av/+/jb-dev/cmds/stagefright/

Linux pcm audio sample code - http://www.alsa-project.org/alsa-doc/alsa-lib/_2test_2pcm_8c-example.html

Android audio native programming using OpenSL ES - http://audioprograming.wordpress.com/2012/03/03/android-audio-streaming-with-opensl-es-and-the-ndk/

Android API for record/play audio(example: pcm): AudioRecord, AudioTrack -

Tech Kaizen

Image Processing Libraries: OpenCV, OpenMAX

Digital Audio Programming