believed as a protective doctrine preached by the Load Buddha in Pali language. The aim of this study is
to analyze acoustic properties of Pirith
using computer-aided methods and identify special characteristics and patterns.
In this study, two methods were used to identify special characteristics of Angulimala Sutta. First method calculates
voiced to unvoiced ratio using zero crossing rate and energy content associated
with the acoustic signal while second method recognizes vowel distribution
using first and second formant frequencies. Results
of the first method indicates approximately 96% of frames are voiced while the
second method suggests approximately
in the square region of
chanting the Angulimala sutta most of the time the tongue height is low positioned in back levels while lips
Keywords: Formant frequencies,
Voiced to unvoiced ratio, Zero-Crossing rate, Vowel distribution
process begins at the point of converting an idea developed in the speakers’
mind to a language code. With the aid of articulatory motion and vocal tract
movement, the phonemes which are lined up in a set of sequences propagate
outside as an acoustic waveform.
means protection from all aspects and this protection is to be obtained by
reciting or listening to Pirith suttas.
The practice of reciting and listening to Pirith
suttas began very early in the history of Buddhist culture. As reported by
Jayaratne 2007, an experiment was performed at Kanduboda International
Meditation Centre, Sri Lanka to understand the effect of Pirith on human beings. When a sample of human subjects was allowed
to listen to Pirith chants, it is
observed that within
of the commencement of the chanting, their heart beat reduced, heart pulse amplitude
halved and reached to an alpha state similar to what is obtained under a
Voiced to unvoiced ratio (V/UV ratio) is an
important parameter as it indicates the involvement of speech production system
with vibration of vocal codes. In this work, we combined the results of V/UV
ratio with zero crossing rate (ZCR) and energy of short time segments of the
signal to strengthen the analysis.
In voiced speech, the
vibrating glottis generates periodic pulses which are resonate in the vocal tract.
Therefore, when vowels are pronounced similar frequencies are generated.
However, in the unvoiced speech, vocal chords held open and a continuous air
beam flow through them. The air beam turns into a turbulent flow because of
narrowed vocal tract and it creates, non-periodic, noise-like sounds 8.
The zero crossing rate measures number
of intersections a given signal makes with the time axis per unit time in an
amplitude-time plot. Voiced speech shows a low
zero-crossing rate due to the excitation of vocal tract by the periodic air
flow, whereas the unvoiced speech shows high zero-crossing count as it is
produced by the turbulent airflow flowed through the narrowed vocal tract 1.
Additionally, the voiced part of the speech has high energy content because of
According to the acoustic theory of speech production, vocal
tract is modeled as a non-uniform tube closed at vocal folds and open at the
lip end 9. Cross sectional area of
the vocal tract depends on the position of tongue, lips, jaw and velum. Due to varying
cross section along the vocal tract, different resonance frequencies
(harmonics) are generated in response to varying vocal fold vibrations.
Consequently, the complex output voice signal is composed of several harmonics
called as formants which are clearly visible in spectrographic displays
of voice segments. Normally, they occur on average at intervals of
, where c is the speech of
of the vocal tract 3.
Vowels can be mapped using the
relationship between lip opening width
to the first formant frequency,
and tongue constriction width to second
vowels, which are not of any particular language but a measuring system in
describing sounds of languages are used as a set of reference vowels in this
work. These vowel sounds demonstrate if the tongue is in an extreme position,
either front or back, high or low. The current system was modified by Daniel
Jones * based on the original idea proposed by earlier phoneticians, notably
Ellis** and Bell.**. The standard International Phonetic Alphabet, IPA vowel
trapezium, is shown in figure (*).
the analyzing process, Samples of
Angulimala Sutta recited by
male monk chanters were recorded under high precision conditions and 15 samples
were subjected to analysis. Voiced recording was then subjected to
splitting of smaller voiced segments of frame length
using sampling rate of
. This specific frame length was selected
as vocal tract has fixed characteristics over a time interval of the order of
Voiced to unvoiced ratio is calculated by
counting number of frames less than a reference zero-crossing rate and higher
than a reference short time energy as voiced frames and others as unvoiced
frames according to the algorithm shown in Figure 1.
In the computational speech model, a
pre-emphasis filter is applied to the sampled time series of voiced segment to
cancel out the effect of glottis. Then frame-by-frame
analysis was used with hamming
windows and liner predictive coding
(LPC) and auto correlation to
extract the formant values. In vowel analysis, frequency values regarding first
and second formant,
were extracted and the vowel
distribution was obtained by plotting
. In the analysis of vowel distribution, primary cardinal vowels
introduced by Daniel Jones were used as a reference. *
is used for scripting, calculations and analysis. In the frame by frame analysis, speech signals are divided into a non-overlapping frame
of samples. Figure 2 shows the vowel distribution for the all 15
samples. Percentage distribution is shown in Figure 3, while a further analysis
of denser areas is indicated by Figure 4 and Figure 5. Figure 6 offers a
comparison of vowel distribution with primary cardinal vowels.
Voiced to unvoiced ratio calculation, when
combined with zero-crossing rate and energy content, demonstrated
as voiced while
as unvoiced. Further, it demonstrates a clear tendency to pronounce vowels in
chanting Angulimala sutta.
A previous research work on formant
frequency tuning in professional Byzantine chanters shows clear evidence that
chanters have special ability to use personal formant tuning at chanting 2. In
this analysis, vowel distribution shows most common area for all chanters as
shows in figure 2. The calculation of the percentage values indicated that
of vowels concentrate
around the frequency range of
as shown in Figure 3.
Further analysis demonstrated the area bounded by,
of vowel distribution respectively as showing
The Angulimala sutta is rich with vowels as it shows approximately
voiced to unvoiced ratio. Analysis of these
vowels suggest that
of vowels concentrate around the frequency
showing high amount of low back unrounded
When comparing the results with Cardinal vowel chart, the densest vowel area shows the
qualities of cardinal vowel a and ? as shown in Table 2.
It can be concluded
as when chanting the Angulimala sutta, the arrangement is inclined to be
the tongue is low positioned in back levels while lips shaped
unrounded. Less number of vowels are represented by cardinal vowels i and u,
showing high front unrounded vowel and high back rounded vowels respectively.