1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
6 * Copyright 2003 Sun Microsystems, Inc.
8 * See the file "license.terms" for information on usage and
9 * redistribution of this file, and for a DISCLAIMER OF ALL
16 <head><title>FestVox to FreeTTS</title></head>
19 <table bgcolor="#FFCC66" width="100%">
21 <td align=center width="100%">
22 <h1>FestVox To FreeTTS</h1>
28 <p>As of FreeTTS 1.2, FreeTTS provides support to import voice
29 data directly from FestVox. The process currently works well
30 for US English voices, but you are definitely encouraged to
31 try to help us make it work for other locales. This page
32 describes the overall process for doing the import.</p>
34 <h3>Creating a Voice</h3>
35 <p>You must first create a voice using
36 <a href="http://festvox.org">FestVox</a>. We've had success
37 using FestVox 2.0 on both Linux (RedHat 9.0) and Solaris (use
38 gcc 3.2.2 to compile FestVox and Festival on Solaris).
39 <b>NOTE that we did not create FestVox, nor can we provide
40 support for it.</b> The creators of FestVox, however, did a
41 great job and you can refer to their documentation for where
42 to send any questions or comments.</p>
44 <p>FestVox currently provides support for creating two types
45 of voices: diphone and unit selection. The diphone voices
46 support general domain synthesis (i.e., they try to speak any
47 text you throw at them). They are time consuming to create,
48 and are usually not a good first choice when learning how to
49 create voices. The unit selection, or limited domain, voices
50 only support a limited somain (e.g., telling the time), and
51 generally sound very good.</p>
53 <p>If you want to experiment with voice creation and
54 conversion, we recommend you start with creating a time
57 <p>Please refer to the <a href="http://festvox.org/bsv/">
58 FestVox Documentation</a> for information on creating a voice.
59 <a href="http://www.festvox.org/bsv/bsv-usukdiphone-ch.html">
60 Section IV.19</a> of the FestVox documentation provides a
61 good tutorial on making a US Diphone voice, and
62 <a href="http://www.festvox.org/bsv/x1003.html">
63 Section II.5.6</a> provides a good tutorial on recording a
64 cluster unit voice for the limited domain of telling
65 the time. <a href="http://www.festvox.org/bsv/bsv-ldom-ch.html">
66 Section II.5</a> provides a good general explanation of
67 creating a limited domain voice in general.</p>
69 <h3>Importing a FestVox Voice into FreeTTS</h3>
70 <p>FreeTTS follows many of the same steps that
71 <a href="http://cmuflite.org">Flite</a> follows for importing
72 voices. For a more detailed description of the process,
74 <a href="http://www.speech.cs.cmu.edu/flite/doc/flite_8.html#SEC14">
76 <a href="http://www.speech.cs.cmu.edu/flite/doc/index.html">
77 Flite documentation</a>.
79 <p>To import a voice into FreeTTS, you first need to do the
82 <li>Compile <a href="http://festvox.org">Festival 1.4.3 and
83 FestVox 2.0</a> as well as the speech tools that come with
84 Festival. Refer to the Festival documentation for details
85 of setting this up on your system. We've only built
86 Festival and FestVox on RedHat 9.0 and Solaris. For both
87 systems, we used gcc 3.2.2.
89 <li>"festival", "ant", "java", and "javac" must be in your path.
90 For example, we used the following command under bash on
91 RedHat (modify appropriately):
94 PATH=/usr/java/j2sdk1.4.2/bin:/home/jim/festival/bin:/usr/java/apache-ant-1.5.4/bin:$PATH</code>
97 <li>You must set the ESTDIR environment variable to point
98 to the speech tools. For example:
101 ESTDIR=/home/jim/speech_tools</code>
105 <p>To convert a voice, run the
106 <code>FestVoxToFreeTTS.sh</code> script from a command line
107 prompt located in the <code>tools/FestVoxToFreeTTS</code>
110 <p><code>FestVoxToFreeTTS.sh <voicedir></code>
112 <p>where <voicedir> is the directory the FestVox voice
113 resides in. The contents of <voicedir> will looks something
114 like the following:</p>
117 bin/ etc/ FreeTTS/ lpc/ prompt-cep/ recording/ wav/
118 cep/ f0/ group/ mcep/ prompt-lab/ scratch/ wavn/
119 dic/ festival/ lab/ pm/ prompt-utt/ sts/ wrd/
120 emu/ festvox/ lar/ pm_lab/ prompt-wav/ versions/
124 <p>The script will automatically detect whether it is a
125 cluster unit voice or a diphone voice by looking at the
126 <voicedir>/etc/voice.defs file. If no such file exists,
127 you will need to create it. An example for a time-telling
128 voice would be something like the following:
136 FV_VOICENAME=$FV_INST"_"$FV_LANG"_"$FV_NAME
137 FV_FULLVOICENAME=$FV_VOICENAME"_"$FV_TYPE
141 <p>If possible, you can let festival automatically generate
143 <<code>festvoxdir>/src/general/guess_voice_defs</code>.
145 <p>FreeTTS will create a new directory
146 <code><voicedir>/FreeTTS/</code>. In that directory is the
147 text which contains all the data for the voice (along with a
148 few other intermediate files). The voice file will have a
149 name such as <code>sun_time_dtv.txt</code>.
151 <p>The various stages of the conversion process can be called
152 directly by passing a second argument to
153 <code>FestVoxToFreeTTS.sh</code> such as "sts" or "mcep".
154 These should be used carefully. More information on these
155 stages can be found in the Flite documentation.
157 <p>If you do not pass a second argument (recommended) the
158 conversion tool will run the processing stages in the
159 following order: "lpc", "sts", "mcep" (if a cluster unit
160 voice), "idx", "install", and "compile". The "install" and
161 "compile" are specific to FreeTTS and are not mentioned in
162 the Flite documentation. They are the stages that construct
163 the framework for the voice within freetts and compile the
166 <p>When the process gets to the install phase, you will
167 encounter a menu. The install phase only knows how to handle
168 US English voices. If you have any other languages/locales,
169 then you should probably exit at this step. Unfortunately
170 adding new languages or locales is beyond the scope of this
173 <p>The menu allows you to define various features about the
176 <li><b>Name</b>: The name you want to call this voice.
177 For example "kevin", "kevin16", "alan", or "dave".
179 <li><b>Gender</b>: The gender of the voice. Select
180 help from the menu for a full listing of genders.
182 <li><b>Age</b>: The age of the voice. Select help
183 from the menu for a full listing of ages.
185 <li><b>Description</b>: A sentence or so that
186 describes this voice for others.
188 <li><b>Full Name</b>: - The name that will be used to
189 name the voice files and directory. DON'T USE SPACES.
191 to this installation of FreeTTS as well as any other
192 copy of FreeTTS you expect to use this voice. For the
193 sake of similarity to other voices, it is highly
194 recommended to not change this property unless it
195 conflicts with an existing voice. The format for the
196 name follows the convention:
197 <code><domain>_<locale>_<name></code>.
198 The <name> does not have to match the Name
199 property. The domain generally matches an Internet
200 domain or some other globally unique identity. For
201 limited domain voices, you might use the limited domain
202 name instead of locale. Example names include
203 <code>cmu_us_kal</code>, <code>cmu_time_awb</code>,
204 and <code>sun_us_dtv</code>.
206 <li><b>Domain</b>: The domain if this is a limited
207 (ldom) voice, otherwise it must be set to "general".
209 <li><b>Organization</b>: The organization which
210 recorded the voice. For example "cmu" or "sun".
213 <p>If there already exists a voice with the same Full Name,
214 you are given the option to over-write it, cancel, or change
217 <p>When this is done, the voice is put into the FreeTTS
219 <code><FreeTTSdir>/com/sun/speech/freetts/en/us/<voice
220 Full Name></code>. It is recommended to visit this directory
221 and confirm that everything looks correct; there should be
222 four files similar to the following:
224 README - Information about the voice
225 sun_time_dtv.txt - The imported voice data in ASCII format
226 voice.Manifest - The Manifest file with which to create the jar file
227 DtvVoiceDirectory.java - The VoiceDirectory for this new voice
231 limited domain voice for something other than the cmu time
232 domain, then you will likely have to make some changes to make
233 it look at the correct lexicon.
235 <p>As part of the import process, the FestVoxToFreeTTS.sh
236 script will create the jar file for the voice. If you wish
237 to create the jar file manually, you can run one of the
238 following commands, depending upon the type of voice you
239 have imported (substitute the Full Name of the voice you
242 ant -Dclunit_voice=sun_time_dtv -find build.xml
243 ant -Ddiphone_voice=sun_us_dtv -find build.xml
246 <p>The compiled voice is put in
247 <code><FreeTTSdir>/lib/<voice Full Name>.jar</code>.
249 <p>The voice will automatically be added to the list of
250 available voices for FreeTTS.
252 <p>You can now test your voice with:
254 <p><code>java -jar lib/freetts.jar myvoicename</code>
256 <p><code>java -jar bin/JTime.jar myvoicename</code>
259 <p>where myvoicename is the name property you assigned
260 to your voice in the "install" phase. If you've forgotten
261 the name, you can always retrieve it by executing the jar
264 <p><code>java -jar lib/<voice Full Name>.jar</code>
268 <h3>Files in this directory</h3>
270 <li><b>FestVoxToFreeTTS.sh</b>: The bash script that
271 performs the conversion process.
272 <li><b>FestVoxClunitsToFreeTTS.scm</b>: Performs the idx
273 stage of the conversion for cluster unit voices.
274 <li><b>FestVoxDiphoneToFreeTTS.scm</b>: Performs the idx
275 stage of the conversion for diphone voices.
276 <li><b>qsort.scm</b>: A simple quicksort implementation in
278 <li><b>FindSTS.java</b>: Generates the sts file for a
279 given recording. Used by FestVoxToFreeTTS.sh.
280 <li><b>FindSTS.jar</b>: A compiled version of FindSTS.java
281 (automatically generated)
282 <li><b>README</b>: This file.
283 <li><b>CMU_USDiphoneTemplate.java</b>: A template voice
284 directory for en/us diphone voices.
285 <li><b>CMU_USTimeTemplate.java</b>: A template voice
286 directory for en/us time limited domain cluster unit
288 <li><b>VoiceMakefileTemplate.txt</b>: A template Makefile for
289 both ldom and diphone voices.
294 <p>See the <a href="../../license.terms">license terms</a>
295 and <a href="../../acknowledgments.txt">acknowledgments</a>.
297 Copyright 2003 Sun Microsystems, Inc. All Rights
298 Reserved. Use is subject to license terms.</p>