Configuration
This guide will walk you through configuring the voice service, which includes both Text-to-Voice and Voice-to-Text settings. The available options for each service are listed below.
Table of Contents
Text-to-Voice Configuration
OpenAI Text-to-Voice Configuration
To use OpenAI for Text-to-Voice conversion, you must set the OpenAI API key.
Follow these steps to configure OpenAI:
- Obtain your OpenAI API key.
- Enter your OpenAI API key in the General setting.
- Navigate to the voice settings in your application.
- Under the OpenAI Voice Configuration section, you can change the voice type.
For more detailed instructions on how to access the configuration panel, please refer to the Configuration Panel Access Guide.
GPT-SoVits Configuration
This section outlines how to configure the GPT-SoVits port and set up a new voice using the advanced settings.
1. Ensure GPT-SoVits Port is Set
First, ensure that the GPT-SoVits port is properly set in the general settings.
You can find instructions on how to build and configure the port in the GPT-SoVits installation
2. Navigate to Voice Settings
Once the port is set up, follow these steps:
- Navigate to the Voice Settings menu in the app.
- Locate and click the GPT-SoVits Advanced Settings option.
3. Create a New Voice
After opening the GPT-SoVits Advanced Settings, you can create a new voice profile by providing the following details:
- Name: The name which will be in the voice selection list later.
- Refer WAV Path: Path to the reference voice file in
.wav
format. This file will be used to model the new voice. - Refer Text: A transcription of what is spoken in the reference WAV file.
- Prompt Language: The language spoken in the reference WAV file (e.g., English, Chinese, Japanese, etc.).
If your reference WAV file is in Chinese, the transcription will work both in Chinese and English.
After filling in these fields, save your new voice profile.
4. Configure the Voice
Once the new voice is created, navigate back to the Voice Settings menu. Under the GPT-SoVits Voice Configuration section, select the newly created voice from the list to use it for your operations.
By following these steps, you'll be able to properly configure GPT-SoVits and create a new voice profile.
Voice-to-Text Configuration
OpenAI and Groq Voice-to-Text Configuration
1. Check if SoX is Installed and Configured
Before running Groq or OpenAI Voice-to-Text, it's essential to ensure that SoX (Sound eXchange) is already installed and properly configured in your system environment. you can verify it by running:
sox --version
2. Running Groq or OpenAI voice-to-text service
Once SoX is installed and properly configured, Groq or OpenAI audio recording can be executed.
If SoX is not correctly set up, Groq or OpenAI will likely display an error message related to audio processing.
If you encounter any errors during SoX installation or while running Groq or OpenAI,
and need further assistance, please check SoX installation first.
VSCode Built-In Configuration
To use VSCode Built-In for Voice-to-Text, follow these steps:
- Open VS Code.
- Navigate to the VS Code Speech extension on the VS Code Marketplace.
- Install the extension by clicking on the "Install" button.
- After installation, configure the extension within your VSCode settings as needed.
Always ensures restart VS Code after installing the extension to apply the changes.