Skip to main content

Configuration

This guide will walk you through configuring the voice service, which includes both Text-to-Voice and Voice-to-Text settings. The available options for each service are listed below.

Table of Contents


Text-to-Voice Configuration

OpenAI Text-to-Voice Configuration

To use OpenAI for Text-to-Voice conversion, you must set the OpenAI API key.

Follow these steps to configure OpenAI:

  1. Obtain your OpenAI API key.
  2. Enter your OpenAI API key in the General setting.
  3. Navigate to the voice settings in your application.
  4. Under the OpenAI Voice Configuration section, you can change the voice type.

OpenAI-Voice-Selection

note

For more detailed instructions on how to access the configuration panel, please refer to the Configuration Panel Access Guide.

GPT-SoVits Configuration

This section outlines how to configure the GPT-SoVits port and set up a new voice using the advanced settings.

1. Ensure GPT-SoVits Port is Set

First, ensure that the GPT-SoVits port is properly set in the general settings.

GPT-SoVits-port-setting

You can find instructions on how to build and configure the port in the GPT-SoVits installation

2. Navigate to Voice Settings

Once the port is set up, follow these steps:

  1. Navigate to the Voice Settings menu in the app.
  2. Locate and click the GPT-SoVits Advanced Settings option.

GPT-SoVits-Advanced-Settings

3. Create a New Voice

After opening the GPT-SoVits Advanced Settings, you can create a new voice profile by providing the following details:

voice-profile

  • Name: The name which will be in the voice selection list later.
  • Refer WAV Path: Path to the reference voice file in .wav format. This file will be used to model the new voice.
  • Refer Text: A transcription of what is spoken in the reference WAV file.
  • Prompt Language: The language spoken in the reference WAV file (e.g., English, Chinese, Japanese, etc.).
tip

If your reference WAV file is in Chinese, the transcription will work both in Chinese and English.

After filling in these fields, save your new voice profile.

4. Configure the Voice

Once the new voice is created, navigate back to the Voice Settings menu. Under the GPT-SoVits Voice Configuration section, select the newly created voice from the list to use it for your operations.
GPT-SoVits-Voice-Selection

By following these steps, you'll be able to properly configure GPT-SoVits and create a new voice profile.


Voice-to-Text Configuration

OpenAI and Groq Voice-to-Text Configuration

1. Check if SoX is Installed and Configured

Before running Groq or OpenAI Voice-to-Text, it's essential to ensure that SoX (Sound eXchange) is already installed and properly configured in your system environment. you can verify it by running:

sox --version

2. Running Groq or OpenAI voice-to-text service

Once SoX is installed and properly configured, Groq or OpenAI audio recording can be executed. If SoX is not correctly set up, Groq or OpenAI will likely display an error message related to audio processing.
If you encounter any errors during SoX installation or while running Groq or OpenAI, and need further assistance, please check SoX installation first.

VSCode Built-In Configuration

To use VSCode Built-In for Voice-to-Text, follow these steps:

  1. Open VS Code.
  2. Navigate to the VS Code Speech extension on the VS Code Marketplace.
    vscode-speech-installation
  3. Install the extension by clicking on the "Install" button.
  4. After installation, configure the extension within your VSCode settings as needed.
note

Always ensures restart VS Code after installing the extension to apply the changes.