PlayHT - LowCodeAPI

PlayHT

7 APIs

Streaming(1)
- POST
  Stream Audio From Text

POST

PlayHTStreamingStream Audio From Text

POST

https://api.lowcodeapi.com/playht/api/v2/tts/stream

Request Payload

xxxxxxxxxx
 
{
  "text": "",
  "voice": ""
}

The payload will be sent as a application/json as a part of the request body.

Payload

seednumber

An integer number greater than or equal to 0. If equal to null or not provided, a random seed will be used. Useful to control the reproducibility of the generated audio. Assuming all other properties didn't change, a fixed seed should always generate the exact same audio file.

textstring^*

The text to be converted to speech. (Defaults to Hello from a realistic voice.)

speednumber

Control how fast the generated audio should be. A number greater than 0 and less than or equal to 5.0

voicestring^*

The unique ID for a PlayHT or Cloned Voice. (Defaults to s3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json)

emotionenum

An emotion to be applied to the speech. Only supported when voice_engine is set to PlayHT2.0 or PlayHT2.0-turbo, and voice uses that engin

qualityenum

The quality for the audio

sample_ratenumber

A number greater than or equal to 8000, and must be less than or equal to 48000

temperaturenumber

A floating point number between 0, inclusive, and 2, inclusive. If equal to null or not provided, the model's default temperature will be used. The temperature parameter controls variance. Lower temperatures result in more predictable results, higher temperatures allow each run to vary more, so the voice may sound less like the baseline voice.

voice_engineenum

The voice engine used to synthesize the voice.

output_formatenum

The format for the output audio. Note that PlayHT1.0 engine voices and JSON output format only support 'mp3' and 'mulaw. (Defaults to mp3)

text_guidancenumber

A number between 1 and 2. This number influences how closely the generated speech adheres to the input text. Use lower values to create more fluid speech, but with a higher chance of deviating from the input text. Higher numbers will make the generated speech more accurate to the input text, ensuring that the words spoken align closely with the provided text. Only supported when voice_engine is set to PlayHT2.0, and voice uses that engine.

style_guidancenumber

A number between 1 and 30. Use lower numbers to to reduce how strong your chosen emotion will be. Higher numbers will create a very emotional performance. Only supported when voice_engine is set to PlayHT2.0 or PlayHT2.0-turbo, and voice uses that engine.

voice_guidancenumber

A number between 1 and 6. Use lower numbers to reduce how unique your chosen voice will be compared to other voices. Higher numbers will maximize its individuality. Only supported when voice_engine is set to PlayHT2.0 or PlayHT2.0-turbo, and voice uses that engine.

Response

API response data will be shown here once the request is completed.

Created by @samalgorai