Skip to content

Releases: fishaudio/fish-speech

V1.4.1

15 Sep 09:49
Compare
Choose a tag to compare

This release includes bug fix and container optimization.

Fish Speech V1.4 Release

12 Sep 14:38
a817507
Compare
Choose a tag to compare

Fish Speech V1.4 is a leading TTS model trained on 700k hours of audio data in multiple languages.

Supported languages:

  • English (en) ~300k hours
  • Chinese (zh) ~300k hours
  • German (de) ~20k hours
  • Japanese (ja) ~20k hours
  • French (fr) ~20k hours
  • Spanish (es) ~20k hours
  • Korean (ko) ~20k hours
  • Arabic (ar) ~20k hours

Have fun :)

V1.2.1

10 Sep 00:24
237f4fd
Compare
Choose a tag to compare

This is the final stable release before 1.4 release on Sep 10.

Fish Speech V1.2 Release

18 Jul 16:41
dc250ab
Compare
Choose a tag to compare

In this release, we roll-out both 1.2 pretrain and SFT model, and also support auto-reranking for stable generation.

V1.1.2

02 Jul 04:55
97e8e3c
Compare
Choose a tag to compare

This is the final stable release before 1.2

v1.1.1

08 Jun 16:58
46dae9b
Compare
Choose a tag to compare

Improve overall performance and experience, including lots of bug fixes

v1.1.0

11 May 14:20
46594de
Compare
Choose a tag to compare

In this release, we added the VITS decoder module, which provides better phone level accuracy and semantic similarity.

v1.0.0

30 Apr 06:49
44d5fc7
Compare
Choose a tag to compare

This is a major release of fish speech, models can be found at HuggingFace.
Live demo can be found at HuggingFace Space and Fish Audio.

Models are released under BY-CC-NC-SA 4.0 License.

v0.2.0

25 Dec 11:55
1ac1938
Compare
Choose a tag to compare

This version provides basic (arch) model implementation, inference acceleration, and pretrained model.
Most functions / pipelines are tested and working properly.