thunder出版

ada-dataset vol.002

¥11,000 JPY

セール売り切れ

税込。

291個の在庫

Translated JA CN TW

Timestamp-aligned English speech-to-text translation data, delivered on a standard CD-ROM. Each disc contains 5–6 long-form English audio recordings (WAV, 16kHz mono) with word-level and segment-level alignment annotations — ready for real-time subtitle training, simultaneous translation research, and ASR fine-tuning.

What's on the disc

Audio	5–6 WAV files, ~10 min each, totaling ~50–60 min per disc
Annotations	JSONL — segment-level timestamps, source transcript, aligned translation, word-level timing with confidence scores
Subtitles	SRT + VTT — bilingual subtitle files for every audio track
Capacity	~580–700 MB per disc

Data structure per audio file

{
  "audio_path": "wavs/013429.wav",
  "source_language": "english",
  "duration_seconds": 142.5,
  "segments": [
    {
      "start": 0.000,
      "end": 4.230,
      "source_text": "We won't feel compelled...",
      "words": [
        {"word": "We", "start": 0.00, "end": 0.15, "score": 0.98},
        {"word": "won't", "start": 0.18, "end": 0.42, "score": 0.95}
      ]
    }
  ]
}

Series & catalog

This disc is part of an ongoing CD series. Each volume is a self-contained dataset — no other volumes are required. New volumes ship regularly, covering different speech domains: business meetings, news broadcasts, lectures, and casual conversation. Collect them individually or build a comprehensive corpus over time.

License

Once you purchase this CD, the data is yours to use — freely and permanently.

Commercial use — products, SaaS, internal tools, client work
Model training — fine-tune, distill, or train from scratch. No royalties
Modify & derive — transform, augment, merge with your own datasets
No expiration — perpetual license, no recurring fees, no strings attached

Source dataset license is documented in the included LICENSE file on each disc.

Specs

Media	CD-ROM (700 MB)
Audio format	WAV 16kHz 16-bit mono
Audio length	~50–60 min per disc
Annotation	JSONL + SRT + VTT
Alignment	Word-level + segment-level timestamps
Shipping	Physical CD

Who this is for

ML engineers building real-time subtitle or live translation systems
Researchers benchmarking long-form speech translation models
Teams training or evaluating ASR with forced-alignment ground truth
Anyone who wants clean, timestamp-aligned English speech data they actually own

Data structure per audio file JA Translated

{
  "audio_path": "wavs/013429.wav",
  "source_language": "english",
  "target_language": "japanese",
  "duration_seconds": 142.5,
  "segments": [
    {
      "start": 0.000,
      "end": 4.230,
      "source_text": "We won't feel compelled...",
      "source_text": "私たちは強制されることはないだろう…",
      "words": [
        {"word": "We", "start": 0.00, "end": 0.15, "score": 0.98},
        {"word": "won't", "start": 0.18, "end": 0.42, "score": 0.95}
      ]
    }
  ]
}

詳細を表示する

カートにアイテムが追加されました

ada-dataset vol.002