thunder出版

ada-dataset vol.001

ada-dataset vol.001

通常価格 ¥11,000 JPY
通常価格 セール価格 ¥11,000 JPY
セール 売り切れ
税込。

291個の在庫

Translated

Timestamp-aligned English speech-to-text translation data, delivered on a standard CD-ROM. Each disc contains 5–6 long-form English audio recordings (WAV, 16kHz mono) with word-level and segment-level alignment annotations — ready for real-time subtitle training, simultaneous translation research, and ASR fine-tuning.

What's on the disc

Audio 5–6 WAV files, ~10 min each, totaling ~50–60 min per disc
Annotations JSONL — segment-level timestamps, source transcript, aligned translation, word-level timing with confidence scores
Subtitles SRT + VTT — bilingual subtitle files for every audio track
Capacity ~580–700 MB per disc

Data structure per audio file

{
  "audio_path": "wavs/013429.wav",
  "source_language": "english",
  "duration_seconds": 142.5,
  "segments": [
    {
      "start": 0.000,
      "end": 4.230,
      "source_text": "We won't feel compelled...",
      "words": [
        {"word": "We", "start": 0.00, "end": 0.15, "score": 0.98},
        {"word": "won't", "start": 0.18, "end": 0.42, "score": 0.95}
      ]
    }
  ]
}

Series & catalog

This disc is part of an ongoing CD series. Each volume is a self-contained dataset — no other volumes are required. New volumes ship regularly, covering different speech domains: business meetings, news broadcasts, lectures, and casual conversation. Collect them individually or build a comprehensive corpus over time.

License

Once you purchase this CD, the data is yours to use — freely and permanently.

  • Commercial use — products, SaaS, internal tools, client work
  • Model training — fine-tune, distill, or train from scratch. No royalties
  • Modify & derive — transform, augment, merge with your own datasets
  • No expiration — perpetual license, no recurring fees, no strings attached

Source dataset license is documented in the included LICENSE file on each disc.

Specs

Media CD-ROM (700 MB)
Audio format WAV 16kHz 16-bit mono
Audio length ~50–60 min per disc
Annotation JSONL + SRT + VTT
Alignment Word-level + segment-level timestamps
Shipping Physical CD

Who this is for

  • ML engineers building real-time subtitle or live translation systems
  • Researchers benchmarking long-form speech translation models
  • Teams training or evaluating ASR with forced-alignment ground truth
  • Anyone who wants clean, timestamp-aligned English speech data they actually own

 

Data structure per audio file JA Translated

{
  "audio_path": "wavs/013429.wav",
  "source_language": "english",
"target_language": "japanese",   "duration_seconds": 142.5, "segments": [ { "start": 0.000, "end": 4.230, "source_text": "We won't feel compelled...",
"source_text": "私たちは強制されることはないだろう…",       "words": [ {"word": "We", "start": 0.00, "end": 0.15, "score": 0.98}, {"word": "won't", "start": 0.18, "end": 0.42, "score": 0.95} ] } ] }
数量
詳細を表示する