Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND APPARATUS FOR PERFORMING SPEAKER DIARIZATION ON MIXED-BANDWIDTH SPEECH SIGNALS
Document Type and Number:
WIPO Patent Application WO/2023/101343
Kind Code:
A1
Abstract:
An apparatus for processing speech data may include a processor configured to: separate an input speech into speech signals; identify a bandwidth of each of the speech signals; extract speaker embeddings from the speech signals based on the bandwidth of each of the speech signals, using at least one neural network configured to receive the speech signals and output the speaker embeddings; and cluster the speaker embeddings into one or more speaker clusters, each speaker cluster corresponding to a speaker identity.

Inventors:
KIM MYUNGJONG (US)
ANAPSINGEKAR VIJENDRA RAJ (US)
ANSHU AVIRAL (US)
KI TAEYEON (US)
Application Number:
PCT/KR2022/018957
Publication Date:
June 08, 2023
Filing Date:
November 28, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SAMSUNG ELECTRONICS CO LTD (KR)
International Classes:
G10L17/02; G10L17/04; G10L17/18; G10L21/0272; G10L25/18
Foreign References:
JP2000330590A2000-11-30
KR20190092379A2019-08-07
Other References:
QUAN WANG; CARLTON DOWNEY; LI WAN; PHILIP ANDREW MANSFIELD; IGNACIO LOPEZ MORENO: "Speaker Diarization with LSTM", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 28 October 2017 (2017-10-28), 201 Olin Library Cornell University Ithaca, NY 14853 , XP080996820
LEE KONG AIK; YAMAMOTO HITOSHI; OKABE KOJI; WANG QIONGQIONG; GUO LING; KOSHINAKA TAKAFUMI; ZHANG JIACEN; SHINODA KOICHI: "NEC-TT System for Mixed-Bandwidth and Multi-Domain Speaker Recognition", COMPUTER SPEECH AND LANGUAGE., ELSEVIER, LONDON., GB, vol. 61, 13 November 2019 (2019-11-13), GB , XP085977603, ISSN: 0885-2308, DOI: 10.1016/j.csl.2019.101033
LARCHER ANTHONY; MEHRISH AMBUJ; TAHON MARIE; MEIGNIER SYLVAIN; CARRIVE JEAN; DOUKHAN DAVID; GALIBERT OLIVIER; EVANS NICHOLAS: "Speaker Embeddings for Diarization of Broadcast Data In The Allies Challenge", ICASSP 2021 - 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), IEEE, 6 June 2021 (2021-06-06), pages 5799 - 5803, XP033955445, DOI: 10.1109/ICASSP39728.2021.9414215
Attorney, Agent or Firm:
Y.P.LEE, MOCK & PARTNERS (KR)
Download PDF: