Abstract: In complex multi-speaker scenarios with significant speaker overlap and background noise, extracting the target speaker's speech remains a major challenge. This capability is crucial for ...