Question

我正在使用google speech_recognition api来解决这个问题。它会自动过滤掉坏词并返回一个字符串，如＆＃34; F ***＆＃34;或＆＃34; P ******＆＃34;

这是我的代码。我的代码中没有错误，但请帮助我如何从我的音频中获取原始转换后的文本。

    from gtts import gTTS
    import speech_recognition as sr

    r = sr.Recognizer()

with sr.Microphone() as source:
    print('Ready...')
    r.pause_threshold = 1
    r.adjust_for_ambient_noise(source, duration=1)
    audio = r.listen(source)

    command = r.recognize_google(audio).lower()
    print('You said: ' + command + '\n')

Answer 1

profanity_filter

可选如果设置为true，服务器将尝试过滤掉亵渎语言，用星号替换除了每个过滤词中的初始字符之外的所有字符，例如“F***”。如果设置为false或省略，则不会过滤掉亵渎。

搜索： https://googlecloudplatform.github.io/google-cloud-python/latest/search.html?q=profanity_filter&check_keywords=yes&area=default

示例：

https://googlecloudplatform.github.io/google-cloud-python/latest/speech/index.html?highlight=profanity_filter#synchronous-recognition

使用亵渎性过滤器的示例。

>>> from google.cloud import speech
>>> client = speech.SpeechClient()
>>> results = client.recognize(
...     audio=speech.types.RecognitionAudio(
...         uri='gs://my-bucket/recording.flac',
...     ),
...     config=speech.types.RecognitionConfig(
...         encoding='LINEAR16',
...         language_code='en-US',
...         profanity_filter=True,
...         sample_rate_hertz=44100,
...     ),
... )
>>> for result in results:
...     for alternative in result.alternatives:
...         print('=' * 20)
...         print('transcript: ' + alternative.transcript)
...         print('confidence: ' + str(alternative.confidence))
====================
transcript: Hello, this is a f****** test
confidence: 0.81

很好的例子; - ）

（我没有测试过这个）

recogn_google（音频）过滤掉坏词

1 个答案: