recogn_google(音频)过滤掉坏词

时间:2018-04-14 02:58:08

标签: python speech-recognition

我正在使用google speech_recognition api来解决这个问题。它会自动过滤掉坏词并返回一个字符串,如" F ***"或" P ******"

这是我的代码。我的代码中没有错误,但请帮助我如何从我的音频中获取原始转换后的文本。

    from gtts import gTTS
    import speech_recognition as sr

    r = sr.Recognizer()

with sr.Microphone() as source:
    print('Ready...')
    r.pause_threshold = 1
    r.adjust_for_ambient_noise(source, duration=1)
    audio = r.listen(source)

    command = r.recognize_google(audio).lower()
    print('You said: ' + command + '\n')

1 个答案:

答案 0 :(得分:3)

  

profanity_filter

     

可选如果设置为true,服务器将尝试过滤掉亵渎语言,用星号替换除了每个过滤词中的初始字符之外的所有字符,例如“F***”。如果设置为false或省略,则不会过滤掉亵渎。

搜索: https://googlecloudplatform.github.io/google-cloud-python/latest/search.html?q=profanity_filter&check_keywords=yes&area=default

示例:

https://googlecloudplatform.github.io/google-cloud-python/latest/speech/index.html?highlight=profanity_filter#synchronous-recognition

  

使用亵渎性过滤器的示例。

>>> from google.cloud import speech
>>> client = speech.SpeechClient()
>>> results = client.recognize(
...     audio=speech.types.RecognitionAudio(
...         uri='gs://my-bucket/recording.flac',
...     ),
...     config=speech.types.RecognitionConfig(
...         encoding='LINEAR16',
...         language_code='en-US',
...         profanity_filter=True,
...         sample_rate_hertz=44100,
...     ),
... )
>>> for result in results:
...     for alternative in result.alternatives:
...         print('=' * 20)
...         print('transcript: ' + alternative.transcript)
...         print('confidence: ' + str(alternative.confidence))
====================
transcript: Hello, this is a f****** test
confidence: 0.81

很好的例子; - )

(我没有测试过这个)