使用Shell脚本在文件中进行模式匹配

时间:2016-03-27 12:02:37

标签: regex bash shell awk sed

我有一个包含数百个单词的文件。 该文件的某些部分包含以下划线(_)开头并以逗号(,)分隔的单词。我想找到以逗号分隔的下划线开头的单词并将其保存在某个数组中。我怎么能这样做?

I tried cat <filename> grep _*

但它列出了行而不是单词。

示例文件(Apple的tbd文件)。我想得到以下划线开头的单词列表

 archs:           [ armv7, armv7s, arm64 ]
platform:        ios
install-name:    /System/Library/Frameworks/AVFoundation.framework/AVFoundation
current-version: 2.0
objc-constraint: retain_release
exports:
  - archs:           [ armv7, armv7s, arm64 ]
    re-exports:      [ /System/Library/Frameworks/AVFoundation.framework/libAVFAudio.dylib ]
    symbols:         [ _AVAssetAssociatedSubtitlesTrackReferencesKey, _AVAssetChapterListTrackReferencesKey,
                       _AVAssetChapterMetadataGroupsDidChangeNotification,
                       _AVAssetDownloadSessionAirPlayAuthorizationInfoKey,
                       _AVAssetDownloadSessionCachePrimingDownloadTokenKey,
                       _AVAssetDownloadSessionClientAuditTokenKey, _AVAssetDownloadSessionClientBundleIdentifierKey,
                       _AVAssetDownloadSessionCurrentLoadedTimeRangesKey,
                       _AVAssetDownloadSessionDeleteDownloadWhenAssetFinalizesKey,
                       _AVAssetDownloadSessionDidResolveMediaSelectionNotification,
                       _AVAssetDownloadSessionDownloadFailedNotification,
                       _AVAssetDownloadSessionDownloadSucceededNotification,
                       _AVAssetDownloadSessionFileSizeAvailableNotification,
                       _AVAssetDownloadSessionHTTPCookiesKey, _AVAssetDownloadSessionHTTPHeaderFieldsKey,
                       _AVAssetDownloadSessionLoadedTimeRangesChangedNotification,
                       _AVAssetDownloadSessionMaxSizeAllowedForCellularAccessKey,
                       _AVAssetDownloadSessionMediaSelectionKey, _AVAssetDownloadSessionMinimumRequiredMediaBitrateKey,
                       _AVAssetDownloadSessionNewlyLoadedTimeRangeKey,
                       _AVAssetDownloadSessionOptimizeAccessForLinearMoviePlaybackKey,
                       _AVAssetDownloadSessionPreferredAudibleMediaCharacteristicKey,
                       _AVAssetDownloadSessionPreferredLegibleMediaCharacteristicKey,
                       _AVAssetDownloadSessionPreferredVisualMediaCharacteristicKey,
                       _AVAssetDownloadSessionPriorityKey, _AVAssetDownloadSessionProtectedContentSupportStorageURLKey,
                       _AVAssetDownloadSessionPurchaseBundleKey, _AVAssetDownloadSessioniTunesStoreContentDSIDKey,
                       _AVAssetDownloadSessioniTunesStoreContentDownloadParametersKey,
                       _AVAssetDownloadSessioniTunesStoreContentIDKey,
                       _AVAssetDownloadSessioniTunesStoreContentInfoKey,
                       _AVAssetDownloadSessioniTunesStoreContentPurchasedMediaKindKey,
                       _AVAssetDownloadSessioniTunesStoreContentTypeKey,
                       _AVAssetDownloadSessioniTunesStoreContentUserAgentKey,
                       _AVAssetDownloadTaskMediaSelectionKey, _AVAssetDownloadTaskMinimumRequiredMediaBitrateKey,
                       _AVAssetDurationDidChangeNotification, _AVAssetExportPreset1280x720,
                       _AVAssetExportPreset1920x1080, _AVAssetExportPreset3840x2160,
                       _AVAssetExportPreset3GPRelease6MMS, _AVAssetExportPreset640x480,
                       _AVAssetExportPreset960x540, _AVAssetExportPresetAppleM4A,
                       _AVAssetExportPresetAudioOnlyMMS, _AVAssetExportPresetAuxSmall,

1 个答案:

答案 0 :(得分:2)

您可以尝试:

$ grep -o ' _[a-zA-Z]*' <filename>

<强>输出:

 _AVAssetAssociatedSubtitlesTrackReferencesKey
 _AVAssetChapterListTrackReferencesKey
 _AVAssetChapterMetadataGroupsDidChangeNotification
 _AVAssetDownloadSessionAirPlayAuthorizationInfoKey
 _AVAssetDownloadSessionCachePrimingDownloadTokenKey
 _AVAssetDownloadSessionClientAuditTokenKey
 _AVAssetDownloadSessionClientBundleIdentifierKey
 _AVAssetDownloadSessionCurrentLoadedTimeRangesKey
 _AVAssetDownloadSessionDeleteDownloadWhenAssetFinalizesKey
 _AVAssetDownloadSessionDidResolveMediaSelectionNotification
 ...

将结果存储在数组arr中:

$ words=`grep -o ' _[a-zA-Z]*' <filename>`
$ read -a arr <<<$words

输出arr的元素:

$ for elem in ${arr[*]}
> do
>    echo ${elem}
> done
_AVAssetAssociatedSubtitlesTrackReferencesKey
_AVAssetChapterListTrackReferencesKey
_AVAssetChapterMetadataGroupsDidChangeNotification
_AVAssetDownloadSessionAirPlayAuthorizationInfoKey
_AVAssetDownloadSessionCachePrimingDownloadTokenKey
_AVAssetDownloadSessionClientAuditTokenKey
_AVAssetDownloadSessionClientBundleIdentifierKey
_AVAssetDownloadSessionCurrentLoadedTimeRangesKey
_AVAssetDownloadSessionDeleteDownloadWhenAssetFinalizesKey
_AVAssetDownloadSessionDidResolveMediaSelectionNotification
...