我已经在AWS控制台上成功设置了glue crawler。现在我有了一个Cloudformation模板来模拟整个过程,只是我不能将Exclusions:字段添加到模板中。背景:在AWS Glue API中,Exclusions:
字段表示用于排除与数据存储(在我的示例中为S3数据存储)中的特定模式匹配的文件或文件夹的glob模式。
尽管爬虫配置旁边填充的脚本中的所有其他值,例如S3Target、爬虫名称、IAM角色和分组行为,我仍然无法在glue爬虫控制台上填充glob模式,所有这些glue设置/字段都成功地从CFN模板中填充,除了Exclusions字段,在Glue控制台上也称为排除模式。我的CFN模板通过了验证,我已经运行了爬虫程序,希望排除的globs虽然隐藏,但仍然会有某种影响,但不幸的是,我似乎不能填充排除字段?
Here's the S3Target Exclusion AWS Glue API guide
Here's an AWS sample YAML CFN for a Glue Crawler
Here's a helpful YAML string array guide
YAML
CFNCrawlerSecDeraNUM:
Type: AWS::Glue::Crawler
Properties:
Name: !Ref CFNCrawlerName
Role: !GetAtt CFNRoleSecDERA.Arn
#Classifiers: none, use the default classifier
Description: AWS Glue crawler to crawl SecDERA data
#Schedule: none, use default run-on-demand
DatabaseName: !Ref CFNDatabaseName
Targets:
S3Targets:
- Exclusions:
- "*/readme.htm"
- "*/sub.txt"
- "*/pre.txt"
- "*/tag.txt"
- Path: "s3://sec-input"
TablePrefix: !Ref CFNTablePrefixName
SchemaChangePolicy:
UpdateBehavior: "UPDATE_IN_DATABASE"
DeleteBehavior: "LOG"
# Added single schema grouping Glue API option
Configuration: "{\"Version\":1.0,\"CrawlerOutput\":{\"Partitions\":{\"AddOrUpdateBehavior\":\"InheritFromTable\"},\"Tables\":{\"AddOrUpdateBehavior\":\"MergeNewColumns\"}},\"Grouping\":{\"TableGroupingPolicy\":\"CombineCompatibleSchemas\"}}"
JSON
"CFNCrawlerSecDeraNUM": {
"Type": "AWS::Glue::Crawler",
"Properties": {
"Name": {
"Ref": "CFNCrawlerName"
},
"Role": {
"Fn::GetAtt": [
"CFNRoleSecDERA",
"Arn"
]
},
"Description": "AWS Glue crawler to crawl SecDERA data",
"DatabaseName": {
"Ref": "CFNDatabaseName"
},
"Targets": {
"S3Targets": [
{
"Exclusions": [
"*/readme.htm",
"*/sub.txt",
"*/pre.txt",
"*/tag.txt"
]
},
{
"Path": "s3://sec-input"
}
]
},
"TablePrefix": {
"Ref": "CFNTablePrefixName"
},
"SchemaChangePolicy": {
"UpdateBehavior": "UPDATE_IN_DATABASE",
"DeleteBehavior": "LOG"
},
"Configuration": "{\"Version\":1.0,\"CrawlerOutput\":{\"Partitions\":{\"AddOrUpdateBehavior\":\"InheritFromTable\"},\"Tables\":{\"AddOrUpdateBehavior\":\"MergeNewColumns\"}},\"Grouping\":{\"TableGroupingPolicy\":\"CombineCompatibleSchemas\"}}"
}
}
转载请注明出处:http://www.xhjyjj.com/article/20230526/1186353.html