【问题标题】:Spring Batch - Read multiple files from S3Spring Batch - 从 S3 读取多个文件
【发布时间】:2021-01-04 07:17:33
【问题描述】:

就像从 s3 读取 spring 批处理中的单个文件一样,我们使用

@Bean
public FlatFileItemReader<Map<String, Object>> itemReader() {
    FlatFileItemReader<Map<String, Object>> reader = new FlatFileItemReader<>();
    reader.setLineMapper(new JsonLineMapper());
    reader.setRecordSeparatorPolicy(new JsonRecordSeparatorPolicy());
    reader.setResource(resourceLoader.getResource("s3://" + amazonS3Bucket + "/" + file));
    return reader;
}

但是,如果我想从某个特定文件夹/键中读取所有文件,那么 MultiResourceItemReader 是否有一些东西,如下所示(我们用于本地文件系统)

MultiResourceItemReader<UserData> reader = new MultiResourceItemReader<>();
reader.setResources(resources);

【问题讨论】:

    标签: spring spring-boot spring-batch spring-batch-tasklet


    【解决方案1】:

    像这样创建一个 MultiResourceItemReader,

    @Autowired
    private AmazonS3 s3;
    
    @Autowired
    private ResourceLoader resourceLoader;
    
    
    public MultiResourceItemReader<String> fileItemReader() throws Exception {
    
        List<Resource> resourceList = new ArrayList<>();
    
        String s3ResponseFilePath = "s3://bucket/path/"; //put your s3 path here
    
        //TODO: warn: this functn can only return max 1000 objects
        s3objects = s3.listObjects("bucket", s3ResponseFilePath).getObjectSummaries();
    
        for(S3ObjectSummary it:s3objects)
            resourceList.add(resourceLoader.getResource( "s3://" + s3Config.getBucket() + "/" + it.getKey()));
    
        Resource[] resources = resourceList.toArray(new Resource[resourceList.size()]);
    
        MultiResourceItemReader<String> reader = new MultiResourceItemReader<>();
        reader.setResources(resources);
        reader.setDelegate(flatFileItemReader());
    
        return reader;
    }
    

    这个阅读器需要一个delegate和lineMapper,你可以这样实现,

    private FlatFileItemReader<String> flatFileItemReader() throws Exception {
    
        FlatFileItemReader<String> reader = new FlatFileItemReader<>();
        JsonLineMapper lineMapper = new JsonLineMapper();
        reader.setLineMapper(lineMapper);
        reader.afterPropertiesSet();
    
        return reader;
    }
    
    
    public class JsonLineMapper implements LineMapper<String> {
    
        private ObjectMapper mapper = new ObjectMapper();
    
        @Override
        public String mapLine(String s, int i) throws Exception {
    
            return s;
        }
    }
    

    【讨论】:

    • 似乎在my question中使用相同的策略不起作用
    【解决方案2】:

    不,由您来创建Resource 数组并将其传递给MultiResourceItemReader

    【讨论】:

    • 我在问...我可以使用 MultiResourceItemReader 读取多个 s3 文件,如果可以,那么如何?
    • 是的,你可以。您需要创建一个 s3 资源数组并将它们传递给MultiResourceItemReader
    • @GauravRaghav - 你能实现这个吗?你能显示一些代码吗?
    • @Pra_A - 是的,刚刚发布了我的答案
    • @GauravRaghav 如果我的回答有帮助,请接受:stackoverflow.com/help/someone-answers。谢谢。
    猜你喜欢
    • 2015-08-30
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-10-27
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多