【发布时间】:2019-05-31 09:26:35
【问题描述】:
我有如下弹簧批处理应用程序(表名和查询被编辑为一些通用名称)
当我执行这个程序时,它能够读取 7500 个事件,即 3 倍的块大小,并且无法读取 oracle 数据库中的剩余记录。我有一个包含 5000 万条记录的表,并且能够复制到另一个 noSql 数据库。
@EnableBatchProcessing
@SpringBootApplication
@EnableAutoConfiguration
public class MultiThreadPagingApp extends DefaultBatchConfigurer{
@Autowired
private JobBuilderFactory jobBuilderFactory;
@Autowired
private StepBuilderFactory stepBuilderFactory;
@Autowired
public DataSource dataSource;
@Bean
public DataSource dataSource() {
final DriverManagerDataSource dataSource = new DriverManagerDataSource();
dataSource.setDriverClassName("oracle.jdbc.OracleDriver");
dataSource.setUrl("jdbc:oracle:thin:@***********");
dataSource.setUsername("user");
dataSource.setPassword("password");
return dataSource;
}
@Override
public void setDataSource(DataSource dataSource) {}
@Bean
@StepScope
ItemReader<UserModel> dbReader() throws Exception {
JdbcPagingItemReader<UserModel> reader = new JdbcPagingItemReader<UserModel>();
final SqlPagingQueryProviderFactoryBean sqlPagingQueryProviderFactoryBean = new SqlPagingQueryProviderFactoryBean();
sqlPagingQueryProviderFactoryBean.setDataSource(dataSource);
sqlPagingQueryProviderFactoryBean.setSelectClause("select * ");
sqlPagingQueryProviderFactoryBean.setFromClause("from user");
sqlPagingQueryProviderFactoryBean.setWhereClause("where id>0");
sqlPagingQueryProviderFactoryBean.setSortKey("name");
reader.setQueryProvider(sqlPagingQueryProviderFactoryBean.getObject());
reader.setDataSource(dataSource);
reader.setPageSize(2500);
reader.setRowMapper(new BeanPropertyRowMapper<>(UserModel.class));
reader.afterPropertiesSet();
reader.setSaveState(true);
System.out.println("Reading users anonymized in chunks of {}"+ 2500);
return reader;
}
@Bean
public Dbwriter writer() {
return new Dbwriter(); // I had another class for this
}
@Bean
public Step step1() throws Exception {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(4);
taskExecutor.setMaxPoolSize(10);
taskExecutor.afterPropertiesSet();
return this.stepBuilderFactory.get("step1")
.<UserModel, UserModel>chunk(2500)
.reader(dbReader())
.writer(writer())
.taskExecutor(taskExecutor)
.build();
}
@Bean
public Job multithreadedJob() throws Exception {
return this.jobBuilderFactory.get("multithreadedJob")
.start(step1())
.build();
}
@Bean
public PlatformTransactionManager getTransactionManager() {
return new ResourcelessTransactionManager();
}
@Bean
public JobRepository getJobRepo() throws Exception {
return new MapJobRepositoryFactoryBean(getTransactionManager()).getObject();
}
public static void main(String[] args) {
SpringApplication.run(MultiThreadPagingApp.class, args);
}
}
您能帮我如何使用 Spring Batch 有效地读取所有记录,或者帮助我任何其他方法来处理这个问题。我尝试过这里提到的一种方法:http://techdive.in/java/jdbc-handling-huge-resultset 使用单线程应用程序读取和保存所有记录需要 120 分钟。由于春季批次最适合这种情况,我认为我们可以快速处理这种情况。
【问题讨论】:
-
为什么没有完成工作?是否抛出异常,我们需要更多信息
-
它不会抛出任何异常。它刚刚完成了我阅读三倍块大小的工作并停止了。这是日志:作业:[SimpleJob:[name=multithreadedJob]] 使用以下参数启动:[{}] INFO 15008 --- [main] osbatch.core.job.SimpleStepHandler :执行步骤:[step1] INFO 15008 --- [main] osbclsupport.SimpleJobLauncher : Job: [SimpleJob: [name=multithreadedJob]] 使用以下参数完成:[{}] 和以下状态:[COMPLETED]
标签: spring spring-boot spring-integration spring-batch