【发布时间】:2020-08-10 18:48:59
【问题描述】:
我需要在 mysql 中插入大量数据(大约 100k)然后我尝试使用 Spring Data Jpa 批量插入,因为我使用的是一个包含 30 条记录的简单示例。
第一件事是删除@GeneratedValue os,我的实体端实现了Persistable,在插入之前不需要选择:
@Entity
public class User implements Persistable {
@Id
private Integer id;
// properties here...
然后,在我的 application.yml 上:
spring:
jpa:
properties:
hibernate.jdbc.batch_size: 30
hibernate.generate_statistics: true
show-sql: true
hibernate:
ddl-auto: validate
datasource:
driverClassName: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://localhost:3306/db?cachePrepStmts=true&reWriteBatchedInserts=true
// user and password
我有一个简单的存储库:
public interface UserRepository extends JpaRepository<User, Integer> { }
以及插入方法:
public void process() {
List<User> users = new ArrayList<>();
for (int i = 1 ; i <= 30; i++) {
User user = new User();
user.setId(i);
// set properties
users.add(user);
if(i % 30 == 0) {
userRepository.saveAll(users);
users.clear();
}
}
}
那么我认为正确的是只有1个批处理操作,但我有29条语句:
1745893 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
3524622 nanoseconds spent preparing 30 JDBC statements;
68290171 nanoseconds spent executing 29 JDBC statements;
215125391 nanoseconds spent executing 1 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
240389888 nanoseconds spent executing 1 flushes (flushing a total of 29 entities and 29 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
有什么想法吗?
谢谢!
【问题讨论】:
-
您确认执行的这 29 条语句是用于 id check .ie 在插入之前还是用于其他一些列,即在插入之后?