A project developed at Opower that uses javax.persistence
annotations to load objects in MySQL using its infile stream format. This component is meant for importing large amount of data at a high throughput rate. It supports many of the same features Hibernate supports except with up to 10x performance gain. So for example, if your data takes 60 seconds to be imported into a MySQL database, with jPile it would only take 6 seconds! You don't have to change anything on your model objects. jPile will read the persistence annotations automatically at start up.
The following annotations are supported:
- @Table
- @SecondaryTable
- @SecondaryTables
- @Embedded
- @EmbeddedId
- @Id
- @Column
- @OneToMany
- @ManyToOne
- @OneToOne
- @JoinColumn
- @PrimaryKeyJoinColumn
- @GeneratedValue
- @Temporal
- @Enumerated
jPile cannot rely on MySQL auto_generated
option. Typical database operations save a new row and fetch the last auto generated id. This is not possible when flushing an infile stream to the database. Instead jPile tries to generate its own auto generated ids for any column definition that has @GeneratedValue(strategy = GenerationType.AUTO)
.
jPile allows the client to configure whether entities are updated when inserting into an existing row with a duplicate primary/unique key. There is a slight decrease in performance when using this feature: persisting entities takes around 30-40% longer. Performance of replacing entities decreases, as the number of rows that need to be updated increases.
jPile needs a local MySQL running and Apache Maven. Create a new database schema called 'jpile' using CREATE DATABASE jpile CHARACTER SET utf8 COLLATE utf8_general_ci
. The test classes use root
with no password to login. The username and password is located in AbstractIntTestForJPile
class.
All test cases will automatically create and drop the required tables for integration tests. After creating the local database, you should be able to run mvn clean install
to run all the tests and install locally.
The project is still under development. One of the reasons we decided to go open source was so that other people could improve this project. If you find any bugs, please create a new issue or contact the lead developer on the project. If you have a fix, then please submit a patch. Make sure that you have added new test cases that show what the patch fixes.
jPile is very easy to use. If you are using Maven, then add the following dependency:
<dependency>
<groupId>com.opower</groupId>
<artifactId>jpile</artifactId>
<version>1.7.11</version>
</dependency>
The most common use case is to create a new instance of HierarchicalInfileObjectLoader
. You have to provide a valid database Connection
. HierarchicalInfileObjectLoader
doesn't rely on a database pool because it needs to disable foreign key constraints. Using multiple connections would fail because each new connection would have foreign key constraints enabled by default. Below shows how to do this.
Connection connection = ...;
HierarchicalInfileObjectLoader hierarchicalInfileObjectLoader = new HierarchicalInfileObjectLoader();
try {
hierarchicalInfileObjectLoader.setConnection(connection);
hierarchicalInfileObjectLoader.persist(myEntity);
// Add more using persist()
} finally {
hierarchicalInfileObjectLoader.close();
connection.close(); // Don't forget to close the connection
}
jPile is released on the MIT license which is available in license.txt
to read.
By running the performance test: mvn clean install -Dperformance
25,000 fake objects were created. Each object has a Customer, Contact (One-to-one) and 4 Products (One-to-many) which have a Supplier (Many-to-one). All these objects were saved using simple MySQL prepared statements, Hibernate, and jPile. The results were as follows:
- Prepared Statements - 60s
- Hibernate - 40s
- jPile - 6s