Massive data import! - Part 1
Fabrice Bernhard2 min read
We often have to face the problem of importing data off an Excel file with thousands lines.
PHP is not suited for this task, it’s slow and there is a high risk for the import to crash due to “memory limit” or some other annoying stuff like that!
So instead we chose a better way by using pure SQL which is much faster at this kind of operation.
At first, you must convert your Excel file to CSV (Excel does it very well). Be careful to choose the right field separator: I generally use ”~” because there is little chance of finding this character in your written data.
Steps:
- Create a temporary table that matches exactly the structure of the Excel file
- Fill the temporary table with the CSV file
- Run SQL queries to fill your database
Practical example:
Suppose we have an Excel file containing thousands of users that must be dispatched to several tables depending on their type.
CSV file sample:
User 1~user1@theodo.fr~0987564321~user~~~
User 2~user2@theodo.fr~0134256789~user~~~
User 3~user3@theodo.fr~0128971271~user~~~
Agent 1~agent1@company.com~0486282688~agent~Company 1~Role 1~0987654321
Agent 2~agent2@company.com~0176254621~agent~Company 2~Role 2~0445664332
User 4~user4@company.com~0456789856~user~~~
1. Create the temporary table
We will create a table contain the following fields:
- name
- phone
- type
- company_name
- agent_role
- company_phone
DROP TABLE IF EXISTS user_tmp; CREATE TABLE user_tmp ( name varchar(127), email varchar(127), phone varchar(20), type varchar(20), company_name varchar(127), agent_role varchar(127), company_phone varchar(20), id int(11) NOT NULL auto_increment, PRIMARY KEY (`id`), UNIQUE KEY `IDX_ATTRIBUTE_VALUE` (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
2. Fill the temporary table
Import your CSV file into the temporary table:
LOAD DATA LOCAL INFILE 'PATH\_TO\_YOUR\_CSV\_FILE/users.csv'
INTO TABLE user\_tmp CHARACTER SET 'utf8' FIELDS TERMINATED BY '~' LINES TERMINATED BY '\\n';
3. Fill your own tables
Suppose you have the following two tables:
User
- name
- phone
Agent
- name
- phone
- company_name
- role
- company_phone
Insert data with SQL queries:
INSERT INTO user (name, phone, email) SELECT name, phone, email FROM user_tmp WHERE type = ‘user’;
INSERT INTO agent (name, phone, email, company_name, role, company_phone) SELECT name, phone, email, company_name, agent_role, company_phone FROM user_tmp WHERE type = ‘agent’;
All done! Your tables are complete.
This is a simple example, you can use this method to make more complex data imports (with joins). All you need to do is to adapt your SQL queries.
Here we have seen how we can leverage something fast but apparently limited (LOAD DATA) and make it powerful, by using a temporary table and SQL requests inserting data into the actual tables.