Importing

Basics of Importing

The following is a basic overview of the import process. Please see the subsequent example for commands that are to be used. Complete documentation for the LOAD DATA command to be run within MariaDB may be found at [https://mariadb.com/kb/en/library/load-data-infile/].

The general process is as follows:

(1) Move your data to the appropriate import folder

(2) Within MariaDB, create the database table that will hold the imported data

(3) Within MariaDB, import the data

(4) Validate the import

(5) Remove your data from the import folder

In order to (2) create the database table, you will need to have an understanding of your data. You will need to know the name of all columns, as well as each column’s data type (integer, numeric with decimals, string of characters, etc) and each column’s maximum width. As an example, if one of the columns in your data is US phone numbers of the format 6174953292, then you may opt to use int(10). This tells us that all entries will be integers with up to 10 digits. However, if you suspect some entries have dashes such as 617-495-3292, then you will need to use char(12) which stores the data as a string of characters, up to 12 characters in length.

Please feel free to contact RCS at research@hbs.edu with any questions you may have regarding data import.

Example of Importing

We will use the following to illustrate importing:

MariaDB username = jharvard

MariaDB database = jharvard_database

MariaDB database table = table_import

Import filename = SampleData.txt

Data: 

Column_1

Column_2

Column_3

25

Harvard Way

Boston, MA

which as a file appears as:

Line 1:   Column_1|Column_2|Column_3

Line 2:   25|Harvard Way|Boston, MA

import

Before importing, we will need to create our database table. Creating a table includes specifying the maximum size of each column. We will keep this example simple by specifying each column be char and have a maximum length of 20. Please note that you can modify your table at a later time, for instance if you need to example a column from char(20) to char(30).

To begin, prepare the import folder as described above in the section Prevent Others From Accessing Your Data

            mkdir /export/mdb_external/import/jharvard

            chmod 700 /export/mdb_external/import/jharvard

Move your data to this import folder

            mv SampleData.txt /export/mdb_external/import/jharvard

                        ! Note that mv moves the file, as opposed to cp which copies the file!

                                    cp SampleData.txt /export/mdb_external/import/jharvard

Log into MariaDB

            mysql -h HOSTNAME -u jharvard -p

Within MariaDB, create table that will hold imported data

            use jharvard_database;

            create table table_import (Column_1 char(20), Column_2 char(20), Column_3 char(20));

import2

Within MariaDB, import data

load data local infile ‘/export/mdb_external/import/jharvard/SampleData.txt’ into table table_import fields terminated by ‘|’ lines terminated by ‘\n’ ignore 1 lines;

import3

Please note our command has 4 sections:

(1) load data local infile ‘/export/mdb_external/import/jharvard/SampleData.txt’

specify file to import

(2) into table table_import

specify table that will hold the imported data

(3) fields terminated by ‘|’ lines terminated by ‘\n’

specify delimiters (Click here for more information)

(4) ignore 1 lines

include this only if your file includes column header information

Official documentation for this command may be found at [https://mariadb.com/kb/en/mariadb/load-data-infile/].

We can check do a preliminary check on the first 10 rows of our data via

 select * from table_import limit 10;
import4

Log out of MariaDB

            exit;

Delete import folder  

            rm -rf /export/mdb_external/import/jharvard