5 Best Ways to Import CSV Data into MySQL Table Using Python

πŸ’‘ Problem Formulation: Often we are faced with the challenge of transferring data stored in a CSV file into a MySQL database table. Whether for data analysis, application feeding, or data migrations, efficiently moving data from a CSV to a MySQL table is a common task. We want to take a CSV file such as ‘data.csv’ and import its contents into a MySQL table named ‘data_table’ in an existing database.

Method 1: Using Python’s csv and MySQLdb Libraries

This method involves reading the CSV file row by row with Python’s built-in csv module and inserting the data into the MySQL table via the MySQLdb library. It offers a straightforward approach but may not be the fastest for large datasets.

Here’s an example:

import csv
import MySQLdb

# Connect to the database
db = MySQLdb.connect(host='localhost', user='user', passwd='password', db='database_name')
cursor = db.cursor()

# Open the CSV file
with open('data.csv', 'r') as csvfile:
    csv_data = csv.reader(csvfile)
    for row in csv_data:
        cursor.execute('INSERT INTO data_table VALUES(%s, %s, %s)', row)

# Commit and close
db.commit()
db.close()

Output:

Rows from ‘data.csv’ are inserted into the ‘data_table’ MySQL table.

The snippet above opens a connection to the MySQL database, then opens the CSV file and reads each row. Using a cursor provided by MySQLdb, it inserts each row into the specified table. After all rows are inserted, changes are committed to the database, and connections are closed.

Method 2: Using pandas and SQLAlchemy

Pandas offer high-level data manipulation tools, and in conjunction with SQLAlchemy, it can insert CSV data into a MySQL table as a DataFrame. This method is typically faster than the first one and better suited for larger datasets.

Here’s an example:

import pandas as pd
from sqlalchemy import create_engine

# Create engine
engine = create_engine('mysql://user:password@localhost/database_name')

# Read CSV to a DataFrame
df = pd.read_csv('data.csv')

# Insert the dataframe into MySQL
df.to_sql('data_table', con=engine, if_exists='replace', index=False)

Output:

All rows from ‘data.csv’ are inserted into ‘data_table’, with the table being created or replaced if it already exists.

The code creates an SQL engine connection, reads the CSV into a Pandas DataFrame, and then uses the to_sql method to write the DataFrame contents to the MySQL table. The if_exists='replace' argument dictates the behavior when the table already exists.

Method 3: Using Python’s csv and mysql-connector-python

Using the native mysql-connector-python provided by MySQL, this method emphasizes compatibility and support from MySQL. It is also straightforward and uses prepared statements for security and efficiency.

Here’s an example:

import csv
import mysql.connector

# Connect to the database
cnx = mysql.connector.connect(user='user', password='password', host='localhost', database='database_name')
cursor = cnx.cursor()

# Open the CSV file
with open('data.csv') as csvfile:
    csv_data = csv.reader(csvfile)
    for row in csv_data:
        cursor.execute('INSERT INTO data_table VALUES(%s, %s, %s)', row)

# Commit and close
cnx.commit()
cnx.close()

Output:

Rows from ‘data.csv’ are inserted into the MySQL table ‘data_table’.

After establishing a connection using mysql.connector, the CSV data is read and inserted into the database table row by row. This provides a secure method to insert data, preventing SQL injection attacks with prepared statements.

Method 4: Using the MySQL LOAD DATA INFILE Command

The LOAD DATA INFILE SQL command is a built-in MySQL utility that provides a very fast way to import data from a CSV file directly into a MySQL table. This is considered the fastest method for large data imports, but it requires direct access to the server’s file system.

Here’s an example:

import mysql.connector

# Connect to the database
cnx = mysql.connector.connect(user='user', password='password', host='localhost', database='database_name')
cursor = cnx.cursor()

# Execute LOAD DATA INFILE command
cursor.execute("LOAD DATA INFILE 'data.csv' INTO TABLE data_table FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\\n'")

# Commit and close
cnx.commit()
cnx.close()

Output:

‘data.csv’ content is bulk-inserted into the ‘data_table’.

This code utilizes the MySQL’s native bulk data import functionality. The LOAD DATA INFILE command specifies the path to the CSV, the table to load the data into, and the delimiters used in the file. This method is highly efficient for large data sets and typically runs faster than inserting rows individually.

Bonus One-Liner Method 5: Using mysqlimport Utility

If you have shell access and prefer a command-line approach, mysqlimport provides a simple yet powerful one-liner to import CSV files. It’s fast and efficient, but access to the terminal is required. It’s installed by default with the MySQL client programs.

Here’s an example:

mysqlimport --ignore-lines=1 --fields-terminated-by=, --verbose --local -u user -ppassword database_name data.csv

Output:

The terminal will show a verbose output of ‘data.csv’ being imported into the ‘database_name.data_table’.

The example shows how to use mysqlimport with options to define field terminators, ignore header lines, and provide login credentials discreetly. Note the --local option is used to import data from a local CSV file.

Summary/Discussion

  • Method 1: Python’s csv and MySQLdb. Good for small-scale imports with direct control over insert operations. Can be slow for big data.
  • Method 2: pandas with SQLAlchemy. Suited for data scientists and analysts familiar with Pandas. Offers simplicity and efficiency, especially for complex data operations. Requires additional libraries.
  • Method 3: Python’s csv with mysql-connector-python. Official MySQL support and compatibility with well-implemented security features. Moderate speed.
  • Method 4: MySQL LOAD DATA INFILE. The fastest method with direct server filesystem access requirement. Not suitable if you don’t have direct access to the MySQL server machine.
  • Method 5: mysqlimport Utility. Great for quick imports directly from the command line. Requires shell access and is convenient for database administrators.