5 Best Ways to Convert CSV to MySQL Using Python

πŸ’‘ Problem Formulation: You have a CSV file containing data that needs to be imported into a MySQL database. For example, you want to take “data.csv”, which includes several columns of data, and import it directly into an existing MySQL table with matching schema, ensuring data types and character encoding are correctly handled during the import process.

Method 1: Using Python’s CSV and MySQL Connector Libraries

This method involves reading the CSV file using Python’s built-in csv module and then writing the data to MySQL using the mysql-connector-python library. The process ensures control over individual row processing and error handling.

Here’s an example:

import csv
import mysql.connector

db_connection = mysql.connector.connect(user='user', password='password', host='127.0.0.1', database='mydatabase')
cursor = db_connection.cursor()

with open('data.csv', 'r') as csvfile:
    csvreader = csv.reader(csvfile)
    next(csvreader) # Skip header row
    for row in csvreader:
        cursor.execute('INSERT INTO mytable VALUES(%s, %s, %s)', row)

db_connection.commit()
cursor.close()
db_connection.close()

As a result, each row from the ‘data.csv’ file is inserted into the ‘mytable’ table within the ‘mydatabase’ database on the MySQL server.

In this code snippet, we create a database connection and a cursor to execute SQL commands. We then open the CSV, skip the header and iterate over each row to insert the data into the MySQL table. Finally, we commit the transaction and close the cursor and connection but make sure you have the proper access privileges and error handling for production code.

Method 2: Using the pandas Library

pandas is a powerful Python data analysis toolkit which can read a CSV file directly into a DataFrame and then use the to_sql method to write the data to a MySQL database using an SQLAlchemy engine.

Here’s an example:

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('mysql+mysqlconnector://user:password@localhost/mydatabase')
df = pd.read_csv('data.csv')
df.to_sql('mytable', con=engine, if_exists='append', index=False)

Your data from ‘data.csv’ is now appended to the ‘mytable’ in your ‘mydatabase’ MySQL database.

The pandas to_sql method is a convenient way to take a DataFrame and write it to a SQL table. This example also skips the index on the DataFrame; adjust based on your preference for indexing your data in MySQL.

Method 3: Using Python’s csvkit Library

csvkit is a suite of command-line tools designed to work with CSV. The csvsql command can be used to generate SQL statements for a CSV and can even execute these directly to a database.

Here’s an example:

!csvsql --db mysql+mysqlconnector://user:password@localhost/mydatabase --insert data.csv

This example assumes you have csvkit installed and you’re running the command in a shell (the exclamation point is a Jupyter notebook convention for running shell commands).

By running the csvsql command from csvkit, it reads ‘data.csv’ and inserts the data into the ‘mydatabase’ on MySQL. It is a quick method for users comfortable with command-line tools and automation scripts.

Method 4: Using MySQL’s LOAD DATA INFILE Command

MySQL natively supports loading data from CSV files using the LOAD DATA INFILE SQL command. Though not a direct Python method, this can be run through a Python script using a connector to execute the command.

Here’s an example:

import mysql.connector

db_connection = mysql.connector.connect(user='user', password='password', host='127.0.0.1', database='mydatabase')
cursor = db_connection.cursor()

query = """
LOAD DATA INFILE 'absolute/path/to/data.csv'
INTO TABLE mytable
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\\n'
IGNORE 1 ROWS;
"""

cursor.execute(query)
db_connection.commit()
cursor.close()
db_connection.close()

This will input rows from the ‘data.csv’ into ‘mytable’ following the specified options for field delimiters, line terminators, and ignoring the header line.

This snippet shows how to execute a native MySQL command through Python. Be aware that the file path must be absolute, and the MySQL user must have the FILE privilege. Additionally, security settings might prevent using `LOAD DATA INFILE` on some systems.

Bonus One-Liner Method 5: Using a Shell Command

If Python is not a strict requirement and you have access to the command line, you can use a combination of shell commands to directly send CSV data to MySQL.

Here’s an example:

!mysql -u user -p --local-infile=1 -e "LOAD DATA LOCAL INFILE 'data.csv' INTO TABLE mydatabase.mytable FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\\n' IGNORE 1 ROWS" -D mydatabase

This shell one-liner directly loads the ‘data.csv’ into the ‘mytable’ in the ‘mydatabase’ MySQL database, applying the specified format for the CSV file.

This one-liner is succinct and, because it bypasses Python, can be faster, but it requires direct shell access and the MySQL server configuration must permit `LOCAL INFILE`.

Summary/Discussion

  • Method 1: CSV and MySQL Connector Libraries. Provides fine-grained control and error handling. Requires more code.
  • Method 2: pandas Library. Simplifies the conversion process with powerful data manipulation features. Can be less efficient with large datasets.
  • Method 3: csvkit Library. Ideal for shell-based data operations and automation. Requires separate installation and familiarity with command-line tasks.
  • Method 4: MySQL’s LOAD DATA INFILE Command. Leverages MySQL’s efficient bulk data-loading capabilities. Security restrictions and privilege requirements may apply.
  • Method 5: Shell One-Liner. Fast and direct, bypassing Python. Requires shell access and specific MySQL configuration settings.