5 Best Ways to Convert CSV to GeoJSON in Python

πŸ’‘ Problem Formulation: You have a CSV file containing geographical data such as coordinates, place names, or other location-specific information, and you require it in GeoJSON format for use in web mapping applications or geospatial data processing. The CSV input might have columns for latitude and longitude, and the desired output is a GeoJSON feature collection where each row in the CSV becomes a point feature in the GeoJSON object.

Method 1: Using Python’s csv and json Modules

This method involves reading the CSV file with Python’s built-in csv module and constructing the GeoJSON using the json module. It allows for manual control over how the CSV data is processed and how the GeoJSON is structured, fitting perfectly for custom data processing workflows.

Here’s an example:

import csv
import json

csvfile = open('data.csv', 'r')
jsonfile = open('data.geojson', 'w')

fieldnames = ("id","lat","lon","name")
reader = csv.DictReader(csvfile, fieldnames)
data = {"type": "FeatureCollection", "features": []}
for row in reader:
    feature = {"type": "Feature",
               "geometry": {"type": "Point",
                            "coordinates": [float(row['lon']), float(row['lat'])]},
               "properties": row}
    data["features"].append(feature)
json.dump(data, jsonfile, indent=4)

Output will be a GeoJSON file with points representing each row of the CSV.

This code snippet reads from a CSV file named ‘data.csv’ and writes a GeoJSON object to ‘data.geojson’. It defines the CSV structure and constructs a Python dictionary that fits the GeoJSON specification. Each row becomes a point feature with properties taken directly from the CSV columns, which is then written out to a file using JSON formatting.

Method 2: Using GeoPandas Library

GeoPandas is a powerful tool that simplifies working with geospatial data in Python. It can read a variety of data formats, including CSV files, and has built-in methods to convert data frames to GeoJSON. It is particularly well-suited for those who are already familiar with pandas data structures.

Here’s an example:

import geopandas as gpd
from shapely.geometry import Point

df = gpd.read_file('data.csv')
df['geometry'] = df.apply(lambda row: Point(float(row['lon']), float(row['lat'])), axis=1)
gdf = gpd.GeoDataFrame(df, geometry='geometry')
gdf.to_file('data.geojson', driver='GeoJSON')

Output will be a GeoJSON file with points representing each CSV row.

This snippet uses GeoPandas to read the CSV data into a GeoDataFrame, creates Point geometries from the latitude and longitude columns, and then exports the GeoDataFrame to a GeoJSON file. It’s a concise and efficient way to handle geospatial data conversions.

Method 3: Using Pandas with simplejson/lib

This method uses Pandas for handling CSV data along with the simplejson or json library to convert DataFrame to GeoJSON format. It is ideal for those who want to utilize the data manipulation strength of Pandas before exporting to GeoJSON.

Here’s an example:

import pandas as pd
import simplejson as json

df = pd.read_csv('data.csv')
features = []
for _, row in df.iterrows():
    feature = {"type": "Feature",
               "geometry": {"type": "Point",
                            "coordinates": [row['lon'], row['lat']]},
               "properties": row.to_dict()}
    features.append(feature)
geojson = {"type": "FeatureCollection", "features": features}
with open('data.geojson', 'w') as f:
    f.write(json.dumps(geojson, indent=4))

Output will be a formatted GeoJSON file with point features.

This code combines the ease of DataFrame manipulation in Pandas with the flexibility of the json library. After converting the CSV to a DataFrame, it iterates over each row to construct a GeoJSON feature and then writes a FeatureCollection to a file.

Method 4: Using csv2geojson Tool

csv2geojson is a command-line tool that can convert CSV to GeoJSON without having to write any Python code. This is best suited for quick conversions without much need for data manipulation or customization during the process.

Here’s an example:

!pip install csv2geojson
!csv2geojson data.csv > data.geojson

Output will be a new GeoJSON file named ‘data.geojson’.

The example uses bash commands in a Jupyter notebook or a similar environment to install the csv2geojson Python package and then to run the conversion command. This method is the fastest, requiring only installation and execution of the tool with the input CSV file, making it ideal for straightforward conversions.

Bonus One-Liner Method 5: Using pandas and geojson

Pandas can be combined with the geojson library to perform the CSV to GeoJSON conversion in a single line after initial setup. This is the most compact code for quick conversion tasks within a Python script.

Here’s an example:

import pandas as pd
import geojson

df = pd.read_csv('data.csv')
points = [geojson.Point((row['lon'], row['lat'])) for idx, row in df.iterrows()]
with open('data.geojson', 'w') as f:
    geojson.dump(geojson.FeatureCollection([geojson.Feature(geometry=point) for point in points]), f)

Output will be a ‘data.geojson’ file containing the GeoJSON data.

This is perhaps the most concise way to perform the conversion within Python, using list comprehensions to create GeoJSON Point objects from a pandas DataFrame and then dumping a FeatureCollection of these Points into a file.

Summary/Discussion

  • Method 1: Using Python’s csv and json Modules. Strengths: Full control over conversion process, ideal for customized workflows. Weaknesses: More verbose and coding-intensive.
  • Method 2: Using GeoPandas Library. Strengths: Simplifies geospatial data manipulation, integrates well with pandas. Weaknesses: Requires additional library installation, may be overkill for simple tasks.
  • Method 3: Using Pandas with simplejson/lib. Strengths: Utilizes pandas’ data handling capabilities, flexible output formatting. Weaknesses: Still requires some manual data structure setup.
  • Method 4: Using csv2geojson Tool. Strengths: Fastest conversion with minimal coding. Weaknesses: Less flexibility in data manipulation, external tool dependency.
  • Method 5: Using pandas and geojson – One-Liner. Strengths: Extremely concise code for conversion. Weaknesses: Might be less clear and harder to debug or extend.