Automate Backup to Google Drive with Python

4.7/5 - (4 votes)

Project Description

Every day I find myself in a position where I have to take daily backup of a certain folder named Reports and store it in my Google drive. This is an essential folder for me since I store all the necessary documents (especially reports). Hence, keeping a backup of this folder in the cloud gives me a cushion in case of any fault in my local system.

However, this involves a long and repetitive manual process wherein I have to sign in to my google account, navigate to the folder where I store my backup files and then copy and paste new files to this folder. This consumes some time, and I was thinking of automating the entire process. That is when I came up with a wonderful automation script that not only saved me all the hassle of manually backing up my files but also ensured that I could do so without the help of any fancy third-party software application.

So, this is what my script does –

  • It connects me to Google Drive automatically.
  • Uploads a fresh copy of my backup folder every time by replacing the old files.
  • The entire backup script is triggered every day at 12:00 AM, thereby ensuring that I do not have to run it manually on a daily basis.

Thus, in this article, I will guide you through implementing the same automated backup technique to store files from your local system to Google drive on a daily basis using Python. So, without further delay, let’s discover the power of Python automation through our project.

Step 1: Setting up Google Drive API

Before you start writing your script, it is essential to ensure that you have set up your Google Drive API properly. Simply put, the Drive API allows you to interact with Google Drive storage and supports several popular programming languages, such as Java, JavaScript, and Python.

  • Go ahead and log in to your GCP console using your gmail ID.
  • After logging in, create a new Project. Let’s name it FileBackup.
  • Now, you will need to configure and download the client configuration secrets JSON file. So, head over to APIs and Services ⮕ Select OAuth consent screen.
  • Select User Type as External and click Create.
  • Fill in the necessary App Information and click on Save and Continue.
  • No need to fill in Scopes for now and proceed to the next step by clicking on Save and Continue.
  • Add a Test User. Simply enter your email ID here, or you may choose to add another user. However, to keep things simple, I entered my email ID.

You are done with the first step. Now we need the Credentials that will authorise us when we talk to the Google Drive API from our script.

  • Click on Credentials CREATE CREDENTIALS ⮕ Select OAuth client ID.
  • Fill in the required details and click on Create.
  • A screen pops up, and you can see your client ID and secret. Download the JSON and make sure to store it in your Project folder with the name client_secrets.json.

The final step is to enable the Google Drive API so that you can communicate with it using the credentials from your script.

  • Select Enabled APIs and services ⮕ click ENABLE APIS AND SERVICES ⮕ Search Google Drive ⮕ Select the Google Drive API ⮕ Click on Enable

That’s it! You are now all set to communicate with the Google Drive API using your script to connect and store files in your Google Drive.

Step 2: Create the Automation Script

Now that everything is ready, we are now ready to write a program to generate daily backups on the cloud. To contact the API, you will need the help of the PyDrive library. So, go ahead and install it from your terminal as pip3 install PyDrive

After installing the PyDrive library, create a new file in your project directory and name it backup.py. This is our driver file.

  • First, import the necessary libraries and modules that will help you throughout the course of your script.
from pydrive.drive import GoogleDrive
from pydrive.auth import GoogleAuth
import os
  • Secondly, you need need to set up an authentication with google drive API. As soon as this code is executed, your default browser will open up and ask you for user permissions to access your drive contents.
    • IMPORTANT NOTE: Ensure that the client_secrets.json file is in the same directory as our Python file. Also, keep the json file a secret and take special care that it is not leaked online.
gauth = GoogleAuth()
gauth.LoadCredentialsFile("mycreds.txt")
if gauth.credentials is None:
    gauth.LocalWebserverAuth()
elif gauth.access_token_expired:
    gauth.Refresh()
else:
    gauth.Authorize()
gauth.SaveCredentialsFile("mycreds.txt")
drive = GoogleDrive(gauth)

Note that even after you have set up the pydrive and the google API such that my secret_client.json works, it will still require web authentication for G-Drive access every time you run your script. To deal with this issue, you need to make a minor adjustment so that your app doesn’t have to ask the client to authenticate every time you run the app. You just need to use LoadCredentialsFile and SaveCredentialsFile as we did in the code above. This way, you will need to provide access the first time you run your script and never again will Google ask you to authenticate it (unless, off course your access token expires).

  • Next, it is time to access the folder/file you want to backup from the local drive and store it in your Google Drive.
def upload_file_to_drive():
    for x in os.listdir(path):
        file_list = drive.ListFile(
            {'q': "'16jhq7j-SWZmKF_vU0MmKNX' in parents and trashed = False"}).GetList()
        try:
            for file1 in file_list:
                if file1['title'] == os.path.join(path, x):
                    file1.Delete()
        except:
            pass

Explanation: The key value pair 'q': "'16jhq7j-SWZmKF_vU0MmKNX' denotes the folder ID of the folder within your Google drive where you want to save the files. This is how you can get the folder ID of the required folder:

Simply copy it and paste it in your script.

A major problem here is whenever you copy the files and store them in your Google drive folder, the pre-existing files also get copied and duplicate files cover up the space unnecessarily. Thus, instead of simply copying the files blindly, you can opt to replace them instead. So, you can follow a two-step process – (i) Delete the pre-existing files (ii) Store a fresh copy of all the files in the folder. That is exactly what has been done in the above script.

Putting it all Together

  • All that remains to be done now is to put all the bits and pieces together to compile the entire code. So this is how the complete script looks like –
from pydrive.drive import GoogleDrive
from pydrive.auth import GoogleAuth
import os

gauth = GoogleAuth()
gauth.LoadCredentialsFile("mycreds.txt")
if gauth.credentials is None:
    gauth.LocalWebserverAuth()
elif gauth.access_token_expired:
    gauth.Refresh()
else:
    gauth.Authorize()
gauth.SaveCredentialsFile("mycreds.txt")
drive = GoogleDrive(gauth)
path = r"C:\reports"
def upload_file_to_drive():
    for x in os.listdir(path):
        file_list = drive.ListFile(
            {'q': "'16jhq7j-SWZmKF_vUehGalNf6Yr0MmKNX' in parents and trashed = False"}).GetList()
        try:
            for file1 in file_list:
                if file1['title'] == os.path.join(path, x):
                    file1.Delete()
        except:
            pass
        f = drive.CreateFile({'parents': [{'id': '16jhq7j-SWZmKF_vUehGalNf6Yr0MmKNX'}]})
        f.SetContentFile(os.path.join(path, x))
        f.Upload()
        f = None


upload_file_to_drive()

Output:

Schedule Regular Backup

One final task that needs to be taken care of is ensuring that the backup script runs at a particular time every day. There are different ways of doing this. You can use the schedule library in Python to do so within your script itself. However, using schedule has its own downside and it needs the script to keep on running.

Hence, I would suggest using the windows task scheduler. Windows Task Scheduler is built-in windows program that facilitates you with the ability to schedule and automate tasks in Windows by running scripts or programs automatically at a given moment.

  • Search for “Task Scheduler”.
  • Click Actions ⮕ Create Task
  • Give your task a Name
  • Then select Actions ⮕ New
  • Find the Python Path using where python in the command line and copy the path from the command line.
  • Go to the folder where your Python script is actually located. Hold the Shift Key on your keyboard and right-click on the file and select Copy as path
  • In the "Add arguments (optional)” box, add the name of your Python file.
  • In the "Start in (optional)" box, paste the location of the Python file that you copied previously.
  • Click “OK”
  • Go to “Triggers” ⮕ Select “New”
  • Choose the repetition that you want. Here you can schedule Python scripts to run daily, weekly, monthly or just one time.
  • Click “OK”

Once, you have set this up, your trigger is now active and your Python script will run automatically every day.

Summary

Hurrah! You have successfully set up your automated backup script to take the necessary backups on a scheduled time every day without the need to do anything manually. I hope the complete walkthrough of the project helped you! Subscribe and stay tuned for more interesting projects in the future.