Automate Your Excel Workflow: Accessing Google Drive Files with Python

BigDataBall Leave a Comment

Instead of manually downloading BigDataBall’s in-season Excel files, let’s automate that using Google Drive! This post will guide you through automating this process using Python, making your workflow smoother and more efficient.

Before We Begin

PRE-REQUIREMENTS:
1) Make sure you already have a season pass! Monthly pass and one-off pass plans are not eligible for automation.
2) Unfortunately, we’re not able to provide step-by-step troubleshooting assistance. Reach out to a developer you know for hands-on help in resolving your automation issues.

1. Create a Service Account: Head to https://cloud.google.com/iam/docs/service-accounts-create and follow the instructions to create a dedicated service account for your script.

2. Generate a Service Account Key: Download a JSON key file for your service account from https://cloud.google.com/iam/docs/keys-create-delete. This file contains the credentials your Python script will use to access Google Drive.

3. Get Access to Shared Folder Access:Please ask us to provide your service account with access to the shared Google Drive folder containing the target in-season Excel files.

4. Install Necessary Libraries: Use pip to install the required Python libraries:


pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib

Python Automation in Action

Here’s the Python code to automate your Excel file workflow:


from googleapiclient.discovery import build
from google.oauth2 import service_account
import io
from googleapiclient.http import MediaIoBaseDownload

# Replace with the path to your downloaded service account key file
SERVICE_ACCOUNT_FILE = '/path/to/your/credentials.json'

# Replace with the desired download directory
DOWNLOAD_DIR = '/path/to/download/directory'

# Define the scopes
SCOPES = ['https://www.googleapis.com/auth/drive']

credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)

# Build the Drive API client
service = build('drive', 'v3', credentials=credentials)

# Customize the search query if needed
query = "name contains '-feed.xlsx'"

# Search for the file
results = service.files().list(q=query,
supportsAllDrives=True,
includeItemsFromAllDrives=True,
fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])

# Find the most recently modified file (assuming alphabetical order reflects modification)
latest_file = sorted(items, key=lambda x: x['name'])[-1]

# Download the file
request = service.files().get_media(fileId=latest_file['id'])
fh = io.FileIO(f"{DOWNLOAD_DIR}/-feed.xlsx", 'wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print(f"Download {int(status.progress() * 100)}%.")

print(f"Downloaded file: {latest_file['name']}")

Explanations

1. Initialization: We import necessary libraries, define paths to the service account key file (`SERVICE_ACCOUNT_FILE`) and download directory (`DOWNLOAD_DIR`), and specify the required scopes.

2. Authentication: We create a service account credentials object using the key file and build the Google Drive API client.

3. File Search: We define a search query to find files containing “-feed.xlsx” in their name. You can customize this query to match your specific needs.

4. Finding the Latest File: The code sorts the found files by name and selects the last one, assuming it’s the most recently modified.

5. Downloading the File: We use the `files().get_media()` method to download the latest file to our specified directory with progress updates.

Next Steps

Now that you’ve automated the download, consider these enhancements:

* Process the Excel Data: Utilize libraries like `openpyxl` or `pandas` to read and manipulate the downloaded Excel data.
* Schedule the Script: Set up a cron job or use a task scheduler to run your script automatically at the times BigDataBall uploads the updates.
* Integrate with Other Tools: Send notifications, update databases, or trigger other actions based on the processed data.

Note that this automation can be written with other progamming languages. this workflow saves you time, reduces errors, and allows you to focus on more critical tasks. Happy automating!

Add a Comment: