How do I access (read, write) Google Sheets spreadsheets with Python?
I am wondering if you can point me to an example of reading/writing to/from a google doc/spreadsheet using python.
I did look at google docs API here https://developers.google.com/google-apps/spreadsheets/ but not sure if I hit the right link. Also an example will be of great help.
What I am trying to do is query spreadsheets based on the different columns more like a SQL query and then do some downstream parsing with the data and put it into another spreadsheet or doc at google docs.
Best, -Abhi
(Jun-Dec 2016) Most answers here are now out-of-date as: 1) GData APIs are the previous generation of Google APIs, and that's why it was hard for @Josh Brown to find that old GData Docs API documentation. While not all GData APIs have been deprecated, all newer Google APIs do not use the Google Data protocol; and 2) Google released a new Google Sheets API (not GData). In order to use the new API, you need to get the Google APIs Client Library for Python (it's as easy as pip install -U google-api-python-client
[or pip3
for Python 3]) and use the latest Sheets API v4+, which is much more powerful & flexible than older API releases.
Here's one code sample from the official docs to help get you kickstarted. However, here are slightly longer, more "real-world" examples of using the API you can learn from (videos plus blog posts):
- Migrating SQL data to a Sheet plus code deep dive post
- Formatting text using the Sheets API plus code deep dive post
- Generating slides from spreadsheet data plus code deep dive post
- Those and others in the Sheets API video library
The latest Sheets API provides features not available in older releases, namely giving developers programmatic access to a Sheet as if you were using the user interface (create frozen rows, perform cell formatting, resizing rows/columns, adding pivot tables, creating charts, etc.), but NOT as if it was some database that you could perform searches on and get selected rows from. You'd basically have to build a querying layer on top of the API that does this. One alternative is to use the Google Charts Visualization API query language, which does support SQL-like querying. You can also query from within the Sheet itself. Be aware that this functionality existed before the v4 API, and that the security model was updated in Aug 2016. To learn more, check my G+ reshare to a full write-up from a Google Developer Expert.
Also note that the Sheets API is primarily for programmatically accessing spreadsheet operations & functionality as described above, but to perform file-level access such as imports/exports, copy, move, rename, etc., use the Google Drive API instead. Examples of using the Drive API:
- Listing your files in Google Drive and code deep dive post
- Google Drive: Uploading & Downloading Files plus "Poor man's plain text to PDF converter" code deep dive post (*)
- Exporting a Google Sheet as CSV blog post only
(*) - TL;DR: upload plain text file to Drive, import/convert to Google Docs format, then export that Doc as PDF. Post above uses Drive API v2; this follow-up post describes migrating it to Drive API v3, and here's a developer video combining both "poor man's converter" posts.
To learn more about how to use Google APIs with Python in general, check out my blog as well as a variety of Google developer videos (series 1 and series 2) I'm producing.
ps. As far as Google Docs goes, there isn't a REST API available at this time, so the only way to programmatically access a Doc is by using Google Apps Script (which like Node.js is JavaScript outside of the browser, but instead of running on a Node server, these apps run in Google's cloud; also check out my intro video.) With Apps Script, you can build a Docs app or an add-on for Docs (and other things like Sheets & Forms).
UPDATE Jul 2018: The above "ps." is no longer true. The G Suite developer team pre-announced a new Google Docs REST API at Google Cloud NEXT '18. Developers interested in getting into the early access program for the new API should register at https://developers.google.com/docs.
UPDATE Feb 2019: The Docs API launched to preview last July is now available generally to all... read the launch post for more details.
UPDATE Nov 2019: In an effort to bring G Suite and GCP APIs more inline with each other, earlier this year, all G Suite code samples were partially integrated with GCP's newer (lower-level not product) Python client libraries. The way auth is done is similar but (currently) requires a tiny bit more code to manage token storage, meaning rather than our libraries manage storage.json
, you'll store them using pickle
(token.pickle
or whatever name you prefer) instead, or choose your own form of persistent storage. For you readers here, take a look at the updated Python quickstart example.
Have a look at GitHub - gspread.
I found it to be very easy to use and since you can retrieve a whole column by
first_col = worksheet.col_values(1)
and a whole row by
second_row = worksheet.row_values(2)
you can more or less build some basic select ...
where ... = ...
easily.
I know this thread is old now, but here is some decent documentation on Google Docs API. It was ridiculously hard to find, but useful, so maybe it will help you some. http://pythonhosted.org/gdata/docs/api.html.
I used gspread recently for a project to graph employee time data. I don't know how much it might help you, but here's a link to the code: https://github.com/lightcastle/employee-timecards
Gspread made things pretty easy for me. I was also able to add logic in to check for various conditions to create month-to-date and year-to-date results. But I just imported the whole dang spreadsheet and parsed it from there, so I'm not 100% sure that it is exactly what you're looking for. Best of luck.