How to Download Kaggle Data with Python and requests.py23 Nov 2012
Recently I started playing with Kaggle. I quickly became frustrated that in order to download their data I had to use their website. I prefer instead the option to download the data programmatically. After some Googling, the best recommendation I found was to use lynx. My friend Anthony recommended that alternatively I should write a Python script.
Although Python is not my primary language, I was intrigued by how simple it was to write the script using requests.py. In this example, I download the training data set from Kaggle’s Digit Recognizer competition.
The idea is simple:
- Attempt to download a file from Kaggle but get blocked because you are not logged in.
- Login with requests.py.
- Download the data.
Here’s the code:
my_password to your Kaggle login info. Feel free to optimize the chunk size to your liking.