memo: load CelebA on Colab

Use the built-in dataset

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import torch
from torchvision import datasets, transforms

# Root directory for the dataset
data_root = 'data'
# Spatial size of training images, images are resized to this size.
image_size = 64

celeba_data = datasets.CelebA(
        data_root,
        download=True,
        transform=transforms.Compose([
        transforms.Resize(image_size),
        transforms.CenterCrop(image_size),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.5, 0.5, 0.5],
                             std=[0.5, 0.5, 0.5]) ] ) )

This will download and extract the zip file.

SOURCE CODE FOR TORCHVISION.DATASETS.CELEBA


Extract the zip file

1
2
3
4
5
6
7
8
9
import zipfile

data_root = 'data/celeba'

# Add shortcut of dataset to your google drive
zip_path = '/content/drive/MyDrive/CelebA/Img/img_align_celeba.zip'

with zipfile.ZipFile(zip_path, 'r') as ziphandler:
    ziphandler.extractall('data')

How do I load the CelebA dataset on Google Colab, using torch vision, without running out of memory?

ZipFile - GfG


Download with gdown

  1. Install the package: pip install gdown.
  2. Copy the URL in the address bar.
1
2
3
4
import gdown
url = "https://drive.google.com/u/0/uc?id=1m8-EBPgi5MRubrm6iQjafK2QMHDBMSfJ&export=download"
output = "celeba.zip"
gdown.download(url, output)

Then unzip it and its subfolder:

1
2
3
unzip celeba.zip
cd celeba
unzip img_align_celeba.zip

(Python) Use the gdown package to download files from Google Drive


(2024-02-21)

gdown 4.7.1 cannot download large dataset dtu.zip 554 MB with the following error reported:

1
2
3
4
5
6
7
8
9
(base) zi@lambda-server:~/Downloads$ gdown 135oKPefcPTsdtLRzoDAQtPpHuoIrpRI_
Access denied with the following error:

        Cannot retrieve the public link of the file. You may need to change
        the permission to 'Anyone with the link', or have had many accesses. 

You may still be able to access the file from the browser:

         https://drive.google.com/uc?id=135oKPefcPTsdtLRzoDAQtPpHuoIrpRI_ 

Update gdown to 5.1.0 to avoid it: issue

1
pip install --upgrade gdown

Download GoogleDrive

(2024-05-25)

isl-org/TanksAndTemples has a function about downloading datasets from google drive.