memo: Python | pathlib

Real Python Tutorial

Construct path

Make the filename an object instead of strings.

1
2
3
4
5
6
7
from pathlib import Path

# Instantiate an object:
datadir = Path("./datadir")

# Joining Paths
new_path = datadir / "sampled.npy"

Iterate dir

  1. .iterdir() method iterates over all the files in the given directory

    1
    2
    3
    4
    5
    
    from pathlib import Path
    from collections import Counter
    
    # count number of files of different types
    Counter(path.suffix for path in Path.cwd().iterdir())
    

    Return: Counter({'.md': 2, '.txt': 4, '.pdf': 2, '.py': 1})

  2. .glob("*.txt") returns all the files with a .txt suffix in the current directory.

    1
    2
    
    >>> Counter(path.suffix for path in Path.cwd().glob("*.p*"))
    Counter({'.pdf': 2, '.py': 1})
    
  3. .rglob() recursively find all the files in both the directory and its subdirectories.

    1
    2
    3
    4
    5
    6
    
    def tree(Path(directory)):
      print(f"+ {directory}")
      for path in sorted(directory.rglob("*")):
        depth = len(path.relative_to(directory).parts)
        spacer = "    " * depth
        print(f"{spacer}+ {path.name}")
    

glob two patterns

python - How to glob two patterns with pathlib? - Stack Overflow

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from pathlib import Path

exts = [".jl", ".jsonlines"]
mainpath = "/path/to/dir"

# Same directory
files = [p for p in Path(mainpath).iterdir() if p.suffix in exts]

# Recursive
files = [p for p in Path(mainpath).rglob('*') if p.suffix in exts]

# 'files' will be a generator of Path objects, to unpack into strings:
list(files)