Skip to content

Speed up SimpleHTTPRequestHandler.list_directory() by using os.scandir() #151788

@mjbommar

Description

@mjbommar

SimpleHTTPRequestHandler.list_directory() calls os.listdir() and then, for every entry, os.path.isdir() (a stat) and os.path.islink() (an lstat) — two stat-family syscalls per entry. This is wasted work on any filesystem and dominates listing time for large directories; on network filesystems like NFS, where each call is a round-trip, it becomes severe.

os.scandir() returns the entry type from the directory read itself (POSIX d_type / NFS READDIRPLUS), eliminating the per-entry stats in the common case. CPython already did this migration for os.walk(), glob, and pathlib.Path.iterdir() (gh-117727); http.server was missed.

Benchmark

Directory with 1000 files + 1000 dirs (plus a few symlinks):

  • stat-family syscalls (strace): 4088 → 88 (the 88 is constant interpreter startup; the per-entry loop drops from ~2 syscalls to ~0)
  • local filesystem wall-clock: ~10× faster
  • emulating NFS by injecting per-stat latency: the listing goes from seconds to ~2 ms

Worst case — a mount that returns DT_UNKNOWN — falls back to one cached lstat per entry, which is still fewer calls than today and never worse.

The change is behavior-preserving: DirEntry.is_dir()/is_symlink() match os.path.isdir/os.path.islink semantics (follow-symlinks behavior and return-False-on-error), verified across real dirs/files, symlink-to-dir, symlink-to-file, and broken symlinks. The existing test_httpservers suite passes unchanged.

I have a patch ready and will open a PR.


This issue was prepared with AI assistance (Claude Code); the analysis and benchmarks were reviewed by me.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    pendingThe issue will be closed if no feedback is providedperformancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions