SimpleHTTPRequestHandler.list_directory() calls os.listdir() and then, for every entry, os.path.isdir() (a stat) and os.path.islink() (an lstat) — two stat-family syscalls per entry. This is wasted work on any filesystem and dominates listing time for large directories; on network filesystems like NFS, where each call is a round-trip, it becomes severe.
os.scandir() returns the entry type from the directory read itself (POSIX d_type / NFS READDIRPLUS), eliminating the per-entry stats in the common case. CPython already did this migration for os.walk(), glob, and pathlib.Path.iterdir() (gh-117727); http.server was missed.
Benchmark
Directory with 1000 files + 1000 dirs (plus a few symlinks):
- stat-family syscalls (
strace): 4088 → 88 (the 88 is constant interpreter startup; the per-entry loop drops from ~2 syscalls to ~0)
- local filesystem wall-clock: ~10× faster
- emulating NFS by injecting per-
stat latency: the listing goes from seconds to ~2 ms
Worst case — a mount that returns DT_UNKNOWN — falls back to one cached lstat per entry, which is still fewer calls than today and never worse.
The change is behavior-preserving: DirEntry.is_dir()/is_symlink() match os.path.isdir/os.path.islink semantics (follow-symlinks behavior and return-False-on-error), verified across real dirs/files, symlink-to-dir, symlink-to-file, and broken symlinks. The existing test_httpservers suite passes unchanged.
I have a patch ready and will open a PR.
This issue was prepared with AI assistance (Claude Code); the analysis and benchmarks were reviewed by me.
Linked PRs
SimpleHTTPRequestHandler.list_directory()callsos.listdir()and then, for every entry,os.path.isdir()(astat) andos.path.islink()(anlstat) — two stat-family syscalls per entry. This is wasted work on any filesystem and dominates listing time for large directories; on network filesystems like NFS, where each call is a round-trip, it becomes severe.os.scandir()returns the entry type from the directory read itself (POSIXd_type/ NFSREADDIRPLUS), eliminating the per-entry stats in the common case. CPython already did this migration foros.walk(),glob, andpathlib.Path.iterdir()(gh-117727);http.serverwas missed.Benchmark
Directory with 1000 files + 1000 dirs (plus a few symlinks):
strace): 4088 → 88 (the 88 is constant interpreter startup; the per-entry loop drops from ~2 syscalls to ~0)statlatency: the listing goes from seconds to ~2 msWorst case — a mount that returns
DT_UNKNOWN— falls back to one cachedlstatper entry, which is still fewer calls than today and never worse.The change is behavior-preserving:
DirEntry.is_dir()/is_symlink()matchos.path.isdir/os.path.islinksemantics (follow-symlinks behavior and return-False-on-error), verified across real dirs/files, symlink-to-dir, symlink-to-file, and broken symlinks. The existingtest_httpserverssuite passes unchanged.I have a patch ready and will open a PR.
This issue was prepared with AI assistance (Claude Code); the analysis and benchmarks were reviewed by me.
Linked PRs