Skip to content

"Fixed" tracing bug if module name and current working directory are identically named#137

Open
danielryckman wants to merge 1 commit into
mainfrom
daniel/module-name-bug
Open

"Fixed" tracing bug if module name and current working directory are identically named#137
danielryckman wants to merge 1 commit into
mainfrom
daniel/module-name-bug

Conversation

@danielryckman
Copy link
Copy Markdown
Collaborator

For Python projects named reponame/packagename where reponame equals packagename (i.e. torchtitan/torchtitan), .split() will improperly split at the first instance in the string. The path after the root module will then still have module name on it. For example, it splits "/home/daniel/torchtitan/torchtitan/config/manager.py" into ['/home/daniel', 'torchtitan/config/manager.py'] instead of ['/home/daniel/torchtitan', 'config/manager.py']. Thus, after prepending the module name, we get torchtitan/torchtitan/config/manager.py, which doesn't exist.

@danielryckman danielryckman requested a review from Essoz June 4, 2026 18:25
@Essoz
Copy link
Copy Markdown
Collaborator

Essoz commented Jun 5, 2026

✅ The fix solves the immediate problem, but I share your concern about potential edge cases.

Potential issue:
This still relies on string matching for the module separator pattern f"/{root_module}/". If a root module has the same name as a submodule or class within it (e.g., a package structure like mylib.submodule.mylib), the rsplit could still grab the wrong segment.

Better approach: Instead of parsing the file path string directly, we:

Determine the absolute path of the root module (using importlib.util.find_spec() or pathlib)
Use pathlib.Path.relative_to() to remove the root module path from the file path
Convert the resulting relative path to a module path
This avoids string-based parsing altogether and is much more robust to naming collisions.

Example Code (you can probably use a coding agent to help you implement this, should be a pretty lightweight change).

from pathlib import Path
from importlib.util import find_spec

def get_module_path_from_file_path(file_path: str, root_module: str) -> str | None:
    """
    Get the module path from a file path using importlib and pathlib.
    """
    try:
        # Find the root module's location
        spec = find_spec(root_module)
        if spec is None or spec.origin is None:
            return None
        
        root_module_path = Path(spec.origin).parent
        file_path_obj = Path(file_path).resolve()
        
        # Get the relative path from root module to file
        try:
            relative_path = file_path_obj.relative_to(root_module_path)
        except ValueError:
            # file_path is not under the root module
            return None
        
        # Convert to module path
        module_name = str(relative_path.with_suffix(''))
        module_path = f"{root_module}.{module_name.replace('/', '.')}"
        return module_path
    except Exception:
        return None

This approach is import-aware and doesn't rely on fragile string parsing.

return None
# get the path of the module from the file path
path_after_root_module = file_path.split(f"/{root_module}/")[1].split(".py")[0]
path_after_root_module = file_path.rsplit(f"/{root_module}/", 1)[-1].split(".py")[0]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants