Skip to content

Attempt 1 failed: BrowserType.launch: ENOSPC: no space left on device, mkdtemp '/tmp/playwright-artifacts-aNE1l9' #901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dejoma opened this issue Jan 20, 2025 · 2 comments
Labels
bug Something isn't working stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed

Comments

@dejoma
Copy link

dejoma commented Jan 20, 2025

Describe the bug 🐛

Attempt 1 failed: BrowserType.launch: ENOSPC: no space left on device, mkdtemp '/tmp/playwright-artifacts-aNE1l9'

The browser is not closed after running. So I am running a lambda function, it gets called multiple times. And then it runs out of memory.

See here the Github issue link and fix in the comments:
microsoft/playwright-java#526

My code 💻 🐊

SCRAPE_CONFIG = {
    "llm": {
        "api_key": os.environ["OPENAI_API_KEY"],
        "model": "openai/gpt-4o-mini",
    },
    "search_engine": "serper",
    "serper_api_key": os.environ["SERPER_API_KEY"],
    # "num_results": 5,
    "loader_kwargs": {
        # https://github.com/microsoft/playwright/issues/14023
        "args": ["--single-process", "--disable-gpu", "--disable-dev-shm-usage"],
    },
    "force": True,
    "verbose": True,
    "headless": True,
}


scraper = SearchGraph(prompt=prompt, config=SCRAPE_CONFIG, schema=ScraperOutput)  # type: ignore
scrape_results = scraper.run()

Hotfix update 🧯

So I've tried to empty some directories when they surpass 1.0GB, and it seems to work for now.

def cleanup_temp_files():
    """Clean up large temporary files and directories in /tmp.

    This function checks for files/directories larger than 1GB in /tmp
    and removes them to prevent disk space issues.
    """
    cleaned_paths = []
    ONE_GB = 1024 * 1024 * 1024  # 1GB in bytes

    try:
        # Get all items in /tmp
        tmp_items = os.listdir("/tmp")

        for item in tmp_items:
            full_path = os.path.join("/tmp", item)
            try:
                # Get size of file/directory
                if os.path.isdir(full_path):
                    total_size = sum(
                        os.path.getsize(os.path.join(dirpath, filename))
                        for dirpath, _, filenames in os.walk(full_path)
                        for filename in filenames
                    )
                else:
                    total_size = os.path.getsize(full_path)

                # Remove if larger than 1GB
                if total_size > ONE_GB:
                    if os.path.isdir(full_path):
                        shutil.rmtree(full_path)
                    else:
                        os.remove(full_path)
                    cleaned_paths.append(f"{full_path} ({total_size / ONE_GB:.2f}GB)")

            except Exception as e:
                print(f"Failed to process {full_path}: {e}")

    except Exception as e:
        print(f"Error accessing /tmp directory: {e}")

    if cleaned_paths:
        print(f"Cleaned up {len(cleaned_paths)} large files/directories: {cleaned_paths}")
    return cleaned_paths

My log file:

Cleaned up 2 large files/directories: ['/tmp/core.headless_shell.5405 (1.01GB)', '/tmp/core.headless_shell.5910 (1.01GB)']
@dosubot dosubot bot added the bug Something isn't working label Jan 20, 2025
@VinciGit00
Copy link
Collaborator

ok @dejoma can you make the pull request please?

Copy link

dosubot bot commented Apr 25, 2025

Hi, @dejoma. I'm Dosu, and I'm helping the Scrapegraph-ai team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • The issue involves a failure in launching a browser using the Playwright library due to insufficient disk space.
  • The problem arises because the browser does not close properly, leading to memory exhaustion when a lambda function is repeatedly called.
  • A temporary fix involves cleaning up temporary files larger than 1GB to mitigate disk space issues.
  • @VinciGit00 has requested you to make a pull request for the fix.

Next Steps:

  • Please let us know if this issue is still relevant to the latest version of the Scrapegraph-ai repository. If so, you can keep the discussion open by commenting on the issue.
  • Otherwise, the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
Projects
None yet
Development

No branches or pull requests

2 participants