Skip to content Skip to footer
0 items - $0.00 0

Show HN: An API that takes a URL and returns a file with browser screenshots by gkamer8

Show HN: An API that takes a URL and returns a file with browser screenshots by gkamer8

Show HN: An API that takes a URL and returns a file with browser screenshots by gkamer8

17 Comments

  • Post Author
    _nolram
    Posted February 6, 2025 at 7:31 pm

    [dead]

  • Post Author
    tantaman
    Posted February 6, 2025 at 7:37 pm

    us ai?

  • Post Author
    xnx
    Posted February 6, 2025 at 7:39 pm

    For anyone who might not be aware, Chrome also has the ability to save screenshots from the command line using:
    chrome –headless –screenshot="path/to/save/screenshot.png" –disable-gpu –window-size=1280,720 "https://www.example.com"

  • Post Author
    aspeckt-112
    Posted February 6, 2025 at 7:40 pm

    I’m looking forward to giving this a go. Great idea!

  • Post Author
    manmal
    Posted February 6, 2025 at 7:54 pm

    Being a bit frustrated with Linkwarden’s resource usage, I’ve thought about making my own self hosted bookmarking service. This could be a low effort way of loading screenshots for these links, very cool! It‘ll be interesting how many concurrent requests this can process.

  • Post Author
    synthomat
    Posted February 6, 2025 at 8:13 pm

    That's nice and everything but what to do about the EU cookie banners? Does hosting outside of the EU help?

  • Post Author
    quink
    Posted February 6, 2025 at 8:17 pm

    > SCREENSHOT_JPEG_QUALITY

    Not two words that should be near each other, and JPEG is the only option.

    Almost like it’s designed to nerd-snipe someone into a PR to change the format based on Accept headers.

  • Post Author
    mpetrovich
    Posted February 6, 2025 at 8:45 pm

    Reminds me of this open source library I wrote to do the same thing: https://github.com/nextbigsoundinc/imagely

    It uses puppeteer and chrome headless behind the scenes.

  • Post Author
    joshstrange
    Posted February 6, 2025 at 8:49 pm

    This is cool but at this point MCP is the clear choice for exposing tools to LLMs, I'm sure someone will write a wrapper around this to provide the same functionality as an MCP-SSE server.

    I want to try this out though and see how I like it compared to the MCP Puppeteer I'm using now (which does a great job of visiting pages, taking screenshots, interacting with the page, etc).

  • Post Author
    jot
    Posted February 6, 2025 at 8:56 pm

    If you’re worried about the security risks, edge cases, maintenance pain and scaling challenges of self hosting there are various solid hosted alternatives:

    https://browserless.io – low level browser control

    https://scrapingbee.com – scraping specialists

    https://urlbox.com – screenshot specialists*

    They’re all profitable and have been around for years so you can depend on the businesses and the tech.

    * Disclosure: I work on this one and was a customer before I joined the team.

  • Post Author
    morbusfonticuli
    Posted February 6, 2025 at 9:01 pm

    Similar project: gowitness [1].

    A really cool tool i recently discovered. Next to scraping and performing screenshots of websites and saving it in multiple formats (including sqlite3), it can grab and save the headers, console logs & cookies and has a super cool web GUI to access all data and compare e.g the different records.

    I'm planning to build my personal archive.org/waybackmachine-like web-log tool via gowitness in the not-so-distant future.

    [1] https://github.com/sensepost/gowitness

  • Post Author
    westurner
    Posted February 6, 2025 at 9:35 pm

    simonw/shot-scraper has a number of cli args, a GitHub actions repo template, and docs:
    https://shot-scraper.datasette.io/en/stable/

    From https://news.ycombinator.com/item?id=30681242 :

    > Awesome Visual Regression Testing > lists quite a few tools and online services: https://github.com/mojoaxel/awesome-regression-testing

    > "visual-regression": https://github.com/topics/visual-regression

  • Post Author
    kevinsundar
    Posted February 6, 2025 at 11:07 pm

    I'm looking for something similar that can also extract the diff of content on the page over time, in addition to screenshots. Any suggestions?

    I have a homegrown solution using an LLM and scrapegraphai for https://getchangelog.com but would rather offload that to a service that does a better job rendering websites. There's some websites that I get error pages from using playwright, but they load fine in my usual Chrome browser.

  • Post Author
    mlunar
    Posted February 6, 2025 at 11:47 pm

    Similar one I wrote a while ago using Pupetteer for the IoT low power display purposes. Neat trick is that it learns the refresh interval, so that it takes a snapshot just before it's requested :) https://github.com/SmilyOrg/website-image-proxy

  • Post Author
    rpastuszak
    Posted February 6, 2025 at 11:52 pm

    Cool! In using sth similar on my site to generate screenshots of tweets (for privacy purposes):

    https://untested.sonnet.io/notes/xitterpng-privacy-friendly-…

  • Post Author
    ranger_danger
    Posted February 7, 2025 at 12:04 am

    No license?

  • Post Author
    jchw
    Posted February 7, 2025 at 12:04 am

    One thing to be cognizant of: if you're planning to run this sort of thing against potentially untrusted URLs, the browser might be able to make requests to internal hosts in whatever network it is on. It would be wise, on Linux, to use network namespaces, and block any local IP range in the namespace, or use a network namespace to limit the browser to a wireguard VPN tunnel to some other network.

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.