Skip to content Skip to footer
0 items - $0.00 0

Show HN: HTML visualization of a PDF file’s internal structure by desgeeko

Show HN: HTML visualization of a PDF file’s internal structure by desgeeko

Show HN: HTML visualization of a PDF file’s internal structure by desgeeko

11 Comments

  • Post Author
    Muromec
    Posted February 10, 2025 at 2:07 pm

    That's pretty cool! I would have used it a lot at my previous job if it existed back then. In my ideal world it should work somewhat like https://lapo.it/asn1js/ — you drop a file and it does all the stuff locally.

  • Post Author
    SSLy
    Posted February 10, 2025 at 2:13 pm

    Damn, this is also convenient for forensics and finding watermarks.

  • Post Author
    xeon06
    Posted February 10, 2025 at 2:18 pm

    Wow, I've been doing some PDF parsing at work and this is going to come in SO handy.

  • Post Author
    est
    Posted February 10, 2025 at 2:19 pm

    I remember there was a similar project on github allows visualize any type of binary data by a given schema. There was an TCP/IP example IIRC.

  • Post Author
    nonrandomstring
    Posted February 10, 2025 at 2:21 pm

    Well done. This is a very useful security previewing tool. PDFs are a
    menace.

  • Post Author
    swsieber
    Posted February 10, 2025 at 2:23 pm

    I've used the iText RUPS (free) for a while for debugging PDFs (as I have the "privilege" to work on code that extracts data from PDFs…). It looks like your introspection stuff might be a bit stronger, which would be great. I'll take it for a whirl.

  • Post Author
    tyilo
    Posted February 10, 2025 at 2:41 pm

    Looks nice.

    Would be better if all of the PDF's bytes where shown. Seems like `endobj` and `xref` are not shown.

  • Post Author
    escapecharacter
    Posted February 10, 2025 at 2:57 pm

    I’ve been shopping for something that does a per-byte description of the content of visual media formats (jpeg, png, avi, mp4, etc). Anyone know of one?

  • Post Author
    tekkk
    Posted February 10, 2025 at 3:08 pm

    This would be really nice as browser library. Could just dragn drop a file and see its insides. But impressive nonetheless.

  • Post Author
    kevmo314
    Posted February 10, 2025 at 3:14 pm

    Is the UI tooling that does the visualization a library? I really like the UI format, would love to use this for breaking down and debugging video byte streams too.

    EDIT: Oh it's actually reasonably simple, great use of CSS! https://github.com/desgeeko/pdfsyntax/blob/main/docs/simple_…

  • Post Author
    LegionMammal978
    Posted February 10, 2025 at 3:33 pm

    If you're interested in manipulating PDFs, I've found QPDF [0] to be a useful tool. Its "QDF mode" lays out the objects in a form where you can directly edit them, and it can automatically fix up the xref table afterwards. It can also convert to and from a JSON format that you can manipulate with your own scripts.

    [0] https://github.com/qpdf/qpdf, https://qpdf.readthedocs.io/en/stable/

Leave a comment

In the Shadows of Innovation”

© 2025 HackTech.info. All Rights Reserved.

Sign Up to Our Newsletter

Be the first to know the latest updates

Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.