Validating PDF contents with Cypress

Hi all! So I’ve got a couple of places on the website I’m working on where I validate that a file has been downloaded by accessing the xhr and this works really well. I would like to now extend this to getting the contents of the pdf (perhaps saving it into a json file) and asserting that the information is correct. I have an issue: All the examples I’ve seen online look to make sure that the pdf has very basic information like “PDF-1.4” on line 0 and then name is X.
However, I want to be able to assert that the date displayed is the correct date and the To person is correct and that the line description is correct etc (all content of an invoice for example)

So I thought about getting the pdf information from something like pdf-parser and saving it as a json. I have no idea how to do this.

Has anyone implemented this or have seen something they can point me towards?
Thanks in advance

(For anyone wondering how I work with the xhr:

static VerifyInvoiceHasBeenDownloaded() {
    cy.intercept("POST", "**/InvoiceDetailsPdf").as("postDownload");

    cy.get(".download-pdf-button").click({ multiple: true });

    cy.get("@postDownload").then(xhr => {
      const downloadsFolder = Cypress.config('downloadsFolder');
      // @ts-ignore
      const fileName = xhr.response.headers['content-disposition'].split('filename=')[1];
      const downloadedFilename = path.join(downloadsFolder, fileName);
      cy.readFile(downloadedFilename, 'binary', { timeout: 15000 })
        .should((buffer) => {
      cy.log(`**File ${fileName} exists in downloads folder**`);
      cy.readFile(downloadedFilename).should((text) => {
        const lines = text.split('\n');
      // @ts-ignore

And then I call it like this in my integration file:


Have you tried this?

If I understand it correctly then it should be just what you need.

Another option maybe for you would be to integrate visual testing


Thanks for your answer. I have had a look through it but can’t figure out how to view the data to make assertions on it? I would need to know that it is bringing through the correct stuff first. Would it display on the console?
Visual testing would be a great alternative, but we put timestamps on all of our documents and so that would cause it to fail every day because the document would technically be changing

You could try and print the text here

it('tests a pdf', () => {
  cy.task('getPdfContent', 'mypdf.pdf').then(content => {
    // test you pdf content here, with expect(this and that)...
// print the text first 
Samantha Louw


Most visual testing tools will allow you to exclude a certain area, or allow a certain amount of difference without marking the test as failed. So it might still be an option!

1 Like