HTML to PDF error

I would like to export an HTML file to a PDF file. I am attempting to use go-wkhtmltopdf from
https://pkg.go.dev/github.com/SebastiaanKlippert/go-wkhtmltopdf
When I attempt to use this, I get the error:
fork/exec : no such file or directory

	var body bytes.Buffer
	str := "<!DOCTYPE html>\n<html>\n    test\n</html>"
	body.WriteString(str)

	pdfGenerator, err := pdf.NewPDFGenerator()

	page := pdf.NewPageReader(bytes.NewReader(body.Bytes()))
	pdfGenerator.AddPage(page)

	err = EnsureBaseDir("./pdfFiles/test.pdf")
	if err != nil {
		fmt.Println("EnsureBaseDir failed error: ", err)
	}

	file, err := os.Create("./pdfFiles/test.pdf")
	if err != nil {
		fmt.Println("os.Create error: ", err)
	}
	defer file.Close()

	pdfGenerator.SetOutput(file)
	err = pdfGenerator.Create()
	if err != nil {
		fmt.Println("pdfGenerator.Create error: ", err)
		http.Error(w, err.Error(), http.StatusInternalServerError)
	}

The call to pdfGenerator.Create generates the error. My eventual intention is to create the HTML from a Golang html template and I began with the output from templates.ExecuteTemplate but to simplify, I created this simple HTML string and I still get the error.

Maybe there is a better package for accomplishing this, but this one is mentioned in a number of websites as the right one.

You do know that this requires you to have the wkhtmltopdf program and that you point the package to it via
wkhtmltopdf package - github.com/SebastiaanKlippert/go-wkhtmltopdf - pkg.go.dev?

Yeah - at first glance it looks like he forget to install wkhtmltopdf. I’ve used that wkhtmltopdf wrapper before and creating a dependency on that executable is always a pain on new environment setups.

I did not know that. I’ll see if I can install it.

Hey, Dean you must be pretty active on this forum. I’ve made a ton of progress on my project since you last helped me. And Jeff, I believe you helped on that also so thanks. I’m 72 years old and learning all this new technology in semi-retirement and can figure out most of it, but it’s really nice to have forums to get past the ones I cannot figure out.

Well, installing wkhtmltopdf got me past that issue. I installed wkhtmltopdf and successfully created the test.pdf with the simple HTML, no error and the PDF is good. When giving it my more-complex HTML which renders fine in the browser and is about 30 lines long, I get a huge error. It does create a PDF and it is 2300 pages long, all empty until the last page which has some elements of the HTML. I will show the end:

Current assets
Bank savings account Kiribati
Long term assets
Equipment Kiribati
Current liabilities
Interest payable Kiribati Purchase Tax payable Kiribati
Long term liabilities
Mortgage loan Kiribati
Equity
Retained earnings Kiribati
Current assets total
Long term assets total
Assets total
Current liabilities total
Long term liabilities total
Equity total
Liabilities and equity total

The error is then sent to the browser via: http.Error(w, err.Error(), http.StatusInternalServerError)

Again, mostly empty pages, but here is the beginning and end:

Loading pages (1/6)
[>                                                           ] 0%
[======>                                                     ] 10%
Warning: Blocked access to file                                   
[=====================>                                      ] 35%
Error: Failed to load about:blank, with network status code 301 and http status code 0 - Protocol "about" is unknown
[============================================================] 100%
Counting pages (2/6)                                               
[============================================================] Object 1 of 1
Resolving links (4/6)                                                       
[============================================================] Object 1 of 1
Loading headers and footers (5/6)                                           
Printing pages (6/6)
[>                                                           ] Preparing
[>                                                           ] Page 1 of 2304
[>                                                           ] Page 2 of 2304
[>                                                           ] Page 3 of 2304
[>                                                           ] Page 4 of 2304

end the end:

[===========================================================>] Page 2301 of 2304
[===========================================================>] Page 2302 of 2304
[===========================================================>] Page 2303 of 2304
[============================================================] Page 2304 of 2304
Done                                                                            
Exit with code 1 due to network error: ProtocolUnknownError

This page does have a thumbnail image, but I removed that and get the same results. This page is a financial statement, a balance sheet, with links on each account to a transaction detail page for the account. Maybe the wkhtmltopdf doesn’t handle those. There are also 3 dropdowns and a button.

Maybe you aren’t passing valid HTML due to not following the 301 redirect? The browser does that automatically.

Or maybe the redirect url is broken.

I don’t understand, my server is not doing any redirects.

It may be related to this issue:
https://stackoverflow.com/questions/62315246/wkhtmltopdf-0-12-6-warning-blocked-access-to-file
It says to enable local file access but I haven’t found a way to set it.

Sorry, I misinterpreted the 301 as an http status. If I had read more carefully I would have seen that it wasn’t. Still, it seems to be trying to load “about:blank” as a URL. That could still be a red herring regarding the 2300 pages. I’d try to pare down the html just before the “Current assets” to identify what is causing all the page breaks.

I just re-ran it but removed everything starting with current assets, same error:

[===========================================================>] Page 2302 of 2304
[===========================================================>] Page 2303 of 2304
[============================================================] Page 2304 of 2304
Done                                                                            
Exit with code 1 due to network error: ProtocolUnknownError

I removed the link to the stylesheet, no error but still not rendering even close. The pdf is 2,000 pages long and the amounts don’t show up at all.
The last page in the pdf:


rendered in the browser:

BTW, it looks a lot nicer with the style sheet

I’m very frustrated with this package. I can’t get it to work on the simplest example.

here is the HTML:

<!DOCTYPE html>

<head>
    <meta charset="UTF-8">
    <link rel="stylesheet" type="text/css" href="../styles/tutorial.css">
</head>
<html>
    <h2>test</h2>
    <div class="container">
        container
    </div>
</html>

and stylesheet:

.container {
    border: 1px solid blue;
}

Looks fine in the browser, but the pdf looks like this:

Image 3-22-22 at 3.07 PM
Notice no blue border
And I get no error on pdfGenerator.Create()

Try this

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <link rel="stylesheet" type="text/css" href="../styles/tutorial.css">
</head>
<body>
    <h2>test</h2>
    <div class="container">
        container
    </div>
</body>
</html>

May not make a difference because html renderers are usually very lenient.

See also Css styles not applying in pdf · Issue #3676 · wkhtmltopdf/wkhtmltopdf · GitHub

Sorry, I guess I made it too simple and didn’t check it since it rendered fine. No, that wasn’t it. PDF still has no styling.

I added this line:
page.EnableLocalFileAccess.Set(true) which fixed the 301 error.

Have you used this package?

Did you change your HTML to give the absolute path to the CSS file as described in the link?

Confirmed this was resolved by adding absolute path of stylesheet

You’ve moved beyond the Go wrapper and are bumping into issues with the underlying program.

1 Like

I changed the href to:
href="/styles/tutorial.css"
This works in the browser, but the pdf still has no styling

If I change to
href="../styles/tutorial.css" the go wrapper does not style the pdf
If I run the command-line
wkhtmltopdf --enable-local-file-access tutorial.html test.pdf
from the templates folder, the pdf file is styled.
I have this structure:

serverRoot
    templates
         tutorial.html
    styles
         tutorialStyle.css

Rick, you are running up against a few problems. When you link to something using a relative path, it depends on where you’re executing it. So in your example where you ran wkhtmltopdf --enable-local-file-access tutorial.html test.pdf it looks like you’re in the “tempates” folder, so the relative path to ../styles is fine (same with when you run it from the web browser because I’m assuming you’re navigating to something like /tutorial.html). However, if you were to, say, run that same command in the serverRoot folder it would look for that file in the parent folder of serverRoot.

If gets more complicated in the case of wkhtmltopdf because it’s executing in a headless browser. Check out this issue for more details.

if you want to get up and running quickly, one option is to place your styles directly in the chunk of HTML you’re sending to wkhtmltopdf. Instead of this in your header:

<head>
    <meta charset="UTF-8">
    <!-- External stylesheet -->
    <link rel="stylesheet" type="text/css" href="../styles/tutorial.css">
</head>

… put your styles directly in the header so you don’t have to resolve a file at all:

<head>
    <meta charset="UTF-8">
    <!-- Inline styles -->
    <style>
    .container {
        border: 1px solid blue;
    }
    </style>
</head>

Other than that, you could try adding a base tag. What happens if you do something like this?

<head>
    <meta charset="UTF-8">
    <!-- Replace this port with your actual development server port -->
    <base href="http://localhost:8080/">
    <!-- Now this will resolve to http://localhost:8080/styles/tutorial.css  
    which means that wkhtmltopdf should be able to retrieve the css from your
    go web server which is serving up the files-->
    <link rel="stylesheet" type="text/css" href="/styles/tutorial.css">
</head>

You could also try a base href of “./” with the stylesheet href of href="/styles/tutorial.css" and as long as your golang executable is running in serverRoot and you have successfully enabled local file access I think that should work.

Hey, Dean, I tried all of your suggestions but no cigar. Except for putting the styles in the . My real stylesheet is very long and is used by almost all of my templates, about 20 of them. If that worked it would be a nightmare maintaining it. I tried it though and it did work. Next, I’m going to try stepping into the wrapper code. Maybe I can figure out where the path is getting messed up.

What happens if you set the style href to be the exact URL your browser is getting it from? E.g.:

<!-- Make sure you can paste the contents of href into browser and it works -->
<link rel="stylesheet" type="text/css" href="http://localhost:123/styles/tutorial.css">

Change port to whatever you’re using. And make sure the go app that is serving this content up is running when you go to generate the PDF.

Also, regarding inline styles: you could keep your stylesheet separate but just inject the contents of it inline in your template when you go to generate PDFs.

Also I found this exact issue here:

It’s a problem with wkhtmltopdf, not the wrapper. In your setup code, set stderror:

pdfGenerator, err := pdf.NewPDFGenerator()
// Set stderr
pdfGenerator.SetStderr(os.Stdout)

… and you should see on the console where it tries to find your css file and can’t. I think your best options are:

  • Turn the header into a template like I mentioned above and inject contents of the CSS file as an inline style when generating a PDF.
  • Write the HTML to disk in a temp file as mentioned in issue above.

Finally something works.

    <base href="http://localhost">
    <link rel="stylesheet" type="text/css" href="/styles/tutorial.css">

I’m glad it works because my brain was boiling over stepping through the exec cmd code. I see your recent post and looked at the linked issue. I hope they are working on it. I have a thumbnail image which doesn’t show up in the pdf but it is not loaded from a file, it is the base64 string I save in the database. It is finding the stylesheet and is using it. I wonder if your ‘base’ suggestion would work for that open issue. The thumbnail is not that important. However, it is also not executing the javascript, which I mainly use to format the numbers and the pdf is 370 pages long with the actual HTML output is on the last 2 pages. I found an option for disabling javascript, but not one for enabling it, so I guess the default is enabled.
browser output:
Image 3-23-22 at 1.31 PM

pdf output:
Image 3-23-22 at 1.30 PM

Looks like it doesn’t support the display: grid either.

I assume there’s a path issue with the javascript file. Be sure to set stderr:

pdfGenerator.SetStderr(os.Stdout)

That will print warnings to the console, of which I’m guessing there are some (that image file for example).

Looks like it doesn’t support the display: grid either.

Yep: