Options for writing back end persisting processes?

jdc-cunningham · May 2, 2017, 12:07am

I’ve been looking for a language that can write persisting processes in the background on servers doing tasks like computing, scraping, etc… My web tech ‘scope’ I guess right now is a LAMP application developer, I’m behind in tech but I’ve built some processes with PHP/HTMLSimpleDom and aside from using things like CRON/nohup/other methods for persistence, these things may not continue running when you exit. Also I’ve only successfully had CRON run on a Raspberry Pi not the VPS’s I use or my local environment. I’ll have to figure it out for Lets Encrypt.

I’ve been listening to podcasts about Go and I keep hearing things like how it can replace servers and do many “concurrent” things. I’ve only so far as gone to do the basic “hello world” but I’m wondering if I’m looking at the wrong language.

I’ve also heard of CGI background processes possibly written in C++? I’m not sure.

I guess I’m looking for thoughts from people who have experience doing this stuff with the application I have in mind.

In the future too I’d like to do mass processing of data like ML but I don’t know if Go is wrong for that ie Python. Definitely beyond my capabilities I realize as a procedural coder. I don’t mean this as object-oriented as the definition says, I mean thinking sequential processing/flow this happens, then that.

lutzhorn · May 2, 2017, 6:49am

I don’t quite understand what you mean by that. What is a ‘process’ for you? What do you mean when you talk about ‘persisting’ it 'in the background?

jdc-cunningham · May 2, 2017, 7:00am

Hello lutzhorn,

Perhaps persisting is the wrong term to use here. I think an event-scheduler is more appropriate like CRON or Daemon(?).

By persisting I was referring to when you start a process by command line. For example one time I did a node tutorial with live chat and you had to start it, for it to start listening on command line. I’m not sure if you left, if it would continue running or exit. Usually you are told use something like nohup for ssh/command line.

My application would be for scrapers running at intervals, processing data, etc… but it runs once you start it and perhaps is possible to accept commands without stopping.

Edit: for backend I currently use PHP and PHP you have to execute it. So with regard to persisting, keep it running even when you exit the back end that started it or not require a front-end/client trigger to start. There is nohup but wondering if you can do this with Go.

lutzhorn · May 2, 2017, 7:20am

Well, cron is not an event scheduler, it is a ‘daemon to execute scheduled commands’. You can run any program periodically using cron.

I have the impression that you should familiarize yourself with basic Unix concepts: start a process, but it into the background, get it back into the foreground, etc.

Making a process accept commands without stopping is possible, there are established concepts for this. Signals are one, listening for commands on a network socket is another.

But all this is not related to Go, it applies to any program no matter what programming language was used. It is something your OS does.

jdc-cunningham · May 2, 2017, 7:23am

Yes I probably posted too soon without reading enough. I can’t even explain the difference between Unix/Linux.

Also not exactly sure what Go is intended for despite hearing different capabilities (reducing servers, being able to talk to hardware level like Node can).

Yeah alright thanks, pointless post on my part but learned some things / have a direction to go.

nathankerr · May 2, 2017, 8:14am

In a response to another question I described the setup for my site, https://pocketgophers.com. It written in go and:

starts with the VPS starts/restarts
updated with git push
updates are graceful, meaning old connections are not dropped (until a timeout) when the new version starts
if the process dies it gets restarted

As far as I know, this is as persistent as a process can get.

jdc-cunningham · May 2, 2017, 8:20am

Thanks.

This is another thing I heard about too the Go front end/website capability.

What is the point here for the persistence to make sure your site is always up? Or it runs processes? I see links but nothing seems to be “polling” I guess. Not on the front end anyway.

I’m also curious about your form’s action, is this an API endpoint? the leading double-forward slash is intriguing. Not related to the question I realize. Also does not seem to be part of your domain (front end) so it doesn’t deal with CORs problem (not AJAX) not saying this is AJAX. Could be a page-refreshes on submit sort of deal.

nathankerr · May 2, 2017, 9:23am

I want my site to always be up. As far as I know, this is the best that can be done with a single VPS.

The site is fairly static in its current state. I could almost serve it from S3, except for automatic https certificate updates, possibly some redirect handling, and backend-only analytics. However, I am just git push (and writing the code) away from doing more complicated things such as APIs, background processes, etc. My deployment setup won’t need to change unless I need some other infrastructure (e.g., a different database (I currently use sqlite3)) as that would also need to be setup and deployed.

My form’s action is an API endpoint for https://mailchimp.com. I don’t want to manage the email list infrastructure myself as keeping an email list deliverable is a non-trivial job. The double forward slash is a way of specifying a url that uses the same scheme (http or https) as the page it is linked from. I don’t remember if I did that myself or if mailchimp did when it generated the form.

The form is not doing anything fancy. It POSTs to the specified url which starts the email confirmation process to subscribe to my list. Feel free to try the form out. You can unsubscribe at anytime.

There is currently no AJAX done on Pocket Gophers. I think the only javascript on the site is on https://pocketgophers.com/go-release-timeline/ and is used because the content is interactive.

I try to keep my front-end as light as possible. My philosophy is that the site should help the reader as much as possible. For example, my subscription form does not get in the way of the content by popping up, running javascript, etc. Analytics is backend only, and is done after requested the requested content is served and therefore does not delay response time. The simple-looking design is responsive and accessible.

Hopefully this was not too much of an answer If you want to know more, tell me.

jdc-cunningham · May 2, 2017, 9:34am

Regarding “I want my site to always be up…”

What is your concern traffic load? Have you bench marked your vps to see how many requests it can support? I thought this was one of those things Go accels in being able to serve like 25,000 concurrent requests per second or some figure I heard haha. Maybe that’s assuming a dedicated/powerful server.

I’m using Apache myself at this point, have not used Go at all (aside from the basic print demo and some loops) so yeah I’m definitely behind. I’m using LAMP/JavaScript myself and you can benchmark Apache but I have yet to do it.

I was thinking maybe you have some process that does an overflow to public cloud “copy of your code/static dumping” or some other solution to keep the site up.

Edit: you also mentioned automatic HTTPs I think, is this with regard to Let’s Encrypt, that’s something I have to do myself as I’m kind of done shelling out $9 per SSL cert haha, though the 1 year setup versus every 90days but with *correct automation, not an excuse.

I am working on creating my own private Git thing that’s a huge need I have so many copies of this web application I’m developing… it’s not intended to be open source public so using GitHub is not an option at this time especially the private repo costs more than my VPS haha (shout out to OVH).

You provided a lot of cool information I have to catch up.

Yeah regarding light sites I’m figuring that out myself trying to deploy a cloud-media oriented web app in a place where network averages between 0.5Mbps to 1.5Mbps holy cow versus the average 22Mbps in the US or for me at least 100 (1Gbps wired).

This is good stuff thanks a lot for your post.

nathankerr · May 2, 2017, 10:47am

I want my site to always be up because I want people to be able to use the information on the site. If the site is not up, the information can’t be accessed to be used.

I haven’t benchmarked my site because it doesn’t get enough traffic to impact the load on mv VPS. My load average right now is 0.00 0.00 0.00.

Benchmarking is a useful tool, as demonstrated in my recent post Concurrency Slower?. Benchmarking Go or Apache doesn’t make much sense to me because your application or use case will always be different. I think a better approach is to instrument your application and then use that data to figure out where there are performance problems. Once a problem is located, benchmarks are useful for making sure the changes you make to the code actually improves things.

Using Let’s Encrypt with go is easy. Here is an example. I use a cache with the manager to avoid generating new certs when my server restarts/updates.

On another note, you aren’t “definitely behind”. There’s nothing to “catch up” on. Sure, there are things you don’t know. There will always be things you don’t know. In fact, the more I know, the more I know I don’t know. What I try to do is know about many things so that I can learn them as needed. For example, I was able to respond to your other post because I remembered I had seen a package that could help; then I had to find it so I could share it with you.

Running your own git server does not need to be difficult. If you are the only one who needs access, then a VPS with ssh access works well. If you need to manage multiple user, then something like Gogs would be easier. Just make sure you have backups handled.

To quickly get started writing web apps in go, you might like a book like Web Development with Go: Learn to Create Real World Web Applications using Go. The author, @joncalhoun, also runs a website and email list with beginner-appropriate material. You might, however, prefer some other book of the many available.

jdc-cunningham · May 2, 2017, 12:04pm

Thanks a lot for the links.

At the moment I’m looking to learn Go not for front end yet, just because I’m accustomed to the stack I learned, somewhat regretfully in a way. If I was a Wordpress developer I could get jobs as they’re base don LAMP (I think so still). Regarding JavaScript I haven’t learned any of the new frameworks (not new but) so…

I’m after Go for the back end part, but yes good to know with front end. I’m definitely trying to build more efficient “server-side” rendered pages or templated anyway.

I mentioned benchmarking/site going down on your behalf because I wasn’t sure why else it would go down. Are you saying your server provider isn’t reliable? Otherwise why would it break or potential vulnerability?

Yeah the bench mark thing is interesting, it’s odd to think about ram consumption/cpu based on what is going on… I can see what you mentioned about profiling. I’m curious to see what my little single-core VPS with 2GB of RAM can do… apparently quite a bit… but one thing I built right now that is reallllyyy bad… in one visit it makes 11+ http requests haha. yeah it takes so much time and then you have to condense. Almost easier to burn it all and start over (slash/burn).

Regarding Git, ideally it would be nice to have multiple user capability but at this time I am the sole developer.

Do you think it makes sense to have the Git server separate from what hosts the site itself? We’re trying to cut costs at this time as we don’t really have any active users yet and having both things on the same server would be nice. I just have to figure out how you make something “private” yet you can connect to it by SSH from another ip/server/computer. I’ve got the mindset of Cors/ajax/public facing directory on that one but yeah I was going to follow a tutorial on setting up private Git directory.

Also thanks for the Let’s Encrypt mention, maybe this will be my first real “foray” into Go that is actually useful to me (not just print hello world). Though it would be interesting to see the Qualyss test rating as right now I have an A+ with my apache config.

Anyway thanks a lot learned quite a bit.

nathankerr · May 2, 2017, 12:53pm

The book I linked to teaches you how to do the back-end stuff. Think of it like replacing Apache and PHP in LAMP. Your front end would still be in HTML, CSS, and JS, its just that the code generating and serving the front-end is written in Go instead of PHP.

On reliability, there is no such thing as 100% reliable. Hardware fails, power fails, security updates require rebooting the VPS, etc. Most of the downtime I have is because I have to reboot to apply some update. Since my server starts when the VPS starts, the downtime is minimized. Sites like Facebook, Google, and Amazon are more reliable, but still experience downtime. Their increased reliability comes at the cost of much more complicated infrastructure including multiple servers and data centers. My current reliability needs do not justify the increased cost and complexity needed to achieve it.

You can host your git server on the same VPS as your site. If you serve it from a different subdomain (e.g., git.example.com) then it will be easy to move it if you need to. Since you are the only dev right now, I would just use git over ssh to your current VPS. This is what I am doing now. When your needs change, then you can figure out something different.

You probably don’t need to slash/burn your site to convert it to go. I would start by having your go version act as a reverse proxy to your current infrastructure. Then you can covert the handling of each request separately from PHP to Go, starting with the most frequent or most expensive request.

https://pocketgophers.com currently has a Qualys rating of A. I have not spent much time on it and just followed CloudFlare’s recommendations for exposing Go on the Internet. Some things have changed since then, like mandating DNS CAA, that I have not followed up on. Feel free to check it yourself.

jdc-cunningham · May 2, 2017, 2:47pm

Sounds good, exciting stuff.

Regarding the reverse proxy, not sure if I fully understand. One possible problem, we use CloudFlare (just a free account right now) I wonder how that will interact with the proxy.

I don’t know if it’s dumb, when the url builds the page (some parameter determines the posted title, meta tags, other dynamic parts of the page) rest of the html code echo’d out like a block of text… not sure if that’s a dumb way to do things… anyway not related.

I suppose regarding the vps reliability I think of it as “they are liable to keep it running” but at the same time I don’t have a pinging service that tells me “hey your server is down”. Even with cloudflare without the database which caches a lot of stuff to save on Free api calls (cloudinary haha), the site would not function.

Yeah this is great, sounds like being able to replace PHP/Apache with Go will really make me get into it/know it.

I’m just curious how crazy of a syntax/code change it will be as I do everything “full stack” (quote on quote) with regard to database schema/access/page building whatever with PHP/MySQL/JavaScript

One time my hosting provider did go down for several hours which they compensated with proportional hosting credit, but yeah that would have sucked if it was vital. That was one time several months ago(year(s)?) but yes your point is right, I didn’t think about that, just assume all is well, they’re liable. It was a hardware failure too I think (fire).

The slash and burn… the project did not have a clear direction and I built the “modules” so to speak, and just kept tacking on code… god it’s a nightmare to trace the flow/execution of code… yeah gotta redo it specially with the network constraint of 0.5 to 1.5Mbps at best.

This was very informative thanks a lot I’m sure I’ll be referring to this page for a bit and hopefully can pick it up… I was looking to convert to Node before (thinking about it) regarding the ability to do what I do with PHP/MySQL with Node/JavaScript/Mongo. I do still feel that I"m using outdated technology but at this point I think I’m just going to focus on “rapid deployment” of different sites that hopefully produce traffic.

edit: yeah that book is a good place to start thanks, hopefully it is free WHAT!!! It costs money! haha that’s like 4 hours of work, or four days of snacks I can swing that.

edit: I keep hearing things about Go like “binary” I don’t get that, have to read. I think that is a good learning project to replicate a web app that I built in LAMP with Go and be able to empirically compare the resource usage.

joncalhoun · May 2, 2017, 3:12pm

To piggy back a tiny bit - Caddy Server makes SSL with Let’s Encrypt stupid easy regardless of what language your web app is written in. I definitely suggest looking into it. You can use it kinda like an nginx drop in with SSL in like 5 lines of code in a config file and it handles automatically renewing certs for you and everything else.

jdc-cunningham · May 2, 2017, 3:19pm

edit: sorry now I get what you meant by piggy back… oh wow you’re the author of that book holy ■■■■… the carp word is blocked? hmm

sorry I’m on the awake/should be asleep/was productive/can’t logically think end of a day

props for being an entrepreneur

Yeah I have to read a lot I have this impression of (well it used to be make from what I hear) I wonder how it is server side. You mentioned having to start/build I think (at least once on startup). I wonder if that’s a trade off as opposed to when you install Apache/other stuff and it’s just there. Not saying that’s an argument. Also the more resource usage… right now I hover at ~150MB ram rest, CPU I think is close to 0. Granted no back end processes other than processing mysql/php and serving the files.

I can’t use that argument of not using “libraries/frameworks” I mean I use jQuery extensively so what’s the difference between that and including something else (with regard to front end). I didn’t even know for example Angular ran in the browser I thought it was something you had to install. Anyway whatever makes it easier/straight forward.

I started renting GoDaddy servers that were overpriced 4-5x what I rent now so we all start somewhere.

nathankerr · May 2, 2017, 3:47pm

Go could replace Apache/PHP in your stack:

jdc-cunningham · May 2, 2017, 3:52pm

Right okay, from here on then I should return having done so, can’t say when but I see what you’re saying.

Problem with so many flavors there’s the Node.js possibility too, but I’m more sold on Go. I don’t do anything special with Apache and aside from the general logic flow/array/string/json_encode/url-parsing/referers operations with PHP, then interface with basic CRUD command with MySQL. yeah… probably does make sense to just replace it with Go.

Not sure if there is a PHP-PDO equivalent to Go and MySQL regarding injections.

It will probably take me a bit but excited to have an idea of what to do.

Thanks for everything.

nathankerr · May 2, 2017, 4:00pm

Using a reverse proxy to start changing from php to go would look like:

Initially all of the application work would be done by Apache/PHP. Then you can incrementally start handing the requests directly in go.

This avoids needing to rewrite all your code at once. After all the requests are handled in go, you can stop using Apache/PHP.

jdc-cunningham · May 2, 2017, 4:07pm

Right I see it.

It’s actually mostly the front-end JavaScript that needs work, you know think you’ve got three things on a page each one does an HTTP request (dumb) it’s actually at least 11 requests right now per visit (user/page refresh). Granted I am using browser push state and using dynamic loading to load parts of a page. But the initial url-template building.

It’s bad a slash-burn is warranted when you’ve got multiple copies of functions in separate javascript files (scope/state problem), global state variables at least 100 lines long… jesus, 1 variable per line… but yeah…

It’s crazy though how even the time it takes for a PHP script to process say a look up request with MySQL and then return the value (actually MySQL taking time) and then on top of the latency with regard to the datacenter to the location being served… we’re talking 1KB takes 6 seconds to load… yeah I definitely have to redesign the application.

I shall return victorious! Haha but you know with day time job unrelated to web and then current project with LAMP not sure how long. But want to get into it. What is fmt anyway you know haha like ‘<’ stdio.h ‘>’ or something. Gotta get used to it.

nathankerr · May 2, 2017, 4:20pm

fmt contains the functions similar to the printf and scanf (from stdio.h) series of functions.