Machine learning in Go - future?

nyggus · August 11, 2021, 11:58am

Hi all. As a data scientist, I am interested in machine learning. I see that there are a few ML packages in Go, but they do not seem to be actively developed, some even do not seem to be maintained. Not that long ago, Tensorflow supported Go, but not anymore. Hence, I am wondering whether Go has any chances to start being an important player in the ML community, and I am afraid it does not seem so for the moment. Does anyone here has more knowledge about this and can share their thoughts? I’d appreciate it.

skillian · August 11, 2021, 5:26pm

Does this help you?

nyggus · August 11, 2021, 7:37pm

Thanks a lot, Sean. But frankly, this post does not help. The web is full of such posts, but they are even misleading. They miss the important side of the coin, the one I mentioned - they describe those packages as great options, but they miss what I wrote, that they are not active, or not maintained anymore, or their development is very slow given what we see in the machine learning community, for instance in Python. This is based on those packages that I formulated my main thought that machine learning in Go does not look promising.

Massimo.Costa · August 12, 2021, 8:54am

Not expert in ML but in the Tensorflow’s website you can find a link to the Go binding.

Isn’t it working?

nyggus · August 12, 2021, 9:00am

Thanks, Massimo. As I wrote, TensorFlow supported go bindings, but it does not so anymore. Look here: https://www.tensorflow.org/api_docs/. You will see there something sad (at least for me):

So, I suppose the binding work, but is not maintained anymore, and so the future is unclear.

samyak · August 13, 2021, 11:17am

I suppose you have already looked at Gorgonia.

The author/creator of Gorgonia @chewxy has written a couple of books on Machine Learning using Go. Perhaps he can provide more insights here.

nyggus · August 13, 2021, 11:21am

Yes, this is perhaps an exception to the rule, since it looks as a maintained and developed package. I am not sure, though, whether one active package is enough for Go to become an important player. I haven’t used it, though, so I can’t yet tell how deep it goes into ML, but I am going to explore it. Thanks, Samyak, for sharing!

Dean_Davidson · August 14, 2021, 2:07am

I’m having a hard time understanding your motivation for this post. If there’s a library written in go that you want to use, use it. If there isn’t and you have the knowledge and the will to create it, create it. If you are already using something written in Python that works, keep using it unless you have a reason not to. If Gorgonia looks interesting to you, try it.

Nobody can predict the future. As a data scientist I would imagine you know that as well as anybody.

nyggus · August 14, 2021, 6:26am

Thanks, Dean, for sharing. I understand your point, and sometimes this is how I proceed. For instance in proof-of-concepts projects, this does not have to make much of a difference which framework I choose. But I do claim that it is not so in production projects. Then such a decision is crucial, and a wrong choice can lead to disasters, or at least problems.

If a software product with an ML component is to be maintained and developed for years (and all such products I worked on were planned to work for years), then it’s important what you choose. Let me give you an example. Four years ago in my company someone started a project related to one disease. Then the project had another phase, for another disease. Then another, then another. It’s been four years, so software has changed (generally, but in terms of ML even more) a lot. Each new phase assumes that the product will be modified, not written from scratch. Much is already done, and stakeholders will not pay for more, like rewriting significants parts of the product (adding a new disease means modifying the existing code). Imagine that the code was written using an ML library that at the start of the project was already not maintained for three years. This would mean that now we would have to use software that is already 7 years old. That’s a lot, particularly when there is so much other choices now, with so much newer solutions.

This is why I don’'t agree that you can choose whatever you want, and I do claim that we data scientists should take wise decisions. And this is why I don’t agree that the discussion I started here lacks motivation. In our company, you cannot choose just whatever framework you want. We have a list of supported frameworks, and when we put a new item there, people discuss it, try it, provide their opinion. After some time the decision is made whether or not the framework should be supported or better not. I would disagree to support a framework that has not been maintained for two or three years. And we’re talking about machine learning, which develops extremely quickly in recent years.

No one can predict the future? Not with certainty,. but I have worked in many projects using time series forecasting, and is it not predicting the future? Sometimes we fail, but quite often we do succeed, and in our lives we always try to predict the future.

I’d say that we data scientists should be able to choose this framework that will serve us not only today, but also tomorrow - at least this is important in my compnany, so I really think this discussion makes sense.

Dean_Davidson · August 17, 2021, 6:17pm

I understand the sentiment behind your post: you don’t want to adopt a library and then have it be abandoned by its’ creators. My point was that you should just evaluate things on a per-library basis, not necessarily a per-language basis. Also - that you should pick what works for your team. Consider this article on towards data science:

Our data shows that popularity is not a good yardstick to use when selecting a programming language for machine learning and data science. There is no such thing as a ‘best language for machine learning’ and it all depends on what you want to build, where you’re coming from and why you got involved in machine learning.

Do you have a team of 10 people who all know Python and have experience with TensorFlow? Then Python/TensorFlow is the obvious choice. Gorgonia is actively maintained and the authors seem committed to keeping it that way, but In my experience, team composition is going to be the most important factor here.

Scott_Cotton · August 17, 2021, 10:03pm

Nyggus, Dean,

In the earlier days of Go, there was more interest in science and math. One of the selling points was even built in complex numbers. But I agree with your assessment of a less lively community for data science (and science/math in general) than some other top players. I also find this disappointing at times. For an example, there have been proposals to remove complex numbers from Go because no one uses them (except I was using them).

I think there are a number of historical and socio-economic reasons for this, such as that python scikit is actually largely based on ancient C. That kind of code base simply takes time to accumulate. Or the explosion of cloud stuff and our intrinsic need to associate a language with a fixed kind of applications. To me, Go is a restart in the scientific arena and has distinguishing features which are quite amenable to scientific computing (see for example this talk).

For me, there are nonetheless some current success stories like Gonum, bleve search, Gorgonia. Also Google cloud platform has some ML infra with Go interfaces. I wonder if parts of that back end are in Go, who knows. There are lots of small scientific libraries of interest, but you have to have a very specific problem at hand to exploit them, not a big project.

[Data] Science in Go is growing, and with time I believe it will accelerate (for reasons similar to the gonum talk I link above). I also think the best way to help this acceleratation is for us as individuals to give sciency things in Go a try (when it makes sense), and for us as a community to engage scientific and educational institutions more, and more visibly.

nyggus · August 18, 2021, 5:14am

Thanks, Dean. Yes, you get my point. I agree, though we always need to remember that an ML component is always just a component of the whole application. Sure, we can use various languages for one application, but frankly, whenever I can use one language instead of two, I follow this option. Hence I am thinking about Go, since I think it’s often a much better option to write an app in it than in Python. But then again, if this is an ML app, then perhaps Python is better?

I would be happy to see more ML packages in Go, and I would be happy to see them being actively maintained and developed. Hence my question. I am relatively new to Go, and hence I wanted to ask what those with more experience and knowledge think about it.

nyggus · August 18, 2021, 5:28am

Thanks a lot, Scott, for sharing your thoughts. Gonum does look nicely indeed, and so does Gorgonia. Who knows, maybe Go will enter the science community in some future. But this does not seem to be an easy task. For us Go offers some great features, but I am afraid for many scientists who are not programmers languages like Python and R offer something that Go does not (though Tengo does): an interpreter. For many scientists, Python and R are not programming languages but software for data analysis and visualization. Often you do not need much skills to perform even complex analyses in them — your knowledge of the given feature is more important. For instance, when you know a lot about generalized linear mixed effects modeling, using R for such a task will not be that complicated even when you do not know R too well. When you know a lot about neural networks, using Python and keras can be a piece of cake. But when you need to be almost a programmer to do something, then many such people will simply give up.

My company is hiring data scientists all the time, and I am often involved in this process. We often see educated data scientists who are also developers. But often we also see scientists who want to move to the business and who have been using R or Python for years but who cannot be called developers. They just used these languages for some statistical or machine learning stuff. They don’t know what is unit testing, they have never written any application, things like that. For them, Go would be much more difficult than Python or R. Enough to run the interpreter, read the data from a CSV file (for which you often really have to run just one function, and sometimes you can do it directly from IDE without actually running this function) and then run some standard code.

I think this is the reason why the science community (in which I have actually spent over 20 years) loves R and Python. But data science is also very active outside of the science, and there data science means something else; it’s not just statistics and machine learning and visualization — more work is related to actual programming, to creating software products. Perhaps there Go will find its place sooner or later, and I would be happy seeing this happening.

True! I am myself trying to follow this path, since I have been working on a Go package to perform some complicated survey sampling stuff. Whether or not someone will use it, ever… no idea. Maybe no one. But for me it’s a great way to learn Go, but also if we do not add things like that to Go, it would have no chances to be used.

system · November 16, 2021, 5:29am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.