sugarme / gotch

Go binding for Pytorch C++ API (libtorch)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

example doesn't run on ubuntu

Arnold1 opened this issue · comments

That's error because your Dockerfile didn't effectively update CGO flags at gotch package $GOPATH/$GOPATH/pkg/mod/github.com/sugarme/gotch@$GOTCH_VERSION/libtch/lib.go .

In your Dockerfile you should do these in order:

Install compiler dependencies
Install Go and setup GOPATH environment
Install libtorch
Install gotch
Have a look at Setup gotch CPU shell script for more detail (it needs to read GOPATH env to know where is gotch package located and update its lib.go file for CGO flags before your Go example can compile and run).

@sugarme I have some issues still with the demo example - what to install regarding compiler dependencies?

where I find which CGO flags I need to set? I see there is something in the https://github.com/sugarme/gotch/blob/master/setup-gotch.sh - but who also sets the SRCDIR in that script?
also the GOPATH in the setup-gotch.sh is set automatically if not defined...

will go build my app also need to set the same flags as for the lib?

do you have an example which shows how to compile the main.go somewhere in this repo?

here is my updated Dockerfile:

FROM ubuntu:22.04

ENV DEBIAN_FRONTEND noninteractive

# Install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends build-essential ca-certificates cmake curl unzip nano wget g++

WORKDIR /home/developer
ENV HOME /home/developer
ENV GOPATH "$HOME/go"

# Install golang
RUN wget -c https://go.dev/dl/go1.19.5.linux-amd64.tar.gz \
&& rm -rf /usr/local/go && tar -C /usr/local -xzf go1.19.5.linux-amd64.tar.gz \
&& rm go1.19.5.linux-amd64.tar.gz
ENV PATH=$PATH:/usr/local/go/bin

# Install Libtorch - CPU
RUN wget https://raw.githubusercontent.com/sugarme/gotch/master/setup-libtorch.sh
RUN chmod +x setup-libtorch.sh
RUN sed -i 's/sudo//g' setup-libtorch.sh
ENV CUDA_VER=cpu
RUN bash setup-libtorch.sh

ENV GOTCH_LIBTORCH="/usr/local/lib/libtorch"
ENV LIBRARY_PATH="$LIBRARY_PATH:$GOTCH_LIBTORCH/lib"
ENV CPATH="$CPATH:$GOTCH_LIBTORCH/lib:$GOTCH_LIBTORCH/include:$GOTCH_LIBTORCH/include/torch/csrc/api/include"
ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$GOTCH_LIBTORCH/lib"

# Setup gotch
RUN wget https://raw.githubusercontent.com/sugarme/gotch/master/setup-gotch.sh
RUN chmod +x setup-gotch.sh
RUN sed -i 's/sudo//g' setup-gotch.sh
ENV CUDA_VER=cpu
ENV GOTCH_VER=v0.7.0
RUN bash setup-gotch.sh

COPY main.go /home/developer/pytorch_demo/main.go

WORKDIR /home/developer/pytorch_demo
RUN go mod init main
RUN go mod tidy
#RUN go build main.go

ENTRYPOINT bash

my env variables:

$ printenv
HOSTNAME=17dbfb11a10c
PWD=/home/developer/pytorch_demo
HOME=/home/developer
CUDA_VER=cpu
TERM=xterm
LIBRARY_PATH=:/usr/local/lib/libtorch/lib
SHLVL=1
GOTCH_LIBTORCH=/usr/local/lib/libtorch
LD_LIBRARY_PATH=:/usr/local/lib/libtorch/lib
GOTCH_VER=v0.7.0
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/go/bin
CPATH=:/usr/local/lib/libtorch/lib:/usr/local/lib/libtorch/include:/usr/local/lib/libtorch/include/torch/csrc/api/include
DEBIAN_FRONTEND=noninteractive
GOPATH=/home/developer/go
_=/usr/bin/printenv

go build main.go errors:

go build main.go
# command-line-arguments
/usr/local/go/pkg/tool/linux_amd64/link: running g++ failed: exit status 1
/usr/bin/ld: cannot find -lcuda: No such file or directory
/usr/bin/ld: cannot find -lcudart: No such file or directory
/usr/bin/ld: cannot find -lcublas: No such file or directory
/usr/bin/ld: cannot find -lcudnn: No such file or directory
/usr/bin/ld: cannot find -lcaffe2_nvrtc: No such file or directory
/usr/bin/ld: cannot find -lnvrtc-builtins: No such file or directory
/usr/bin/ld: cannot find -lnvrtc: No such file or directory
/usr/bin/ld: cannot find -lnvToolsExt: No such file or directory
/usr/bin/ld: cannot find -lc10_cuda: No such file or directory
/usr/bin/ld: cannot find -ltorch_cuda: No such file or directory
collect2: error: ld returned 1 exit status

@Arnold1,

Cgo flags located at

$GOPATH/pkg/mod/github.com/sugarme/gotch@v0.7.0/libtch/lib.go

Compilers can be either gcc or clang.

You don't need to set cgo flag for your own projects.

@sugarme here my cflags:

cat $GOPATH/pkg/mod/github.com/sugarme/gotch@v0.7.0/libtch/lib.go
package libtch

// #cgo CFLAGS: -I -O3 -Wall -Wno-unused-variable -Wno-deprecated-declarations -Wno-c++11-narrowing -g -Wno-sign-compare -Wno-unused-function
// #cgo CFLAGS: -I/usr/local/include
// #cgo CFLAGS: -D_GLIBCXX_USE_CXX11_ABI=1
// #cgo LDFLAGS: -lstdc++ -ltorch -lc10 -ltorch_cpu -L/lib64
// #cgo CXXFLAGS: -std=c++17 -I -g -O3
// #cgo CFLAGS: -I/libtorch/lib -I/libtorch/include -I/libtorch/include/torch/csrc/api/include -I/libtorch/include/torch/csrc
// #cgo LDFLAGS: -L/libtorch/lib
// #cgo CXXFLAGS: -I/libtorch/lib -I/libtorch/include -I/libtorch/include/torch/csrc/api/include -I/libtorch/include/torch/csrc
import "C"

and my go.mod:

go.mod 
module main

go 1.19

require github.com/sugarme/gotch v0.7.0

go.sum:

go.sum 
github.com/golang/freetype v0.0.0-20170609003504-e2365dfdc4a0/go.mod h1:E/TSTwGwJL78qG/PmXZO1EjYhfJinVAhrmmHX6Z8B9k=
github.com/sugarme/gotch v0.7.0 h1:vDQqLmuo5uhqNTfTyR7xbye9pPK9a4l57YWMKH41gGU=
github.com/sugarme/gotch v0.7.0/go.mod h1:ydo7fmsmT+2L5p8Am1YhOLSWGN9WV9nyrfk/RSmhTfo=
golang.org/x/image v0.0.0-20200927104501-e162460cd6b5/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=

@Arnold1,

The cgo flags seem to be correct for libtorch CPU.

Can you ssh to docker container and do printenv? As the errors means linker cannot locate libtorch.

Also, try again

go clean
go clean -cache
go build . 

Other things to try are with clang compiler

apt install clang
export cc=clang
export cxx=clang++


go clean
go clean -cache
go build .

@sugarme I already ssh into the docker container and got the printenv:

printenv
HOSTNAME=17dbfb11a10c
PWD=/home/developer/pytorch_demo
HOME=/home/developer
CUDA_VER=cpu
TERM=xterm
LIBRARY_PATH=:/usr/local/lib/libtorch/lib
SHLVL=1
GOTCH_LIBTORCH=/usr/local/lib/libtorch
LD_LIBRARY_PATH=:/usr/local/lib/libtorch/lib
GOTCH_VER=v0.7.0
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/go/bin
CPATH=:/usr/local/lib/libtorch/lib:/usr/local/lib/libtorch/include:/usr/local/lib/libtorch/include/torch/csrc/api/include
DEBIAN_FRONTEND=noninteractive
GOPATH=/home/developer/go
_=/usr/bin/printenv

oh that worked fine:

go clean
go clean -cache
go build . 

# ls -la
total 12496
drwxr-xr-x 1 root root     4096 Jan 21 11:16 .
drwxr-xr-x 1 root root     4096 Jan 21 09:07 ..
-rw-r--r-- 1 root root       62 Jan 21 09:07 go.mod
-rw-r--r-- 1 root root      473 Jan 21 09:07 go.sum
-rwxr-xr-x 1 root root 12770528 Jan 21 11:16 main
-rw-r--r-- 1 root root     1375 Jan 20 17:56 main.go

any idea why that was necessary?

thanks @sugarme - will close that ticket...

@Arnold1,

hey, finally! Not sure, maybe cgo kicked in too early.

@sugarme would you mind pointing me in the right direction to serve the following demo model in go?

import torch
import torch.nn as nn

# Define the model
class WeatherModel(nn.Module):
    def __init__(self):
        super(WeatherModel, self).__init__()
        self.fc1 = nn.Linear(3, 32) # input size = 3, output size = 32
        self.fc2 = nn.Linear(32, 2) # input size = 32, output size = 2
        
    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Create an instance of the model
model = WeatherModel()

# Define a loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Generate some dummy data
x = torch.randn(100, 3)
y = torch.randint(0, 2, (100,))

# Train the model
for epoch in range(100):
    # Forward pass
    output = model(x)
    loss = criterion(output, y)

    # Backward and optimize
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

# Save the model
torch.save(model.state_dict(), "model.pt") # or: torch.save(model, "model.pt")

@Arnold1,

Please close this issue and open a discussion, I will give you some suggestion there. Cheers.

@Arnold1 ,

Here is my solution:

package main

import (
	"fmt"

	"github.com/sugarme/gotch"
	"github.com/sugarme/gotch/nn"
	"github.com/sugarme/gotch/pickle"
	"github.com/sugarme/gotch/ts"
)

type WheatherModel struct {
	fc1 *nn.Linear
	fc2 *nn.Linear
}

func NewWeatherModel(path *nn.Path) *WheatherModel {
	fc1 := nn.NewLinear(path.Sub("fc1"), 3, 32, nn.DefaultLinearConfig())
	fc2 := nn.NewLinear(path.Sub("fc2"), 32, 2, nn.DefaultLinearConfig())

	return &WheatherModel{
		fc1: fc1,
		fc2: fc2,
	}
}

func (m *WheatherModel) Forward(input *ts.Tensor) *ts.Tensor {
	fc1 := m.fc1.Forward(input)
	relu := fc1.MustRelu(true)
	fc2 := m.fc2.Forward(relu)
	relu.MustDrop()

	return fc2
}

func main() {
	device := gotch.CPU
	// If run on CUDA
	// device := gotch.CudaIfAvailable()

	vs := nn.NewVarStore(device)
	m := NewWeatherModel(vs.Root())

	// full path to your pretrained model `model.pt`
	modelFile := "FULL_PATH_TO_YOUR_MODEL_FILE"
	err := pickle.LoadAll(vs, modelFile)
	if err != nil {
		panic(err)
	}

	// Do inference
	x := ts.MustRandn([]int64{100, 3}, gotch.Float, device)
	logits := m.Forward(x) // [100, 2]

	// Do any interpretation with this logits.
	fmt.Printf("logits: %i\n", logits)
	fmt.Printf("logits: \n%#0.4f\n", logits.Float64Values())
}
# Output something like
logits: 
TENSOR INFO:
        Shape:          [100 2]
        DType:          float32
        Device:         {CPU 1}
        Defined:        true

logits: 
[0.4666 0.3637 0.3711 0.1435 0.2674 0.0968 0.2155 0.0747 0.3846 0.1310 0.1036 0.1068 0.2269 0.1374 0.2622 0.0966 0.3364 0.2419 0.2559 0.1239 0.2078 0.0541 0.1850 -0.0285 0.4018 0.0316 0.3130 0.1059 0.4992 0.1840 0.2705 0.0893 0.2177 0.0131 -0.0598 0.0564 0.4007 0.3565 0.1276 0.0877 0.5475 0.2313 0.1911 0.1193 0.5492 0.3044 0.2973 0.3399 0.1939 -0.0507 0.1650 0.1355 0.2012 -0.1244 0.2357 -0.0618 0.2346 -0.0272 0.3358 0.1037 0.2852 0.0731 0.1508 0.0010 0.1749 -0.0148 0.1111 0.0735 0.3220 0.0780 0.4949 0.2015 0.2710 0.0099 0.3836 0.1501 0.2353 0.2130 0.3477 0.0856 0.4268 0.1042 0.4200 0.3960 0.1855 0.0460 0.2461 0.1772 0.1192 -0.2242 0.0592 -0.0550 0.5307 0.2736 0.1645 -0.0829 0.3825 0.1801 0.2923 -0.0100 0.4155 0.3490 0.3317 0.0293 0.2269 0.0513 0.3326 0.1493 0.1725 0.0277 0.3002 0.1375 -0.0279 0.0550 0.2495 0.1646 0.4161 0.0835 -0.0442 -0.0546 0.2335 0.0150 0.3581 0.1400 0.4781 0.1753 -0.0736 -0.0748 0.1782 -0.1530 0.1994 0.3007 0.1893 0.1221 0.1346 -0.0705 0.0792 0.0261 0.5324 0.1616 -0.0935 -0.1747 0.2727 0.1253 0.1369 0.0070 0.2873 0.1538 0.1307 -0.0202 0.6271 0.3309 0.4274 0.1868 0.3671 -0.0449 0.1746 0.0626 0.3832 0.0547 0.3734 0.1453 0.0343 0.0024 0.3191 0.1516 0.1443 0.0324 0.0070 -0.0653 0.2069 0.0698 0.2630 0.1781 0.5994 0.2957 0.0504 0.0676 0.1586 0.0583 0.5069 0.2553 0.2641 0.2426 0.4484 0.4739 0.2172 0.0883 0.3559 0.1426 0.5432 0.2592 0.3352 0.1942 0.1889 0.0768 0.4852 0.1615 0.8677 0.4903]

@sugarme thanks a lot!

Load pretrained Pytorch models and run inference

@sugarme are there currently any limitation on the model you can serve with gotch?

@sugarme can I still ask you about the above?

@Arnold1 ,

Gotch can load pretrained models in different ways:

  1. Build your own model in Go any load pretrained weights (above example)
  2. Convert PyTorch models to JIT and run JIT model in gotch (you dont need to know nor build the model, just run JIT for inference)
  3. Know and can build model, then load JIT model. This way you can load and finetune from JIT model.

Up to date, we can build and load any models and load pretrained weights from Huggingface transformers, any pretrained vision models from PyTorch. I also try some from Timm models without any issues.
If you use the first method using 'pickle' subpackage as example above, you may get some loading errors due to incomplete type casting in pickle subpackage. Feel free to open an issue and provide pretrained weights so that we try and fix.

Similarly wit JIT models, you may have errors of incorrect casting types if the original PyTorch models built with missing corresponding Go types in gotch. Again, provide jit model so we can try and fix.

Also, this is just a note that you can always load pretrained weights partially and skip some untrainable or unused parameters from the pretrained models with the first method.

Hope you get the idea. Feel free to open discussion instead of issue here for easy tracking.

@sugarme I want to build a classification model with PyTorch in python and serve it with gotch in golang - is there anything I need to look out for building the model using PyTorch in python and serving with gotch?

what is the recommended format for that scenario of the model and what datatypes are supported?

so from what I get from your prev reply is - there is nothing which prevents going this approach - train model in python and serve model with gotch?

@Arnold1 ,

Yes, you can train model in Python and serve in Go using gotch. So far, all models that I have tried work without any issue.

If you have Python model already, just initiating weights and saving, then try to load them with gotch and see if any issues.

Furthermore, read other closed issues as I believe someones has asked previously. Thanks.

@sugarme thanks again - my bad I didn't see that question.

@Arnold1,

go clean
go clean -cache
go build .

This just saved me hours of headache! Thank you! No idea why it's required but it is 😄