dataiku / solutions-contrib

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Autofix Libraries Path

adamnietopfizer opened this issue · comments

Hi @anaslaaroussi1,

Why do we look for project-python-libs here as the target directory?

target_directory = "project-python-libs"

For some reason on my DSS instance it cannot find that in the paths listed in the PYTHONPATH environment variable. Is it a given that project-python-libs would be present in the PYTHONPATH? The only time I see project-python-libs appear is when running os.listdir(None) in the standard web app's python backend. However, in my instance for some reason the Dataiku DSS project's library folder is not actually present in that project-python-libs. It is only accessible through the DSS_DATA_DIR/config/projects/DSS_PROJECT_KEY/lib/webapps. I believe you may be mentioning that this issue can occur here...

## can be autofixed in DSS new versions by reading the external libs and adding relative wabapps paths
## TODO : Should it be auto-fixed

I am getting that exception thrown underneath line 116. Perhaps something like this to autofix will resolve the issue?

    def __get_execution_main_path(self):
        ## If code studio the path should exist
        if self.context == ExecutionContext.DATAIKU_DSS_CODE_STUDIO:
            try:
                root_path = self.__get_root_path()
                exec_path = os.path.join(root_path, self.relative_path)
                if self.prefix:
                    exec_path = os.path.join(exec_path, self.prefix)
                if os.path.exists(exec_path):
                    return exec_path
                else:
                    logger.warning(f"{exec_path} path does not exist")
            except Exception as e:
                raise e from None

        ## The path should both exist and be in python libs
        elif self.context == ExecutionContext.DATAIKU_DSS:
            root_relative_path = self.relative_path.split("/")[0]
            root_path = self.__get_root_path()
            if root_relative_path in os.listdir(root_path):
                return os.path.join(root_path, self.relative_path)
            else:
                paths = os.environ.get(self.dss_python_path_env_var)
                if paths:
                    paths_splitted = paths.split(":")
                    for path in paths_splitted:
                        if root_relative_path in path:
                            return os.path.join(path, self.relative_path.lstrip("webapps").lstrip("/").rstrip("/"), "")                     
                raise WebaikuError(
                    f"You should add {root_relative_path} to your pythonPath in external-libraries.json of the current project lib folder"
                )
        else:
            ## 1- find the caller frame and abs path
            caller_frame = None
            calling_file_path = None
            try:
                for frame_info in inspect.stack():
                    if frame_info.filename != __file__:
                        caller_frame = frame_info
                        break
            finally:
                del frame_info

            if caller_frame is not None:
                calling_file_path = os.path.abspath(caller_frame.filename)
            else:
                logger.warning("Execution path was not found")

            if calling_file_path:
                return find_relative_path(calling_file_path, self.relative_path)
        return None

I found that the following lines I added below resolve the issue for me:

                paths = os.environ.get(self.dss_python_path_env_var)
                if paths:
                    paths_splitted = paths.split(":")
                    for path in paths_splitted:
                        if root_relative_path in path:
                            return os.path.join(path, self.relative_path.lstrip("webapps").lstrip("/").rstrip("/"), "")

I could probably make the last line with os.path.join a little bit more readable. Also made it safer to only check for a path that ends with root_relative_path

                paths = os.environ.get(self.dss_python_path_env_var)
                if paths:
                    paths_splitted = paths.split(":")
                    for path in paths_splitted:
                        if root_relative_path == path.split("/")[-1]:
                            return os.path.join(path.rstrip(root_relative_path), self.relative_path)

hello @adamnietopfizer , there is indeed a bug making it not working as it should if the user isolation framework is not setup. thank you for pointing out this, will be updated soon