semgrep / semgrep

Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.

Home Page:https://semgrep.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dockerfile syntax parsing error | dockerflie language | Engine(PartialParsing)

kariaelo opened this issue · comments

Describe the bug
When i run a new dockerfile rule on a yaml file of mine for building a docker image , i get Syntax error at line target.dockerfile:24: --> Engine(PartialParsing)

To Reproduce
Steps to reproduce the behavior, (https://semgrep.dev/playground/s/kxK2P)

What is the priority of the bug to you?

  • P1: important to fix or quite annoying

Environment
all of the environments

The reason why the YAML fails to parse is because the shell command on line 24 has nine instances where the closing square bracket is escaped but their corresponding opening square brackets are not escaped.

If this inconsistency is fixed then the file parses as expected:
RUN echo "PS1="[\e[91m]\u[\e[0m][\e[2m]@[\e[0m][\e[94m]\h[\e[0m]:[\e[32m]\w[\e[0m] [\e[2m]=>[\e[0m] ""

@jkinsfather's code above fixes the code for the Bash prompt (which originally showed spurious closing brackets) and it also serves as a workaround for semgrep's parsing bug.

I'm reopening the issue because there's still a bug in Semgrep's Dockerfile parser. The following Dockerfile is accepted by docker but not by semgrep:

FROM ubuntu

# sh command to set text color to red (does nothing useful but is valid)
RUN echo "\e[91m"

Edit: the command echo "\e[91m" sets the text in red in sh (the default shell used by Docker to process the RUN instructions) but not in bash, although it's syntactically valid in both cases.

docker is happy:

$ docker build -t test .
[+] Building 0.0s (6/6) FINISHED                                                
 => [internal] load build definition from Dockerfile                       0.0s
 => => transferring dockerfile: 112B                                       0.0s
 => [internal] load .dockerignore                                          0.0s
 => => transferring context: 2B                                            0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest           0.0s
 => [1/2] FROM docker.io/library/ubuntu                                    0.0s
 => CACHED [2/2] RUN echo "\e[91m"                                         0.0s
 => exporting to image                                                     0.0s
 => => exporting layers                                                    0.0s
 => => writing image sha256:c0f4394029c2ee3b806186b37a86ccf79c5bf6057f85d  0.0s
 => => naming to docker.io/library/test                                    0.0s

But semgrep fails to parse the RUN line:

$ semgrep -l docker -e 'RUN ...' Dockerfile --verbose
No .semgrepignore found. Using default .semgrepignore rules. See the docs for the list of default ignores: https://semgrep.dev/docs/cli-usage/#ignoring-files
Rules:
- -
[WARN] Syntax error at line Dockerfile:4:
 `RUN echo "\e[91m"` was unexpected

...