Nixinova / LinguistJS

Analyse and list all languages used in a folder. Implementation of and powered by GitHub's Linguist.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Passing a string as raw input does not work - the string is interpreted as a file name

princefishthrower opened this issue · comments

As the title states, passing a string as the raw input to interperet / analyse doesn't work. For just a simple snippet of JavaScript code:

import linguest from "linguist-js";

const results = await linguist('console.log("Hello, world!");');

throws and exception:

null false Error: ENOENT: no such file or directory, scandir '/Users/chris/enterprise/codevideo/syntax-spy/console.log("Hello, world!");'
    at Object.readdirSync (node:fs:1514:26)
    at walk (/Users/chris/enterprise/codevideo/syntax-spy/node_modules/linguist-js/dist/helpers/walk-tree.js:25:36)
    at analyse (/Users/chris/enterprise/codevideo/syntax-spy/node_modules/linguist-js/dist/index.js:86:46)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  errno: -2,
  code: 'ENOENT',
  syscall: 'scandir',
  path: '/myrepository/path/root/console.log("Hello, world!");'
}

I took a look at the types:

async function analyse(path?: string, opts?: T.Options): Promise<T.Results>
async function analyse(paths?: string[], opts?: T.Options): Promise<T.Results>
async function analyse(rawInput?: string | string[], opts: T.Options = {}): Promise<T.Results> {
	const useRawContent = opts.fileContent !== undefined;
	const input = [rawInput ?? []].flat();
	const manualFileContent = [opts.fileContent ?? []].flat();

// ....

So then I assumed passing an empty object as the options i.e., {}, would definitely tell linguist-js that I want the input to be interpreted as rawInput, i.e.:

const results = await linguist(code, {});

but unfortunately this also results in the same error.

Have I interpreted the types wrong? Do I need to pass some special flag to signal that the string I am passing is raw input and not a file name?

Hi, when parsing raw file content you need to use the fileContent key in the options.
The first input is for putting the list of filenames only.

This is how you use the raw input feature:

const fileNames = ['file1.ts', 'file2.ts', 'ignoreme'];
const fileContent = ['#!/usr/bin/env node', 'console.log("Example");', '"ignored"'];
const options = { ignoredFiles: ['ignore*'] };
const { files, languages, unknown } = await linguist(fileNames, { fileContent, ...options });

I'll update the documentation of the readme to make this more clear.

Hi @Nixinova , thanks for this info... the problem is I don't "know" what the file name / extension could be. I'm really trying to determine the language from any snippet of code - just as a string. Something like this:

const {files, languages, unknown} = await linguist(["test"], {fileContent: ["console.log('Hello, world!');"]});
console.log("files:", files);
console.log("languages:", languages);
console.log("unknown:", unknown);

yields

files: { count: 1, bytes: 29, results: { test: null }, alternatives: {} }
languages: { count: 0, bytes: 0, results: {} }
unknown: { count: 1, bytes: 29, extensions: {}, filenames: { test: 29 } }

I'd expect to see at least 'JavaScript' or 'TypeScript' in the languages results...

This is out of scope of this program then unfortunately. Not even GitHub.com with all its might does analysis like that for files. This program requires at least one hint of classification, like a file name, hashbang, modeline, etc, otherwise it hasn't a clue.