kbrsh / wade

:ocean: Blazing fast 1kb search library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make stopWords translateable

jens1o opened this issue · comments

For example, in an upcoming project I'd need to translate the stopWords to German, but unfortunately this isn't possible yet. :/

But I really like the library! Keep up the good work!

Glad you like the library! I don't think English stop words will translate exactly to German, that's the problem with trying to translate them. If you have an idea of German stop words, you can do something like this:

var stopWords = ["one", "two"];

Wade.pipeline[2] = function(str) {
  var words = str.split(" ");
  for(var i = 0; i < words.length; i++) {
    if(stopWords.indexOf(words[i]) !== -1) {
      words.splice(i, 1);
    }
  } 
  return words.join(" ");
}

The third item in the Wade pipeline removes stop words (hence replacing index 2). If you have an array of German stop words, you can store it in the stopWords variable. Let me know if you have any trouble or any more questions.

EDIT: I have found a list of German stop words online, feel free to use them:

German Stop Words Array
var stopWords =
 ["aber","alle","allem","allen","aller","alles","als","also","am","an","ander","andere","anderem","anderen","anderer","anderes","anderm","andern","anderr","anders","auch","auf","aus","bei","bin","bis","bist","da","damit","dann","das","dasselbe","dazu","daß","dein","deine","deinem","deinen","deiner","deines","dem","demselben","den","denn","denselben","der","derer","derselbe","derselben","des","desselben","dessen","dich","die","dies","diese","dieselbe","dieselben","diesem","diesen","dieser","dieses","dir","doch","dort","du","durch","ein","eine","einem","einen","einer","eines","einig","einige","einigem","einigen","einiger","einiges","einmal","er","es","etwas","euch","euer","eure","eurem","euren","eurer","eures","für","gegen","gewesen","hab","habe","haben","hat","hatte","hatten","hier","hin","hinter","ich","ihm","ihn","ihnen","ihr","ihre","ihrem","ihren","ihrer","ihres","im","in","indem","ins","ist","jede","jedem","jeden","jeder","jedes","jene","jenem","jenen","jener","jenes","jetzt","kann","kein","keine","keinem","keinen","keiner","keines","können","könnte","machen","man","manche","manchem","manchen","mancher","manches","mein","meine","meinem","meinen","meiner","meines","mich","mir","mit","muss","musste","nach","nicht","nichts","noch","nun","nur","ob","oder","ohne","sehr","sein","seine","seinem","seinen","seiner","seines","selbst","sich","sie","sind","so","solche","solchem","solchen","solcher","solches","soll","sollte","sondern","sonst","um","und","uns","unse","unsem","unsen","unser","unses","unter","viel","vom","von","vor","war","waren","warst","was","weg","weil","weiter","welche","welchem","welchen","welcher","welches","wenn","werde","werden","wie","wieder","will","wir","wird","wirst","wo","wollen","wollte","während","würde","würden","zu","zum","zur","zwar","zwischen","über"]

Maybe you can implement the German stopwords and we can somehow set an language of the search? That would be quite cool...

I can make the stopWords variable configurable if you're interested, so that the user can change what stop words are being used.

That's what I want :)

Cool, I'll add it in the next version.

Thanks!