Ads detection in russian wikipedia

An ML pipeline that results in a model capable to detect ads in russian wikipedia (or any other corpus you could think of). Work still in progress. Model has some achievements already, buf further enhancement required.

As part of pipeline automatically extracted corpus of ads samples in russian is generated at step 2.


