brennanpincardiff / eksiR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

# eksiR: An R package to scrape data from eksisozluk.com

## Installation

```{r}
devtools::install_github("sefabey/eksiR",force=T)
library(eksiR)
```

## Example usage

The goal of this package is to make scraping data from eksisozluk.com easier in R. 

You can scrape entries by providing entry IDs:

```{r}
eksiR::eksi_scrape_entry(1789751)
```

You can also export scraped entries into a .csv file:
```{r}
eksiR::eksi_scrape_entry(1789751,export_csv = T)
```

If you would like to scrape multiple entries and return a neat tibble:

```{r}
entry_ids <- 1:20
purrr::map_df(entry_ids, eksi_scrape_entry)
```

As default, removed/deleted entries are retained. If you want to remove them use:

```{r}
entry_ids <- 1:20
purrr::map_df(entry_ids, eksi_scrape_entry) %>% 
  na.omit()
```

You can also scrape topics:
```{r}
eksi_scrape_topic("https://eksisozluk.com/tayyip-erdogan-fatih-sarac-ses-kaydi--4224837")
eksi_scrape_topic("https://eksisozluk.com/tayyip-erdogan-fatih-sarac-ses-kaydi--4224837",export_csv = T)
```

You can also use utility functions to parse eksisozluk specific date/time objects. This function adds two extra columns to tibbles, namely `original_date_time` and `edited_date_time`
```{r}
eksi_scrape_topic("https://eksisozluk.com/tayyip-erdogan-fatih-sarac-ses-kaydi--4224837") %>% 
  eksi_date_time_parse()
```



## But, What is eksisozluk?

[Wikipedia](https://en.wikipedia.org/wiki/Ek%C5%9Fi_S%C3%B6zl%C3%BCk) defines eksisozluk as:

*Ekşi Sözlük (Turkish pronunciation: [ecˈʃi søzˈlyc]; "Sour Dictionary") is a collaborative hypertext "dictionary" based on the concept of Web sites built up on user contribution.[1] However, Ekşi Sözlük is not a dictionary in the strict sense; users are not required to write correct information. It is currently one of the largest online communities in Turkey with over 400,000 registered users.[2] The number of writers is about 54,000. As an online public sphere, Ekşi Sozluk is not only utilized by thousands for information sharing on various topics ranging from scientific subjects to everyday life issues, but also used as a virtual socio-political community to communicate disputed political contents and to share personal views.[3]*

*The website's founder is Sedat Kapanoğlu. He founded the website for communicating with his friends in 1999 as he was inspired by The Hitchhiker's Guide to the Galaxy.[4][5] Ekşi Sözlük has been successful, many other websites that use this concept has emerged, like İTÜ Sözlük in Turkish.[6]*

*Enrollment periods to the dictionary and the criteria of acceptance are changeable.[7] Ekşi Sözlük does not accept new authors regularly; there are specific times in which new authors are accepted. There is a waiting period for new members who want to become authors in which they must post at least 10 entries. All entries are inspected according to the dictionary rules and their quality, and if they are pass inspection, the new user becomes an author. However, this process might take from months to years.*


AFAIK, there is no popular platform outside Turkey thay may act as eksisozluk's exact counterpart. Although closest candidate would be urbandictionary.com, it must be noted that there are major differences between both platforms. Despite the lack of global attention to platform's format, eksisozluk is one of the 16th most visited websites in Turkey, according to [Alexa](https://www.alexa.com/topsites/countries/TR).

About

License:Other


Languages

Language:R 100.0%