Database implementation

Question

Database implementation

PeeHaa opened this issue 12 years ago · comments

I'm thinking of implementing database support for the backlog. This would prevent possible issues with Stack Overflow's throttling limits as well as pave the way for some new features I want to implement.

It would also introduce a way to better integrate the cv-pls plugin.

My question is: would it be hard for you to be able to setup a database for the backlog?

Gordon Oheim · Answer 1 · Tue Dec 25 2012 06:36:14 GMT+0800 (China Standard Time)

Setting up a db would be fairly easy. How much access to it would you need?

Pieter Hordijk · Answer 2 · Wed Dec 26 2012 04:52:30 GMT+0800 (China Standard Time)

Don't think we need that much. Still thinking about what it is we want / need to implement together with the cv-pls plugin team. I first wanted to find how big a pain it would be to enable database access before making concrete decisions. Will update this ticket when things are more clear about the requirements.

Jocelyn · Answer 3 · Fri May 10 2013 10:03:08 GMT+0800 (China Standard Time)

What do you have in mind? A server-side database like MySQL, or a client-side database like Local Storage?

Pieter Hordijk · Answer 4 · Fri May 10 2013 15:44:17 GMT+0800 (China Standard Time)

@J0celyn I was thinking about serverside storage, because that would open up a lot of possibilities we cannot easy have when relying on client side storage.

Gordon Oheim · Answer 5 · Fri May 10 2013 21:37:43 GMT+0800 (China Standard Time)

There is still no concrete ideas why the CVBacklog would need this though. The throttling issues are gone since I added the API key. So while I am still open to the idea, I am hesitant to implement this just because I can.

Jocelyn · Answer 6 · Sun May 12 2013 02:08:40 GMT+0800 (China Standard Time)

I think the first use of a database would be to store the list of questions already extracted from the chat transcripts. If the date of the earliest and latest chat lines parsed are stored too, then the script knows exactly what to browse the next time the data is updated.
The database could also be useful to show the list of questions in a different order. Right now, the lastest questions posted in the chat are the first ones to appear in the list.
Instead, the list could show the questions with the most close-votes or del-votes first. Or first show questions that need to be close-voted or del-voted to avoid automatic deletion of close-votes/del-votes.

Gordon Oheim · Answer 7 · Sun May 12 2013 19:34:41 GMT+0800 (China Standard Time)

Scraping is not an issue. It's always only 25 pages and that's not a bottleneck in terms of speed. Sorting is also not an issue. We can easily sort the results from the SE.API in memory. The questions are currently listed by creation date btw and not by how they appear in the Chat Search (if they do it's coincidence or a bug). So the only thing that a database would result in is a longer backlog. Do we need that? I don't think so.

Like I said, it's not that I am against changing to a database. But right now, the backlog does nothing that would really warrant the effort. For the scope of this application, the current approach works quite well without a database.

Gordon Oheim · Answer 8 · Mon Jul 22 2013 16:25:56 GMT+0800 (China Standard Time)

Since there we did not find a concrete reason why we require a database, I am closing this. I suggest opening a new ticket once we have a concrete Use Case.