Agnostic Omniscience Profiler

Question

Agnostic Omniscience Profiler

dsevero opened this issue 6 years ago · comments

Would be interesting to see comparisons between different versions automagically.

For example: the current solution to 001 with list comprehensions in contrast to a generator expression

# current
print(sum([n for n in range(1000) if n % 3 == 0 or n % 5 == 0]))

# new
print(sum(n for n in range(1000) if n % 3 == 0 or n % 5 == 0))

Storing a README in each subdirectory with the current stats (cpu time and memory consumption) would be nice.

Frank Kair · Answer 1 · Sun Apr 29 2018 23:27:10 GMT+0800 (China Standard Time)

This idea is very interesting @daniel-severo!

Should we have multiple solutions of the same language or should we compare different languages? I'd like to hear your thoughts as well, @murilocamargos, @fredericojordan and @caian-gums.

In the particular case you illustrated above, I think we should just optimize the code that's already in the repository, since it's very clear that the generator expression is more memory efficient.

Daniel Severo · Answer 2 · Mon Apr 30 2018 05:39:55 GMT+0800 (China Standard Time)

I think the generator version will consume less memory, but will be slower (since the computer itself is simple and generators have some overhead). So that would be interesting to compare also.

Murilo Camargos · Answer 3 · Mon Apr 30 2018 07:46:37 GMT+0800 (China Standard Time)

We could have different versions of the same problem solved by the same language (e.g. 001v1.py, 001v2.py, ...) and compare time and memory consumptions both within the same language and against other languages. This way, we don't need a snippets directory as suggested in #6 because we would be opening space for adding more efficient code without removing the old one.

Ilê Caian · Answer 4 · Thu May 03 2018 01:50:13 GMT+0800 (China Standard Time)

As my point of view, we have some issues to be discussed. The main point of the repo is:

Have the most efficient solution(memory, time spent, etc...) in each language?
Have the most solutions as possible?

As I think that the most important now is have the most efficient solution, we should have just one file per language. On future, we can have something like:

002_caian-gums.cpp
002_FrankKair.cpp
...

And if someone finds a new better solution or better performance solution, just open a PR to the already submitted solution and change the small portion of it or the entire code.

Daniel Severo · Answer 5 · Thu May 03 2018 01:52:09 GMT+0800 (China Standard Time)

Not sure it's that simple. "The best solution" doesn't exist. The example above illustrates this: one consumes less RAM the other CPU time.

Frederico Jordan · Answer 6 · Thu May 03 2018 04:41:32 GMT+0800 (China Standard Time)

IMHO, we should strive to make the fastest running scripts, disregarding memory usage, since most of the problems are CPU-bound.
I also think that algorithm discussion and comparison should be kept either on the problem's README or the project's wiki, both seem like a nice place to me.

Ilê Caian · Answer 7 · Thu May 03 2018 06:09:59 GMT+0800 (China Standard Time)

👍 to @fredericojordan ideas (I think to keep the comparison on the wiki is the best option)

Frederico Jordan · Answer 8 · Sat May 05 2018 00:58:06 GMT+0800 (China Standard Time)

@FrankKair, can you open the project's wiki?

Frank Kair · Answer 9 · Mon May 07 2018 08:55:19 GMT+0800 (China Standard Time)

I just opened the wiki with a simple welcome message.

About the profiler, a simple $ time would suffice to get started?

Maybe it's better to keep the info about time/cpu on the readme of each directory, to be more easily accessible and use the wiki to post more complex solutions strategies.

What do you guys think? An example with problem001:

https://projecteuler.net/problem=001

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. 
The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000.


Performance / Status / Profiling / Whatever
C++:      0.0s
Crystal:  0.1s
Erlang:   0.3s
Elixir:   0.3s

or

https://projecteuler.net/problem=001

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. 
The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000.


Performance / Status / Profiling / Whatever
- C++
Real time: 11.570 s
User time: 11.377 s
Sys. time: 0.147 s
CPU share: 99.61 %
Exit code: 0

- Elixir
Real time: 11.570 s
User time: 11.377 s
Sys. time: 0.147 s
CPU share: 99.61 %
Exit code: 0

(The numbers are for illustrations purposes only, not the actual code being tested).

Frank Kair · Answer 10 · Tue May 08 2018 03:16:03 GMT+0800 (China Standard Time)

Or maybe we could have a separate file on each folder called profiling, or performance, like so: src/001/performance.md where we have all these numbers.

Not sure what's the best way to go here 😅

Ilê Caian · Answer 11 · Thu May 10 2018 21:38:31 GMT+0800 (China Standard Time)

@daniel-severo If you read my comment you'll see that when I said 'best solution' I meant to say 'the best solution based on our parameters'. I agree with you that the best solution is something too vague that we have to decide where we make the cut.

@FrankKair Thanks for opening the wiki! I think we can place that content on performance.md and make a link to it on readme.md. That's my opinion.

Ilê Caian · Answer 12 · Mon May 21 2018 12:05:59 GMT+0800 (China Standard Time)

I noticed that in #69 we got some 'functional approach' to JS code of solution 001. I got an idea that different solutions with different approaches maybe are not the best or more efficient but in case of this specific scenario (JS code with a functional approach) I can read the code and thought that this is still an elegantly written code. This is totally a personal opinion and I understand that.

I think we can discuss about this specific topic of exactly how we should to the approve or reject of the 'best solution' if this is not set up right now.
My opinion is that we should maintain one and just one solution for each language. Should we maintain other solutions on a specific Wiki page for each problem?