FrankKair / polyglot-euler

📜 Project Euler solutions in various programming languages

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Agnostic Omniscience Profiler

dsevero opened this issue · comments

Would be interesting to see comparisons between different versions automagically.

For example: the current solution to 001 with list comprehensions in contrast to a generator expression

# current
print(sum([n for n in range(1000) if n % 3 == 0 or n % 5 == 0]))

# new
print(sum(n for n in range(1000) if n % 3 == 0 or n % 5 == 0))

Storing a README in each subdirectory with the current stats (cpu time and memory consumption) would be nice.

This idea is very interesting @daniel-severo!

Should we have multiple solutions of the same language or should we compare different languages? I'd like to hear your thoughts as well, @murilocamargos, @fredericojordan and @caian-gums.

In the particular case you illustrated above, I think we should just optimize the code that's already in the repository, since it's very clear that the generator expression is more memory efficient.

I think the generator version will consume less memory, but will be slower (since the computer itself is simple and generators have some overhead). So that would be interesting to compare also.

We could have different versions of the same problem solved by the same language (e.g. 001v1.py, 001v2.py, ...) and compare time and memory consumptions both within the same language and against other languages. This way, we don't need a snippets directory as suggested in #6 because we would be opening space for adding more efficient code without removing the old one.

As my point of view, we have some issues to be discussed. The main point of the repo is:

  • Have the most efficient solution(memory, time spent, etc...) in each language?
  • Have the most solutions as possible?

As I think that the most important now is have the most efficient solution, we should have just one file per language. On future, we can have something like:

002_caian-gums.cpp
002_FrankKair.cpp
...

And if someone finds a new better solution or better performance solution, just open a PR to the already submitted solution and change the small portion of it or the entire code.

Not sure it's that simple. "The best solution" doesn't exist. The example above illustrates this: one consumes less RAM the other CPU time.

IMHO, we should strive to make the fastest running scripts, disregarding memory usage, since most of the problems are CPU-bound.
I also think that algorithm discussion and comparison should be kept either on the problem's README or the project's wiki, both seem like a nice place to me.

👍 to @fredericojordan ideas (I think to keep the comparison on the wiki is the best option)

@FrankKair, can you open the project's wiki?

I just opened the wiki with a simple welcome message.


About the profiler, a simple $ time would suffice to get started?

Maybe it's better to keep the info about time/cpu on the readme of each directory, to be more easily accessible and use the wiki to post more complex solutions strategies.

What do you guys think? An example with problem001:

https://projecteuler.net/problem=001

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. 
The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000.


Performance / Status / Profiling / Whatever
C++:      0.0s
Crystal:  0.1s
Erlang:   0.3s
Elixir:   0.3s

or

https://projecteuler.net/problem=001

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. 
The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000.


Performance / Status / Profiling / Whatever
- C++
Real time: 11.570 s
User time: 11.377 s
Sys. time: 0.147 s
CPU share: 99.61 %
Exit code: 0

- Elixir
Real time: 11.570 s
User time: 11.377 s
Sys. time: 0.147 s
CPU share: 99.61 %
Exit code: 0

(The numbers are for illustrations purposes only, not the actual code being tested).

Or maybe we could have a separate file on each folder called profiling, or performance, like so: src/001/performance.md where we have all these numbers.

Not sure what's the best way to go here 😅

@daniel-severo If you read my comment you'll see that when I said 'best solution' I meant to say 'the best solution based on our parameters'. I agree with you that the best solution is something too vague that we have to decide where we make the cut.

@FrankKair Thanks for opening the wiki! I think we can place that content on performance.md and make a link to it on readme.md. That's my opinion.

I noticed that in #69 we got some 'functional approach' to JS code of solution 001. I got an idea that different solutions with different approaches maybe are not the best or more efficient but in case of this specific scenario (JS code with a functional approach) I can read the code and thought that this is still an elegantly written code. This is totally a personal opinion and I understand that.

I think we can discuss about this specific topic of exactly how we should to the approve or reject of the 'best solution' if this is not set up right now.
My opinion is that we should maintain one and just one solution for each language. Should we maintain other solutions on a specific Wiki page for each problem?