Speedup of day 16
marhoy opened this issue · comments
Thanks for a nice explanation of the reverse cumsum approach for part 2!
You complain about the speed, but it can if fact be speeded up even further:
Since we know we are only going to look at a small part of the whole list, we don't need to copy the input 10_000 times. With my input, I only had to copy it 806 times.
This function solves my input in 7s with standard Python:
def part2(input_string, phases=100):
# Find the start index and convert input to list of ints
start_index = int(input_string[:7])
numbers = list(map(int, input_string))
# Make sure that start index is indeed in the second half,
# otherwise the trick won't work
assert start_index > len(input_string)*10_000 / 2
# We are only going to compute the numbers from just before start_index
# to the end (in reverse). So we don't need to copy the input 10_000 times.
n_repeats = (len(input_string)*10_000 - start_index ) // len(input_string) + 1
# Compute new start index for this shorter list:
start_index -= (10_000 - n_repeats)*len(input_string)
numbers = numbers*n_repeats
for _ in range(phases):
cumsum = 0
for i in range(len(numbers) - 1, start_index - 1, -1):
cumsum += numbers[i]
numbers[i] = cumsum % 10
return "".join(map(str, numbers[start_index:start_index + 8]))
Hey, thank you very much for reading my walkthrough and for the time you took to open an Issue here. I already replied to your Reddit comment, but I'm answering here too.
The behavior you're observing is actually a false positive. Even though working with a smaller list is better, it does not give a noticeable performance boost. The speedup of your second part over mine only comes from the fact that you enclosed it inside a function. If I enclose my second part in a function, the speed is the same as yours. Multiplying the list 10k times is not a big deal.
This happens because inside functions the LOAD_FAST
python opcode is used, which is much faster than LOAD_GLOBAL
, used for global variables and therefore all over the place in the main body of the script.
With this said, thanks for reminding me of this, I added the explanation in my walkthrough 👍
Fixed in b32c06a moving day 16 part 2 code inside a function.