[2023 Day 12] I feel like I might be missing a trick regarding combinations
So I managed to get part 1 of the day, but it took 2 seconds to run on the real input, which is a bad sign.
I can't see any kind of optimisation that means I can skip checks and know how many combinations are in those skipped checks (aside from 0.) I can bail out of branches of combinations if the info so far won't fit, but that still leads me to visiting every valid combination which in one of the examples is 500k. (And probably way more in the input, since if I can't complete the example near instantly the input is not happening.)
Right now I take the string, then replace the first instance of a ? with the two possible options. Check it matches the check digits so far then use recursion on those two strings.
I can try to optimise the matching, but I don't think that solves the real problem of visiting every combination.
! iterate over our string. Whenever you hit a non-empty, check if the next N are also possible to be a # (N being the first element of our sequence) and that the N+1th isn't a #. If they are, we can truncate the first N+1, the first element of our sequence, and recurse. If you hit a #, you know that the first element has to start here at the latest, so you can break. With this method, memoization is enough to get part 2 down to 25 ms. To make the memoization more efficient you can also truncate all the way up to the next non-empty when recursing. !<
I can bail out of branches of combinations if the info so far won’t fit, but that still leads me to visiting every valid combination which in one of the examples is 500k.
By "every valid combination" do you mean every substitution of '?' with a '#' or '.'? If yes, then you're wrong, you can bail out of branches that don't fit early, and cut a lot of them this way.
Consider the following example:
???????????? [1, 2, 2]
When you substitute the first two question marks with ##, the answer already doesn't match the input string, so you can throw away 1M of the combinations that don't fit.
Also, while you're at it, avoid generic type annotations (e.g. list), try to always specify the generic argument (e.g. list[str]) :)
When you substitute the first two question marks with ##, the answer already doesn’t match the input string, so you can throw away 1M of the combinations that don’t fit.
You know I figured I was already doing that, but printing your example shows I was not. I also added some logic to the other end since 1,2,2 needs a space of 7 and if the check is all dots and I only have 6 chars left, I know it can't fit.
still taking a long time for the real data so it must be something inefficient in my code then, rather than the method.
Also, while you’re at it, avoid generic type annotations
Good point. Recently figured that one out, still not automatic as you can see.
I don't think there are many significant optimizations with regards to reducing the search tree. It took me long enough to get behind it, but the "solution" (not saying there aren't other ways) to part 2 is to not calculate anything more than once. Instead put partial solutions in a dict indexed by the current state and use that cached value if you need it again.
It seems like you are actually constructing all rows with replaced ?. This won't be viable for part 2, your memory usage will explode. I have a recursive function that calls itself twice whenever a ? is encountered, once assuming it's a ., and once a #.