adventofcode2024

package module
v0.17.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 7, 2025 License: GPL-3.0 Imports: 11 Imported by: 0

README

= Advent of code 2024
:toc:

image:https://godoc.org/gitlab.com/jhinrichsen/adventofcode2024?status.svg["godoc", link="https://godoc.org/gitlab.com/jhinrichsen/adventofcode2024"]
image:https://goreportcard.com/badge/gitlab.com/jhinrichsen/adventofcode2024["Go report card", link="https://goreportcard.com/report/gitlab.com/jhinrichsen/adventofcode2024"]
image:https://gitlab.com/jhinrichsen/adventofcode2024/badges/main/pipeline.svg[link="https://gitlab.com/jhinrichsen/adventofcode2024/-/commits/main",title="pipeline status"]
image:https://gitlab.com/jhinrichsen/adventofcode2024/badges/main/coverage.svg[link="https://gitlab.com/jhinrichsen/adventofcode2024/badges/main/coverage.svg",title="coverage report"]


My take on https://adventofcode.com/2024/ in Go. As usual, i don't particularly
care if i provide my solutions _fast_, i try to be _correct_ on the first
answer, and care for being runtime efficient.
All puzzles are backed by unit testing the examples and the puzzles.
Results are hard coded into the unit tests, so you might not want to peek at `_test.go` files.

== Environment

- Go 1.23
- vim, vim-go, gopls, fed by an HHKB
- VisualStudio Code for debugging
- Fedora 41 @ Framework 16" (AMD Ryzen 7 7840HS w/ Radeon 780M Graphics), 256 KiB L1d, 256 KiB L1i
- Fedora 41 @ custom built AMD Ryzen 5 3400G on a Gigabyte B450
- macOS @ 16" Macbook Pro 2019 (Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz)

== Project structure

Each puzzle for day `n` lives in a `day{{.n}}.go` and `day{{.n}}_test.go` file.
Unit test data, both examples and puzzle input, is in
`testdata/day{{.n}}.txt`, and `testdata/day{{.n}}_example.txt`.

== Code quality

Gitlab is starting to support the concept of code quality.
Historically, you'd be using third party tools such as _goreportcard_ or _SonarQube_.
Problem with this, you need to commit your change to the release branch, so third party tools can pick it up.
Using staticcheck to prepare a mostly
https://docs.gitlab.com/ee/ci/testing/code_quality.html#code-quality-report-format[Code Climate]
compliant format, you get code quality in merge requests (all editions):

image::img/code-quality.png[]

Unfortunately, the code quality dashboard, Gitlab calls it the
https://docs.gitlab.com/ee/ci/testing/code_quality.html#project-quality-view[Project quality view],
is only available in Gitlab Ultimate.

== Overview

Number of tries for a correct answer:

|===
| Day | Part 1 | Part 2
| 1   |   1    |   1
| 2   |   1    |   1
| 3   |   1    |   3
| 4   |   1    |   1
| 5   |   1    |   1
| 6   |   1    |   1
| 7   |   1    |   1
| 8   |   1    |   1
| 9   |   1    |   1
| 10  |   1    |   1
| 11  |   2<1> |   1
| 12  |   1    |   1
| 13  |   1    |   1
| 14  |   1    |   1
| 15  |   1    |   1
| 16  |   1    |   1
| 17  |   1    |   1
| 18  |        |
| 19  |        |
| 20  |        |
| 21  |        |
| 22  |        |
| 23  |        |
| 24  |        |
| 25  |        |
|===
<1> error in bit shifting logic determining number of digits in a number

Total runtime so far:

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 5 3400G with Radeon Vega Graphics
BenchmarkDay01Part1-8   	   10000	    133619 ns/op	   16384 B/op	       2 allocs/op
BenchmarkDay01Part2-8   	   10000	    122981 ns/op	   57374 B/op	       4 allocs/op
BenchmarkDay02Part1-8   	    4788	    275458 ns/op	  109453 B/op	    1007 allocs/op
BenchmarkDay02Part2-8   	    1605	    645229 ns/op	  309248 B/op	    4509 allocs/op
BenchmarkDay03Part1-8   	     934	   1359496 ns/op	  157584 B/op	    1543 allocs/op

BenchmarkDay03Part2-8   	     583	   2208954 ns/op	  165925 B/op	    1711 allocs/op
BenchmarkDay04Part1-8   	    2104	    568110 ns/op	   48544 B/op	     154 allocs/op
BenchmarkDay04Part2-8   	    3135	    417356 ns/op	   48544 B/op	     154 allocs/op
BenchmarkDay05Part1-8   	      55	  20832291 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay05Part2-8   	      21	  51428549 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay06Part1-8   	    1938	    701061 ns/op	  663580 B/op	       2 allocs/op

BenchmarkDay07Part1-8   	     445	   2474109 ns/op	       0 B/op	       0 allocs/op

BenchmarkDay08Part1-8   	    1798	    672954 ns/op	   90136 B/op	       2 allocs/op
BenchmarkDay08Part2-8   	    2086	    655172 ns/op	   90139 B/op	       2 allocs/op
----

That's 82 ms total.

----
go test -run=^$ -bench=Day -benchmem | tee benches/all.txt
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay01Part1-16    	   36889	     30237 ns/op	   16384 B/op	       2 allocs/op
BenchmarkDay01Part2-16    	   31315	     36905 ns/op	   53328 B/op	       7 allocs/op
BenchmarkDay02Part1-16    	  188276	      6090 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay02Part2-16    	   12307	     95254 ns/op	  199737 B/op	    3502 allocs/op
BenchmarkDay03Part1-16    	    2896	    383072 ns/op	  157598 B/op	    1543 allocs/op
BenchmarkDay03Part2-16    	    1574	    717401 ns/op	  165976 B/op	    1711 allocs/op
BenchmarkDay04Part1-16    	   10624	    115248 ns/op	   23616 B/op	     141 allocs/op
BenchmarkDay04Part2-16    	   45325	     26349 ns/op	   23616 B/op	     141 allocs/op
BenchmarkDay05Part1-16    	      99	  11922306 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay05Part2-16    	      36	  29796143 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay06Part1-16    	    3771	    296899 ns/op	  895652 B/op	     196 allocs/op

BenchmarkDay07Part1-16    	     843	   1364444 ns/op	       0 B/op	       0 allocs/op

BenchmarkDay08Part1-16    	    4962	    244642 ns/op	  109216 B/op	       9 allocs/op
BenchmarkDay08Part2-16    	    4339	    273104 ns/op	  109216 B/op	       9 allocs/op
BenchmarkDay09Part1-16    	      60	  17348834 ns/op	  327680 B/op	       1 allocs/op

BenchmarkDay10Part1-16    	    1741	    606704 ns/op	 1478965 B/op	   13295 allocs/op
BenchmarkDay10Part2-16    	    1994	    536712 ns/op	 1380583 B/op	   13290 allocs/op
BenchmarkDay11Part1-16    	    3156	    335200 ns/op	  308456 B/op	     305 allocs/op
BenchmarkDay11Part2-16    	     111	  10603092 ns/op	 8801646 B/op	    1974 allocs/op
PASS
ok  	gitlab.com/jhinrichsen/adventofcode2024	27.748s
awk -f total.awk < benches/all.txt
74738636 ns, 74738 μs, 74 ms, 0 s
----

== Day 1: Historian Hysteria

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay01Part1-16    	   40551	     29019 ns/op	   16384 B/op	       2 allocs/op <1>
BenchmarkDay01Part1-16    	   39687	     28998 ns/op	   16384 B/op	       2 allocs/op
BenchmarkDay01Part1-16    	   41421	     28890 ns/op	   16384 B/op	       2 allocs/op
BenchmarkDay01Part1-16    	   40461	     28838 ns/op	   16384 B/op	       2 allocs/op
BenchmarkDay01Part1-16    	   40435	     29237 ns/op	   16384 B/op	       2 allocs/op
BenchmarkDay01Part2-16    	   29532	     40068 ns/op	   57374 B/op	       4 allocs/op <2>
BenchmarkDay01Part2-16    	   30114	     40194 ns/op	   57374 B/op	       4 allocs/op
BenchmarkDay01Part2-16    	   29638	     39958 ns/op	   57374 B/op	       4 allocs/op
BenchmarkDay01Part2-16    	   29630	     40373 ns/op	   57374 B/op	       4 allocs/op
BenchmarkDay01Part2-16    	   29608	     40189 ns/op	   57374 B/op	       4 allocs/op
----
<1> left and right slice, each len(1000) X 64 bit integer, memory is O(2*N)
<2> the above plus a histogram for the right list, memory is (O(2*N + N))

By using smaller int types (uint -> uint32, uint16) `B/op` could be decreased, but 16 KiB fits into L1 cache.

== Day 2: Red-Nosed Reports

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay02Part1-16    	    9501	    123002 ns/op	  191264 B/op	    2552 allocs/op
BenchmarkDay02Part2-16    	    5326	    221674 ns/op	  396720 B/op	    6266 allocs/op
PASS
----

== Day 3: Mull it over

Usually i try to avoid regular expressions, as they quickly become hard to understand.
No way around `regexp` package in this puzzle.

People keep praising Python for its brevity in code golf.

[quote]
Me gopher staring at 5 lines of Python code
- Reddit user

They surely have a good point, although sometimes statically typed Go doing just fine.

=== Part 1

[source, Go]
----
func Day03(lines []string) (sum uint) {
        atoi := func(s string) (n uint) { <1>
                for i := range s {
                        n = 10*n + uint(s[i]-'0')
                }
                return <2>
        }
        re := regexp.MustCompile(`mul\((\d{1,3}),(\d{1,3})\)`)
        for i := range lines {
                gs := re.FindAllStringSubmatch(lines[i], -1)
                for j := range gs {
                        sum += atoi(gs[j][1]) * atoi(gs[j][2])
                }
        }
        return <2>
}
----
<1> custom atoi() instead of strconv.Atoi()
<2> yes, that is rather opinionated

I am using a fast custom parser because `strconv.Atoi()` is way more elaborate (error handling, negative values, `e` notation, ...) than our use case requires.
The regexp already makes sure of 1 to 3 digits, so no additional logic required.

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay03Part1-16    	    6162	    192923 ns/op	  125182 B/op	    1564 allocs/op
BenchmarkDay03Part1-16    	    6196	    194894 ns/op	  125120 B/op	    1564 allocs/op
BenchmarkDay03Part1-16    	    5199	    193495 ns/op	  125184 B/op	    1564 allocs/op
BenchmarkDay03Part1-16    	    5224	    193238 ns/op	  125096 B/op	    1564 allocs/op
BenchmarkDay03Part1-16    	    6243	    191711 ns/op	  125127 B/op	    1564 allocs/op
PASS
----

=== Part 2

Using a `do()|don't()` RE to build hot and cold regions, then apply part 1 in hot regions.

----
xmul(2,4)&mul[3,7]!^don't()_mul(5,5)+mul(32,64](mul(11,8)undo()?mul(8,5)
++++++++++++++++++++                                       +++++++++++++ <1>
 ******** <2>
                            ******** <3>
                                                *********
                                                                ********
----
<1> `+` indicates enabled region
<2> match group in an enabled region
<3> match group in a disabled region

My first try was wrong, i was intentionally treating multiple lines separately.
Many regexp parsers have an option for single string/ multiple lines.
I was just misguided because most puzzles are line oriented, obviously this one is not.

My second guess was also wrong, i pass part 1 or part 2 as a parameter into the `Day03` function,
and stupidly passed the wrong value. More coffee?

After a coffee, i removed all region housekeeping, extended the RE by `|do()|don't`,
and just toggled `enabled` flag.

When treating the puzzle input as one program, not multiple lines, stats go up for part 1.

----
name           old time/op    new time/op    delta
Day03Part1-16     193µs ± 1%     411µs ± 3%  +112.47%  (p=0.008 n=5+5)

name           old alloc/op   new alloc/op   delta
Day03Part1-16     125kB ± 0%     158kB ± 0%   +25.92%  (p=0.008 n=5+5)

name           old allocs/op  new allocs/op  delta
Day03Part1-16     1.56k ± 0%     1.54k ± 0%    -1.34%  (p=0.008 n=5+5)
----

Benchmark for part 2:

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay03Part2-16    	    1332	    851205 ns/op	  165921 B/op	    1710 allocs/op
BenchmarkDay03Part2-16    	    1305	    938589 ns/op	  165923 B/op	    1710 allocs/op
BenchmarkDay03Part2-16    	    1395	    825859 ns/op	  165929 B/op	    1710 allocs/op
BenchmarkDay03Part2-16    	    1414	    877336 ns/op	  165951 B/op	    1710 allocs/op
BenchmarkDay03Part2-16    	    1472	    813813 ns/op	  165901 B/op	    1710 allocs/op
PASS
----

== Day 4: Ceres Search

=== Part 1

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay04Part1-16    	    5364	    224012 ns/op	   48560 B/op	     154 allocs/op
BenchmarkDay04Part1-16    	    5380	    223078 ns/op	   48559 B/op	     154 allocs/op
BenchmarkDay04Part1-16    	    5467	    222404 ns/op	   48560 B/op	     154 allocs/op
BenchmarkDay04Part1-16    	    4488	    222880 ns/op	   48560 B/op	     154 allocs/op
BenchmarkDay04Part1-16    	    5368	    219925 ns/op	   48560 B/op	     154 allocs/op
PASS
----

=== Part 2

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay04Part1-16    	    5394	    219828 ns/op	   48561 B/op	     154 allocs/op
BenchmarkDay04Part1-16    	    5421	    222877 ns/op	   48560 B/op	     154 allocs/op
BenchmarkDay04Part1-16    	    5377	    220939 ns/op	   48559 B/op	     154 allocs/op
BenchmarkDay04Part1-16    	    4857	    220370 ns/op	   48559 B/op	     154 allocs/op
BenchmarkDay04Part1-16    	    5455	    223552 ns/op	   48559 B/op	     154 allocs/op
BenchmarkDay04Part2-16    	   12849	     93220 ns/op	   48560 B/op	     154 allocs/op
BenchmarkDay04Part2-16    	   12699	     93705 ns/op	   48560 B/op	     154 allocs/op
BenchmarkDay04Part2-16    	   12814	     94808 ns/op	   48559 B/op	     154 allocs/op
BenchmarkDay04Part2-16    	   12862	     94259 ns/op	   48559 B/op	     154 allocs/op
BenchmarkDay04Part2-16    	   12758	     93285 ns/op	   48559 B/op	     154 allocs/op
PASS
----

== Day 5: Print Queue

=== Part 1

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay05Part1-16    	     378	   3107355 ns/op	    2688 B/op	       1 allocs/op
BenchmarkDay05Part1-16    	     386	   2970485 ns/op	    2688 B/op	       1 allocs/op
BenchmarkDay05Part1-16    	     393	   2939942 ns/op	    2688 B/op	       1 allocs/op
BenchmarkDay05Part1-16    	     403	   3054037 ns/op	    2688 B/op	       1 allocs/op
BenchmarkDay05Part1-16    	     398	   2887172 ns/op	    2688 B/op	       1 allocs/op
PASS
----

Removing GC...

----
name           old time/op    new time/op    delta
Day05Part1-16    2.99ms ± 4%    2.99ms ± 0%      ~     (p=0.690 n=5+5)

name           old alloc/op   new alloc/op   delta
Day05Part1-16    2.69kB ± 0%    0.00kB       -100.00%  (p=0.008 n=5+5)

name           old allocs/op  new allocs/op  delta
Day05Part1-16      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
----

=== Part 2

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay05Part2-16    	      39	  29821854 ns/op	       0 B/op	       0 allocs/op <1>
BenchmarkDay05Part2-16    	      34	  30269073 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay05Part2-16    	      38	  30446051 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay05Part2-16    	      36	  30861993 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay05Part2-16    	      32	  31323378 ns/op	       0 B/op	       0 allocs/op
PASS
----
<1> That's 30 ms for part 2, no heap allocation -> no garbage collection

== Day 6: Guard Gallivant

=== Part 1

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay06Part1-16    	    4609	    243924 ns/op	  895651 B/op	     196 allocs/op
----

=== Part 2

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay06Part2-16    	      21	  55184398 ns/op	  969380 B/op	     197 allocs/op
----

That's 55ms for part 2.

=== Optimization Analysis

Original brute force approach (14.5 seconds):
- Tests all 17,161 grid positions as potential obstructions
- Uses `map[state]struct{}` for loop detection with expensive hash operations
- Allocates new map for each simulation (4.1M allocations total)

Optimized approach (55ms):

1. Path-only testing: Only test obstruction positions on the guard's original path (~4,900 positions vs 17,161 total). This optimization is valid because only positions the guard visits can potentially cause loops when blocked.

2. Array-based state tracking: Replace `map[state]struct{}` with `[]bool` using packed state indices:
   ```
   stateIdx = pos.Y*dimX*4 + pos.X*4 + direction
   ```
   Array lookups are ~10x faster than map hash operations.

3. State reuse: Clear and reuse the same visited array instead of allocating new maps for each simulation, reducing GC pressure.

**Benchstat comparison**:
----
              │ bench_before.txt │         bench_after.txt         │
              │      sec/op      │    sec/op     vs base           │
Day06Part2-16    14513.03m ± ∞ ¹   54.56m ± ∞ ¹  ~ (p=0.333 n=1+5)

              │  bench_before.txt  │         bench_after.txt         │
              │        B/op        │     B/op       vs base          │
Day06Part2-16   98804508.5Ki ± ∞ ¹   946.7Ki ± ∞ ¹  -100.00% (n=1+5)

              │ bench_before.txt │        bench_after.txt         │
              │    allocs/op     │  allocs/op   vs base           │
Day06Part2-16    4133206.0 ± ∞ ¹   197.0 ± ∞ ¹  ~ (p=1.667 n=1+5)
----

Performance improvement: ~266x faster runtime, ~104,000x less memory usage

== Day 7: Bridge Repair

=== Part 1

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay07Part1-16              1312            916007 ns/op               0 B/op          0 allocs/op
----

=== Part 2

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay07Part2-16                66          18897873 ns/op               0 B/op          0 allocs/op
----

That's ~0.9ms for part 1 and ~19ms for part 2.

=== Optimization Analysis

Original approach (full enumeration):
- Tests all 2^n combinations for Part 1, 3^n for Part 2
- Systematic but inefficient for large equation sets

Optimized approach (stack-based traversal):

1. Stack-based exploration: Uses fixed-size stack to traverse only viable solution paths
2. Aggressive pruning: Immediately discards branches where `result > target`
3. Early termination: Stops as soon as first valid solution is found
4. Zero allocations: Uses stack array instead of heap allocations

Benchstat comparison:
----
              │ day07_benchmarks.txt │         day07_optimized.txt         │
              │        sec/op        │    sec/op     vs base               │
Day07Part1-16          4979.1µ ± ∞ ¹   899.0µ ± ∞ ¹  -81.95% (p=0.008 n=5)
Day07Part2-16          482.80m ± ∞ ¹   18.79m ± ∞ ¹  -96.11% (p=0.008 n=5)
geomean                 49.03m         4.110m        -91.62%
----

Performance improvement: ~82% faster Part 1, ~96% faster Part 2, ~92% overall improvement

== Day 8: Resonant Collinearity

=== Part 1

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 5 3400G with Radeon Vega Graphics
BenchmarkDay08Part1-8   	    1564	    661012 ns/op	   90139 B/op	       2 allocs/op
BenchmarkDay08Part1-8   	    1797	    631212 ns/op	   90136 B/op	       2 allocs/op
BenchmarkDay08Part1-8   	    2593	    630481 ns/op	   90136 B/op	       2 allocs/op
BenchmarkDay08Part1-8   	    1716	    678508 ns/op	   90136 B/op	       2 allocs/op
BenchmarkDay08Part1-8   	    1820	    648858 ns/op	   90136 B/op	       2 allocs/op
goos: linux
----

=== Part 2

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 5 3400G with Radeon Vega Graphics
BenchmarkDay08Part2-8   	    2032	    647094 ns/op	   90139 B/op	       2 allocs/op
BenchmarkDay08Part2-8   	    2167	    616031 ns/op	   90140 B/op	       2 allocs/op
BenchmarkDay08Part2-8   	    2073	    652428 ns/op	   90139 B/op	       2 allocs/op
BenchmarkDay08Part2-8   	    1794	    638548 ns/op	   90139 B/op	       2 allocs/op
BenchmarkDay08Part2-8   	    1544	    649896 ns/op	   90140 B/op	       2 allocs/op
----

Being an electrical engineer, i misread part 1 to stop after one complete wave.
The unit test was my safety net.
I lazily used

[source, Go]
----
for range 1 {
    ....
}
----

to stop iterating.
Part 2 was exactly that, my part 1.

Part 1 decreased slightly

----
name          old time/op    new time/op    delta
Day08Part1-8     711µs ± 5%     650µs ± 4%  -8.55%  (p=0.016 n=5+5)

name          old alloc/op   new alloc/op   delta
Day08Part1-8    90.1kB ± 0%    90.1kB ± 0%    ~     (all equal)

name          old allocs/op  new allocs/op  delta
Day08Part1-8      2.00 ± 0%      2.00 ± 0%    ~     (all equal)
----

Performancewise, part 1 and 2 are very similar, suggesting that the major part is locating antennas on the same frequency, as opposed to spreading the wave.

== Day 9: Disk Fragmenter

=== Part 1

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 5 3400G with Radeon Vega Graphics
BenchmarkDay09Part1-8   	      42	  27858720 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay09Part1-8   	      42	  27579218 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay09Part1-8   	      40	  28169173 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay09Part1-8   	      40	  28437979 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay09Part1-8   	      40	  28237083 ns/op	       0 B/op	       0 allocs/op
----

That's 27 ms.

== Day10: Hoof It

=== Part 1

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay10Part1-16    	     996	   1124976 ns/op	 1459165 B/op	   13245 allocs/op
BenchmarkDay10Part1-16    	     981	   1129558 ns/op	 1459171 B/op	   13245 allocs/op
BenchmarkDay10Part1-16    	    1702	    656466 ns/op	 1459166 B/op	   13245 allocs/op
BenchmarkDay10Part1-16    	    1764	    642919 ns/op	 1459166 B/op	   13245 allocs/op
BenchmarkDay10Part1-16    	    1790	    638223 ns/op	 1459165 B/op	   13245 allocs/op
PASS
----

I misread part 1 and counted _all_ trails, not just _distinct_ start and end trails.
For part 2, this turns out to be just what we need, lucky me...

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay10Part2-16    	    1028	   1007849 ns/op	 1377221 B/op	   13243 allocs/op
BenchmarkDay10Part2-16    	    1116	   1018253 ns/op	 1377216 B/op	   13243 allocs/op
BenchmarkDay10Part2-16    	    1926	    587530 ns/op	 1377217 B/op	   13243 allocs/op
BenchmarkDay10Part2-16    	    1875	    569882 ns/op	 1377217 B/op	   13243 allocs/op
BenchmarkDay10Part2-16    	    1897	    581965 ns/op	 1377217 B/op	   13243 allocs/op
PASS
----

== Day 11: Plutonian Pebbles

The stone transformation rules require frequent digit counting and splitting of even-digit numbers.
A naive approach would use `strconv.Itoa()` + `len()` or `math.Log10()`, but both are expensive:

- `math.Log10()`: ~100-200 CPU cycles (floating-point operations)
- String conversion: Memory allocation + GC pressure

Instead, we use a branchless digit counting algorithm that operates purely on integers:

[source, Go]
----
func digits_branchless(n uint64) int {
	d := 1
	d += int((n - 10) >> 63) ^ 1           // >= 10?
	d += int((n - 100) >> 63) ^ 1          // >= 100?
	d += int((n - 1000) >> 63) ^ 1         // >= 1000?
	// ... up to 10^11 <1>
	return d
}
----
<1> This is what GPT-5 came up with, but 10^11^ is wrong, it must be 10^18^

Each comparison uses bit manipulation to avoid branches:

- `(n - threshold) >> 63` extracts the sign bit (1 if n < threshold, 0 if n >= threshold)
- `^ 1` flips it (0 if n < threshold, 1 if n >= threshold)
- Sum all results to get digit count

Performance characteristics:

- Theoretical: ~45 CPU cycles (11 comparisons × 4 ops each + setup)
- Actual (modern AMD64): ~8-12 cycles due to superscalar execution and instruction-level parallelism
- Advantage: 10-20x faster than `math.Log10()`, completely branchless (no branch prediction misses)

Implementation note: An earlier attempt used a binary-tree style approach with power-of-10 thresholds (10^10, 10^5, 10^2, 10^1) and complex arithmetic to combine results, which was theoretically more elegant but contained subtle bugs in the threshold calculations and number reduction logic. The simpler linear approach above proved more reliable while maintaining equivalent performance.

The algorithm also handles the iteration challenge: when splitting stones during iteration, we iterate backwards through the slice to avoid index shifting issues when inserting new elements.

=== Optimization: Frequency Counting vs Slice Operations

Initial Implementation (Slice-based):
----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay11Part1-16    	       2	2026317200 ns/op	 7208584 B/op	      31 allocs/op
PASS
----

~2.0 seconds for part 1 (25 blinks). The exponential stone growth dominated performance through `runtime.memmove` (99.33% CPU time in slice insertions).

Optimized Implementation (Frequency Counting):
----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay11Part1-16    	    3186	    342996 ns/op	  308458 B/op	     305 allocs/op
BenchmarkDay11Part2-16    	      96	  10807918 ns/op	 8823100 B/op	    1976 allocs/op
PASS
----

Performance Improvement:

- Part 1: 345µs (~5,900x faster, from 2.0s to 0.3ms)
- Part 2: 10.9ms (75 blinks, 207+ trillion stones)

Key Optimizations:

1. Frequency Counting: Track stone counts instead of individual stones
2. Iterative Processing: Pure iteration, zero recursion
3. Mathematical Insight: Identical stones transform identically
4. Memory Efficiency: Constant memory usage regardless of stone count

The optimized approach scales with unique stone types (~100-1000) rather than total stone count (trillions), achieving subsecond performance for both parts.

== Day 13: Claw Contraption

Day 13 involves solving systems of linear equations using Cramer's rule. The puzzle presents claw machines where button presses move a claw by fixed amounts, and we need to find integer solutions to reach prize coordinates.

=== Cramer's Rule Extraction

Initially implemented inline within `solveMachine()`, the Cramer's rule logic was extracted into a reusable `matrix.go` module for better code organization and reusability.

The API design makes Cramer's rule prerequisites explicit through the function signature:

[source, Go]
----
func Cramer(eq1, eq2 Eq) (int, int, bool)
func Bareiss(eqs []Eq) ([]int, bool)
----

The signature `Cramer(eq1, eq2 Eq)` makes it immediately clear that:

- Exactly 2 equations are required (no more, no less)
- 2×2 systems only - the compiler enforces this constraint
- No additional comments needed to explain the mathematical requirements

This contrasts with `Bareiss(eqs []Eq)` which accepts any number of equations via slice. The API itself documents the algorithmic constraints without relying on comments or return code inspection.

**Original inline implementation:**
----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay13Part1-16            109476             10216 ns/op               1 B/op          0 allocs/op
BenchmarkDay13Part2-16            105620             11266 ns/op               2 B/op          0 allocs/op
----

**Generalized Cramer implementation:**
----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay13Part1-16             87717             13299 ns/op               2 B/op          0 allocs/op
BenchmarkDay13Part2-16             85485             13943 ns/op               2 B/op          0 allocs/op
----

=== Runtime Impact Analysis

Performance Impact:

- Part 1: +30.2% slower (3,083 ns overhead)
- Part 2: +23.8% slower (2,677 ns overhead)
- Average: +27% slower

The ~27% performance overhead comes from:

1. Function call overhead: The inline code had zero function calls, while our generalized version calls `Cramer()`
2. Struct creation: Creating two `Eq` structs adds minor overhead
3. Parameter passing: Passing 6 integers vs direct variable access

Trade-offs:

Cost:
- ~3μs overhead per machine (negligible for human perception)
- 27% relative performance decrease

Benefits:
- Reusable: Cramer's rule available for other 2×2 systems
- Testable: Mathematical logic isolated and unit tested
- Maintainable: Clear separation of concerns
- Type-safe: Compiler enforces 2×2 constraint
- Extensible: Ready for Bareiss when n×n systems needed

The 27% overhead is acceptable because:

- Absolute impact is tiny (~3μs per machine)
- Day13 still runs in ~14μs total (very fast)
- Code quality improvements outweigh the minor performance cost
- The overhead would be amortized across multiple uses of the matrix library

This is a classic performance vs maintainability trade-off where the benefits of clean, reusable code justify the small performance cost.

=== Matrix Library Performance

Comparison of Cramer's rule vs Bareiss algorithm for 2×2 systems:

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkCramerVsBareiss/Cramer-16              44439138                26.80 ns/op
BenchmarkCramerVsBareiss/Bareiss-16              2078890               577.5 ns/op
----

Cramer's rule is ~21x faster than Bareiss for 2×2 systems, making it the optimal choice for this specific problem size.

== Day 14: Restroom Redoubt

=== Part 1

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay14Part1-16    	    3807	    281629 ns/op	  124567 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    3806	    270840 ns/op	  124563 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    4398	    272542 ns/op	  124563 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    4167	    274931 ns/op	  124563 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    3675	    273548 ns/op	  124563 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    3879	    272872 ns/op	  124564 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    3878	    270117 ns/op	  124564 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    3709	    276097 ns/op	  124564 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    3718	    273230 ns/op	  124564 B/op	    1146 allocs/op
BenchmarkDay14Part1-16    	    3981	    272449 ns/op	  124564 B/op	    1146 allocs/op
----

Lots of allocs, lets check where they come from:

----
      flat  flat%   sum%        cum   cum%
   79.01MB 35.01% 35.01%   146.51MB 64.93%  regexp.(*Regexp).FindStringSubmatch
   67.51MB 29.92% 64.93%    67.51MB 29.92%  regexp.(*Regexp).doOnePass
   54.58MB 24.19% 89.11%   221.11MB 97.99%  gitlab.com/jhinrichsen/adventofcode2024.NewDay14
    7.02MB  3.11% 92.22%     7.02MB  3.11%  regexp.onePassCopy
       5MB  2.22% 94.44%        5MB  2.22%  regexp/syntax.(*compiler).inst (inline)
       4MB  1.77% 96.21%        4MB  1.77%  regexp/syntax.(*parser).newRegexp (inline)
    1.50MB  0.67% 96.88%     1.50MB  0.67%  runtime.allocm
    1.50MB  0.67% 97.55%     2.50MB  1.11%  regexp.makeOnePass
    1.16MB  0.51% 98.06%     1.16MB  0.51%  runtime/pprof.StartCPUProfile
    0.50MB  0.22% 98.28%     5.50MB  2.44%  regexp/syntax.parse
----

Primarily in the parser, not the solver.
Custom parser to the rescue:

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay14Part1-16    	    3303	    329691 ns/op	   16385 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    7905	    152471 ns/op	   16384 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    7828	    151359 ns/op	   16384 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    7762	    153447 ns/op	   16384 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    7844	    152640 ns/op	   16384 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    6765	    153720 ns/op	   16384 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    7816	    152940 ns/op	   16384 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    6735	    153580 ns/op	   16384 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    7830	    153114 ns/op	   16384 B/op	       1 allocs/op
BenchmarkDay14Part1-16    	    6556	    153674 ns/op	   16384 B/op	       1 allocs/op
----

- R4 struct is 32 bytes (4 × 8-byte ints)
- Slice capacity is exactly 500 robots for part 1
- 500 × 32 = 16,000 bytes ≈ 16KB

And runtime drops 50%.

The framework's AMD Ryzen 7 7840HS has

- L1d cache: 256 KiB (8 instances) = 32 KiB per core
- L1i cache: 256 KiB (8 instances) = 32 KiB per core

But let's compress further, because our solution space is within 8 bits.
Switching from `int` to `int8` shows expected memory reduction, plus a negligible runtime progression:

goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
              │  bench16KB  │              bench2KB              │
              │   sec/op    │   sec/op     vs base               │
Day14Part1-16   153.3µ ± 1%   151.4µ ± 1%  -1.25% (p=0.007 n=10)

              │   bench16KB   │               bench2KB               │
              │     B/op      │     B/op      vs base                │
Day14Part1-16   16.000Ki ± 0%   2.000Ki ± 0%  -87.50% (p=0.000 n=10)

              │ bench16KB  │            bench2KB            │
              │ allocs/op  │ allocs/op   vs base            │
Day14Part1-16   1.000 ± 0%   1.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

Although the benchmark looks good, we still must use some `int` cases for additions before the modulo operation.
Both position and velocity use the full range of two digits (-99..99), so the `int8` range is not large enough.
`int16` is safe for all possible combination, and this is the sweet spot for our puzzle:

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
              │  bench16KB   │              bench2KB               │              bench4KB               │
              │    sec/op    │    sec/op     vs base               │   sec/op     vs base                │
Day14Part1-16   153.28µ ± 1%   151.37µ ± 1%  -1.25% (p=0.007 n=10)   93.47µ ± 2%  -39.02% (p=0.001 n=10)

              │   bench16KB   │               bench2KB               │               bench4KB               │
              │     B/op      │     B/op      vs base                │     B/op      vs base                │
Day14Part1-16   16.000Ki ± 0%   2.000Ki ± 0%  -87.50% (p=0.000 n=10)   4.000Ki ± 0%  -75.00% (p=0.000 n=10)

              │ bench16KB  │            bench2KB            │            bench4KB            │
              │ allocs/op  │ allocs/op   vs base            │ allocs/op   vs base            │
Day14Part1-16   1.000 ± 0%   1.000 ± 0%  ~ (p=1.000 n=10) ¹   1.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal
----


Compared to our baseline, **huge** performance gains, although the Go `regexp` package is top notch.

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
              │    bench     │              bench4KB               │
              │    sec/op    │   sec/op     vs base                │
Day14Part1-16   273.05µ ± 1%   93.47µ ± 2%  -65.77% (p=0.000 n=10)

              │     bench      │               bench4KB               │
              │      B/op      │     B/op      vs base                │
Day14Part1-16   121.645Ki ± 0%   4.000Ki ± 0%  -96.71% (p=0.000 n=10)

              │     bench     │              bench4KB              │
              │   allocs/op   │ allocs/op   vs base                │
Day14Part1-16   1146.000 ± 0%   1.000 ± 0%  -99.91% (p=0.000 n=10)
----

So, stepping back from our micro optimizations, let's try to use a different algorithm.

Given that

	p(t) = p~0~ + v×t
	((a % n) + (b % n)) % n = (a+b) % n

We can directly compute `p(100)` eliminating the seconds loop, reducing from O(t) to O(1).
Then, compress the sector detection from `0..N` into `0..1` by dividing position by half size of field,
inline the parser, remove the `Day14Puzzle` transporter Parser -> Solver:

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
BenchmarkDay14Part1-16    	  114633	      9353 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  240643	      4791 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  238321	      4784 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  237918	      4799 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  232068	      4814 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  239653	      4794 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  240558	      4777 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  239126	      4803 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  235606	      4781 ns/op	       0 B/op	       0 allocs/op
BenchmarkDay14Part1-16    	  218532	      4763 ns/op	       0 B/op	       0 allocs/op
----

We're now down to using Go as a Macro Assembler, single digit microsecond runtime, no garbage allocation.

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
              │     bench     │               bench1                │
              │    sec/op     │   sec/op     vs base                │
Day14Part1-16   273.051µ ± 1%   4.840µ ± 2%  -98.23% (p=0.000 n=10)

              │    bench     │               bench1                │
              │     B/op     │    B/op     vs base                 │
Day14Part1-16   121.6Ki ± 0%   0.0Ki ± 0%  -100.00% (p=0.000 n=10)

              │    bench    │                bench1                │
              │  allocs/op  │  allocs/op   vs base                 │
Day14Part1-16   1.146k ± 0%   0.000k ± 0%  -100.00% (p=0.000 n=10)
----

=== Part 2

----
Robot positions at t=6668:
Densest area around (62,58) with 221 robots in 20x20 window
#.............................
#.............................
#.............................
#.............................
#..............#..............
#.............###.............
#............#####............
#...........#######...........
#..........#########..........
#............#####............
#...........#######...........
#..........#########..........
#.........###########.........
#........#############........
#..........#########..........
#.........###########.........
#........#############........
#.......###############.......
#......#################......
#........#############........
#.......###############.......
#......#################......
#.....###################.....
#....#####################....
#.............###.............
#.............###.............
#.............###.............
#.............................
#.............................
#.............................
----

== Day 16: Reindeer Maze

Day 16 presents a maze navigation problem with **state-dependent movement costs**:

- Forward movement: 1 point per tile
- 90-degree rotation: 1000 points per turn
- Start facing East, find lowest score path to end

=== Algorithm Choice: Dijkstra's

**Recommended: Dijkstra's Algorithm**

Characteristics:
- **Non-negative edge weights**: All costs (movement=1, rotation=1000) are positive
- **State-dependent graph**: Position + direction create the state space
- **Single-source shortest path**: Find minimum cost from start to end
- **Optimal substructure**: Shortest path property holds

Why not alternatives:

- **BFS**: Inadequate - uniform edge weights assumption violated (1 vs 1000 costs)
- **A***: Overkill - requires heuristic design, Dijkstra sufficient for unweighted optimization
- **DFS**: Suboptimal - no guarantee of shortest path

Implementation considerations:
- State representation: `(x, y, direction)` tuples
- Priority queue: Min-heap ordered by cumulative cost
- Visited tracking: Mark `(position, direction)` pairs to avoid cycles
- Cost function: `cost = steps + (turns × 1000)`

Expected complexity: O((V + E) log V) where V = positions × 4 directions

=== Performance Results

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: Intel(R) Core(TM) i7-14700K
BenchmarkDay16Part1-28    	      10	 100309101 ns/op	29003684 B/op	  488817 allocs/op
BenchmarkDay16Part1-28    	      12	  97076521 ns/op	29003665 B/op	  488817 allocs/op
BenchmarkDay16Part1-28    	      13	 100633463 ns/op	29003675 B/op	  488817 allocs/op
BenchmarkDay16Part1-28    	      12	 101567947 ns/op	29003656 B/op	  488817 allocs/op
BenchmarkDay16Part1-28    	      13	 104575342 ns/op	29003712 B/op	  488818 allocs/op
BenchmarkDay16Part1-28    	      10	 112985890 ns/op	29003686 B/op	  488817 allocs/op
BenchmarkDay16Part1-28    	      12	 111309220 ns/op	29003696 B/op	  488817 allocs/op
BenchmarkDay16Part1-28    	      12	 106102885 ns/op	29003664 B/op	  488817 allocs/op
BenchmarkDay16Part1-28    	      12	 100597466 ns/op	29003632 B/op	  488817 allocs/op
BenchmarkDay16Part1-28    	      10	 114607734 ns/op	29003732 B/op	  488818 allocs/op
BenchmarkDay16Part2-28    	       9	 113297370 ns/op	29214856 B/op	  490496 allocs/op
BenchmarkDay16Part2-28    	      12	 112791667 ns/op	29214872 B/op	  490496 allocs/op
BenchmarkDay16Part2-28    	      10	 107404384 ns/op	29214888 B/op	  490497 allocs/op
BenchmarkDay16Part2-28    	      13	 104006267 ns/op	29214858 B/op	  490496 allocs/op
BenchmarkDay16Part2-28    	      13	 108982748 ns/op	29214880 B/op	  490496 allocs/op
BenchmarkDay16Part2-28    	       9	 115011110 ns/op	29214898 B/op	  490497 allocs/op
BenchmarkDay16Part2-28    	      12	 115122870 ns/op	29214880 B/op	  490496 allocs/op
BenchmarkDay16Part2-28    	      13	 118729444 ns/op	29214843 B/op	  490496 allocs/op
BenchmarkDay16Part2-28    	      10	 114593633 ns/op	29214897 B/op	  490497 allocs/op
BenchmarkDay16Part2-28    	      10	 115151624 ns/op	29214830 B/op	  490496 allocs/op
PASS
----

== Day17: Chronospatial Computer

Again one of these wonderful CPU emulators, by far too lovely to let an AI agent work on it.

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: Intel(R) Core(TM) i7-14700K
BenchmarkDay17Part1-28    	 5075667	       201.3 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 6431104	       193.1 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 6278082	       191.8 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 5981952	       190.7 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 6154922	       193.8 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 5958874	       191.9 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 6388510	       191.2 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 6315153	       193.8 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 6342436	       189.5 ns/op	      56 B/op	       3 allocs/op
BenchmarkDay17Part1-28    	 5808722	       189.6 ns/op	      56 B/op	       3 allocs/op
----

That's faster than 190 ns, or inverse, 5.3 MHz.

=== Optimization step #1

Memory allocations are from `strings.Builder` that handles output, replacing with a simple `[]byte`:

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: Intel(R) Core(TM) i7-14700K
              │     b0      │                 b1                  │
              │   sec/op    │   sec/op     vs base                │
Day17Part1-28   190.1n ± 1%   166.1n ± 1%  -12.67% (p=0.000 n=10)

              │     b0     │                 b1                 │
              │    B/op    │    B/op     vs base                │
Day17Part1-28   56.00 ± 0%   24.00 ± 0%  -57.14% (p=0.000 n=10)

              │     b0     │                 b1                 │
              │ allocs/op  │ allocs/op   vs base                │
Day17Part1-28   3.000 ± 0%   1.000 ± 0%  -66.67% (p=0.000 n=10)
----

166.1 ns is equivalent to 6.02 MHz.

=== Optimization step #2

The operand to combo transformation right now looks like this:

----
switch operand {
case 0:
        combo = 0
case 1:
        combo = 1
case 2:
        combo = 2
case 3:
        combo = 3
case 4:
        combo = a
case 5:
        combo = b
case 6:
        combo = c
case 7:
        // NOP
default:
        panic(fmt.Sprintf("illegal operand %d", operand))
}
----

We can treat these as registers directly, with register 0-3 holding fixed values as in https://en.wikipedia.org/wiki/RISC-V#Register_sets[RISC-V].

----
const REGISTER_A = 4 <1>

registers := [8]uint{0, 1, 2, 3, 0, 0, 0, 0} <2>

parse(lines[1], REGISTER_B) <3>

combo := registers[operand] <4>
----
<1> index into registers
<2> mix of fixed values [0..3] and registers [4..]
<3> parse input program directly into register
<4> single indirection effectively removes a large portion of branch mispredictions, which will always clear the CPU pipeline

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: Intel(R) Core(TM) i7-14700K
              │     b0      │                 b1                  │                 b2                  │
              │   sec/op    │   sec/op     vs base                │   sec/op     vs base                │
Day17Part1-28   190.1n ± 1%   166.1n ± 1%  -12.67% (p=0.000 n=10)   112.4n ± 4%  -40.89% (p=0.000 n=10)

              │     b0     │                 b1                 │                 b2                 │
              │    B/op    │    B/op     vs base                │    B/op     vs base                │
Day17Part1-28   56.00 ± 0%   24.00 ± 0%  -57.14% (p=0.000 n=10)   24.00 ± 0%  -57.14% (p=0.000 n=10)

              │     b0     │                 b1                 │                 b2                 │
              │ allocs/op  │ allocs/op   vs base                │ allocs/op   vs base                │
Day17Part1-28   3.000 ± 0%   1.000 ± 0%  -66.67% (p=0.000 n=10)   1.000 ± 0%  -66.67% (p=0.000 n=10)
----

Runtime is now down to 112.4 ns equivalent to 8.9 MHz.

=== Optimization step #3

One final micro-optimization for the `LEAQ` operation in the loop termination expression

----
	for pc := 0; pc < len(cmd)-2; {
  0x20e5		4c8d49f5		LEAQ -0xb(CX), R9 <1>
  0x20e9		4c39c8			CMPQ AX, R9
  0x20ec		0f8d0e020000		JGE 0x2300
----
<1> load and compare for each loop iteration

----
	l := len(cmd) - 2
	for pc := 0; pc < l; {
----

Now we keep the loop terminator in `R9`.

----
	l := len(cmd) - 2
  0x211b		4c8d49f5		LEAQ -0xb(CX), R9 <1>
  0x211f		31c0			XORL AX, AX
  0x2121		31c9			XORL CX, CX
  0x2123		6690			NOPW
	for pc := 0; pc < l; {
  0x2125		eb06			JMP 0x212d
  0x2127		4c89d8			MOVQ R11, AX
			if idx > 0 {
  0x212a		4889d9			MOVQ BX, CX
	for pc := 0; pc < l; {
  0x212d		4c39c8			CMPQ AX, R9 <2>
----
<1> load once
<2> compare against R9

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: Intel(R) Core(TM) i7-14700K
              │     b2      │                 b3                 │
              │   sec/op    │   sec/op     vs base               │
Day17Part1-28   112.4n ± 4%   108.9n ± 2%  -3.11% (p=0.045 n=10) <1>

              │     b2     │               b3               │
              │    B/op    │    B/op     vs base            │
Day17Part1-28   24.00 ± 0%   24.00 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

              │     b2     │               b3               │
              │ allocs/op  │ allocs/op   vs base            │
Day17Part1-28   1.000 ± 0%   1.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal
----
<1> another 3% gain

All in all we're down to 108.9 ns ≙ 9.2 MHz.

=== Part 2

By looking at the number of digits, our cmd has length 31 (including separator), and the program function is monotonically increasing:

----
1 31 1
10 31 3
100 31 5
1000 31 7
10000 31 9
100000 31 11
1000000 31 13
10000000 31 15
100000000 31 17
1000000000 31 19
10000000000 31 23
100000000000 31 25
1000000000000 31 27
10000000000000 31 29
100000000000000 31 31
1000000000000000 31 33
10000000000000000 31 35
100000000000000000 31 37
1000000000000000000 31 39
10000000000000000000 31 43
----

So we should start digging deeper starting from 100000000000000, that's 1e14.

in addition, the output changes its front continuously, while the back stays more stable.
Exactly the opposite of our counting system, where the front is stable, and the back changes quickly.

----
goos: linux
goarch: amd64
pkg: gitlab.com/jhinrichsen/adventofcode2024
cpu: Intel(R) Core(TM) i7-14700K
BenchmarkDay17Part2-28    	     928	   1229557 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     938	   1266436 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     992	   1231424 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     979	   1220102 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     880	   1229319 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     980	   1231827 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     960	   1214826 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     984	   1218044 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     978	   1217073 ns/op	  270112 B/op	    7569 allocs/op
BenchmarkDay17Part2-28    	     985	   1211364 ns/op	  270112 B/op	    7569 allocs/op
----

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrNoStartFound = errors.New("no start position 'S' found in maze")
	ErrNoEndFound   = errors.New("no end position 'E' found in maze")
)

Day16 specific errors

View Source
var ErrNoSolutionFound = &NoSolutionError{}

Sentinel error for errors.Is() compatibility

Functions

func Bareiss added in v0.17.0

func Bareiss(eqs []Eq) ([]int, bool)

Bareiss solves an n×n system using Bareiss algorithm (fraction-free Gaussian elimination) Returns solution vector and whether solution exists

func Cramer added in v0.17.0

func Cramer(eq1, eq2 Eq) (int, int, bool)

Cramer solves a 2x2 system using Cramer's rule eq1: A*x + B*y = X eq2: A*x + B*y = X Returns x, y, and whether integer solution exists

func Day01

func Day01(buf []byte, part1 bool) uint

func Day02 added in v0.2.0

func Day02(puzzle Day02Puzzle, part1 bool) (n uint)

func Day03 added in v0.17.0

func Day03(program string, part1 bool) (sum uint)

func Day04 added in v0.17.0

func Day04(puzzle Day04Puzzle, part1 bool) uint

Day04 solves Day 4 using the puzzle struct (both parts)

func Day05 added in v0.17.0

func Day05(lines []string, part1 bool) (sum uint)

func Day06 added in v0.17.0

func Day06(puzzle Day06Puzzle, part1 bool) uint

Day06 solves Day 6 using the puzzle struct

func Day07 added in v0.17.0

func Day07(puzzle Day07Puzzle, part1 bool) (sum uint)

func Day08 added in v0.17.0

func Day08(lines []string, part1 bool) (total uint)

func Day09 added in v0.17.0

func Day09(puzzle Day09Puzzle, part1 bool) (checksum uint)

func Day10 added in v0.17.0

func Day10(puzzle Day10Puzzle, part1 bool) uint

Day10 solves Day 10 using the puzzle struct

func Day11 added in v0.17.0

func Day11(stones []uint64, part1 bool) uint64

func Day12 added in v0.17.0

func Day12(puzzle Day12Puzzle, part1 bool) uint

func Day13 added in v0.17.0

func Day13(puzzle Day13Puzzle, part1 bool) int

func Day14 added in v0.17.0

func Day14(buf []byte, dimX, dimY int, seconds uint, part1 bool) uint

func Day15 added in v0.17.0

func Day15(puzzle []byte, part1 bool) (uint, error)

func Day16 added in v0.17.0

func Day16(grid []byte, part1 bool) (uint, error)

func Day16Part1 added in v0.17.0

func Day16Part1(grid []byte) (uint, error)

func Day16Part2 added in v0.17.0

func Day16Part2(grid []byte) (uint, error)

func Day17 added in v0.17.0

func Day17(lines []string, part1 bool) (uint, uint, uint, string)

func Diff added in v0.2.0

func Diff[T uint](s1, s2 []T) uint

Diff returns number of differences between two slices. Unfortunately, the corresponding proposal has been rejected. https://github.com/golang/go/issues/58730, proposal: slices: add difference function

func DumpDay08 added in v0.17.0

func DumpDay08(lines []string, ps map[image.Point]bool)

func NewDay11 added in v0.17.0

func NewDay11(input string) []uint64

Types

type ClawMachine added in v0.17.0

type ClawMachine struct {
	ButtonA Point
	ButtonB Point
	Prize   Point
}

type Day02Puzzle added in v0.17.0

type Day02Puzzle struct {
	// contains filtered or unexported fields
}

Day02Puzzle represents the parsed Day 2 puzzle data

func NewDay02 added in v0.17.0

func NewDay02(filename string) Day02Puzzle

NewDay02 creates a Day02Puzzle from lines

type Day04Puzzle added in v0.17.0

type Day04Puzzle struct {
	// contains filtered or unexported fields
}

Day04Puzzle represents the parsed Day 4 puzzle data

func NewDay04 added in v0.17.0

func NewDay04(lines []string) Day04Puzzle

NewDay04 creates a Day04Puzzle from lines

type Day06Puzzle added in v0.17.0

type Day06Puzzle struct {
	// contains filtered or unexported fields
}

Day06Puzzle represents the parsed Day 6 puzzle data

func NewDay06 added in v0.17.0

func NewDay06(lines []string) Day06Puzzle

NewDay06 creates a Day06Puzzle from lines

type Day07Equation added in v0.17.0

type Day07Equation struct {
	// contains filtered or unexported fields
}

type Day07Puzzle added in v0.17.0

type Day07Puzzle []Day07Equation

func NewDay07 added in v0.17.0

func NewDay07(lines []string) Day07Puzzle

type Day09Puzzle added in v0.17.0

type Day09Puzzle struct {
	// contains filtered or unexported fields
}

func NewDay09 added in v0.17.0

func NewDay09(buf []byte) Day09Puzzle

type Day10Puzzle added in v0.17.0

type Day10Puzzle struct {
	// contains filtered or unexported fields
}

Day10Puzzle represents the parsed Day 10 puzzle data

func NewDay10 added in v0.17.0

func NewDay10(lines []string) Day10Puzzle

NewDay10 creates a Day10Puzzle from lines

type Day12Puzzle added in v0.17.0

type Day12Puzzle struct {
	// contains filtered or unexported fields
}

func NewDay12 added in v0.17.0

func NewDay12(lines []string) Day12Puzzle

type Day13Puzzle added in v0.17.0

type Day13Puzzle struct {
	Machines []ClawMachine
}

func NewDay13 added in v0.17.0

func NewDay13(lines []string) Day13Puzzle

type Eq added in v0.17.0

type Eq struct {
	A, B, X int
}

Eq represents a linear equation: A*x + B*y = X

type NoSolutionError added in v0.17.0

type NoSolutionError struct {
	Start image.Point
	End   image.Point
}

NoSolutionError provides detailed information when no path exists

func (*NoSolutionError) Error added in v0.17.0

func (e *NoSolutionError) Error() string

func (*NoSolutionError) Is added in v0.17.0

func (e *NoSolutionError) Is(target error) bool

type Point added in v0.17.0

type Point struct {
	X, Y int
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL