Day 3
Anddd we’re off! This time, we’re travelling down the slope at a certain angle, but need to account for the trees in our path. We’re given a map showing open squares (.
) and trees (#
), as such:
..##.......
#...#...#..
.#....#..#.
..#.#...#.#
.#...##..#.
..#.##.....
.#.#.#....#
.#........#
#.##...#...
#...##....#
.#..#...#.#
Of course, it’s a magical island, with infinite width - having the same pattern repeating multiple times.
..##.........##.........##.........##.........##.........##....... --->
#...#...#..#...#...#..#...#...#..#...#...#..#...#...#..#...#...#..
.#....#..#..#....#..#..#....#..#..#....#..#..#....#..#..#....#..#.
..#.#...#.#..#.#...#.#..#.#...#.#..#.#...#.#..#.#...#.#..#.#...#.#
.#...##..#..#...##..#..#...##..#..#...##..#..#...##..#..#...##..#.
..#.##.......#.##.......#.##.......#.##.......#.##.......#.##..... --->
.#.#.#....#.#.#.#....#.#.#.#....#.#.#.#....#.#.#.#....#.#.#.#....#
.#........#.#........#.#........#.#........#.#........#.#........#
#.##...#...#.##...#...#.##...#...#.##...#...#.##...#...#.##...#...
#...##....##...##....##...##....##...##....##...##....##...##....#
.#..#...#.#.#..#...#.#.#..#...#.#.#..#...#.#.#..#...#.#.#..#...#.# --->
Our task is to slide down the slope and count the number of trees we encounter along the way! But first, we import the data from our file:
# import data
with open("input.txt", 'r') as file:
arr = file.read().splitlines()
arr[0:3] # print first 3 lines
['...#.....#.......##......#.....',
'...#..................#........',
'....##....#.......#............']
Part 1
Righty, seems easy enough! We just need to keep track of our position and move 3 times to the right each line - if we hit the end of the file, we can “wrap around” this magic island by using a modulus (%
) to make the remainder our “wrapped around” position.
position = 0
count = 0
for line in arr:
if line[position] == "#":
count += 1
# wrap around if we hit the end of the line
position = (position + 3)%len(line)
print(count)
220
We simply count the number of trees we encounter and get our answer!
Ans: 220
Part 2
Time to check the rest of the slopes - you need to minimize the probability of a sudden arboreal stop, after all.
- Right 1, down 1.
- Right 3, down 1. (This is the slope that we just checked.)
- Right 5, down 1.
- Right 7, down 1.
- Right 1, down 2.
Alrighty! Most of the slopes are trivial (i.e. adding 1/5/7 to our position instead of 3), but the last slope is a little more interesting. With a slope of 1 right and 2 down, we need to skip alternating lines. That’s not bad though - we can just add a line of code to select the n
th alternating lines per n
down, as such:
def checkSlope(right, down):
# selects every line if down 1, selects alternating nth line for down = n
arr_selected=arr[::down]
position = 0
count = 0
for line in arr_selected:
if line[position] == "#":
count += 1
# wrap around if we hit the end of the line
position = (position + right)%len(line)
return count
With that done, we just need to run our function per given slope, to get our answer!
slopes=[[3,1],[1,1],[5,1],[7,1],[1,2]]
prod = 1
for slope in slopes:
treeNum = checkSlope(slope[0],slope[1])
print("Right: ", slope[0],"Down: ", slope[1], "Trees: ", treeNum)
prod *=treeNum
print("Ans:", prod)
Right: 3 Down: 1 Trees: 220
Right: 1 Down: 1 Trees: 70
Right: 5 Down: 1 Trees: 63
Right: 7 Down: 1 Trees: 76
Right: 1 Down: 2 Trees: 29
Ans: 2138320800
Day 4
Ooooo this one is interesting. We arrive at the airport and (as per usual) we’re needed for some help with the long lines at the passport scanners! (Okay, but ngl I’m getting throwback to those 3h long lines at TSA and… shudders). The automatic passport scanners are slow because they’re having trouble detecting which passports have all required fields. The expected fields are as follows:
byr
(Birth Year)iyr
(Issue Year)eyr
(Expiration Year)hgt
(Height)hcl
(Hair Color)ecl
(Eye Color)pid
(Passport ID)cid
(Country ID)
Passport data is validated in batch files (your puzzle input). Each passport is represented as a sequence of key:value
pairs separated by spaces or newlines. Passports are separated by blank lines. For example, in our input file, we have our first 3 entries:
eyr:2033
hgt:177cm pid:173cm
ecl:utc byr:2029 hcl:#efcc98 iyr:2023
pid:337605855 cid:249 byr:1952 hgt:155cm
ecl:grn iyr:2017 eyr:2026 hcl:#866857
cid:242 iyr:2011 pid:953198122 eyr:2029 ecl:blu hcl:#888785
Alrighty, let’s import these data! This was a little bit more convoluted than our import from our previous days - I used a couple of split operations to create a dictionary of key:value
pairs per passport, and threw that into our best friend, a dataframe.
import pandas as pd
import re
arr = []
# import data
with open("input.txt", 'r') as file:
curDict = dict()
for line in file:
if line == "\n" or line == "":
# new entry! add data to array and create empty dictionary
arr.append(curDict)
curDict = dict()
else:
# parse input into tuples and add into dictionary
tuples=line.rstrip().split(' ')
for i in tuples:
i = i.split(':')
curDict[i[0]]=i[1]
# Creates DataFrame.
df = pd.DataFrame(arr)
df.tail(3)
eyr | hgt | pid | ecl | byr | hcl | iyr | cid | |
---|---|---|---|---|---|---|---|---|
284 | 2039 | 72cm | #d80d30 | #cc4aff | 2025 | 5307c9 | 2025 | 224 |
285 | 2027 | 184 | 58151347 | NaN | 2015 | 98fb9d | 2029 | NaN |
286 | 2021 | 183cm | 164cm | xry | 2019 | #18171d | 2013 | 187 |
Part 1
Our first task is to check for the validity of passports. Specifically, those that have all required fields. We are to treat cid
as optional, but all other fields are necessary. In our batch file, how many passports are valid?
Thanks to the format of our data management from before, this is easy. We simply check for any rows that have null fields (ignoring the cid
field). And thus, we have:
df['valid_p1']=~(df.drop(['cid'], axis=1)).isnull().any(axis=1)
df['valid_p1'].value_counts()
True 219
False 68
Name: valid_p1, dtype: int64
Ans: 219
Part 2
The line is moving more quickly now, but you overhear airport security talking about how passports with invalid data are getting through. Better add some data validation, quick!
byr
(Birth Year) - four digits; at least 1920 and at most 2002.iyr
(Issue Year) - four digits; at least 2010 and at most 2020.eyr
(Expiration Year) - four digits; at least 2020 and at most 2030.hgt
(Height) - a number followed by either cm or in:- If cm, the number must be at least 150 and at most 193.
- If in, the number must be at least 59 and at most 76.
hcl
(Hair Color) - a # followed by exactly six characters 0-9 or a-f.ecl
(Eye Color) - exactly one of: amb blu brn gry grn hzl oth.pid
(Passport ID) - a nine-digit number, including leading zeroes.cid
(Country ID) - ignored, missing or not.
That’s more like it! For each field, we now have a specific format we want to match - sounds like a task perfect for RegEx.
A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern. RegEx can be used to check if a string contains the specified search pattern.
Here are some examples of how to use RegEx from the w3schools page:
Character | Description | Example |
---|---|---|
[] | A set of characters | “[a-m]” |
\ | Signals a special sequence (can also be used to escape special characters) | “\d” |
. | Any character (except newline character) | “he..o” |
^ | Starts with | “^hello” |
$ | Ends with | “world$” |
* | Zero or more occurrences | “aix*” |
+ | One or more occurrences | “aix+” |
{} | Exactly the specified number of occurrences | “al{2}” |
| | Either or | “falls|stays” |
Alrighty, let’s get right into it! We can start with the easier fields, like byr
. We need byr
to be a string that comprises of 4 digits and is between 1920 and 2002.
# byr (Birth Year) - four digits; at least 1920 and at most 2002.
byr_valid = re.findall("(19[2-9][0-9]|200[0-2])", byr)
Let me break down the above pattern: the first half (before |
) asks RegEx to match the format 19(any digit between 2 and 9)(any digit between 0 and 9)
, i.e. any number between 1920 and 1999. The second half does the same thing, but for 2000-2002!
Even the more complicated looking fields like hgt
aren’t too bad - we just need to add more “either” fields using |
!
# byr (Birth Year) - four digits; at least 1920 and at most 2002.
hgt_valid = re.findall("(1[5-8][0-9]cm|19[0-3]cm|59in|6[0-9]in|7[0-6]in)", byr)
Here, we’re checking if the height falls between 150-189cm, or 190-193cm, or 59in, or 60-69in, or 70-76in. Not too bad once we get used to it!
The hardest field to check is the pid
field - here, we want to make sure the field is a nine-digit number, including leading zeroes.
# pid (Passport ID) - a nine-digit number, including leading zeroes.
pid_valid = re.findall("^\d{9}$", pid))
The ^
operator signifies that the field has to start with the following expression, \d
is a special sequence that returns only digits (the equivalent of [0-9]
), {9}
specifies 9 (digits), and the $
“end with” metacharacter matches the EOL character \n
. This metacharacter is especially important to make sure the pattern ends with those 9 digits and nothing more - otherwise, we’ll end up accepting 10 digit PIDs!
Whew! With that all done, we can finish up our little passport validity function and output our answer.
def isPassportValid(eyr,hgt,pid,ecl,byr,hcl,iyr):
validity = False
if not(pd.isna(eyr) or pd.isna(hgt) or pd.isna(pid) or pd.isna(ecl) or pd.isna(byr) or pd.isna(hcl) or pd.isna(iyr)):
passport_valid = []
# byr (Birth Year) - four digits; at least 1920 and at most 2002.
passport_valid.append(len(re.findall("(19[2-9][0-9]|200[0-2])", byr)) == 1)
# iyr (Issue Year) - four digits; at least 2010 and at most 2020.
passport_valid.append(len(re.findall("(201[0-9]|2020)", iyr)) == 1)
# eyr (Expiration Year) - four digits; at least 2020 and at most 2030.
passport_valid.append(len(re.findall("(202[0-9]|2030)", eyr)) == 1)
# hgt (Height) - a number followed by either cm or in:
# If cm, the number must be at least 150 and at most 193.
# If in, the number must be at least 59 and at most 76.
passport_valid.append(len(re.findall("(1[5-8][0-9]cm|19[0-3]cm|59in|6[0-9]in|7[0-6]in)", hgt)) == 1)
# hcl (Hair Color) - a # followed by exactly six characters 0-9 or a-f.
passport_valid.append(len(re.findall("^#[a-f0-9]{6}", hcl)) == 1)
# ecl (Eye Color) - exactly one of: amb blu brn gry grn hzl oth.
passport_valid.append(len(re.findall("(amb|blu|brn|gry|grn|hzl|oth)", ecl)) == 1)
# pid (Passport ID) - a nine-digit number, including leading zeroes.
passport_valid.append(len(re.findall("^\d{9}$", pid)) == 1)
validity = not(False in passport_valid)
return validity
df['valid_p2']=df.apply(lambda x: isPassportValid(x['eyr'], x['hgt'], x['pid'], x['ecl'], x['byr'], x['hcl'], x['iyr']), axis=1)
df['valid_p2'].value_counts()
False 160
True 127
Name: valid_p2, dtype: int64
Since I was new to RegEx, I had trouble debugging this one, and using this RegEx debugging tool (highly recommended!!!) helped me to figure out exactly what was wrong. It’s a really useful tool that I’ll be sure to use in the future - very very cool!
Thoughts after day 3 & 4: mhmmm text parsing, not too hot. but also, a newfound love for regex!