Arrays & Hashes
Key Points
- Use hash to reduce time needed to find k/v pairs
Group Anagrams
First Attempt
- Use the array of words to populate a hash with letters as keys and the indexes of words containing those letters as values
- Then create an
untriedarray with all indexes ofstrsand while it has elements- retrieve the indexes for each letter from the hash and compare them to the previous iteration, carrying forward the union
- If the union is empty, go next, if it isn’t then return the word and the words at the union indexes
- Rather than adding the index to the array, why not the word? That’s what I need to return anyway, and easier to compare
- This is definitely a step in the right direction, lets you check the size is the same
- Stuck on how to keep duplicates when reducing by intersection though
- Solved by getting them back by selecting from untried before deleting them from there
- Maybe it’s a nested hash? First key is the letter, then the count of that letter in the word, then the words that it applies to
- once you have that hash, maybe you can just iterate over it to solve?
- or, key the hash by the strings, with a nested hash of letter/count k/v pairs (maybe length too)
- yeah this one with the length, because then if all the keys are equal they’re an anagram
- maybe need a count too, to handle duplicate values
- then iterate over and they’re equal if same length and all counts equal
- So in summary, a hash with structure
{ length: { string: { count: n, letter1_count: n, ... letterN_count: n, } }}- Then you iterate within each length
- add the string to anagrams array (
counttimes) because it’s an anagram of itself - if any of the letter counts are different you go next
- if they’re all the same
- it’s an anagram
- you add it to the anagrams array for this string
- you remove the word from the length hash
- and push anagrams array to results
- add the string to anagrams array (
- This was sooooo close to working, I could make it correct if I excluded the occurence count from the comparison of letter counts
- Necessary because otherwise
["tea", "eat", "tea"]will not recognise ‘eat’ and ‘tea’ as anagrams due to them having a different number of occurences
- Necessary because otherwise
- But the exclusion made it too slow to finish in the allotted time
- So I added some complexity to the initial construction of the hash by separating each word’s
countshash intocount(would have been better to call itoccurences) andchar_counts, then just comparing thechar_countsto check if it’s an anagram - This implementation just snuck in under the time limit
- Which coupled with the huge amount of code I wrote, means there’s definitely a better way
- But hey, at least I solved it myself
Optimised
- Can group anagrams by sorting them, but O(m * n log n) where n is the average string length and m is the number of strings
- Ahhhh, I was on the right track but didn’t realise you can use the hash of counted values as the key
- That lets you make the value an array of words which match the count, which you can just return
- Time complexity is O(m * n)
- He made the key an array of letter counts, but doesn’t that take extra unnecessary space/longer to compare?
- Was because Python has more restrictions on hash keys than Ruby, but in Ruby a hash as a key is fine (with
=>syntax)
- Was because Python has more restrictions on hash keys than Ruby, but in Ruby a hash as a key is fine (with
Product of Array Except Self
First Attempt
- Calculate the sum of all values, then map over the input array returning that sum divided by the current element
- But you can’t use division
- You really can’t because they added a 0 in haha
- If you don’t care about time complexity, could just map over (with index)
- Return the results of reducing the array with
:*, skipping the current index - Yeah gets the correct solution but takes too long
- Return the results of reducing the array with
- Follow up says to do it in constant extra space, so I assume the initial solution uses another array or hash in some way
- So how do I do this in O(n) using an array/hash/both?
- Splitting it into start and end arrays before and after the current number also works
- But is also too slow
Optimised
- Last idea of splitting into 2 arrays was exactly right, but I was doing a lot of extra multuplication
- Initialise
prefixarray of sizenums.size, each index will be the product of all values up to and including that index innums- You can do this incrementally by multiplying
nbyprefix[i - 1]
- You can do this incrementally by multiplying
postfixarray of sizenums.size, each index will be the product of all the values after and including that index innums- You can do this incrementally by multiplying
nbypostfix[i + 1]
- You can do this incrementally by multiplying
- Need to iterate over
numsonce, populatingprefixandpostfixincrementally from the start and finish respectively - Once you’ve filled those two, loop
nums.sizetimes- prefix is
before[i - 1]- Or 1 if
i.zero?
- Or 1 if
- postfix is
postfix[i + 1]- Or 1 if
i == nums.size - 1
- Or 1 if
results[i] = prefix * postfix
- prefix is
Top K frequent elements
First Attempt
- Should just be able to make a hash of the element counts, key is the count & value is value
- But sorting it is kinda a problem maybe?
- Also you’re not really looking for a specific value, but a specific place in order
- Or make a 2D array of counts & values, then sort by counts and take the first
k- Can speed it up by skipping processed values at cost of memory, store them in a hash?
- Was actually still pretty slow
- Or to save memory just count each of the elements returned by a
uniqcall- This was also slow
- Can speed it up by skipping processed values at cost of memory, store them in a hash?
- Maybe put the counts/values in an array already in order using binary search, to avoid the sort?
- Then just return the last
kvalues
- Then just return the last
Optimised
- Use bucket sort to do it in O(n) time
- Create a second array
freqswith lengthnums + 1- use counts of each value as indexes to store an array of values which occur
counttimes
- use counts of each value as indexes to store an array of values which occur
- Get the
countshash by iterating over nums and incrementing the count value for the currentneach time- This is faster than calling
nums.count(n), even with duplicate prevention
- This is faster than calling
- Map
countshash tofreqsarray, using count as the index and pushing the value to the values array at that index - Take the last
kvalues from freqs
Two Sum
First Attempt
- Just try all the combinations
- Start by subtracting current value from target, and searching for the remainder
Optimised
- Hash, O(n)
- Create a hash of value/index pairs by iterating over the array
- For each value, subtract it from the target and check if there’s a key for the remainder
- Check as you create the hash so you don’t have to loop again
- If yes, return the value for the remainder key and the index of the current iteration value
- If no, add the current iteration value and its index as a k/v pair
Valid Sudoku
First Attempt
- Initialise
cols, an array of columns (which are Sets)row, an set representing the current rowsquares, an array of squares (which are Sets)
- Iterate over the existing array (of rows) and for each cell
- next if it’s a dot
cols[j].add?(n)row.add?(n)squares[(i / 3 * 3) + (j / 3)].add?(n)/3*3works because it’s integer division- so for example
i = 4,4 / 3 = 1, 1 * 3 = 3
- If any of those return false, return false as the puzzle is invalid
- Nailed it :) Other than taking ages an a lot of googling to figure out the algorithm for finding the current square