Revew of Code Katas

This is just an informal, personal review of the Katas that I have done. I will go over what I liked and didn't like for each as well as any other thoughts I had. This is meant to be an honest review.

One thing to note: none of the code is truly production ready. I did this as an informal exercise, therefore there's a chance that it will fall down if used in production if, for some reason, someone decides to do so.

Implementation notes

All of these are done in using the D Programming Language. This was for no other reason that I enjoy working in this language even though I typically work in Java for work. Granted, it's not perfect, but I just enjoy it more and there are some very nice benefits to the language.
No external libraries were used. I only used the standard libraries.
Some of the interfaces are done in the way the kata specified vs. how I may have done things. The Dependencies kata is a good example of this as I created a struct just to contain methods and an array of dependencies so that the add/get methods are called as such: deps.get('A');

General Observations

Probably because I used to do a lot of database work, I tend to rely on filtering a lot. I haven't investigated any alternatives, but I suspect there may be. Not saying that they're better or worse than filtering as I simply do not know yet.
The standard library for D has been very useful, especially std.algorithm, which contains the map, filter, uniq, and sort methods amongst others.
The new syntax for lamdas is very nice and easy to use. E.g.: a => a.foo or (a,b) => a.equals(b)
I really wish the std.log was available in D.

Katas

ANAGRAMS

This was a very simple kata where the goal was to find all of the words that are anagrams to each other. To do this, I created a dictionary to store the words with a key that consists of the sorted letters in each word. For example, if the word is "dog", then it would be stored with the key "dgo". Then, when the word "god" appears, it will be associated with the same key. So, to get the anagrams for a specific word, I simply generate a key and retrieve the words associated with the key. For further normalization, I ignored words in the wordlist with a single quote in them and lower-cased the keys.

Because there may be multiple instances of the same exact word in the dictionary due to case, I have a second associative array which holds all of the words. If the lower-case version of the word is already in the array, then I don't process it. Otherwise, I add it to the anagrams.

At this point, I see one thing I'd like to change: instead of lower-casing the words when I use them, it would be better to have used a map with the filters to lower-case the words as I read them in. This way, I only lower-case the words once vs. 1-3 times.

The rest of the program looks to see how many anagrams exist in the dictionary as well as the longest set of anagrams and prints them out.

ANAGRAM2

This is similar to the ANAGRAMS kata, however we are looking for all two-word anagrams of the word "documenting." Initially, I screwed up as I thought that there were no repeated letters in the word, so by checking for uniqueness, the algorithm was extremely fast. However, I got no results. The fixed version gets the following results using a built-in word list:

Words: mg, continued
Words: mg, unnoticed
Words: dec, mounting
Words: dem, counting
Words: gin, document
Words: cong, minuted
Words: cunt, demoing
Words: nome, ducting
Words: omen, ducting
Words: ming, counted
Words: mont, deucing
Words: count, deming
Words: tuned, coming
Words: gnome, induct
Words: mount, ceding

As one may guess, I filtered out values by size first so that I looked at words that were 2 and 9 characters long, then 3 and 8, and so on. I also wrote a method that checked each character of the anagram mapping key to ensure that we only looked at anagrams that only had characters in the word documenting. I then matched up each short anagram key with a long one, matched up the words for each key with each other, made sure that they letters in each matched the word documenting, and printed out the ones that matched.

It's a slow algorithm from an algorithmic standpoint, however it ran very fast on my VM. It's O(n^5). I'm sure there may be a better algorithm. I just haven't thought of it yet.

On at least Debian, the american wordlist is: /usr/share/dict/american-english

TODO

Try to find an algorithm that doesn't run in 0(n^5) time. Perhaps look into using a different type of data structure?

BANK_OCR

In this task, numbers are read in from an "OCR" and transformed into strings that contain the translated number. The data looks like this:

    _  _     _  _  _  _  _
  | _| _||_||_ |_   ||_||_|
  ||_  _|  | _||_|  ||_| _|

Simply reading them isn't enough as you also have to check to ensure the number is valid. This is done via a simple checksum algorithm and ensuring that all of the "numbers" are valid numbers. However, if there is an error with the number, then you have to try to automatically fix the number by checking each individual digit to see if it's off by one character in the original text.

Note: This is only done once per digit! The text originally confused me as I thought that we needed to examine every possible error on every digit, but instead it's supposed to check the only first digit to see if it could be something else, then check only the second digit. So, at most, each possible "fix" varies from the original string by one digit. Needless to say, I initially had a large mount of "solutions".

This was a fun little problem and a bit different. I didn't deal with reading from an actual file and have a very ugly set of unit tests, but it works. For fixing the number, I went with a recursive solution, though it could just as easily be done with an iterative solution.

I did recycle the wordDiff function from the word_chains kata as it served the same purpose here, except instead of real words I'm examining a string of spaces, underscores, and/or pipes. It is rewarding to see such methods being reused.

Outside of that, nothing of interest. It could use a bit of cleanup, especially to improve performance, but again, this was a very small data set, so it's fast enough. Even then, I can't imagine when such a program would be necessary.

BINARY_CHOP

In this kata, the goal was to create 5 different versions of the binary search algorithm. I ended up creating six. Before we get into the algorithms themselves, I want to talk about how I tested them. Because all of the algorithms would have the same interface, I created a single test function and passed in the search function as a parameter. Not a realistic scenario since normally you don't put several versions of the same algorithm in the same file/class/whatever, but it did cut down significantly on the amount of code that needed to be written.

The first version is a simple iterative version that "loops" over the array of data searching for the value. Nothing of note here except that one change that could have been made is to move the check to see if the index is the last index into the condition of the loop.

The second version is a standard recursive version where we search for a value between a start and end pair of indices. This recursive version would work in any language as it leaves the array intact.

The third version is another recursive version, but instead of passing in a start and end index, it simply calls itself with a subset of the data. It's a bit simpler than the previous version because there was no need to write a second function to handle the start and end indices, however the logic is a bit more complicated since I had to keep in mind to calculate the actual index in the array since the slices start at 0 in the new call.

The next two versions are a bit silly as they create and call new Fibers/Threads each time a match is not found. It was a good exercise in use these, however the implementation is silly and slow.

The sixth version is more interesting as it leverages message passing to do the work. It only sends one message and the spawnChop function is the same one used in the first version, however it was useful to show how this could be done. If nothing else, the algorithm can be called asynchronously.

As for benchmarks, which I did out of curiosity, the results of a run are below:

Bench1: 15
Bench2: 22
Bench3: 20
Bench4: 483361
Bench5: 4628804
Bench6: 377949

The names reflect the versions of the algorithms. As one can see, 4 and 5 are lousy implementations. Version 6 can be improved if the function was spawned once vs. every time the method was called. Versions 1-3 are consistently the fastest and from the initial benchmarks, using array of 1,000,000 integers, all appear to be acceptable solutions. Memory may become an issue with the recursive versions, but that can be remedied pretty easily using pass by reference vs. pass by value. Also, the 6th benchmark is slow, but by spawning the search function once vs. many times, this should be much faster.

Overall, this was a fun exercise. I may come back to it later to try to improve some of the algorithms as benchmarking is fun.

BLOOM_FILTER

This was another fun exercise. First, though not part of the actual challenge, D lets you perform a word-count on a file with only one word per line using one line of code:

size_t wc = reduce!((a, b) => a + b)(0, File(args[1]).byLine().filter!(a => a != "").map!(a => 1));

Needless to say, it's pretty slick. Anyway, the reason for this was to determine the number of words so that the number of bits needed for the filter would be large enough. The size of the BitArray used was calculated as follows:

(2 ^^ size) - 1

After some experimentation, when done, it appears that increasing the number of size of the exponent by adding 3 made the ratio set elements to the size of the bit array much better.

Following the advice of someone else who completed this kata, I used the MD5 algorithm to generate the hashes for the words and used the first 32-bits to generate the index.

The rest of the program generates 100 random 5 letter words and checks to see if they are exist or not. It also checks the original dictionary to verify that the word is a real word. The final result is the number of hits against the filter as well as the number of false positives.

CONCAT_WORDS

In this kata, the goal was to create three versions of same algorithm to concatenate 2, 3, and 4 letter words into real 6 letter words. One version is supposed to be readable, one version is supposed to be fast, and the other is supposed to be flexible. The purpose was to compare and contrast each solution.

The first solution is the readable one where I mostly used standard language constructs to create a solution. The performance isn't the best, but it does work well.

The second solution is the fastest. The largest speedup is due to the fact that the different words are broken out into different associative arrays based on size. Then, each possible pairing (2 chars and 4 chars, 3 chars and 3 chars, 4 chars and 2 chars) are checked using two loops: one fir 3 char words only and the other checks both permutations of 2 and 4 char words per iteration. Other attempts to speed it up were tried, but this version appears to be the fastest possible.

Lastly was the extensible version. It uses a combination of techniques from the two previous versions to try to have it be relatively fast, but also flexible. The flexibility comes to play when you pass in a word size and it will gather all of the words that are at least 2 characters in size to at most the size of the word minus 2 characters in size. Then, it goes over all of the possible combinations, and I mean all, to see if the new word is legit.

Here are the benchmarks:

Bench1: 32140
Bench2: 4048
Bench3: 63918

As one can see, the most flexible version is the least efficient. In retrospect, a stack or similar could be used to cut down on the number of words that get checked per iteration, but not 100% sure how much that would save. Perhaps this is an exercise for another time.

Regardless, this shows that the more you can tailor your solution to the problem, the better your performance can be. In this case, since I knew what the word sizes would be, I could make some assumptions with respect to the algorithm to improve performance which, in this case, is pretty significant even if you discount the third version.

TODO

Look into the stack solution to improve performance of the flexible version.

CONWAYS GAME OF LIFE

Another first, this kata is a basic implementation of Conway's Game of Life. I think I've looked at this before and it appears to be simpler than I recall. There's nothing really special about the code. It was just fun.

DATA_MUNGING

This was a two-part kata where two different solutions were created and then refactored to leverage as much code between the two as possible. In D, this was actually very simple since most of the logic leveraged built-in libraries, so there was only really one method that had to be abstracted out plus one data structure.

I'm particularly proud of this kata because it fit in very nicely with component programming. The beauty of this is that it creates very concise code that, if the methods are named correctly, is quite readable. Below is the core code for each program, one for football (soccer) data and the other for weather data.

Record!(string) rec;
File("football.dat").byLine()
        .map!(a => a.split())
        .filter!(a => a.length == 10)
        .filter!(a => isNumeric(a[0]))
        .map!(a => Record!(string)(a[1].idup, parse!int(a[6]), parse!int(a[8])))
        .findMinDiff(rec);
writefln("Team %s, Spread %d", rec.id, rec.diff);

**

Record!(uint) rec;
File("weather.dat").byLine()
        .map!(a => a.split())
        .filter!(a => a.length >= 14)
        .filter!(a => isNumeric(a[0]))
        .map!(a => Record!(uint)(parse!uint(a[0]), parse!uint(a[1]), parse!uint(a[2])))
        .findMinDiff(rec);
writefln("Day %d, spread %d", rec.id, rec.diff);

The only real differences between the two is how we determine if it's a line we care about (the two filters) and the data we extract (The last map line). Both versions are very clean and concise, especially since each line performs a different function.

The only aspect of this code that is unknown is whether this is also an efficient solution. The data set was very small, so no benchmarks were performed. It may be interesting to see how this would work with a larger data set vs. more custom code.

TODO

Create a larger data set for one of the programs and try to determine if a faster solution can be created.
- Create the data set using another program.

DEPENDENCIES

This kata requires that I create code that can determine all of the dependencies of a initial "key". By all, this means all of the dependencies of each dependency and so on down the chain. Essentially, this is an exercise in turning a graph into a unique list of elements.

The algorithm I used is to first get all of the dependencies into a single struct containing an associative array. The add method takes a key and an array of all it's dependencies. The get method will first initialize an associative array with all of the direct dependencies for the current key. Then, for every dependency, I keep digging down into their dependencies until I add no more dependencies.

It's an O(n^2) algorithm, therefore not terribly efficient, but good enough for the small data set I was working with. Not really sure how to better do this as there's the issue of circular dependencies, so graph traversal algorithms become problematic unless you have a good way of detecting circular dependencies, though perhaps the same solution I used for detecting duplicates would work.

DICTIONARY_REPLACER

In this kata, the goal was to create a simple string replacement method. It was nothing special from a complexity standpoint, but I decided to take it up a notch by making it a bit smarter. I did use the built-in replace method for arrays, but I tried to make the code smart by using a regex to find all of the tokens to replace, get a unique list of them, then do the replacments. This way, for each token the replace method is only called once regardless of how many times it was captured by the regex.

After re-reading the description, this seemed like a very lame way of doing it, so I created a second version that examined every character individually and, with a few simple rules, did the replacment inline. If the character wasn't part of a string to be replaced, then it was simply appened to a new character array. However, if it was, then after finding the end of the string to be replaced, then the replacement string was appended to the new character array.

Amusingly, the second version crushes the first version in benchmarks (usecs).

Bench1: 15565
Bench2: 2177

The lesson to be learned here is just because you have built-in tools to perform a task, it doesn't mean you should. The only real advantage of the first version vs. the second version is it takes up fewer lines of code. Other than that, there is no real advantage.

ENCRYPTION

This is another one of my personal katas. It was just an exercise in creating small encryption algorithms, most of which aren't very secure.

Theree as nothing too interesting about the exercise except that if you look at the first three algorithms, they are all related. For example, the Viginere algorithm leveages the doShift method from the Caesar cipher since Viginere is based on Caesar. Autokey is then based on Viginere, which resulted in a very short encryption method (two lines), however due to how the algorithm worked, I needed to create a new decription method.

Some other notes:

Starting with the Caesar cipher, I refactored code to make it more concise. I may have been able to do more of that with the others, but there's so little complexity, it didn't really matter too much.
Used structs to "scope" the methods as I wanted everything in one file for simplicity for the exercise, but wanted to ensure that I didn't have to do any weird naming to make it work.
Still have no idea how to convert a string to a binary array (ubyte[]) properly. Perhaps I could have created my own, but I was hoping that std.conv would have something that would be regarded as correct.

UPDATE: I added some code to perform quick-check like testing against the encryption functions. This found a bug with my substitution algorithm that I didn't have a unit test for. It also worked very well ensuring that my tests are passing 100% of the time...so far. The library is not part of this as I need to separate it out, but it does provide additional proof that it works.

FIZZBUZZ

This may be surprising, but I never did this one before. I've known about it for a while, but never actually implemented it as, well, it's dirt simple. However, I decided to do it for fun and the referenced version makes it a little more interesting.

Overall, the code is uninteresting except that since isFizz and isBuzz were so simple, I simply declared the methods using lambdas.

Just for laughs and giggles, here's the core of the program:

void main()
{
    foreach (i; 1..101)
    {
        writeln((isFizz(i) && isBuzz(i)) ? "FizzBuzz"
                                         : isFizz(i) ? "Fizz"
                                                     : isBuzz(i) ? "Buzz"
                                                                 : text(i));
    }
}

auto isFizz = (int a) => (a % 3 == 0) || (text(a).count("3") > 0);
auto isBuzz = (int a) => (a % 5 == 0) || (text(a).count("5") > 0);

The hardest part was figuring out a tiny bug that I had in the program with the count method. Normally, D is good about finding issues with data types, however in this case, I accidentally passed in the number 3 vs. the string "3", so it would never return the correct result.

So, since it was so straight-forward the first time, I decided to generate an alternative solution using OO, specifically a factory method and inheritance. It's much more verbose, but it does illustrate how it can be done pretty easily. I won't go too far into whether it's better or not as one can go on the internet to find out, however if I were to abstract out the original solution into it's own function, that would be the cleanest and perhaps best approach.

As for other approaches, I simply can't think of any. At least none that are significant in any way. I could use if statements vs. the ternary operator, but that accomplishes the same thing, just more verbose. I could use a swtich statement and have a method return a value of 0-3 or an enum, but that's just making the original solution more complex than needed. The OO solution is already there, but I just did it for illustration purposes. That, and I wanted to try to come up with something else.

However, D does have an interesting concept called a range where I can create a data structure that can be iterated on by simply providing a specific set of methods. There's more to it than that, but for the sake of this example, that's all that we really need to know. While it too is more verbose than the original version, this is perhaps the most useful version since it can be used anywhere a range can be used, such as with most of the methods in std.algorithm and std.range as well as foreach loops.

Benchmarks indicate that all of the solutions are about the same in terms of performance. The original version is the fastest followed by the range version and lastly, the object oriented version. However, this only reflects a single run. Other runs have shown that the OO version is faster than the range version, so expecting one to be consistently faster is not recommended. Regardless, I believe the range version is the preferable version due to it's flexibility.

Bench1: 2171414 -- Original
Bench2: 2358674 -- OO
Bench3: 2342891 -- Range

KLONDIKE

This was the most complicated and difficult of the katas. It wasn't because I didn't know the game because, well, I've played it enough. The issue was the fact I probably should have spent more time planning than I did. I just kept getting hit by bugs that could have been solved if I just took the time to design the algorithm.

This exercise was also a learning experience because I initially started with standard arrays, however they proved to be too memory inefficient. Not 100% sure why, but it's probably in part due to the fact that D doesn't have a good garbage collector yet. However, this was worked around by using the Array collection in the standard library. It doesn't work exactly the same, but it's close enough that it's a directly replacement in many cases. In the cases where it isn't, there's usually a simple change to be made.

The one case where it wasn't a simple change was when I tried to test the equality of my structs. Initially, I overrode the built-in equals operator so I could use the == operator to do the comparisons. However, for some reason, when using the Array collection, didn't like this, so I changed it to an equals method and changed the normal calls to use this method. Inconvenient, but it worked.

I also got to take advantage of CTFE (Compile Time Function Execution) to generate the initial deck. Granted, I had to write the code to generate the deck, but it was much less tedious than having to manually create the deck.

Overall, the code works as it should, though the algorithm for playing the game is too simple and doesn't do very well. Testing it against 100,000 games results in 0 wins, so I'm not very happy with the results. With more time, I'm sure I can come up with a better algorithm, which is a side goal of the kata, however I did accomplish the kata's primary goal, which is what I was trying to accomplish. The good news is that a significant portion of the program should not require any refactoring, so it's just a matter of coming up with better playing strategies. However, before doing so, this needs to be broken out into separate modules. This wasn't done earlier because I didn't think it was necessary initially, however it's now grown to the point where I don't want to work in such a large file. (875 lines)

TODO

Create more unit tests.
Work on a better algorithm. Need to work this out on paper as there are probably 20,000 different ones, so need to pick and choose.
- Try to stick with incremental improvements.

LISTS

In this kata, I was to create three types of lists: a singly linked list, a doubly linked list, and a list that did not involve the use of pointers. The first two were very easy because I could simply uses classes for the nodes. I wish I could have used structs, but it's probably quite infeasible to use them since they are allocated on the stack. Removing arbitrary nodes would be very difficult.

The third implementation uses an array to back the list where when adding new nodes, I would expand the list by twice it's length up to size_t max elements. The goal was to have a data structure that would allow for faster iteration. Also, inserts would be pretty fast as long as they don't require a resize. This was due to having the index of the last element stored at all times, so I always know where it is. Removal of elements is slow as it required condensing the array to remove any holes.

Here are some benchmarks to show performance of each solutions:

Benchmarking adds.
- Add1: 80376
- Add2: 13297
- Add3: 11665
Benchmarking finds.
- Find1: 73909
- Find2: 72968
- Find3: 56075

Nothing surprising here. Singly linked lists have to iterate over the entire list before adding an element, therefore they are the slowest. As for searches, arrays are more compact in memory, therefore they are searched a bit faster. I didn't do removes because I didn't figure out how to get a consistant data set across all of the runs without skewing the results due to the code for populating the lists.

MINESWEEPER

This kata was quite simple. Not really much to say about this since it's very straight-forward. Only cool feature used that hasn't been seen in the katas yet is using a static regex that's created at compile time. This is nice because it's only compiled once instead of every time it's used or even when the program starts.

MOST RECENTLY USED

This is the first of my custom katas that I decided to work on. It was nice as I had to truly work on getting the range interface to work. While I did implement the InputRange properties, I'm not sure I needed to as it's probably not needed at all. Of course, the front propery would probably be very useful, so I guess it was worth it.

Now, this is based on the doubly-linked list implementation I created before. Because of how the inserts need to work, there were a number of changes to be made. I did keep the find method as it is useful for finding an element in the list in the event that the element being added is already in the list. I also kept the values method as it is very useful for debugging/unit testing.

Now, I considered the last of the rules and came to the conclusion that having some sort of timer call the "clear" method is probably the best method. My thought is that while it would be cleaner to do it when calling the put function, it would also slow things down considerably especially if a large number of values are inserted very rapidly. Granted, the removal process should be pretty efficient since it starts at the end and stops removing elements when it finds one that doesn't need to be removed. However, it can add up quickly.

Looking up some more information, particularly the Wikipedia page for a Self-Organizing List, I noticed that I only considered one particular use case. I think this is fine for the initial exercise, though it does seem to confirm that my original data structure is the best one for the job. The article is also an interesting read with regards to different use cases and different algorithms.

In this case, I think it's done correctly. With the range interface, every time I do something with a value, I can "put" the value into the list and have it be the most current item. I can also see what the 10 most recently used items are.

The only other change to be considered is checking the length of the list on every insert and if the size of the lists exceeds a max size, we drop the last element. This would keep the size in check, but not eliminate the need for the clear method as if there is a significant amount of time between updates, then elements within those top N elements would need to be removed as well.

PARENTHESIS

This was in interesting little kata where the goal is to generate every possible way N pairs of parenthesis can be nested. It's a problem that is done more easily in a recursive fashion than an iterative one as I tried to come up with a way to do it iteratively, but I just didn't see a way to do it cleanly.

Now, my original thought I think was polluted by the first solution at the bottom of the page, so I tried to do something similar, but it didn't get all possible scenarios and ended up with a lot of duplicate data. I got close with 4 parenthesis, but once I figured out which one was missing, I realized the whole algorithm was wrong. I ended up using a character buffer and added a single parenthesis at a time. This worked much better, though it did take some conditional logic.

POKER

This is a combination of two Katas:

Poker Hands
Texes Hold'em

The first focuses on the various poker hands to determine who is the winner. None of the methods were complicated to create, but a good number were written. All have unit tests, which is how I planned on testing the Kata. In the original description, the goal was to read from a text file and determine who won, however since I was going to use the same code for the Texas Hold'em kata, I decided unit tests were the way to go.

I still went with the string format specified in the Kata as it made writing the various tests much easier. I still used the Suit and CardValue enums from the Klondike kata with the change to put the Ace as the highest card. Since I planned on working with the structs, I had to create a method to convert the string to the cards. I also opted to sort the cards so that it was only done once vs. several times.

To determine the hand, instead of creating one method per hand, I created several methods that could be combined for some of the hands. Others are still specific to a particular hand, but not many. Overall, the simplicity and the conciseness of the functions is very good. The one method I'd love to find a cleaner implementation for is handRank where I have one if statement per possible hand. It seems less than ideal, but without pattern matching, like in functional programming, I'm not sure what else to do.

Now, the Texas Hold'em part of the kata was a bit more fun due to how frustrating it was to get a bit of code working right. The issue is that with a set of 7 cards, what is the best hand you can get with those cards. So, that means you have to try every permutation of the cards to see which is the best hand. Us humans do this pretty well, but computers are a bit dumber, therefore we have to test each combination.

Note: I just realized that what is a combination I kept referring to in the code as a permutation. So, when reading the code: permutation -> combination.

So, my plan was to create all of the combinations at compile-time so I can simply iterate over them when I execute the program. However, generating the correct combinations provied to be a bit trickier than I expected. I always seemed to be off by a small flaw. Turns out, all I had to do was turn some of my logic around and that took care of the issue.

The rest of the code is pretty basic and uses the code that I wrote before. I just ran it through the test case that was part of the kata description, so I didn't do much testing nor did I do all of the logic for determining the winner, such as resolving ties by looking for the high-card. I solved it before for two players, so it's just a matter of extending it to several hands.

POTTER_BOOKS

In this kata, I had to create another shopping cart, but in this case it had to be a bit more intelligent. Based on the set of books provided, I had to determine which was the cheapest way to apply the following discounts:

5% off two different books.
10% off three different books.
20% off four different books.
25% off five different books.

So, for example, if I bought eight books, it would be cheaper to apply the 20% discount twice (4 and 4) vs. the 10% and the 25% (3 and 5). This kata was also to be done iteratively to solve the simpler unit tests first, so that's what I did.

The first few cases were quite easy:

If the array is empty, the cost is 0.
If the array has one element, then simply return the cost of one book.
If the array contains only 1 book multiple times, then simply multiply the number of books by the cost.
If the books in the array are all unique, then apply the appropriate discount.

The last case is where it got tricky. At first, it was pretty simple because I could simply generate sets of unique books while maintaining the number of each book in a separate array. A book could only be part of the permutation if there were still instances of it left. This solved all but the last case where the most straight-forward solution isn't the right one.

After some thought, I came up with a solution where I would check various permutations to see which one provides the lowest cost. You can look at the code to see the algorithm as I won't describe it here. However, the part of interest is that I implemented it as a range. This was very simple and integrated well with the rest of the code.

The only other real change I had to make to the overall algorithm was when I got a unique set of books to try, I had to make sure that there were the same amount of books as required by the permutation. If I didn't check that, I would get sets of books that didn't match the permutations. This was seen where I had several copies of one book left, therefore I would get an array of 1 book for the set of unique books. Since this was associated with the last element in the permutations, it wouldn't take into account all of the remaining books, thus providing a lower score that was incorrect. E.g.

Permutation: 5,5,5,3,3,3
Last set of unique books: [4] (Length == 1, not 3)

REVERSI

Another very simple exercise. Not much to this one as I did it in only the main function in just over 50 lines of code. (Allman-style makes code more longer) I did create the output visually as suggested only because it was pretty easy.

ROMAN_CALCULATOR

I'm not going to code this one because after doing the Roman numeral calculator, I've done the bulk of the work, therefore all I really need to do is implement the basic operations (addition, subtraction, multiplication, division). To really make this useful, I could parse the equation into a binary tree and then traverse the tree to do the calculations, but that's way beyong the kata since it only requires addition. In that case, I could simply create a new struct for the "Roman Numbers" and overide the addition operator.

TODO

Create a full-fledged calculator.
- Support RPN?

ROMAN_NUMERALS

Simple kata to convert to/from Roman numerals. Converting to Roman numerals was more complicated than converting from them, but still pretty easy. I used a straight-forward algorithm where every time a Roman numeral could be used, I added it to the string. So, for the number 4, I would get "IIII". Then, I did some post-processing to convert the various patterns that would result in something similar to "IV" to their correct forms.

The other direction was very simple as I created an array of integers where, if the next value was greater than the previous value, I set the previous value to the new value minus the previous value. I then summed all of the numbers together.

For the first time in D, I wrote conditions on a single line if it could be done. E.g.:

if (diff < 0) continue;

Normally, I always wrap braces around the statements within the block, but after trying it out, in cases such as that one it is cleaner. Unfortunately, if I have to modify it, I have to add the braces back in and reformat, but I think I like this form as long as I don't go nuts with it.

SHOPPING_CART

In this kata, the goal is to create a calculator of sorts where the price of the groceries is determined by a set of rules. These rules specify the cost per item, but also special cost if there is one, such as 3 for $1.00.

Nothing really tricky here. If you see that with the new item you now have a multiple of the special amount (E.g. 3 if it's 3 for $1.00), then you subtract the cost of the 1 item fewer than the special (E.g. 2 items) and replace it with the special cost.

SORTING

In this kata, two sorting routines were created without using the built-in sort routines. First, sorting numbers by adding new numbers to a struct and getting back an array of the numbers in a sorted manner. I used a simple binary tree for this, though in retrospect, a better data structure should be used for production systems so that you ensure the tree is balanced.

The second sorting algorithm is to take a string and turn it into a string where all the letters are lower-case and in sorted order. In this case, I just looped over each letter in the alphabet and as I found a letter in the strin gthat matched, I added it to a new array. Not good for a large data set, but good enough for a simple string.

STRINGCALC

Note: I can't remember where I originally got this from, but the link provided does describe the kata, just with more variations than I handled. The original version I looked at didn't have parts 6, 8, and 9. Also, part 7 was a little different in that the delimiter didn't have to be in square brackets. Perhaps one day I'll go back and complete this version, but for now, I consider this task complete. One person also suggested doing this without conditional statements, so perhaps I'll try that as well.

This was a somewhat interesting kata as it was implemented in phases, each phase adding a new twist. The goal is to create a calculator which accepts delimited string and adds up all of the values. However, negative values are not allowed and the delimiter could be either the comma or a user-specified character that is specified at the beginning of the string. E.g.: //;\n1;2

The "\n" character is also a delimiter. Delimiters can also be multiple characters. So, what I did is I normalized all delimiters to either the comma or the specified delimiter, then split on it. I then checked for negative values and throw an exception if one is found. If not, I summed up all the values using a reduce method.

TODO

Follow the link and implement all aspects of the kata.
Try to write a version that doesn't use conditionals.

TRIGRAMS

This was a fun kata. The goal is to take a string, divide it up into trigrams, and create a new string using that trigram. Nothing special from an algorithm standpoint as generating the new sting just requires a random number generator and iterating over the trigrams while randomly picking the next text until you hit a stopping point. The real fun is just seeing the results.

TODO

Update the code to work on larger sets of text. Keep punctuation in mind.
May have to move beyond tri-grams to something larger.

WORD_CHAINS

This was another fun, but somewhat difficult, kata where, given two words, come up with the shortest chain between the two words. A lot of filtering was used to ensure we only looked at words that were the same size as the provided words and we didn't see the words before.

The trick was not going too deep. It was very easy to create extremely long chains of words which, if we were looking for the longest chain, would be useful. However, since we were looking for the shortest, this was a waste of time and resources. The trick is, since this is a recursive implementation, to ensure that the different between the current word and the last word is the same as the length of the word minus the depth. For example, if we're looking at 3 letter words, cat and dog, then the next word after cat must be different from dog by only two characters.

It took a while before I figured it out. However, once I did, performance improved considerably.

WORD WRAP

In this kata, we are to create some code that puts newlines into a string so that it wraps correctly at a specified number of columns. Originally, this was dirt simple because I simply went back to the first space before the column to wrap at and replaced it with a newline. However, this was unsatisfactory since there could be multiple spaces, so I had to rework the algorithm to handle that. The hardest part was making sure that I indexed the arrays properly, specifically the new start to the sub-string I was pulling from the input. Overall, not too bad since I only used two arrays: the input string and the char array in the function. Granted, I created a new string when I returned the string version of the output, but that's to be expected.

My Katas

Most Recently Used

Create a data structure using a doubly linked list to have the most recently used element be the first element. Add code to periodically remove elements that are older than 10 seconds. Provide a method to retrieve the top 10 most recently used entries.

Rules:

Do not use any built-in data structures.
Provide a method to get all of the entries to be able to test that the expiration happened correctly.
Consider the difference between a clever solution where expiration occurs during the insertion/update of a new value or using a second thread to periodically remove expired data.

After the initial version is created, create another version using a different data structure to see if insertion speed can be improved. Provide benchmarks to show the speed improvements. Also, show benchmarks related to getting the top values or the first value.

Encryption

Implement some basic encryption algorithms. Some potential algorithms:

Caesar cipher
Viginere
Autokey Cipher
Basic substitution
One time pad

Files

REVIEW.mkdn

Latest commit

History

REVIEW.mkdn

File metadata and controls

Revew of Code Katas

Implementation notes

General Observations

Katas

TODO

TODO

TODO

ENCRYPTION

TODO

MOST RECENTLY USED

POKER

TODO

TODO

TODO

My Katas

Most Recently Used

Encryption