This week we went over Binary search capacity, Maps and more Maps.
One of my favorite parts of this section is the infinite nature of Maps and Sets. They are endless. And we can taxonimize basically everything using a combination of them. This type of organization is in line with how my mind works. And I like things that work like my mind.
We start the week by going over binary search. Whereby we start by guessing the middle value of any set of numbers, and then move into the middle of whatever set is correct etc etc. Faster than a linear search where we are moving down values one by one.
We cover “complexity” which is a different meaning in computer science land than in. say, relationships. Where as in relationships “it’s complex” usually means things are not coming to fruition in any societally determined meaningful way, complexity infers that the code or algorithm does compute and it is simply a matter of how much time and space(memory) is necessary for that. Generally, complexity is a trade-off. Instead of the choose 3 of quality, time, and money computer scientists are given a choose 1 of time or space. Perhaps it is because quality is not a variable as all code must execute.
As per societal constructs, time is the number one resource. There is however an issue in focusing on time, and that is that the optimization of time becomes paramount. How much is sacrificed to shave off smaller and smaller pieces of execution time? It is important to remember if they are worth it. Don’t focus so much on optimizing a solution that you forget to look for a different one.
We discuss complexity classes, which is a little math classy for my background. And we move into the interesting problem of a moving sum. Whereby we move along an ArrayIntList, finding the max within it. blah blah blah eliminate the inner loop by keeping a running sum. Basically, we keep track of each start and each ending point. And are able to move through these totals in a way that preserves the max.
A further complexity is to have some concept that disregards certain sums, that keeps us from computing everything or spending time on those things we know will not add up. We are storing the max sum for the starting value in the node that contains the starting value itself.
We are getting into some more mathy stuff.
And we are back into the original program. The best creation being a loop that generates each possible starting point and each possible stopping point, then adding up each of those numbers and comparing the sum.
Having the initial answer, how do we make this faster? We keep a running sum, that eliminates the need to recreate the loop each time. Here we move the initialization of sum from the inner loop to the outer loop, so that the work we have already completed is not forgotten. Searching for a parallel in real life for this- I can’t say I can think of any one. Maybe something about compounding interest? Instead of going off the original amount and adding interest and the original amount each time, we are using the compounded interest to create a running sum of the original amount and adding to that.
Basically, we are seeing if it is better to include previous values or start with the number we are on.
Moving back into what is most executed and how that maps onto the algorithms that were brought up, and how those algorithms affect the time it takes to run the code.
Next we are into Maps – this was unbelievably convenient for me. As the second phase of the Ada Challenge happened this week, and it involved maps and keys, otherwise know as Hash sets in ruby.
A map is a built-in collection class that keeps track of key/value pairs. A map has one value for any given key, this is one of the things I find interesting about Java. We make it more like real life by changing how a map is constructed, by assigning sets to the map. For some reason this rings very true with me, and it is one of the more interesting and memorable strategies I have learned so far.
Maps will only remember the most recent association with a Key. Looking at a large file and creating a map that looks at the values and finds unique ones. We start by simply creating a set, and being able to look at the size of the set.
Next we add the set to a Map that allows us to count the number of unique identifiers. Thinking over this set, I would immediately try to create another field that measures the number of words and adds 1 to each line. Like 1:unique identifier, 2:unique identifier.
But this is overly complicated for what is necessary for the code. It will be able to get size returned, and it is not necessary for each entry to be tracked in a count. Additionally if the size was changed, or new values were placed in the middle the entire count would need to be updated, again unnecessarily complicated.
One of the hardest things for me to grasp is the different vocabulary in computer science. It seems much more abstract than labeling things, as we have objects interacting with each other in a very 1 D space, but you need to be able to conceptualize them in a way that is 3D. What is an interface, what is a constructor, what is an object and how do they interact. We are basically creating scenes in our codes that work a certain way within the code, but cannot be separated from the original context, or they mean something else entirely.
Going over map stuff we learn some concepts like put and ContainsKey, how to update for new values and add to the count of old ones. Running for each loops and the kinds of associations that are available for maps.
Next we bring in the concept of a SortedMap, whereby keys need to be sorted.
Going over a sample program, we discuss how we want to label the keys in the map to be the thing we are looking for. Then use that to return the relationships to the other elements in the Map. We speak in a more detailed manner about the objects within the Map, how they interact and what they contain. It is definitely a new way to think about storing data. We normally just take data storage for granted. If we have a folder that contains folders, it is the same as having a folder that contains files from a high level view. But when we are creating these systems, we need to think about folders and files in terms of muliplicity. Which is a new concept for me. It isn’t 1 item to 1 item. It is Contains 5 items to contains 3 items or Contains 5 items to contains 1 file. Then next we bring in bi directional relationships in this same context, and how to create a program that holds onto those relationships.
Finally, we cover String Manipulation, a check back to CSE 142. Fence post problems specifically. We go back to Mapping and cover the addAll function which allows us to add all elements to another set, and updating the Map to reflect new data that has been taken in. And how to keep track of old data, and issues with repetition.
This blog post was informed by reading and lecture notes from UW CSE 143, Autumn 2017.