Sometimes I get lost in the programming equivalent of throwing stuff at the wall to see what sticks. After some time with no progress in figuring out what sticks, it’s time to step away from that approach. I can typically see that it boils down to not being confident in what I am searching for — and/or — not quite understanding the examples I find, trying to find some kind of shortcut to just get my code working by copy pasta-ing random bits of spagetti and crossing my fingers. Stop. There’s a better way.

I’m currently working on the finance app for week 9 of CS50. Python in Flask with SQLite. The response from my database contains stock symbols, current values in USD, long company names, and random numbers of shares I “bought” while clicking through the UI. Great! But not helping me when focusing in on a Python thing I can’t get working.

Reduce the problem to it's simplest form

# I have this list
mylist = [
    {'thing': 'A', 'count': 4},
    {'thing': 'B', 'count': 2},
    {'thing': 'A', 'count': 6}]

# And I want to make this one where duplicates are merged
newlist = [
    {'thing': 'A', 'count': 10},
    {'thing': 'B', 'count': 2}]

Looking at this distilled version of what I am trying to do, is very different from looking at the code in my application. I have limited Python chops, so let’s dig deeper on what is what.

Get crystal clear on the syntax

  • mylist = [] is a list 👀
  • {'thing': 'A', 'count': 10} is a dictionary 👀
  • 'thing' is a key and 'A' is a value

dictionaries (or dicts for short) are a central data structure. Dicts store an arbitrary number of objects, each identified by a unique dictionary key.

Lists are a part of the core Python language. Despite their name, Python’s lists are implemented as dynamic arrays behind the scenes. This means a list allows elements to be added or removed, and the list will automatically adjust the backing store that holds these elements by allocating or releasing memory. Python lists can hold arbitrary elements — everything is an object in Python.

I could probably spend days studying this, like what is all this list comprehension I come across? A lot of the so-called elegant examples I find online, are just not readable to me yet. Let’s crack open a terminal and run python3 to poke more at what I do understand.

Play around with isolated code in a terminal

>>> mylist = [{'thing': 'A', 'count': 4}, {'thing': 'B', 'count': 2}, {'thing': 'A', 'count': 6}]
>>> mylist[0]
{'thing': 'A', 'count': 4}

Seeing that makes me think that a previous spagetti on the wall attempt was to access keys in different dictionaries as if they were in the same dictionary. They are not. 🤔 But ok, my simplest loop is looping fine and checking for a specific value:

>>> for item in mylist:
...     if item['thing'] == 'A':
...         print(f" { item['count'] } comes with A ")
...     if item['thing'] == 'B':
...         print(f" { item['count'] } comes with B ")
...
 4 comes with A
 2 comes with B
 6 comes with A

So far so good. But can I add up the counts?

>>> for item in mylist:
...     if item['thing'] == 'A':
...         item['count'] += item['count']
...         print(f" { item['count'] } ")
...
 8
 12

Right. That will double each of the values individually, not add them to each other like I want. Which is also something I was messing up inside the app code, but here in isolation it is less confusing to me.

Write up specific pseudo code

make an empty new list
loop though each item in my list
    if the current item’s key is a first
        append item to new list
    else (if new list already has an item with this key)
        add item's value to it's matched key item's value

I found itemgetter. It needs to be imported, so not sure how to run this part in a terminal, but I can see in my actual app that this if-statement works nicely:

newlist = []
for item in mylist:
    if item['thing'] not in map(itemgetter('thing'), newlist):
        newlist.append(item)
    else:
        # this is where I get lost, how to reach out from
        # the current item in the loop to change a value
        # in a different dictionary than this one?!

So… the else will handle the third time through the loop, the {'thing': 'A', 'count': 6} should not be added to the new list, only update the count of the A already in the new list. Can I do that if I get the index of the other dictionary with thing A? 🧐

Take naive steps in the right direction

>>> mylist = [{'thing': 'A', 'count': 4}, {'thing': 'B', 'count': 2}, {'thing': 'A', 'count': 6}]
>>> newlist = []
>>> for index, item in enumerate(mylist):
...     if index != 2:
...         newlist.append(item)
...     else:
...         newlist[0]['count'] += item['count']
...
>>> newlist
[{'thing': 'A', 'count': 10}, {'thing': 'B', 'count': 2}]

This builds up the new list I want. With some very silly hacky workarounds, so let’s try to loose those.

The if index != 2 statement can be solved with the itemgetter I used previously. So omg the only hack left to shed is where I am using newlist[0]['count']. How to update that count value without knowing the index of that other dictionary in the new list? Can I loop thought the new list…?!

# This is the database response
response = [
    {'thing': 'A', 'count': 4},
    {'thing': 'B', 'count': 2},
    {'thing': 'A', 'count': 6}]
newlist = []
for item in response:
    # Check that an item does NOT already exists in newlist
    # with the same value as this item's 'thing'
    if item['thing'] not in map(itemgetter('thing'), newlist):
        newlist.append(item)
    else:
        # Loop through the new list to find the correct count to change
        for x in newlist:
            if x['thing'] == item['thing']:
                x['count'] += item['count']

Yup. This looks awful — and is error prone and slow and probably wins some kind of award for being unPythonic — but it works. And I wrote it and I can read it. 💪 We can improve tomorrow.

# I now have a list of dictionaries where duplicates are merged
newlist = [
    {'thing': 'A', 'count': 10},
    {'thing': 'B', 'count': 2}]