I wanted to continue my analysis of my flickr photostream by comparing last week’s recently created dataset to an older set of photos. Since there was no way (that I could find) of “walking” backwards through a photoset with the python flickrapi module, I thought I would “walk” through my entire photostream of 20,000 photos while pushing necessary information into a list as I went, then reverse-feeding that list into the Clarifai API for analysis:
After a few false starts (in retrospect, this was probably flickr warning me off), I was able to walk through about 11,000 photos before flickr shut me down for exceeding their rate limit. Now that I know flickr has a rate limit, I suppose the ideal way to get this job done would be to pay away Clarifai’s operations limit, and run my entire photostream through its API in one go. The one downside is that this would take hours, since simply looping through half my photostream with flickr’s api alone took almost two.
The other obstacle is that I’m an inveterate miser with zero present income, and so am still waiting to hear back from Clarifai on a student “collaboration” discount. A girl can dream.
In the meantime, I crudely copied the last four thousand URLs that my aborted python script had printed and threw them into Clarifai. This second “dataset” accounts for a time period from October 2015 until July 2013. I then loaded the results into the same d3 sketch as last week.
One thing to note is that for these new visualizations, I gathered more “concepts” per photo than last week, in hopes that more interesting/abstract concepts—like “togetherness”, which may inherently inspire less confidence than an objective prediction like “people”—might emerge. And they did:
Top 10 concepts in 2017: 1) people, 2) outdoors, 3) no person, 4) adult, 5) travel, 6) woman, 7) indoors, 8) man, 9) portrait, 10) nature
At least for the more recent set of photos. One thing that’s immediately obvious: I photographed way fewer “concepts” in 2014 than I did in 2017. I also took fewer photos.
Another striking observation is that “no person” is byfar the most common concept of the 2014 photoset, while “people”—literally the opposite—is the most common for 2017. Looking at the top 10 concepts, one could definitely speculate that I had more company in 2017 than I did in 2014.
While this visualization does its job as an abstract overview of the data, I wanted the photos themselves to tell the story. So on my click, I had the page spit out the photos that were trapped inside their respective bars.
Comparing the “people” category for both photosets, I clearly saw fewer unique people over a longer period of time (ie, how many photos fit in the window) in 2014, while in 2017, I saw more unique individuals over a shorter period of time, even while half the photos displayed were repeats.
Also notable was that the 2014 sample seemed to be entirely processed in instagram, which may be coincidence; I probably just happened to have choosen a period where I backed up all my instagram files at once? Will have to look into that one, but it’s amazing to me that I bothered to process so many mundane photos through instagram, though they would never be posted publicly. Perhaps I truly thought a filtered reality looked better, or maybe I was just constantly looking for an opportunity for public validation.
So what’s with the discrepancy? It will surprise no one to learn that I was extremely depressed from 2013-2014, a period which overlaps with the second photoset in this experiment. This analysis truly corroborates the idea that mental health is a social issue.
For my next steps, I’d like to train a custom model to identify selfies, which I believe is a strong marker (at least, personally) of depression. I’d also like to incorporate Clarifai’s color model to the workflow, run my Instagram history through it, and display it as a time series visualization. I’m absolutely certain this will be able to map my history of depression with excruciating accuracy.