My penchant for patience is not well-known, because I don’t have it.
I began my pilot study today. I used the product key I had received at the #SMSociety conference last month (finally) and felt prepared for the process. I had watched videos and tutorials on using the software; I reread some research on social network analysis, and the Twitter and Society book had most of my attention last week. I read it at the nail salon as well as on the treadmill. Motivation has been an issue for me ever since I returned from my a getaway at the end of July, and today was dedicated to making pilot study inroads.
Right. I have a specific hashtag in mind. I need to pull data from something similar to the research I’ll be doing, but not quite the same. I’m looking for patterns, betweenness centrality, and influencers. I’m also interested in sentiment analysis.
I used NodeXL to import from a Twitter search network today and…waited.
But before I waited, I had the unpleasant task of troubleshooting. My Excel Office 2013 does not like NodeXL.
Me at 14:00
I can’t save the spreadsheet. I’m not sure I’ve fixed this problem yet, but I added the NodeXL folder to Excel’s list of trusted locations – add that to the list of things I’ve learned this week. I’ve restarted my computer, shut down my computer, and I’m still running into the same buffering issue. I’m going to go ahead and pull the data for the hashtag and set the parameters and save later.
According to NodeXL Pro Tutorial, I should have limited my data set to 1000 tweets. Reader, I did not follow this instruction. I inadvertently left the maximum at 18 000 tweets.
I waited for one and a half hours for my data set to be pulled from Twitter’s API. It would pause for 15 minutes because it had reached Twitter’s limits for data extraction. This happened 7 times.
I’ve had the instructions open for each step and continue to refer to them. Colouring all the vertices and getting the graph to look the way I want is not finished yet. I am experiencing a high level of frustration as I sit, click, and wait for something to happen, and refresh. I cannot focus on reading something while the waiting takes place, so I have resorted to watching how to splice yarn on YouTube.
I’ve also taken several screenshots of the finished data pull, but only after I prepared the data, grouped by cluster, and calculated metrics. I also decided to do a time series analysis, even though it is not important for my pilot study. I’m glad I did, as I realized Twitter gave me data from August and July, with a smattering of content all the way back to March. This will be interesting to examine in detail later.
NodeXL also does sentiment analysis of text; this is important for my needs, and NodeXL provides a selection of words in two groups already; group 3 is empty for personalising, which I did. I added 65 terms to the group. I don’t think that all the words will appear, but these words I plan to use for my research, and it will be good to compare to later.
Anne from OISE’s Write-In is checking in with me via email, as I’m participating in the Write-In remotely today. I respond back quickly to her about what I’ve done and how it matches my goals for today (it doesn’t). I appreciate this program OISE has for doctoral students. No one in my house cares if I’ve set any goals for my work today; they are interested in filing complaints in person that SOMEONE ate all the bacon. Suggestions to fry some more are met with blank stares and stomping up the stairs.
I want a cheat sheet “top ten” list, so I’m asking NodeXL to find me the top words and word pairs, URLs, domains, and hashtags. More waiting and refreshing. I should stop clicking on the graph to see the changes, as it is slowing down my process. However, it’s interesting to see how communication takes place. Even in my small data pull, I can see some examples of the 6 types of Twitter networks. I used some of my time to colour and draw out Network Metrics Figure 3 from the Pew Research Report:
At 20:08 EDT, the spreadsheet finally saved.
As a result of my pilot project and all the waiting, I ingested 5 chocolate covered marzipans I had no intention of eating today.
A shout out to the NodeXL Project: they DM-ed me during this process with an offer to help because I had tweeted that I was doing a data pull. Accepting help of any kind is anathema to me. That’s something I’ll have to adjust in my personal list of trusted locations.
Jestem skończona, as my kids like to say. Diving into the results will wait for another day.
Smith, M. A., Rainie, L., Shneiderman, B., & Himelboim, I. (2014). Mapping Twitter topic networks: From polarized crowds to community clusters. Pew Research Center, 20, 1-56. Retrieved from https://www.pewinternet.org/2014/02/20/mapping-twitter-topic-networks-from-polarized-crowds-to-community-clusters/
Smith, M., Milic-Frayling, N., Shneiderman, B., Mendes Rodrigues, E., Leskovec, J., & Dunne, C. (2010). NodeXL: a free and open network overview, discovery and exploration add-in for Excel 2007/2010. From the Social Media Research Foundation, https://www.smrfoundation.org
Weller, K., Bruns, A., Burgess, J., Mahrt, M., & Puschmann, C. (2014). Twitter and society (Vol. 89). Peter Lang.