{"id":1847,"date":"2013-10-17T02:38:27","date_gmt":"2013-10-17T02:38:27","guid":{"rendered":"https:\/\/notebooks.dataone.org\/?p=1847"},"modified":"2013-10-22T14:03:11","modified_gmt":"2013-10-22T14:03:11","slug":"openscience-sentiment-analysis-via-twitter-data","status":"publish","type":"post","link":"https:\/\/notebooks.dataone.org\/data-science\/openscience-sentiment-analysis-via-twitter-data\/","title":{"rendered":"#OpenScience Sentiment Analysis via Twitter Data"},"content":{"rendered":"
In earlier post I mentioned that I would like to look at positive\u00a0sentiments such as “I like @figshare” or “I prefer @figshare” or “I use @figshare” across twitter.<\/p>\n
A quick Web search on Google for “archive of past tweets” (without quotes) brought my attention to this September\u00a04, 2013\u00a0article on Mashable:<\/p>\n
“You Can Now Search for Any Tweet in History<\/a>”<\/p>\n I think it would then make sense to map\u00a0the sentiments via a crosswalk to the existing survey responses on use of @Figshare by early data management adopters.<\/p>\n It is also possible that there are some favorable or positive features that were not captured in the survey responses.<\/p>\n Navigated to the Topsy site – http:\/\/topsy.com\/<\/a><\/p>\n There is a free trial of Topsy Pro Analytics.\u00a0 I might use that after I get an idea of how Topsy’s basic features work.\u00a0 I understand there are more advanced analytical abilities for “Pro” users.<\/p>\n I searched for “@figshare” under the “influencers”\u00a0search option.<\/p>\n <\/a><\/p>\n Similar to the recommendations twitter sent to my personal e-mail when I initially followed @figshare, \u00a0I see @datadryad and @markhahnel.<\/p>\n The other top users\u00a0are unfamiliar to me, but it is important to bear in mind the “free” version of topsy (probably) has analytics for the past 30 days.<\/p>\n I’ll now try a tweet version of the search, directing browser to http:\/\/topsy.com\/tweets<\/a><\/p>\n I’ve typed in “I like @figshare” (without quotes)<\/p>\n It appears true that the data is limited – in fact, it appears to be limited to “Past 25 Days.”\u00a0 I’m preserving a screen capture of that.<\/p>\n The first three results say explicity (as in has the explicit order of terms) “I like @figshare.”\u00a0 However, after the first three, some do not have that order, although they do have the word “like” in it.<\/p>\n I’m given the option to view “I like @figshare” results on “Topsy Analytics.”<\/p>\n http:\/\/topsy.com\/analytics?q1=I%20like%20%40figshare&via=Topsy<\/a><\/p>\n This gives me a view of “tweets per day” for the period September 11 – October 11.\u00a0 I am taking a screen capture of that. Note that the data output is copyright 2013 Topsy Labs, Inc. I am hopeful that my research constitutes “fair use.”<\/p>\n <\/a><\/p>\n The maximum number of tweets per day, presumably for any tweet that contains both “like” and “@figshare” and also I, is 2.<\/p>\n Topsy promises me I can “get the full picture with trends, geo, sentiment, and more” if I upgrade.<\/p>\n For fun, I removed “I Like” and just left @figshare in the search.<\/p>\n I’m impressed – there are 1,327 replies to “@figshare” for the same period.\u00a0 I’m also saving that screen output as figshare-only.png.<\/p>\n <\/a><\/p>\n For fun I added “@dataone” – which has 52 tweets during the same time period.<\/p>\n Now I’d like to add “#openscience” hashtag.<\/p>\n This greatly impresses me.\u00a0 There are 3,173 tweets with the hashtag #openscience during the same time period.<\/p>\n I’m interested in that there were over 125 tweets concerning @figshare from September 15 to the 18th.\u00a0 I had speculated in an\u00a0earlier post\u00a0this “spike” might be related to the RDA conference in Lisbon, Portugal, during that same time period.<\/p>\n The #openscience hashtag seems to be consistently higher than the @figshare mention – but also experiences a spike – over 350 uses – during the 09\/15 to 09\/18 time period. I’m saving this screen capture as figshare-openscience-compare.png.<\/p>\n <\/a><\/p>\n I’m now removing the “#openscience” hashtag to just look at @figshare. I’m curious about the spike. Unfortunately there’s no way to look at the actual tweets using the basic, non-paid version of topsy.<\/p>\n I just clicked “Advanced search.” It appears that there are some options with “Operators” – not quite Boolean.<\/p>\n Importantly, there is the option to search for an exact phrase, by using the familiar operator of quotations surrounding the phrase.<\/p>\n As I already demonstrated, it is possibly to query using a hashtag.<\/p>\n OR is another operator that might be useful.<\/p>\n And there is the option to use “site”<\/p>\n This might be useful for looking at references to the actual domain, figshare.com, or perhaps a “short” URL version.<\/p>\n For example, one of the more recent @figshare tweets concerns a presentation made by a researcher in environmental science.<\/p>\n The link is below:<\/p>\n http:\/\/figshare.com\/articles\/Coexistence_the_maintenance_of_biodiversity_and_its_consequences_for_ecosystem_functioning\/805172<\/a><\/p>\n Note: figshare suggests “cites coming soon.”<\/p>\n Consistent with what appears to be figshare’s social networking priorities – I’m given the options to “share” via facebook, “tweet” (obviously via twitter), or “+1” on Google+.\u00a0 I also have the option to embed.<\/p>\n What I’m interested in here are the short links.<\/p>\n The most obvious short link is that I am given a DOI link – which sadly I cannot cut and paste with much ease – here’s how it turned out (no modification to the formatting – just a direct cut and past from the figshare site using firefox to my DataONE open notebook browser editing pane:<\/p>\n So let me just type it out:<\/p>\n http:\/\/dx.doi.org\/10.6084\/m9.figshare.805172<\/a><\/p>\n I’ve confirmed this link works and that I’ve typed it correctly.<\/p>\n I now clicked “share” – note that I’m already logged into my personal facebook. The link that has been given to me is: https:\/\/www.facebook.com\/sharer\/sharer.php?u=http%3A%2F%2Fshar.es%2FE7XMb&t=Coexistence%2C+the+maintenance+of+biodiversity+and+its+consequences+for+ecosystem+functioning<\/a>#<\/p>\n There is also a QR code posted as my image thumbnail.<\/p>\n I’ve saved a screencapture as figshare-fb-share.png.<\/p>\n <\/a><\/p>\n When shared it, it now has 1 share – but the link was unchanged.<\/p>\n I just deleted it from my timeline. Does not appear to impact the total shares – but anytime I re-load the page, the views of the item increases.<\/p>\n Now I am not logged in to twitter – but when I click “tweet” I now hat this:<\/p>\n Coexistence, the maintenance of biodiversity and its consequences for ecosystem functioning http:\/\/shar.es\/E73PL<\/a> via @figshare<\/p>\n So the key here for me is that the URL is indeed shortened – and it’s notable that “via @figshare” is appended. I’m not signing in to complete the tweet, but I might as well take a screen capture.\u00a0 I’m saving the screen capture as figshare-tweet-share.png.<\/p>\n <\/a><\/p>\n Next one to look at is “google plus.”\u00a0 I’m not logged in to my Google account at the moment. To share via Google Plus requires a login – I’m not just publicly “plus one-ing” the item.<\/p>\n I’m now logged in – and it seems to have automatically completed the +1. Doing this publicly recommended the item to my circles “Public” and “Friends.” – I’m told “I publicly recommended this as Tanner Jessel. I have the option to add a comment.<\/p>\n The link that is shared via Google Plus is apparently the DOI version – http:\/\/dx.doi.org\/10.6084\/m9.figshare.805172<\/a><\/p>\n The title and the QR code are visible. It’s more difficult to take a screen capture because you must “hover” the mouse over the g+ icon, but I saved a screen capture as figshare-gplus-share.png.<\/p>\n <\/a><\/p>\n I’ll now look at “embed.” In very light print, I just noticed the following text:<\/p>\n *The embed functionality can only be used for non commercial purposes. In order to maintain its sustainability, all mass use of content by commercial or not for profit companies must be done in agreement with figshare.<\/strong><\/p><\/blockquote>\n This might be an interesting divergence in use from other hosting \/ sharing platforms for things like presentations – slideshare for example, compared to figshare.<\/p>\n Note that slideshare has an affiliation with linkedin, see article “Linkedin Acquires Professional Content Sharing Platform for $119 M<\/a>.” Figshare is affiliated with a publishing company, Digital Science<\/a>.<\/p>\n I’ve clicked on the “embed” icon and now have a screen offering customization and the code to pull out to embed.\u00a0 Might as well copy it here and past it.\u00a0 I’ve saved a screen capture as figshare-embed-share.png.<\/p>\n While I’m observing this – I also want to point out that there are options to export to Ref. Manager, Endnote, and Mendeley. I think it’s worth revisiting that at a later point.<\/p>\n However: the point of the preceding foray into how sharing via figshare might\u00a0 affect the URL should not be lost, particularly concerning how it is related to twitter data:<\/p>\n Sharing natively via the “tweet” feature will produce and publish to the twitterverse\u00a0a short URL that is preceded by “share.es” – NOT a custom “figshare” specific domain (such as “goo.gl” – Google’s “vanity” web shortener).<\/p>\n The share.es function DOES append “via @figshare” and unless the user deletes the “credit” (which seems unlikely if the person sharing is a data sharing early adopter), then this should provide a reasonable way of tracking shares of figshare content via twitter – and hopefully insights into what exactly is being shared when considered en masse<\/em>.<\/p>\n It is also important for me to know the various ways that a URL referencing figshare might show up in the literature search, which is the second part of my analysis that will also require a more methodical approach to be a true “meta analysis of early adopter implementation of figshare” or something in that vein.<\/p>\n I will continue that exploration in a subsequent post.<\/p>\n For the moment – I’ll return to using the quotations to modify my initial search – I like @figshare becomes “I like @figshare”<\/p>\n I also realized I can search for “all time”<\/p>\n Perhaps because it is limited as a free service – I get 10 results with no total count of the data.\u00a0 There is also no export feature.<\/p>\n However there are 10 results per page, and 20 pages, for an expected 20 tweets in all time with the exact phrase “I like @figshare.”<\/p>\n It’s important to consider that some of these are RT (retweets), expressing the same information, however, here’s an example from twitter user “Jaime Headden @jaimeheadden<\/small><\/a>”<\/p>\n @rmounce<\/a> @figshare<\/a> If all I needed was a repository to share tables, notes, figures, raw data, then it works. But it’s not the paper itself.<\/p><\/blockquote>\n This is difficult to pull apart – I just hovered my mouse over the “retweet” option to get the tweet number for the original tweet – and since I have no physical notebook at my desk I just wrote it on a receipt!<\/p>\n 250984507371565057<\/p>\n Should be the number.<\/p>\n I want to embed the tweet or link to the status.<\/p>\n Instructions here:<\/p>\n https:\/\/dev.twitter.com\/docs\/embedded-tweets<\/a><\/p>\n Apparently wordpress will do this for me – I just need to paste the tweet URL. Did I transcribe the tweet’s number correctly? Yes.\u00a0 Confirmed the link works.<\/p>\n Ok maybe pulling out the status is not that hard:<\/p>\n https:\/\/twitter.com\/ethanwhite\/status\/235552921775923200<\/a><\/p>\n Getting the hang of pulling out individual statuses…<\/p>\n https:\/\/twitter.com\/PhilippKwon\/status\/235506844137844737<\/a><\/p>\n Here’s another potential search – but only a few potential comments:<\/p>\n “I love @figshare” – 4 comments from all time with that exact phrase.<\/p>\n “@figshare is good” – no tweets found<\/p>\n “@figshare is great” – 1 tweet found for all time.<\/p>\n “@figshare is ideal” – no tweets found<\/p>\n “@figshare is perfect”<\/p>\n Topsy might be tired of me searching these phrases – got hung up.<\/p>\n Might be worth coming up with some potential affirmative phrases and then systematically going through and executing them – I’ve done this before as sort of the “poor-man’s-hack” by kind of scripting a URL search – possible using Topsy’s syntax:<\/p>\n http:\/\/topsy.com\/s?q=%22%40figshare%20is%20perfect%22&type=tweet<\/a><\/p>\n I just did this – http:\/\/topsy.com\/s?q=%22via%20%40figshare%22&window=a&type=tweet<\/a><\/p>\n That’s the “Via @ figshare”<\/p>\n For all time, there are definitely 100 results.<\/p>\n Next 10 pages once, definitely 200 – and probably pointless to look for anymore yet we can “substitute” offset=20 for something very high to see what happens:<\/p>\n http:\/\/topsy.com\/s?q=%22via%20%40figshare%22&window=a&type=tweet&offset=20<\/a><\/p>\n http:\/\/topsy.com\/s?q=%22via%20%40figshare%22&window=a&type=tweet&offset=90<\/a><\/p>\n Still going strong at <http:\/\/topsy.com\/s?q=%22via%20%40figshare%22&window=a&type=tweet&offset=200<\/a>> – and the data only goes to 5 months ago.<\/p>\n So, with access to the full dataset, sentiment analysis or simply citation quickly becomes a true “Data Science” topic.\u00a0 It will require some methodical looking around, compared to the poking around I’ve done immediately.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":" In earlier post I mentioned that I would like to look at positive\u00a0sentiments such as “I like @figshare” or “I prefer @figshare” or “I use @figshare” across twitter. A quick Web search on Google for “archive of past tweets” (without quotes) brought my attention to this September\u00a04, 2013\u00a0article on Mashable: Continue reading #OpenScience Sentiment Analysis via Twitter Data<\/span>
\nhttp:\/\/dx.doi.org\/10.6084\/m9.figshare.805172<\/a><\/p>\n
\n
\nhttps:\/\/twitter.com\/jaimeheadden\/statuses\/250984507371565057<\/pre>\n