{"id":3602,"date":"2019-07-19T19:55:13","date_gmt":"2019-07-19T19:55:13","guid":{"rendered":"https:\/\/notebooks.dataone.org\/?p=3602"},"modified":"2019-07-19T19:55:13","modified_gmt":"2019-07-19T19:55:13","slug":"week-9-network-visualization-insights-final-summary-report","status":"publish","type":"post","link":"https:\/\/notebooks.dataone.org\/citation-dataone\/week-9-network-visualization-insights-final-summary-report\/","title":{"rendered":"Week 9: Network Visualization Insights & Final Summary Report"},"content":{"rendered":"\n
Hello!<\/p>\n\n\n\n
This week I had the chance to explore a bit more with Gephi, derive some insights, explore referrer traffic from Google Analytics, start writing my final summary report, and scrape data from the web.<\/p>\n\n\n\n
Additional Gephi Visualizations and Insights<\/strong><\/p>\n\n\n\n I added some additional attributes to the network map, one of which was Publication Title Subject Area. This allowed me to look at the types of journals both DataONE articles are published in as well as their citing articles. The visual below is the full network color coded by publication title subject area, sized by impact score 2, and grouped\/connected by shared citing articles. There are 2 main clusters indicating two main domains of knowledge and the dual nature of the DataONE articles being published. The majority are published in Life Sciences & Biomedicine (pink) and Technology (green) journals. Blue represents the social sciences field which is scattered throughout the network indicating some crossover into other research areas. Brown represents multidisciplinary sciences journals but the articles tend to be focused on Life Sciences & Biomedicine. Articles from the Physical Sciences (orange) and the Arts & Humanities (dark pink) journals are scarce and don\u2019t seem to yield significant contribution to the clustering and the main network. <\/p>\n\n\n\n Within the set of cited articles there are clearly 5-10 that have a larger impact and dominant contribution in the field. These articles could serve as gatekeeprs to other DataONE articles since they share the most citations with other articles (i.e., the nodes that have the most connections). The articles with the highest impact score are listed below:<\/p>\n\n\n\n Satellite Clusters<\/strong><\/p>\n\n\n\n There are several \u201csatellite clusters\u201d that are not connected to the main network (see figure below). This shows articles that often fall into the same research area as the main area of the network (since the have the same color) but do not share many of the same citing articles (since they do not have many shared edges or links in the network). This could indicate the presence of disparate subfields within the main fields covered. Satellite clusters could indicate areas for publication in the future. Here are some satellite clusters of note:<\/p>\n\n\n\n Sorting by Year<\/strong><\/p>\n\n\n\n The visual below represents the network map color code by year. There are a lot of citing articles published during 2017 and 2018 that cite older articles. This could indicate a boon in discovery of DataONE articles as citing articles tend to be the most during the years surrounding publication of the cited article.<\/p>\n\n\n\n I also took a dive into the Google Analytics data of the dataone.org website to try to determine the types of websites that are linking to DataONE. Listed below is the break down of the main website types:<\/p>\n\n\n\n