Potential questions:
Objective 1: Researcher information
- What year did you earn your PhD?
- What professional level do you currently hold (professor, associate professor, researcher, emeritus, post doctoral scholar)
- What organization are you currently employed with (academic institution, organization)
Objective 2: Project details
- Has the project in which you were the primary investigator, officially ended?
*Need toย confirm that the project has officially ended and that no more data is currently being gathered. (After random selection of subjects, check the NSF active awards list to make sure the project was not renewed and therefore is current) This question may not be necessary.
*Project refers to the project that won the NSF award within the Division of Environmental Biology.
- How many researchers (graduate students, post docs, collaborating investigators) were involved and supported by the project?
- How many peer reviewed publications did this project produce in total?
*This may be found in NSF data base so may not necessarily be a question or it should be saved for the end.
- What is the size of the project? How many factors and replicates were included in the study?
*Experiments probably contain some form of replication for statistical purposes. It may be important to know how big their project was. This question may be too specific to quantify as other differences in data types may be brought up as an issue. If relevant to the interview, this question may be asked.
- What was the length of the project’s lifespan from the start of planning the project to when you felt it was complete?
*Some projects are continuing grants and have an older start date. There’s variability in the length of projects which will affect how much data is gathered for the project overall.
- What percentage of the project’s term was spent planning? collecting data? analyzing data? manuscript writing for publication?
- What do you consider the most valuable data generated from your project?
- What do you consider the is the raw data of your most valuable data?
*Raw data is defined to be data that has been gathered specifically for the purpose of the project. It is not data gathered for general use and available to many (ie. weather station data, remote sensing data etc). Raw data is considered to be unmanipulated data gathered from field sites or from instruments for the specified project.
Objective 3: Data generation
- What format is this raw data stored as (in notebooks, excel, relational database, gifs)
- How big is the average file?
- If notebook data: How many notebooks did you produce, How many pages does an average notebook contain, What is the size of the average page?
- If excel spreadsheet: How many excel files did you produce, How many cells does an average excel file contain (rows by columns), How big (in bytes) is the average excel file?
- If photograph data: How many gif files did you produce, How big (in bytes) is the average gif file?
- Is the raw data for this project publicly available?
- If so, where is it stored?
- public database
- home institution database
- organization database/website
- personal website
- What percentage of the total raw data is stored in the repository, previously indicated?
- Did you use all your raw data in peer reviewed publications?
- If not, what percentage of the total raw data was used for publications?
- If not, is the raw data that was not used for publication kept or deleted?
- If kept, where is the unused raw data kept?
- Is it publicly available?
- If so, where is it stored?
- What percentage is of the unused raw data is stored?
- If deleted, what percentage of the unused raw data is deleted?
- If kept, where is the unused raw data kept?
Objective 4: Data curation
- As an estimate, about how much data did you produce in the past year? (depending on their data types)
- How many projects do you run in a given year?
- How many NSF projects are you listed as a primary investigator?
- In general, do you reuse your data?
- If so, what percentage of data do you reuse in a given year? Over the course of your career?
- Do you use data that is publicly available from other researchers not collaborators on a project?
- In general, so you share your data?
- If so, in which repositories?
- public database
- home institution database
- organization database/website
- personal website
- In general, what percentage of your data is stored in the repository, previously indicated?
- How much data (in GB) do you have in your hard drive?
- What percentage of your hard drive is full of data?
- On a scale of 1-5 (1 highly value, 5 do not value at all), how much do you value:
- data management
- data reuse
- data sharing
- How much data is considered by your lab unusable for publication and how much is subsequently deleted?
- How much data do you archive?
- What types of data do you archive?
- What is your data management policy regarding data and unused data? etc.