Frequently Asked Questions
Q1: Who can use the Center for Biostatistics and Health Data Science?
Q2: How is the Center funded?
Q3: What is the difference between statistical consulting versus statistical collaboration?
Q4: At what stage of my research should I seek statistical advice?
Q5: How do I request statistical and/or grant proposal assistance for my research?
Q6: What if I have a last-minute request?
Q7: What if I just want a second opinion?
Q8: How long will it take to complete my project?
Q9: Does the Center have any policies or guidelines that will make our collaboration more efficient?
Q10: What should I bring to the first collaborative meeting?
Q11: What is a “Table 1”?
Q12: What should I expect at the first collaborative meeting?
Q13: What should I do if I need to cancel a collaborative meeting?
Q14: What do I need to do before sharing data with the Center? How should I share my data with the Center?
Q15: Should I include my Center collaborator(s) as a co-author(s) on my abstract, poster, or manuscript?
Q16: What are the responsibilities of the Center's biostatistician?
Q17: What are my responsibilities as a client?
Q18: Are there ethical guidelines for statisticians?
Q19: What is the CBHDS policy for sharing code and scripts used for analyses?
Q20: What type of support is offered during your weekly Zoom walk-in hours?
Q21: Are there any guidelines for estimating biostatistician effort and resources on grant proposals?
Q22: I am a Virginia Tech researcher. Do you have any resources related to the new NIH Data Management and Sharing policy effective on January 25, 2023?
Answer: While our primary research partners are Fralin Biomedical Research Institute, Virginia Tech Carilion School of Medicine, Virginia-Maryland College of Veterinary Medicine, and the Virginia Tech College of Science, our services are available to researchers from across the university, health providers, other academic institutions, industry, and governmental agencies.
You can submit your request for support using the link here.
Answer: The Center is funded through a variety of sources, including an NIH Clinical and Translational Science Award (CTSA), Virginia Tech’s College of Science, the Department of Statistics, as well as by external grants, foundations, external contracts, and internal entities (departments, centers, etc). We welcome contractual partnerships with outside entities, including those from industry, academia, and government.
Answer: A biostatistician serving as a collaborator is an academic biostatistician who works with an investigator to study a research question or research agenda, preferably from study conception through dissemination. A statistician serving as a consultant, on the other hand, provides input on a straightforward statistical question that can be solved in a single meeting lasting for a relatively short period of time, say 30 minutes to an hour. As with many things, there is no clear or definitive marker that distinguishes when general advice transitions to impactful collaboration. We typically use time as the metric for making this distinction.
Center biostatisticians are more than happy to offer brief, general advice on statistical topics that will allow investigators to move forward on a project via our weekly Zoom walk-in hours on Mondays and Wednesdays. For long-term collaborative projects and partnerships, including education and mentoring, the Center can contribute to your work in various ways, including:
- Rigorous Study Design
- Randomization schemes
- Power and sample size calculations
- Critical Research Support
- Grant proposals
- Data Coordinating Center (DCC) support, including data management, extraction, preparation, sharing
- Statistical Programming
- Data Analysis
- Data Visualization
- Publications and Presentations
- Abstract preparation
- Manuscript preparation
- Poster and Oral presentation support
- Community Building and Education
- Guest lectures
- Educational workshops
Answer: Please connect with us as soon as you have arrived at your primary research question. If the Center is included early on in a project, we can help with power and sample size estimation, randomization schemes, data management, data preparation, etc. In short, working with us early in your investigation ensures your best estimate of the sample size needed to demonstrate a clinically meaningful effect, the best study design that minimizes bias, the highest data security, and lead time for budgeting.
Answer: Please fill out the request for support using the link here.
Answer: Regretfully, our last-minute advice is not our best advice. Even if the problem is relatively straightforward, your problem will not be the only one with a looming deadline, and our workload may prevent us from accommodating your last-minute request. In short, please plan ahead and give us ample time to provide you with the best support possible.
Answer: Most quantitative research questions can be answered using multiple statistical approaches, so it is not uncommon for two statisticians to disagree on primary analytic approaches to the same research question. When you come to the Center for support, please let us know if you are working with another statistician so that we can make sure we do not step on the toes of our friends and colleagues. Ethically and collegially, we are not open to statistician shopping.
Answer: Oftentimes, completing requests for analytic support is more time-intensive than anticipated. Non-statisticians typically do not appreciate the time that goes into each project, certainly not intentionally, but likely a result of not living in a statistician's world. That said, we generally ask for a two week lead time for straight forward analyses, and significantly longer for work involving grant preparation and collaboration.
Answer: Yes, we do! Please see the documents here, which give an overview of our collaborative process, data preparation, grant proposals, and manuscript preparation.
Additional guidelines are addressed in Q8 and Q15, and relate to lead time and authorship, respectively.
Answer: Ideally, we would like to see a clear statement of your research question(s) and hypotheses, a brief but detailed explanation of the background theory and previous work, and any data that may already exist. If possible, please bring a completed “Table 1” with you that represents your analytic sample. See Q11: “What is a Table 1” and guidance documents here.
Answer: The first table in a published manuscript typically describes the study sample. This “Table 1” is used to describe the population to which the findings may be generalized. Typically, a “Table 1” will show characteristics (both continuous and categorical data) of groups of participants, where the columns represent a “total” for all participants, along with strata defined by exposure or the comparison groups of interest. The sample statistics included in a “Table 1” typically reflect measures of central tendency and variation (mean and standard deviation, median and interquartile range, range) for continuous measures, along with frequencies and percentages for categorical measures.
For a detailed explanation of a useful “Table 1”, please see the following paper:
Hayes-Larson, Eleanor, Katrina L. Kezios, Stephen J. Mooney, and Gina Lovasi. “Who is in this study anyway? Guidelines for a useful Table 1.” Journal of Clinical Epidemiology Volume 114, October 2019, Pages 125-132. https://www.sciencedirect.com/science/article/abs/pii/S0895435618309867#sec6
Answer: See Q10 for what you should bring to your first collaborative meeting with a Center biostatistician. At the initial meeting, our goal is to understand your research question(s) and/or specific statistical needs. This could include, but is not limited to, assistance with a grant proposal, presentation/poster, analysis of an existing dataset, manuscript or conference abstract preparation, or setting up a REDCap database for your study. If your data are available, we will ask for a brief introduction to your dataset, including details of all independent (predictor) and dependent (outcome) variables of interest that align with your specific research questions. We will also discuss our statistical analytic plan for your work, deliverables (e.g., tables, figures, summary documents, etc) and the specific timeline for your study. Should your work lead to submitting a manuscript, authorship will be discussed. Please see Q15 for details about authorship. If you need assistance with a grant proposal, we will discuss salary to support Center statisticians should your grant be funded. Please refer to the guideline document Grant Proposals for details regarding recommended minimum effort for grant proposals.
Answer: Please let all relevant parties know as soon as you can, preferably at least 24 hours before the scheduled meeting. We will reciprocate the same courtesy, as we all strive to be respectful of one another’s time and maintain solid and trustful collaborative partnerships.
Answer: It is important to realize that no data should ever be shared without IRB approval (if applicable). For human research studies, you will need to add the Center's biostatisticians to the IRB study protocol before sharing your data.
Please ensure that your dataset is free of all protected health information (PHI) prior to sharing it with us. PHI may include: patient name, date of birth, phone number, address, email address, medical record number, health plan number, social security number, or other unique identifying number, characteristic or code. For a full listing of PHI, please refer to What is Considered Protected Health Information Under HIPAA?
After approval to share the data has been given, please deliver the following files to our team:
1. The raw data set.
2. A tidy data set.
3. A codebook describing each variable and its value in the tidy data set.
4. The exact steps that were taken to get from the raw data (1) to the tidy data (2).
For a more detailed description of the files listed above, please visit: https://blogs.biomedcentral.com/bmcblog/2013/11/26/how-to-share-data-with-a-statistician/
Data files should be shared with the Center's team using one of the following acceptable formats:
- REDCap (.xml)
- Excel (.xls or .xlsx)
- comma-separated values (CSV) file (.csv)
- SAS (.sas7bdat)
- SPSS (.sav)
- Stata (.dta)
Please contact your biostatistician for guidance when using any other format. Information about data sharing and storage at Virginia Tech can be found on the university’s Research Data Management Guide website: https://guides.lib.vt.edu/RDM/storage.
Answer: Authorship is generally expected whenever the statistician contributes substantive input on the design or analysis. The Center follows the International Committee of Medical Journal Editors (ICMJE) recommendations for determining authorship. The ICMJE has set the following standards:
1. Substantial contributions to conception and design, or acquisition of data, or analysis and interpretation of data;
2. Drafting the article or revising it critically for important intellectual content; and
3. Final approval of the version to be published.
Authors should meet all of these conditions.
Often a statistical collaborator will be involved in all three levels of manuscript creation, and when this occurs, co-authorship is appropriate. Please note that if the Center biostatistician has made substantial contributions, second or last authorship is appreciated. Second authorship is a metric for promotion for a junior collaborative biostatistician, while last authorship is a metric for leadership and advising, more appropriate for a senior biostatistician.
Answer: For the Center's biostatisticians who are funded directly under NIH (or other granting institution) grants or retained contractually, the biostatistician is expected to provide a level of involvement commensurate with that agreed upon at the start of the relationship. A collaborating biostatistician is ethically bound to be honest about what needs to be done to successfully complete the study, to make every effort to fulfill any agreements about his/her role, and to acknowledge any limitations to expertise that can affect their ability to provide deliverables. The Center's biostatisticians are expected to take the necessary time to learn about the project and science, after which proper advice can be given on how best to carry out the research.
The Center's biostatisticians should be aware of all ethical and regulatory constraints, such as human subjects’ protection or financial privacy laws, and verify that no aspect of the study violates these.
Center biostatisticians are expected to explain statistical concepts and methods, including practical guidance on how they are implemented and interpreted, in a way that is understandable to those without statistical expertise. When a Center biostatistician performs an analysis, a summary report will be provided that includes details of the problem, coding, analytic methods, results, and what can be concluded about the available data. Additionally, the biostatistician will articulate underlying assumptions of the methods used and limitations of the findings.
Center biostatisticians will discuss authorship, timelines, effort, and deliverables at the onset of any collaborative project. They should disclose potential (financial and other) conflicts of interest and resolve them.
For more details, please review When You Consult a Statistician…What to Expect.
Answer: During the initial meeting with a Center biostatistician, please communicate any deadlines, along with the desired timeline for deliverables, and discuss authorship and effort expectations for budgeting. Additionally, please let us know if you are working with another statistician so that we can make sure not to step on the toes of our friends and colleagues. Ethically and collegially, we are not open to statistician shopping (see Q7).
Communication of the research question(s) is critical for ensuring that a good solution is provided for the right question. It is the responsibility of the client to make sure that the biostatistician has a solid understanding of the objectives of the project by providing them with relevant background information. It is helpful to ask for teach-back from the biostatistician to gauge whether your description and explanation has been understood. The client also has a responsibility to be complete and accurate in describing how the data were acquired, including any problems that occurred during data collection or deviations from the study protocol. Any type of missing data or procedural error (such as randomizing before baseline tests verified eligibility) should be documented and shared with the biostatistician, as this may have an impact on the conclusions that can be drawn from analysis results. Your biostatistician may offer a valid approach for proceeding despite these issues. When interpreting results, please be open-minded if the data conflicts with your prior beliefs. Some of the greatest scientific breakthroughs have resulted from unexpected findings. Be aware that you may not be able to generalize your study results beyond the study population. Your Center biostatistician will help make valid conclusions based on your study results.
As a client, please ensure that you are observing all human subject protections, animal rights, and other research regulations. You should take precautions to make sure that the privacy of others is not violated in the material you provide to a biostatistician, including both human subjects and proprietary information. Information that can link data to a specific person, e.g., name, address, employee number, should be removed and the subject identified only by a code that is unique to the study.
Finally, please communicate anything that you wish to be kept confidential or any restrictions on the use of your data without your express permission. Please keep in mind that your biostatistician can provide confidentiality only within limits of the law (generally he/she cannot assure privacy and confidentiality from legal processes of discovery).
For more details, please refer to When You Consult a Statistician…What to Expect.
Answer: A biostatistician should adhere to professional and scientific ethics, which promote the integrity of the data analysis and conclusions.
Sometimes results of a valid statistical analysis will not conform with the expectations of the client. Applying pressure to your biostatistician to achieve a predetermined outcome may adversely affect the validity of study results as well as the biostatistician’s credibility. A biostatistician with a thorough knowledge and understanding of statistical methods is best equipped to establish and defend valid conclusions from the data and study design, as well as to identify and explain any limitations to the conclusions that can be drawn.
The American Statistical Association (https://www.amstat.org/ASA/Your-Career/Ethical-Guidelines-for-Statistical-Practice.aspx) and International Statistical Institute (https://www.isi-web.org/index.php/about-isi/policies/professional-ethics) have published ethical guidelines for professional statisticians.
For more details, please see When You Consult a Statistician…What to Expect.
Answer: Transparency and replicability depend on open access to the code and scripts used in generating statistical reports and analyses. To contribute to replicable and open research, while ensuring proper credit is given to our biostatisticians, our general policy is that code will be made available to collaborators when the results generated by the code in question are accepted for publication in a peer-reviewed journal. Deviations from this policy are typically a result of working agreements established a priori.
Answer: Our Center holds weekly Zoom walk-in hours on Mondays and Wednesdays offering brief, general advice on statistical topics such as study design, data collection/organization, statistical methodology, interpretation of results, and statistical software. Please note that our Zoom walk-in hours are intended to give researchers sufficient direction so that they can move forward on their research projects, and are not intended for “on-the-fly” statistical support. That is, Center biostatisticians do not perform any sample size/power calculations or data analyses during our Zoom walk-in hours but can advise investigators, students and research assistants on these topics. If you are looking for statistical support for a long-term collaborative project and partnership, please submit a request form here and expect to hear back from us within 24 hours.
Answer: Most granting institutions require the inclusion of a statistician as a co-investigator or key personnel on submitted grant proposals. Collaborating with a biostatistician early on in your proposed study is recommended for a successful and long-term collegial relationship, and ultimately results in improved funding, productivity, and stronger science.
UC Davis and Northwestern University have published annualized biostatistics effort allocation guidelines for collaborators. These documents provide guidance regarding the percent effect and the level of funding to allocate for biostatisticians on research projects based on their experience and the size of the project.
Q22: I am a Virginia Tech researcher. Do you have any resources related to the new NIH Data Management and Sharing policy effective on January 25, 2023?
Answer: As of January 25, 2023, the National Institutes of Health (NIH) Data Management and Sharing Policy will be implemented. That is, all NIH grant applications or renewals that generate scientific data will require a robust and detailed data management and sharing plan (DMSP) at the time of proposal submission. For more information about this, please refer to Virginia Tech's University Libraries webpage.