Clay Ford: Librarian for Statistics

Subject Liaisons are librarians who focus on specific topics. They have a robust knowledge of library resources and are happy to assist with research and answer questions, large and small!

Today we’re interviewing Clay Ford, who is the Senior Research Data Scientist for Statistics.

Subject Specialties

  • Statistics

Contact: Email | Visit Brown Library i-044

What are some of the specific ways you can help people learning and working in your subject specialty area(s)?


Clay Ford

I manage the UVA Library’s Statistical Consulting Service, StatLab. That means I can help people get up and running with statistical software such as R, Stata, SPSS and SAS. In particular I have a great deal of experience with “data wrangling”, which is manipulating and cleaning data so it is ready for analysis and visualization.

For example, imagine having 20 years of survey data spread across 20 spreadsheets. To visualize change in survey responses over time, we need to combine those 20 data sets into a single data set. That is something best done programmatically as opposed to by hand. I can also help with selecting, implementing and/or interpreting a statistical method when it comes to analyzing data. In our survey, some response values appear to increase over time. How do we quantify the increase? Is the increase real or perhaps due to random chance? Can we build a model to forecast future responses? Is our model any good?

Every fall and spring I teach workshops on statistical software and methods. I try my best to make the workshops self-contained so they’re suitable for self-study and reference. See what’s on tap and browse my past workshops.

I also write tutorials on various statistical and data wrangling topics.

If someone comes to you for help, what does that look like?

Since coming to the library in 2013 I have hosted hundreds of consultations. A few examples include helping a Curry faculty determine samples sizes for experiments, helping a Darden faculty respond to reviewer comments on the statistical analysis of an article, helping a nursing PhD candidate wrangle air quality data for visualizations, helping a statistics graduate student web scrape figure skating scores, helping a student health staff member analyze student survey data, and helping numerous undergraduates pull together a statistical analysis for their distinguished majors thesis.

Students and faculty will usually email with some questions and ask to meet. I’ll schedule an hour of time and we’ll meet in my office. Occasionally I’ll go to a faculty’s office if that’s more convenient for providing assistance. If possible, I try to get as much information about their questions in advance. What’s the research question? What kind of questions do you have for me? What have you tried so far? Can you share a small sample of your data? Anything to help me prepare and ensure we hit the ground running and have a productive session.

Sometimes one meeting is all it takes, especially if it’s a technical question such as how to merge two data sets, or how to make a specific tweak to a graph. Other times we’ll continue to meet over and over throughout the semester. This often occurs when someone is working on a big project and they learn I am available as a resource. They’ll use me as a reference as they encounter difficulties or want a second set of eyes to review their statistical analysis.

What are some research challenges you enjoy?

The biggest challenge for me, and one that I am determined to enjoy, is evolving and keeping pace with statistical computing and methodologies. For example, ten years ago a compelling visualization was a static 2-dimesional plot with a bit of color. Now it’s an interactive web-based application. When I finished grad school, I thought SAS was the primary statistical programming. Now it’s pretty much R and Python. It’s exciting to imagine where everything will be in 10 or 20 years. It’s going to be a challenge to keep up, but one that I embrace.

What’s something surprising you’ve found in the course of your work in this subject area?

Michele Claibourn (Director of Research Data Services) and I started a user group for the R statistical computing language. We knew there was interest in R around grounds, but had no idea if a user group would appeal to anyone. Well, 5 years and 500 members later, we know there is major interest! It’s fascinating to see how a seemingly niche language like R has found its way into such disciplines as education, archaeology, finance, physics and sociology.

What’s a resource you think people in your subject area(s) aren’t very aware of, but would find useful?

That’s easy. Our Licensed Data Sources. Students in statistics classes are often tasked with finding data to carry out an analysis for a project. Their first reaction is to start Googling for data sources. (That was mine when I was a statistics student at UVA!) While that can certainly turn up some free and open source data sets, it won’t get you access to licensed (read: not free) data sources. That’s where the UVA Library comes in. We provide access to a couple of dozen very large data sources spanning several disciplines. Do yourself a favor and browse our collection!

What’s a recent book you’ve read that you’d recommend?

The Power of Habit by Charles Duhigg

What’s a place or an activity you enjoy in Charlottesville?

My wife and I enjoy unwinding and reflecting at Champion Brewery.

Enthusiastic Endorsements…

“I wanted to thank you very much for taking the time to work through with me the data in the two conference papers. You were very patient and very helpful. As a result I have been able to convey the difference in the results, and their overall meaning, with much more confidence and detail than would have been possible without the time you gave to help me understand the issues in the data.”


“I just needed to extend my sincere gratitude and thanks to Clay Ford, for writing the page “Understanding Q-Q Plots” on the University of Virginia Library website. This is the best explanation of Q-Q plot understanding I have seen on the whole internet. It is amazing.”


“[Clay,]… I wanted to tell you that this particular project was finally accepted for publication at a very prestigious psych journal. So, I wanted to reach out to you to thank YOU for all the incredible help that you’ve given me over the past two years. We really couldn’t have done it without you.”


“I really appreciate your receptiveness, time and expertise. I would have been scrambling and panicking had you not been there at some critical moments. So, thank you.”


“Your Reading PDF Files into R [tutorial] was not only accessible, it was also immediately actionable.”

Visit Clay’s staff directory page.


Comments are closed.