Don’t neglect your data!

This situation comes alarmingly often, especially among social scientists;

A student has 1 year left before they have to submit their PhD thesis. They have written 10s of thousands of words. They have read countless articles. They have an outline of their thesis and a few drafts of some of their chapters. They have some good ideas, they have a good knowledge of their field, and have done all their data collection…

… But they have done zero data analysis. Sometimes, they won't have transcribed their interviews. Some won't have even listened to the interviews since they conducted them (perhaps several years ago).

This leaves them in a very precarious situation; with only a few months remaining, not knowing whether there is anything useful in their data, and not knowing how to find out. They have to learn how to do the analysis for the first time under enormous pressure, with no opportunity to redo any of the practical work should there be a problem with the data (or to further investigate anything interesting).

It's a nightmare situation, but so easily avoidable if you learn how to do data analysis as early as possible. This includes;

  • Learning how to use the software you need (eg NVIVO)

  • Putting data in an appropriate format (transcription)

  • Basic analytical techniques (coding)

You don't need a full data set to get started. It doesn't even need to be real data. Starting early means that you’ll be more comfortable with the analysis once you do have a full data set, and having an understanding of the analytical process will help you get better quality data and better understand the literature.

Don't neglect your data, and don't treat the analysis as something you can throw together at the end.

What to do if you have neglected your data until now

First, make sure you know where the data is, then start on whatever formatting needs to be done.

For example, if you have audio recordings of interviews, these will probably need to be transcribed. Many underestimate how long this takes, so start immediately.

Once you have one transcribed file, that's enough to load into whatever software you are using so you can play around with the basics of analysis. If you know someone who has used the software before, ask them nicely if they can show you what their process is for analyzing data. If you don't know anyone, find some online tutorials to get you started.

You must then get all your data into a usable state. Until this is done, you don't really have anything to work with. It's time-consuming and can be tedious, but it has to be done. Try to put together a checklist so you have a consistent process to follow. Take note of where you save every file, and ALWAYS keep an unaltered copy of the original raw data.

Only once you have the data in an analyzable form can you start to figure out whether you have anything valuable. The earlier you do this, the better.

Remember:

The goal of a PhD thesis is to present your own original research. Yes, you need to write about existing literature, but this is just to provide context for your work. It does not matter how good your literature review is if you haven’t analyzed your own data- make this the priority!

James Hayton

Recovering physicist. I used to work in nanoscience before moving on to bigger things. After finishing my PhD in 2007 I completed 2 postdoc contracts before becoming starting coaching PhD students full-time in late 2010. In 2015 I published the book

https://amzn.to/32F4NeW
Previous
Previous

How to beat writer’s block

Next
Next

A beginner's guide to statistics for PhD research