Researcher at 15 - Published at 19!
March 14, 2011
|Sohum Shah and sample of his statisticall analysis|
How did you spend your high school summers? Find out how one of our Zipperman Scholars students, Sohum Jayesh Shah, spent one of his high school summers and how he came to be a researcher at the age of 15 and published at 19.
What sparked your interest in research at such a young age?
“I took AP Human Geography my sophomore year in high school, which initially sparked my interest in correlating descriptive demographic statistics [from the 2000 Census] with observed data. This was when I was starting to think about colleges, career options, etc.”
How did you find out about this particular opportunity?
“I asked my family and friends if they could help me find some type of experience that would enhance my resume more than a typical high school summer job. I reached out to several people in the academia and heard of an opportunity at the Children’s Hospital. Dr. Eric Simoes, an epidemiologist at The Children's Hospital in Denver and the University of Colorado - Denver's Health Sciences Center, offered to let me work in his office for the summer.
The first thing Dr. Simoes said to me was that this was pretty much my own project. He would guide me as needed, but would not direct me. He then summarized a handful of studies he was conducting and asked which one I would be interested in. The one I selected was a study using Geographic Information Systems (GIS) and elevation data to analyze the spread of Respiratory Syncytial Virus (RSV) among age groups.”
How did you know what to do and how to analyze the data?
“When I first started, I would just spend a few hours experimenting and playing with the GIS software (ArcView) to learn its interface and its capabilities. Initially, I wanted to analyze the spread of RSV at a national level, but Dr. Simoes explained that my analysis needed to be compared with statistical analysis in order to be presentable. He introduced me to Liu Xia, a statistician and Graduate student, who was assisting with other projects in his office. Liu explained the parameters and controls that needed to be in place for a statistical analysis.
What steps did you do to complete the study?
“Dr. Simoes obtained hospitalization data from the Colorado State Hospitalization Database for RSV for 10 years. The data was very raw at first. Each record had hundreds of fields, most of which meant nothing to me. The fields that I first isolated were Patient ID # (a unique number assigned to each patient in place of a name), date of admittance, zip code of residence and date of birth (DOB).
Using a combination of SPSS and Microsoft Excel, I manipulated the data as I needed. When the manipulation was beyond my scope of knowledge, I would talk to Liu and explain what I was trying to do. Then, she would write scripts in SAS and send it to me the output file.
First, I isolated multiple records for the same patient ID# (to avoid counting readmitted patients twice). Then, I used DOB to calculate the patient's age at the time of admittance. Dr. Simoes shared some age trends for RSV, so I analyzed two groups of cases (<1yr and 1-5 yrs) and eliminated all data for patients older than 5yrs. Then, I used date of admittance to assign each record to a Week (1-32) during the RSV season (Nov. 1 - May 31 for our study).
The next step was the most important: I had to do a frequency analysis for each zip code by week. I created a single database-iv file for each week with the number of cases in each zip code. In order to make the data more meaningful, I converted the number of cases in each zip code to a rate (per 1000 in that age group), using data obtained from the census. After importing each database into ARCMap, I could create a visual representation of the number of cases (frequency analysis) that was meaningful even with hundreds of different bins. A map was created for each group and each week of the season for 5 years (1995-2000). All the maps shared a simple color spectrum (white=no cases, red=highest rate).
Dr. Simoes told me risk factors (demographic data) that I could compare visually with the RSV maps. First, I visually analyzed the maps on my own and looked for trends, hot spots of cases, etc. At first, I created PowerPoint presentations, but I soon realized it would be better to see them all at the same time, so I printed out two 25-ft long sheets of paper that had 32 columns (week in the season) and 5 rows (years) of maps. Then, Dr. Simoes joined me in analyzing the trends. After consulting with a few other doctors, we decided the hypotheses we wanted to test against the statistical analysis. Dr. Simoes summarized the results of the statistical analysis for me and we started writing the paper.”
What did you learn from this experience?
“I learned much from this experience in terms of critical thinking and analysis. In addition to learning technical skills, I learned what it is like to be in an office environment. I was 15 years old and working in an office with MDs and PhDs and Graduate students. It taught me how to communicate with others (especially people my parents' age) in a professional manner. It taught me the value of data and the importance of gleaning information from data that is actionable. I can see how Information Systems (IS) is important to study in order to be successful in any area of business. I also learned that I had to follow through and finish projects to completion. I believe I grew from this experience personally and academically and I am truly grateful that I had this opportunity.”